I’ve tried Copilot Vision: It felt creepy, yet somewhat useful — Here’s my take

On Windows 11 and 10, U.S. users now have access to Copilot Vision, an AI-powered assistant designed to aid in various tasks on your computer. Curious about its capabilities? Allow me to provide a detailed overview of the feature, including how to obtain it, as well as my own experience using it.

As an analyst, I utilize the “Vision” feature within the Copilot app, which enables me to collaborate with a chatbot by sharing my screen. This allows the chatbot to visually comprehend the content on my screen, functioning as an additional set of eyes. It offers immediate help, responses, and insights based on the information presented in my apps or web browser, thereby enhancing my productivity and problem-solving capabilities.

In simpler terms, this feature requires your active participation. To activate it, you need to manually enable it in Copilot app versions 1.25061.104.0 and later. If you’re using Microsoft Edge, you can also enjoy this feature through the Copilot integration within the browser itself.

Keep in mind that the AI has a wide range of viewing capabilities, excluding content protected by Digital Rights Management (DRM) or any other forms of restricted materials.

This feature is accessible to all users, regardless of whether they have a Copilot Pro subscription, on both Windows 10 and Windows 11. However, if you’d like to utilize Copilot Vision on iOS or Android devices, you’ll need a subscription for its functionality.

In this guide, I’ll outline the steps to get started with the feature and share my experience.

How to get and enable Copilot Vision on Windows 11

In the United States, Copilot Vision is now accessible, starting from version 1.25061.104.0 of the Copilot app and subsequent releases. To make use of it, you should first visit the Microsoft Store, navigate to the Downloads section, then click on the “Check for updates” button to confirm that your computer has the latest version of the app installed.

You can get it from the Microsoft Store app if you previously uninstalled the app.

Microsoft indicates that Vision is a feature that requires explicit consent for screen sharing, making it technically correct. However, it’s essential to note that this feature appears enabled by default as there’s no setting to completely disable it within the options menu.

On the Copilot settings page, you’ll see a toggle for enabling or disabling the “‘Visual Highlights’” option. This setting controls whether the AI can visually indicate actions on the screen, but it does not affect whether the feature is accessible within the app itself.

If you’re uneasy about Copilot, you might want to navigate to the “Settings”, then “Apps”, find “Installed Apps” and tap on “Copilot”. Once located, select the “Uninstall” option to remove it from your device.

How was my experience using Copilot Vision on Windows 11

The initial encounter with Copilot Vision took me aback since it presented an unfamiliar experience. Unlike traditional chatbots that require explicit prompts, this one seemed to understand the context on its own and offered assistance accordingly.

It provides a glimpse into what the future of computing might look like, even though it’s not entirely precise.

Getting started

As soon as I learned that Copilot Vision could be accessed on my computer, I opened a few applications, found the Copilot app from the Start menu, pressed the “‘Vision’” (represented by glasses) button, chose the application to share, and activated the “‘Share’” switch.

App question test

In the Notepad app, I requested Copilot to demonstrate how to modify the default font for the program, however, the guidance provided was incorrect.

Copilot suggested checking the “View” menu initially, but since it’s not located there as designed, you can find it within the “Edit” menu or by pressing the “Gear” button situated in the upper right corner instead.

Ultimately, it managed to select the correct option by eliminating the others. The intriguing detail is that the chatbot admitted its error due to mixing up different versions of Notepad, and I don’t recall Notepad ever having font settings in the “View” menu.

Settings question test

In the course of my analysis, I initiated the Settings application within Windows 11. Subsequently, I instructed Copilot to install the most recent system updates onto my device.

In the situation at hand, Microsoft’s Copilot with Vision successfully recognized that I was navigating within the Settings application. It then guided me precisely to the “Windows Update” part of the app and emphasized the “Check for updates” button for me.

In the next step, I presented an alternative yet applicable query. I inquired about ways to block my computer from sending updates to other devices while it is undergoing its own update procedure. Although this question isn’t overly complex, it assessed the assistant’s capability of understanding subtler user intentions.

At the initial try, Vision failed to grasp the specifics of the request and provided general guidance for device update instead. Upon revising my wording in the second attempt, it successfully understood the query.

Yet, the instructions provided seemed to contradict what was visibly happening on the screen. For instance, it claimed that I had disabled the update-sharing feature, but in reality, I hadn’t performed any such action. This implied that the assistant was making presumptions based on anticipated behavior instead of examining the real-time system status.

Recognizing objects test

To check Copilot’s ability to identify objects on the screen during the upcoming test, I decided to show it an image and ask for its identification of the particular object within that picture.

Here’s how you could rephrase that sentence: In this particular scenario, when I visualized a red jacket, the chatbot accurately identified it and offered additional details about the product whenever I asked for further information.

Despite knowing the jacket was on Amazon, the chatbot didn’t realize that you weren’t specifically on the product page since you had opened the image in a different tab. Consequently, it couldn’t supply the product page details.

Extracting text test

As an analyst, I’d like to highlight a useful feature of Copilot Vision: the ability to extract text from images. Previously, my operating system lacked this capability, but now, there are numerous methods available for me to accomplish this task efficiently.

You currently have access to the text extraction feature via PowerToys, Snipping Tool, and Click to Do, and in addition, you can utilize the Copilot Vision tool as well.

To carry out this text-reading task, I navigated to the Game Mode section within the Settings app and inquired if it could verbally present the content displayed there. Remarkably, the chatbot successfully read out all the information shown on that page aloud.

The limitation was that it couldn’t transfer the retrieved text to the clipboard or enable selection of the text, like you can with Click to Do. But, the chatbot did record all interactions within the Copilot app as part of our conversation history.

Writing text test

On this screen, you have the ability to pose questions about virtually any topic. Should you be working with a document, feel free to instruct the AI to examine charts, graphs, or any kind of data within it.

In simpler terms, you’re free to request descriptions for any scene, image, landmark, or place that comes to mind. Essentially, your imagination is the limit!

If you’re working on a piece of text, you can ask to read the text and suggest ideas to improve.

In the culmination of my exam, I utilized a text within Notepad, then requested the AI-powered chatbot to expand it, and remarkably, it proposed a different rendition.

As an analyst, I found it truly remarkable when the assistant managed to comprehend the text displayed on the screen and proposed an alternate version. However, its ability to grasp subsequent actions was not as precise as desired.

Even though I understood that directly swapping the text with the proposed version wasn’t feasible, I enquired about the possibility of pasting the alternate text into the document. However, they directed me towards a specific part of the file where I should make the insertion instead.

To accomplish this task, first, you should open and then close Copilot Vision. Once it’s open, navigate to the chat history within the Copilot interface, and select and copy the text you need.

Final thoughts

To someone with extensive experience in creating guides, the technology’s ability to understand and aid with on-screen content is truly remarkable.

On the other hand, it tends to make numerous errors, which might limit its utility unless you’re already familiar with the task at hand.

Additionally, while it may seem knowledgeable, it’s important to remember that it’s an AI program, capable of generating responses based on information it has learned from the internet.

Occasionally, interacting with Copilot Vision reminds me of speaking with technical support on the phone. Despite the fact that it can observe your screen, its guidance feels similar to how a tech support agent might guide you, and often, it seems not to grasp that I’ve already completed the task at hand.

Generally speaking, you often need to provide very precise instructions when using AI, specifying each action exactly. This contradicts the purpose of having an AI that’s meant to comprehend user intentions naturally. It’s crucial to keep in mind that this feature is primarily developed for non-technical users, as technically inclined individuals are less likely to depend on it.

In essence, although Vision seems promising as a personal assistant, its ability to fully understand and accurately respond to context requires further improvement. Instead of acting intelligently based on the screen’s current content, it often appears to be making educated guesses from the command given.

In closing, let me emphasize that the particular questions I didn’t include in this guide aren’t significant. The purpose of a chatbot is to mimic a natural dialogue with a person, just like you would talk to another human.

Additionally, it’s important to note that this tool is not designed to perform actions for you. Instead, it can only examine and interpret the content displayed on your screen. For any action-taking tasks, the AI must function as an agent, but at this point, only Copilot+ PCs have this capability in a restricted capacity within their Settings app.

In summary, although this feature comes at no cost, it does come with certain limitations. For example, despite having a Microsoft 365 subscription, after a certain number of uses, the application suggests upgrading to Copilot Pro.

In case you’re working on resolving an issue and need assistance, there might be a request for you to pay for the AI helper to complete the fix.

More resources

Read More

2025-06-17 14:10