Microsoft's AI has learnt to press buttons instead of employees

The Copilot logo on a smartphone on the background of the Microsoft logo. Photo: Jonathan Raa/NurPhoto

Microsoft has expanded the capabilities of Copilot Studio by introducing a "computer use" tool. Now, corporate AI agents can literally "click" buttons, select menu items, and enter text in the interface fields, just like an employee would do at a computer.

The Verge writes about it.

How the new feature works in Copilot Studio

The innovation is reminiscent of OpenAI's Operator or Claude's function of the same name, but offers more freedom: the agent can work with any website or application, even if it does not have an API.

Charles Lamana, the Corporate Vice President of Microsoft Business & Industry Copilot, explains that "computer use" allows agents to perform tasks where a human was previously required. If a user can interact with a program manually, Copilot will do the same. The system recognises interface elements, presses the necessary buttons, fills in forms, and continues to work even after updating the design or changing the location of elements on the screen.

The location of the new Computer use function in Copilot Studio. Photo: Microsoft

In terms of practical scenarios, Copilot Studio is able to automate routine data entry into CRM, conduct market research, or process invoices. The function analyses changes on the page and adjusts, so the script does not "break" when a website or application receives an update.

This is not the first attempt by Microsoft to teach Copilot to work for the user. At the beginning of April, the consumer version of the software introduced the Actions feature, which allows you to book restaurants, buy tickets, and place orders in the background. However, Actions only works with a limited number of partners, while Copilot Studio agents can interact with any web or desktop application without additional agreements.

Thus, thanks to "computer use", businesses receive the versatile tool for creating AI agents that reproduce human work with applications and websites and eliminate the need for manual operations where API integration is difficult or impossible.

As a reminder, it became known that Microsoft is among candidates to buy TikTok.

We also wrote that Microsoft offers users of old devices that do not support Windows 11 to dispose of them. In this way, the company reminds users that Windows 10 support will soon be discontinued.