Google Launches Gemini 2.5 Computer Use: An AI Model with Human-Like Web Browsing Skills
Google introduced Gemini 2.5 Computer Use, an AI model that can browse the web like a human—filling forms, clicking, scrolling, and typing to enable advanced online automation.
Gemini 2.5 Computer Use: What It Can Do
Built on Gemini 2.5 Pro, this AI model can perform a wide range of web-based tasks that traditionally required human input. Using a virtual browser, it can:
-
Fill and submit online forms
-
Click links and buttons
-
Scroll through web pages
-
Type or use keyboard shortcuts
-
Hover the cursor
-
Open dropdown menus
Essentially, Gemini 2.5 Computer Use acts like a virtual assistant that can directly operate websites, not just interpret or summarize them.
According to Google, the model outperforms other AI systems on multiple web and mobile interaction benchmarks while offering lower latency for smoother performance.
How It Works and Where It’s Available
Developers can now access Gemini 2.5 Computer Use through Google AI Studio and Vertex AI. The company showcased demo videos (sped up 3x) demonstrating the model’s real-time task handling — such as organizing sticky notes on a virtual board using drag-and-drop commands.
In the demo, the AI successfully categorized virtual notes on a site called stick-note-jam.web.app when prompted:
“Go to the site and organize the tasks into the right categories. Drag them there if not.”
This demonstration highlights Gemini’s contextual understanding and ability to follow complex multi-step instructions just like a human operator.
Limitations and Current Capabilities
Currently, Gemini 2.5 Computer Use supports 13 types of actions, and its functionality is limited to the browser environment.
Google clarifies that the model “is not yet optimized for desktop OS-level control,” meaning it can’t yet interact with full desktop applications.
Despite this, the model represents a breakthrough in AI-human interface simulation, showing early signs of what future autonomous AI agents could achieve.
Practical Use Cases and Future Potential
Google teams are already employing Gemini 2.5 Computer Use for UI testing, automating what were previously manual processes.
The model is also being integrated into various internal and experimental projects, including:
-
AI Mode in Search
-
Firebase Testing Agent
-
Project Mariner — an AI platform where users can assign agents to handle tasks like research, planning, and data entry through natural language.
These integrations show how Google aims to streamline software testing, research, and workflow automation through AI-driven systems.
Conclusion
With Gemini 2.5 Computer Use, Google has taken a significant step toward AI systems that don’t just “think” but also act like humans on the web. While still in its early stages with limited actions, the model’s real-world applications — from testing to digital task management — could redefine how AI interacts with technology in the near future.
What's Your Reaction?
Like
0
Dislike
0
Love
0
Funny
0
Angry
0
Sad
0
Wow
0