Computer Use
Computer use lets your agent control a real desktop. It clicks, types, scrolls, and reads the screen the same way a person would. Use it for anything that needs a browser, GUI app, or terminal.
When it activates
When your account has sandbox execution enabled, every run gets a full Linux desktop alongside your normal tools. The agent gets access to:
- computer: mouse clicks, keyboard input, screenshots
- bash: shell commands, file system, processes
- str_replace_based_edit_tool: read and edit files
Your existing integrations (Slack, GitHub, etc.) are available in the same run. Computer use tools and MCP integrations work side by side.
What the agent can do
Your agent gets a full Linux desktop with a browser, terminal, and file system. It can:
- Browse the web and fill out forms
- Run scripts and commands
- Open, edit, and save files
- Use any installed GUI application
- Copy, paste, drag, scroll, use keyboard shortcuts
After every action, the agent sees a screenshot and decides what to do next.
Stream events
Computer use runs produce the same SSE event stream as regular runs, plus a few extras:
Screenshots appear in the chat after each action. The web UI renders them inline; you can also access them from the stream:
Limitations
- No session resume. Desktop state resets between runs. Files you write are saved and accessible in the run's file list, but the desktop itself starts fresh each time.
- No live view. You see screenshots after each action, not a live video feed.
