Documentation

Computer use lets your agent control a real desktop. It clicks, types, scrolls, and reads the screen the same way a person would. Use it for anything that needs a browser, GUI app, or terminal.

When it activates

When your account has sandbox execution enabled, every run gets a full Linux desktop alongside your normal tools. The agent gets access to:

computer: mouse clicks, keyboard input, screenshots
bash: shell commands, file system, processes
str_replace_based_edit_tool: read and edit files

Your existing integrations (Slack, GitHub, etc.) are available in the same run. Computer use tools and MCP integrations work side by side.

Python

What the agent can do

Your agent gets a full Linux desktop with a browser, terminal, and file system. It can:

Browse the web and fill out forms
Run scripts and commands
Open, edit, and save files
Use any installed GUI application
Copy, paste, drag, scroll, use keyboard shortcuts

After every action, the agent sees a screenshot and decides what to do next.

Stream events

Computer use runs produce the same SSE event stream as regular runs, plus a few extras:

Event	When	Key fields
`sandbox-connecting`	Environment starting	`message`
`sandbox-connected`	Desktop ready	`message`, `duration_ms`
`tool_use`	Agent takes an action	`name` (computer/bash/editor), `input`
`tool_result`	Action result	`content` (screenshot for computer actions)
`text`	Agent thinking out loud	`text` delta

Screenshots appear in the chat after each action. The web UI renders them inline; you can also access them from the stream:

Python

Limitations

No session resume. Desktop state resets between runs. Files you write are saved and accessible in the run's file list, but the desktop itself starts fresh each time.
No live view. You see screenshots after each action, not a live video feed.

Computer Use

When it activates

What the agent can do

Stream events

Limitations