Show HN: Orbit – Structured Python control over AI computer use agents
TL;DR
Orbit is an open-source Python framework for structured control over AI computer use agents (CUAs), avoiding black-box behavior.
Key Points
- Each workflow step gets its own model, budget, and typed output via Pydantic, while sharing session context across steps.
- Instead of screenshots, Orbit uses the OS accessibility tree – faster and more reliable than pure vision models.
- Developers can mix cheap and expensive models per step and steer the agent mid-task when it gets stuck.
Nauti's Take
One GitHub star and zero HN comments – this screams 'early proof of concept'. That said, the core idea is sound: anyone serious about CUAs in production pipelines needs exactly this kind of structured layer between natural language and Python logic.
Choosing the accessibility tree over screenshots is a smart move – less token-heavy, less brittle. The open question is whether Orbit holds up against the chaos of real desktop environments or only shines in controlled demo setups.
Context
Computer use agents are seen as the next evolution in AI automation, but in practice they often fail due to poor controllability. Orbit addresses this directly: instead of handing the entire flow to a model, Python keeps orchestration. The 'mix cheap and expensive models per step' principle has real economic relevance – API costs for CUA workflows can spiral fast.
The accessibility tree approach is also more robust than screenshot-based solutions and holds up better when UIs change.