Show HN: Reticle – Postman for AI Agents
TL;DR
Reticle is a local desktop tool (Tauri + React + SQLite) that consolidates the full LLM agent testing loop into one interface.
Key Points
- You define scenarios with prompts, variables, and tools, run them against multiple models, and see prompts, responses, tool calls, and results in one view.
- An eval mode checks whether a prompt or model change silently breaks existing behavior – essentially regression testing for AI agents.
- All data stays local: prompts, API keys, and run history are stored in SQLite on your own machine.
- A step-by-step view for agent runs shows exactly why a model made a specific decision.
Nauti's Take
The Postman analogy is well-chosen and hits a real nerve – tooling for agent development is still lagging far behind the pace at which people are actually building agents. The fact that everything runs locally is not a limitation but a hard requirement for many companies.
The interesting question is whether Reticle can evolve from a useful solo tool into something teams share – collaborative eval sets and shared scenarios would be the logical next step. For now: a solid approach worth watching.
Context
Anyone building agents today mostly debugs in the dark: add logs, run code, manually compare prompts. Reticle targets this pain point directly, positioning itself as 'Postman for AI agents' – an analogy that highlights how badly a standardized testing tool is needed in this space. The eval mode is especially relevant: prompt changes have side effects that often only surface in production.
A local, privacy-friendly tool also lowers the barrier for teams unwilling to send sensitive data to cloud services.