Engineering · Tools

The tools an AI uses

In OpenHive, the LLM only ever interacts with the outside world in one way — tool_use. Reading a file, calling an external API, even getting a person to do something — they all look like the same kind of decision to the model. This piece is about the "tool surface" the model sees, and who fills it in from where.

The unifying principle: every capability is a tool

Instead of branching by type at the engine level, OpenHive exposes every agent capability through a single interface: a named function with a JSON Schema. What the model sees in the prompt is just tools: [{ name, description, input_schema }, ...]— and what kind of capability each entry came from doesn't matter to the model.

As a consequence, routing decisions live inside the model. The engine has no "if this kind of request, call a skill; otherwise delegate" branching. Its job is the dispatch loop: forward a tool_use block to the right executor and feed the result back as tool_result.

Tools from four sources are exposed to the LLM through a single registry, each in the same JSON-Schema shape.

Four kinds of tool

Tools split into four kinds by where they come from. The model doesn't see the difference, but the engine has to — execution model, concurrency, latency, and cost are all different.

1. Built-in tools

Functions wired directly into the engine code. Delegation (delegate_to, delegate_parallel), user questions (ask_user), skill activation and execution (activate_skill, list_skill_files, read_skill_file, run_skill_script), todo manipulation (set_todos, add_todo, complete_todo), history search (search_history, read_history_entry), team DB (db_query, db_exec, db_describe, etc.). Fast and deterministic — the downside is that adding a new capability means changing the engine. That's why file I/O, web fetch, and domain-specific transforms live in skills instead.

2. Skills

A bundle of SKILL.md + optional Python scripts under packages/skills/{name}/. The markdown body is the skill's "manual," and the model has to activatethe skill before it can call any of its functions. It's a two-stage progressive disclosure.

Stage 1 — discovery. The system prompt lists each skill by short name and a one-line description only. The body is not loaded.
Stage 2 — activation. When the model calls activate_skill("name"), that skill's SKILL.md body is injected into the system prompt and its Python function signatures are appended to the tool array.
Stage 3 — invocation. The Python function is executed in a per-call subprocess gated by OPENHIVE_PYTHON_CONCURRENCY. Queue states (skill.queued, skill.started) are streamed to the UI.

The point of this structure is that long manuals can sit around for free, in token terms. With 100 skills, the system prompt only carries 100 one-line entries; each body lands in the context only when the model commits to using it.

3. MCP tools (Model Context Protocol)

Tools that run as external processes. GitHub, Slack, Linear, an internal database — wrap any system in an MCP server and it registers automatically into the OpenHive tool surface. The exposed names follow the convention mcp__{server}__{tool}so they live in their own namespace.

The MCP manager is lazy. The server process is not spawned until the first call, and once up, it's reused for the rest of the session. Because it's a separate process, the trust boundary is also separate — permissions, timeouts, and input validation are the MCP server's own responsibility.

4. Subordinates as tools

The tool array also contains delegate_to / delegate_parallel; the assigneeparameter is the enum of direct children fixed by the canvas. The result of such a call is not a plain function return — the child's turn loop runs end-to-end and the synthesised tool_resultcomes back. To the parent model, it just looks like an ordinary tool call has finished. Lifecycle, caps, and child context isolation are covered in Agent runtime & delegation.

Why expose them all in the same shape

Branching moves into the model.Whether to read a file or ask a person is a domain judgement. It belongs in the model's reasoning, not in an engine if/else.
The engine stays small.One dispatcher routes by name, so adding a new kind of tool doesn't change the turn loop.
The UI becomes uniform. A node on the Run canvas receives the same events (tool_call, tool_result) regardless of what kind of tool was called.
You inherit the model's training. Every major LLM is already well-trained on the tool-call format. A unified surface gets that training benefit for free.

Trade-offs across the four kinds

Kind	Latency	Extensibility	Cost profile	Best fit
Built-in	Lowest	None (engine change)	Effectively free	FS · search · core comms
Skill	Cold start	High (drop-in dir)	Python concurrency gate	Domain logic · deterministic transforms
MCP	Network / IPC	Very high	External system quotas	External SaaS · internal DB integrations
Delegation	Highest	Bounded by org chart	Per-turn tokens	Judgement · creative · multi-step work

When to reach for which

There's no strict decision rule — the same job can often be solved by all three. As a rough heuristic:

Can it be solved with code? → Skill. Deterministic, no token cost.
Is there an existing external system? → MCP. Reuse its auth and permissions directly.
Does it need judgement or creativity?→ Delegation. Let a different persona's LLM run with isolated context and bring back only the result.
FS / search / basic comms?→ Built-in. Don't reinvent it.

How tools end up in the system prompt

Right before an agent starts a turn, the engine assembles the tool array in this order.

tools = [
  ...builtin_tools,                       // always included
  ...active_skill_tools,                  // only skills opened by activate_skill
  ...mcp_tools,                           // whatever the MCP manager registered
  delegate_to({ assignee: enum(children) }), // only when there are direct children
  delegate_parallel(...),                 // ditto
]

The result is a flat array. The model receives it and — without knowing where each entry came from — picks its next action.

Related code

apps/web/lib/server/agents/skill-bundles.ts — skill activation and system-prompt injection
apps/web/lib/server/skills/runner.ts, concurrency.ts — Python subprocess execution and the concurrency gate
apps/web/lib/server/mcp/manager.ts — MCP server lazy start and tool registration
apps/web/lib/server/engine/session.ts — tool-array assembly and the dispatch loop