The XML Tool Protocol
BOR’s tools are an XML protocol the model emits inline as it streams. The model says things like:The parser (server/protocol/parser.js)
A streaming, index-based state machine (modeled on Sixth/Cline’s parseAssistantMessageV2) with two extensions:
- Attributes —
<say tone="warm">,<execute_command background="true">. Short scalars ride on the tag, avoiding noisy child tags. - CDATA payloads — file contents, widget HTML, and shell commands contain
<and>.<![CDATA[ … ]]>lets the model emit them verbatim.
feed() / drain() / flush(). It emits block objects:
body is the inner text (CDATA-unwrapped). children lets handlers pull typed sub-tags (e.g. <create_app>’s <html>, <doc>, <thumbnail>).
Robustness
The parser is hardened against the ways weak models mangle output:- Anchor on the close tag, not on CDATA. Tool/param close tags (
</write_file>,</content>) never appear in file content, so the parser finds the first</tag>directly — it does not skip CDATA to find it. This means a malformed CDATA close (]]without>, or stray text) can’t make one file’s content swallow every following file. - Outer-anchored CDATA unwrap. The body is taken from the first
<![CDATA[to the last]]>, so content that legitimately contains]]>survives. Content with no CDATA at all is accepted too. - Last-occurrence for collision-prone tags.
<html>is the one param whose content contains its own close (</html>), so the parser uses the last</html>within the bounded parent.
The registry (server/protocol/registry.js)
The single source of truth for the tool surface. Adding a tool is one entry here plus one handler module. Each entry declares:
buildToolDocs() renders the whole registry into the system prompt, so the model always sees the current, accurate tool list — including any tools you add. Both LLM harnesses read this same registry.
The executor (server/protocol/executor.js)
Given a parsed block + a per-request ctx, the executor invokes the registered handler, retrying transient failures up to BOR_TOOL_MAX_ATTEMPTS. A handler returns:
event+payload— streamed to the presence as a card (eventis the SSE name,payloadis its data).llmEcho— a compact, model-facing string fed back on the next pass (the tool result the model reads).llmMedia— optional images (e.g. a screenshot) attached to the model’s next turn as image blocks.
<tool_results> block for the model. The executor also enforces cross-tool safety — e.g. any non-browser_action tool auto-closes an open browser session.
The handlers
One module per tool family, underserver/protocol/handlers/:
| Module | Tools |
|---|---|
widget.js | apps (create_app/update_app/launch_app/delete_app) |
shortcut.js | shortcuts (create_shortcut/…/rename_widget) |
file.js | file ops (read_file/write_file/replace_in_file/list_files/search_files/…) |
exec.js | execute_command, wait_until, read_command_tail, list_recent_commands |
theme.js | set_theme, reset_theme, set_ai_name, reset_chat |
memory.js | remember, recall, forget |
computer.js | screen_capture, locate_screen_element, computer_click/type/key, open_app, computer_use |
browser.js | browser_action |
tools.js | web_search, web_fetch, view_image, generate_image |
surface-data.js | list_surfaces, read_surface_data, mutate_surface_data |
svg.js | create_svg |
agent.js | skills + MCP (use_skill, list_mcp_*, use_mcp_tool, access_mcp_resource) |
cron.js / notification.js | cron, push_notification |
say.js | say, attempt_completion, wait_for_human_input |
ux-guidelines.js | get_ux_ui_strict_guidelines |