How Consoles Connects to Everything
Sources, deferred loading, and credential isolation. The architecture behind connecting an AI agent to Linear, GitHub, Slack, and anything else with an API.
A recent study from the UK AI Safety Institute counted 177,000 MCP tools in the wild. Software development accounts for two thirds of them.
The growth rate is the problem. Connect five services to an agent and you’re looking at 50+ tool definitions consuming roughly 55,000 tokens before the model reads a single word of your message. Anthropic has documented configurations exceeding 134,000 tokens in definitions alone. Past 30 to 50 tools, selection accuracy degrades significantly.
Consoles addresses this through a concept called sources, borrowed from the open-source Craft Agent framework we build on, with deferred loading we extended to work across every model we support.
Sources
A source is a connection to an external service. Each one is a folder containing two files:
One folder per source. The agent reads both files before the first interaction.
config.json defines the connection parameters. Server URL, authentication type (OAuth, API key, bearer token, multi-header), transport protocol, and display metadata.
guide.md is structured documentation the agent reads before its first interaction with a service. Available capabilities, required parameters, known limitations, pagination behavior.
Setup is conversational. Say “connect Linear” and the agent handles configuration, OAuth, connection testing, and guide generation.
"Connect Linear"
You ask the agent in plain language.
Config created
Agent writes config.json with URL, auth type, transport.
OAuth triggered
Browser opens. You sign in. Token stored in OS keychain.
Connection tested
Agent verifies the connection works. Reports tool count.
Guide written
Agent writes guide.md with capabilities, gotchas, examples.
No config files to edit. No CLI commands. No restart.
Mentioning an inactive source (“check my Slack messages”) triggers automatic activation. No manual toggling required.
For organizations, sources are scoped and gated. Administrators control which connections are available to which teams. Credentials remain isolated per user. Tool invocations are logged. The same experience that feels lightweight for an individual ships with the access controls and audit trail an organization requires.
Deferred loading
Every tool definition includes a full JSON Schema describing its parameters. Loading all of them upfront consumes context that would otherwise go toward the actual conversation.
Token cost at startup
Tool schema tokens loaded before the agent reads your message
| Sources | Tools | Without | With | Saved |
|---|---|---|---|---|
| 2 | 12 | ~11,000 | ~500 | 95% |
| 5 | 50 | ~55,000 | ~500 | 99% |
| 10 | 95 | ~105,000 | ~500 | 99.5% |
| 20 | 180 | ~200,000 | ~500 | 99.8% |
Anthropic addressed this with Tool Search, a mechanism that defers tool definitions and loads them on demand. The result is an 85% reduction in upfront context consumption.
That mechanism is exclusive to Claude. Other models (GPT-4o, Gemini, open-source alternatives) receive every definition at startup.
We built an equivalent implementation for non-Claude models. On Claude, the native SDK handles deferral. On everything else, our tool search provides identical behavior: names only at startup, definitions loaded on demand, tools callable after discovery. The model is unaware of which implementation serves the search.
The result: 10 connected sources with 100+ tools start at the same ~500 token baseline regardless of which model is running the session.
Credential isolation
Credentials are managed by the host process. The agent does not have access to them.
When the agent invokes a tool, the call routes through the host, which attaches the appropriate token, executes the API call, and returns the result. Tokens do not enter the conversation context. Source configurations store authentication metadata, not secrets. Credentials themselves reside in the OS keychain.
This model extends to the execution sandbox. When an agent orchestrates calls across multiple services, it can execute a script in a V8 isolate using the same isolation architecture Cloudflare developed for Workers. The sandbox operates with no network access, no filesystem, and no environment variables. Service calls route exclusively through the host process, which handles credential injection at the boundary.
A script running in the sandbox cannot access credentials because they are never present in the execution environment.
Permission modes (Explore, Ask to Edit, Execute) provide an additional layer of control, determining which operations the agent can perform and which require explicit approval.
How it comes together
Ask “show me in-progress issues with their PRs and CI status” and the agent discovers the relevant tools from Linear and GitHub on demand, writes a script that runs all the calls in the sandbox, and returns one merged result. The intermediate API responses never enter the conversation. The credentials never enter the sandbox.
Simpler questions get a single direct tool call. Interactive workflows where the agent needs to see each result before deciding the next step stay sequential.
No configuration. The agent reads the question and picks the right approach.