Agent Orchestration & Tool Registry Spec — AI Brain · AI Brain

Last Updated: 2026-04-03 Status: Draft

1. The Architecture: Router + Specialists

1.1 Why Not a Single Agent?

A monolithic agent with one system prompt trying to handle blog writing, analytics diagnosis, content scheduling, and chat management produces mediocre results across the board. The prompt becomes bloated, tool lists grow unwieldy, and the LLM struggles to decide which capability to use.

The solution: A lightweight router that delegates to domain-specific specialist agents.

Role	Model Tier	System Prompt Size	Tools Available	Purpose
Router	Haiku-class	~200 tokens	None (classification only)	Classify intent, pick specialist, orchestrate multi-step
Blog Specialist	Sonnet-class	~800 tokens	`blog.*` tools	Writing, SEO, publishing, blog analytics
Content Specialist	Sonnet-class	~800 tokens	`content.*` tools	Scheduling, social media, campaigns
Analytics Specialist	Sonnet-class	~800 tokens	`analytics.`, `blog.get_`, `content.get_*`	Diagnosis, trends, reporting
Chat Specialist	Sonnet-class	~800 tokens	`chat.*` tools	Chatbot config, knowledge base, inbox

1.2 Agent Flow

2. The Router Agent

2.1 Router System Prompt

You are an intent classifier for a business platform AI copilot.
Given a user message and minimal context, classify the intent.
 
RESPOND ONLY WITH JSON. No explanation, no conversation.
 
Output format:
{
  "intent": "<intent_name>",
  "confidence": <0.0-1.0>,
  "specialist": "<blog|content|analytics|chat|general>",
  "services": ["<service names needed>"],
  "entities": ["<entity keywords>"],
  "time_scope": "<recent|historical|future|none>",
  "action_required": <true|false>,
  "multi_step": <true|false>
}
 
Intent categories:
- content_creation: writing, drafting, scheduling posts
- content_optimization: SEO, title improvement, updating posts
- analytics_diagnosis: traffic analysis, performance questions
- analytics_reporting: metrics, stats, summaries
- configuration: settings, preferences, setup
- knowledge_query: facts about their business/data
- action_execution: "do X for me", "create", "delete", "schedule"
- general_chat: greetings, thanks, off-topic

2.2 Router Input

The router receives minimal context — just enough to classify:

{
  "message": "Why did my blog traffic drop this week?",
  "tenant_type": "food_bakery",
  "active_services": ["blog", "chat"],
  "recent_topics": ["traffic", "recipes"]
}

No memories, no service data, no full history. The router is fast (~50 tokens input, ~30 tokens output, ~100ms).

2.3 Router Output

{
  "intent": "analytics_diagnosis",
  "confidence": 0.94,
  "specialist": "analytics",
  "services": ["blog"],
  "entities": ["traffic", "blog"],
  "time_scope": "recent",
  "action_required": false,
  "multi_step": false
}

2.4 Multi-Step Orchestration

When multi_step: true, the router orchestrates multiple specialists sequentially:

User: "Create a content calendar for next month and analyze which
       topics will perform best"
 
Router classifies:
  multi_step: true
  steps: [
    { specialist: "analytics", task: "analyze_top_performing_topics" },
    { specialist: "blog", task: "create_content_calendar", depends_on: "step_1" }
  ]

Execution:

Router calls analytics specialist → gets top-performing topics
Router passes analytics output to blog specialist as context
Blog specialist creates calendar informed by analytics data
Combined response streamed to user

Each step's output is injected into the next step's context. The user sees a single streamed response covering both.

2.5 Fallback: General Agent

If the router classifies specialist: "general" (greeting, thanks, off-topic, ambiguous):

Use a lightweight general prompt (no specialist system prompt)
No service data pulled
Only tenant profile + preferences from memory
Haiku-class model (fast, cheap)

3. Specialist Agents

3.1 Blog Specialist

System Prompt Core:

You are a blog strategy expert for a business platform. You help
business owners create, optimize, and grow their blog.
 
CAPABILITIES:
- Suggest blog topics based on audience data and trends
- Create draft outlines with SEO-optimized titles
- Analyze post performance and diagnose issues
- Schedule posts for optimal engagement times
- Recommend content updates for declining posts
 
RULES:
- Always reference the tenant's actual metrics, not generic advice
- Recipe/tutorial content format: suggest structured steps
- When creating drafts, match the tenant's writing style from past posts
- For SEO suggestions, focus on title optimization and internal linking
- Never fabricate view counts or engagement metrics

Tools Available:

Tool	Description	Confirmation Required
`blog.list_posts`	List posts with filters (status, category, date range)	No
`blog.get_post`	Get full post details including content	No
`blog.get_analytics`	Get views, engagement metrics for posts	No
`blog.create_draft`	Create a new draft post with title, outline, metadata	No
`blog.update_post`	Update an existing post's content, title, or metadata	No
`blog.schedule_post`	Set or change a post's publish date	No
`blog.delete_post`	Delete a post permanently	Yes
`blog.bulk_update`	Update multiple posts at once	Yes

3.2 Content Specialist

System Prompt Core:

You are a content scheduling and social media expert. You help
business owners plan and automate their content distribution.
 
CAPABILITIES:
- Create and manage content calendars
- Schedule social media posts across platforms
- Analyze engagement across channels
- Suggest optimal posting times based on audience data
- Coordinate blog → social cross-posting
 
RULES:
- Respect the tenant's connected platforms (don't suggest unconnected ones)
- Time-aware: consider timezone, holidays, seasonal events
- Multi-platform formatting: adapt content for each platform's constraints
- If no social accounts connected, suggest blog-only strategies

Tools Available:

Tool	Description	Confirmation Required
`content.list_scheduled`	List scheduled content across platforms	No
`content.create_schedule`	Schedule a new social post	No
`content.update_schedule`	Modify a scheduled post's time or content	No
`content.cancel_schedule`	Cancel a scheduled post	Yes
`content.get_engagement`	Get engagement metrics for published content	No
`content.list_platforms`	List connected social platforms	No

3.3 Analytics Specialist

System Prompt Core:

You are a data analyst for a business platform. You help business
owners understand their metrics and make data-driven decisions.
 
CAPABILITIES:
- Diagnose traffic and engagement changes
- Identify trends and anomalies in performance data
- Compare time periods and content types
- Generate actionable recommendations from data
- Cross-reference metrics across services (blog + chat + content)
 
RULES:
- Always show actual numbers, not just percentages
- Compare against the tenant's own baseline, not industry averages
- When diagnosing drops/spikes, check multiple factors (publish frequency,
  content type, seasonality, specific post performance)
- Present findings as: observation → cause → recommendation
- Never attribute causation without supporting data

Tools Available:

Tool	Description	Confirmation Required
`analytics.blog_overview`	Blog metrics: views, posts, categories (date range)	No
`analytics.post_performance`	Individual post metrics with daily breakdown	No
`analytics.traffic_trends`	Daily/weekly traffic with change detection	No
`analytics.content_comparison`	Compare performance across content types	No
`analytics.chat_overview`	Chat metrics: conversations, resolution rate, topics	No
`analytics.cross_product`	Combined metrics across all services	No

3.4 Chat Specialist

System Prompt Core:

You are a chatbot management expert. You help business owners
configure and optimize their AI chatbot.
 
CAPABILITIES:
- Configure chatbot persona and behavior
- Manage knowledge base (add/remove/update sources)
- Review conversation quality and resolution rates
- Set up lead capture rules
- Manage channel integrations (WhatsApp, website widget)
 
RULES:
- Knowledge base changes affect live chatbot responses — always confirm
- When reviewing conversations, respect end-customer privacy
- For persona changes, suggest testing in sandbox first
- Lead capture rules should be non-intrusive (don't interrupt every conversation)

Tools Available:

Tool	Description	Confirmation Required
`chat.get_config`	Get current chatbot configuration	No
`chat.update_config`	Update chatbot persona, temperature, model	Yes
`chat.list_knowledge`	List knowledge base documents	No
`chat.add_knowledge`	Ingest a new knowledge source (URL, PDF, Q&A)	No
`chat.remove_knowledge`	Remove a knowledge source	Yes
`chat.get_conversations`	List recent chatbot conversations	No
`chat.get_unanswered`	List questions the chatbot couldn't answer	No
`chat.list_leads`	List captured leads	No

4. Tool Registry

4.1 Registration Pattern

Each service registers its tools as a typed module:

// packages/brain/src/tools/blog.ts
 
import { defineTool } from '../tools/registry'
 
export const blogTools = [
  defineTool({
    name: 'blog.list_posts',
    service: 'blog',
    description: 'List blog posts with optional filters for status, category, and date range',
    parameters: {
      type: 'object',
      properties: {
        status: { type: 'string', enum: ['draft', 'published', 'scheduled', 'all'] },
        category: { type: 'string', description: 'Category slug to filter by' },
        limit: { type: 'number', default: 10, maximum: 50 },
        offset: { type: 'number', default: 0 }
      }
    },
    requiresConfirmation: false,
    requiredPermission: 'blog.read',
    execute: async (params, ctx) => {
      return ctx.services.blog.listPosts(params, ctx.tenantId)
    }
  }),
 
  defineTool({
    name: 'blog.delete_post',
    service: 'blog',
    description: 'Permanently delete a blog post. This cannot be undone.',
    parameters: {
      type: 'object',
      properties: {
        post_id: { type: 'string', description: 'The ID of the post to delete' }
      },
      required: ['post_id']
    },
    requiresConfirmation: true,  // Agent MUST ask user before calling
    requiredPermission: 'blog.delete',
    execute: async (params, ctx) => {
      return ctx.services.blog.deletePost(params.post_id, ctx.tenantId)
    }
  })
]

4.2 Tool Registry Aggregation

All tool modules are combined into a single registry:

// packages/brain/src/tools/registry.ts
 
import { blogTools } from './blog'
import { contentTools } from './content'
import { analyticsTools } from './analytics'
import { chatTools } from './chat'
 
const allTools = [
  ...blogTools,
  ...contentTools,
  ...analyticsTools,
  ...chatTools
]
 
// Get tools for a specific specialist
export function getToolsForAgent(agentType: AgentType): ToolDefinition[] {
  const toolMap: Record<AgentType, string[]> = {
    blog: ['blog.*'],
    content: ['content.*'],
    analytics: ['analytics.*', 'blog.get_*', 'content.get_*'],
    chat: ['chat.*'],
    general: []  // no tools
  }
  
  return allTools.filter(tool => 
    toolMap[agentType].some(pattern => matchGlob(tool.name, pattern))
  )
}

4.3 Tool Execution Pipeline

When an agent decides to call a tool, the execution goes through a pipeline:

4.4 Adding New Tools (Extensibility)

When a new LogicSpike service launches (e.g., newsletter service), extending the AI Brain is straightforward:

Create packages/brain/src/tools/newsletter.ts with tool definitions
Import and spread into the registry
Create a newsletter specialist agent config (or extend an existing specialist)
Deploy — the router automatically picks up the new specialist

No changes to the router's intent classifier are needed if the new domain maps to an existing intent category. For a genuinely new domain, add a new intent category to the router's classification prompt.

5. Agent Context Injection

5.1 What Each Agent Receives

Each specialist receives a different context payload, optimized for its domain:

Context Section	Router	Blog Specialist	Analytics Specialist	Content Specialist	Chat Specialist
System prompt	Classifier	Blog expert	Data analyst	Content planner	Chatbot manager
Tenant profile	Minimal	Full	Full	Full	Full
Service data	None	Blog metrics + posts	All service metrics	Content calendar + social	Chat config + conversations
Memories	None	Blog-linked memories	All memories	Content-linked memories	Chat-linked memories
Tools	None	`blog.*`	`analytics.`, `blog.get_`	`content.*`	`chat.*`
History	Last 2 turns	Full window (20 turns)	Full window	Full window	Full window

5.2 Context Handoff Between Agents

In multi-step orchestration, the router passes results between specialists:

Step 1: Analytics specialist runs → produces output:
  "Top performing topics: recipes (2.8x), seasonal content (2.1x), tutorials (1.0x baseline)"
 
Step 2: Blog specialist receives this as injected context:
  PRIOR_ANALYSIS: "Analytics determined top performing topics:
  recipes (2.8x), seasonal (2.1x), tutorials (baseline).
  Use this data to inform content calendar creation."

The handoff is explicit — the router constructs the injection, not the specialists. This prevents specialists from hallucinating about other domains.

6. Error Handling & Resilience

6.1 Tool Call Failures

Failure	Recovery
Tool returns error (e.g., post not found)	Agent receives error as tool_result → reasons about it → responds to user
Tool times out (> 10s)	Retry once. If second attempt fails, agent informs user: "I couldn't fetch your blog data right now"
Permission denied	Agent informs user: "You don't have permission for that. Ask your workspace admin."
Rate limit hit	Agent informs user with remaining quota: "You've hit your AI limit for this month"

6.2 LLM Provider Failures

Attempt 1: Primary provider (e.g., Claude Sonnet)
  └── Timeout (30s) or 5xx error
      │
Attempt 2: Secondary provider (e.g., GPT-4o)
  └── Timeout (30s) or 5xx error
      │
Attempt 3: Tertiary provider (e.g., Gemini 1.5 Flash)
  └── Timeout (30s) or 5xx error
      │
Graceful failure: "I'm experiencing technical difficulties. Please try again in a few minutes."

Circuit breaker: If a provider fails 3+ times in 5 minutes, skip it for the next 10 minutes. Log the outage.

6.3 Agent Confusion

If the specialist produces a response that seems off-topic or hallucinatory:

The router does NOT validate specialist outputs (too expensive)
Instead, the eval scorer tracks user satisfaction signals (rephrasing, escalation)
Persistently poor specialist → alert for prompt engineering review

7. Performance Budgets

Stage	Budget	Notes
Router classification	< 150ms	Haiku-class, ~80 tokens total
Context assembly	< 50ms	Parallel DB queries + KV read
Token budget compression	< 5ms	In-memory computation
Specialist first token	< 500ms	Sonnet-class, streaming
Total first token (user perception)	< 700ms	Sum of above
Full response	1–5s	Depends on response length
Tool execution (per call)	< 2s	Internal service call via gateway
Post-response async work	Non-blocking	Queue + KV write, invisible to user