AgentFleet architecture

At a glance

LangGraph-based orchestration sits in the middle. Models, memory, tools, and observability hang off it.
Single integration layer in front of customer systems (TMS/WMS, ERPs, TOS, telephony). One MCP server per integration.
Stateful, audited, model-agnostic. Cloud, on-prem, or hybrid.

Why this matters

Every conversation with a customer eventually reaches “how does this actually work?” If you can sketch this diagram on a whiteboard and explain each block in one sentence, you can answer 80% of architecture questions in discovery and security reviews. The remaining 20% is the security & compliance and deployment pages.

The diagram

Reading the diagram

Channels in → Supervisor routes → Agents do the work → Models reason, memory remembers, tools act → Observability records everything.

1. Channels

Voice (via SIP trunking to telephony providers like Aviva, Unifonic, Exotel), WhatsApp, email, and event triggers (e.g. an order status change in the TMS). Same agent, multiple channels; context is preserved across them.

2. Orchestration

LangGraph-based, open source. A supervisor agent owns the conversation and delegates sub-tasks to specialist agents based on SOPs and policies the customer encodes. Multi-node workflow, holds state across turns.

Why LangGraph (and not LangChain alone): we need graph-based state machines, not just chains. An agent can loop back, branch, wait for human input.

3. Models

Model-agnostic. The platform routes a given task to the right model:

Task	Typical choice	Why
Conversational reasoning	GPT-4o, Claude Sonnet	Strongest tool use + reasoning
Cheap classification, summarisation	GPT-4o-mini, Gemini Flash	10× cheaper, sufficient
Voice transcription	Whisper	Quality + language coverage
Vision (PODs, invoices)	GPT-4o vision, Gemini	OCR + layout understanding
Rule-based (e.g. address normalize)	Proprietary	Deterministic, fast, no LLM needed
Sensitive / on-prem	Llama, Mistral (fine-tuned)	Run inside customer’s VPC

See Models — choosing & switching for the decision matrix.

4. Memory

Short-term: the current conversation. Held in process. Cleared on session end.
Long-term: vector DB (Pinecone default; FAISS for on-prem). Stores customer SOPs, product docs, historical context. Queried via RAG.

See Memory layers.

5. Tools (MCP)

Every external action the agent takes goes through a tool. Tools are exposed via MCP servers — one per system. This is the integration layer: adding a new customer system means writing one MCP server, after which every agent can use it.

See Tools & MCP integrations.

6. Observability

Every reasoning step, tool call, and decision is logged. The AgentFleet Dashboard surfaces real-time metrics: latency, accuracy, escalation rate, HITL queue length. Anomalies trigger alerts.

See Observability & monitoring.

What’s not in the diagram

Three things newcomers commonly assume are part of the platform but aren’t:

Customer’s CRM (Salesforce, HubSpot). We integrate, we don’t replace.
Customer’s TOS / OMS / ERP. Same.
The conversational front-end for end-users. The customer’s existing channels (their phone tree, their WhatsApp Business number, their email) front Shipsy — we don’t ship a chat widget.

Try it yourself

Open the agent-platform repo and trace one agent (Clara is the most documented — see agents/clara) from inbound call to outbound response. Map each step to a block in the diagram above. If you can’t map a step, that’s a gap in this page — file an issue.

For a guided walkthrough using Claude Code, see Querying the repo with Claude Code.

Sources

Carrix proposal deck, “Architecture and Key Components” section
BDO Unibank deck, same section
agent-platform repo

Changelog

26 May 2026: Initial draft.

Guardrails & hallucination control The 9 capability modules