3 · AgentFleet platformArchitecture overview

AgentFleet architecture

At a glance

  • LangGraph-based orchestration sits in the middle. Models, memory, tools, and observability hang off it.
  • Single integration layer in front of customer systems (TMS/WMS, ERPs, TOS, telephony). One MCP server per integration.
  • Stateful, audited, model-agnostic. Cloud, on-prem, or hybrid.

Why this matters

Every conversation with a customer eventually reaches “how does this actually work?” If you can sketch this diagram on a whiteboard and explain each block in one sentence, you can answer 80% of architecture questions in discovery and security reviews. The remaining 20% is the security & compliance and deployment pages.

The diagram

Reading the diagram

Channels inSupervisor routesAgents do the workModels reason, memory remembers, tools actObservability records everything.

1. Channels

Voice (via SIP trunking to telephony providers like Aviva, Unifonic, Exotel), WhatsApp, email, and event triggers (e.g. an order status change in the TMS). Same agent, multiple channels; context is preserved across them.

2. Orchestration

LangGraph-based, open source. A supervisor agent owns the conversation and delegates sub-tasks to specialist agents based on SOPs and policies the customer encodes. Multi-node workflow, holds state across turns.

Why LangGraph (and not LangChain alone): we need graph-based state machines, not just chains. An agent can loop back, branch, wait for human input.

3. Models

Model-agnostic. The platform routes a given task to the right model:

TaskTypical choiceWhy
Conversational reasoningGPT-4o, Claude SonnetStrongest tool use + reasoning
Cheap classification, summarisationGPT-4o-mini, Gemini Flash10× cheaper, sufficient
Voice transcriptionWhisperQuality + language coverage
Vision (PODs, invoices)GPT-4o vision, GeminiOCR + layout understanding
Rule-based (e.g. address normalize)ProprietaryDeterministic, fast, no LLM needed
Sensitive / on-premLlama, Mistral (fine-tuned)Run inside customer’s VPC

See Models — choosing & switching for the decision matrix.

4. Memory

  • Short-term: the current conversation. Held in process. Cleared on session end.
  • Long-term: vector DB (Pinecone default; FAISS for on-prem). Stores customer SOPs, product docs, historical context. Queried via RAG.

See Memory layers.

5. Tools (MCP)

Every external action the agent takes goes through a tool. Tools are exposed via MCP servers — one per system. This is the integration layer: adding a new customer system means writing one MCP server, after which every agent can use it.

See Tools & MCP integrations.

6. Observability

Every reasoning step, tool call, and decision is logged. The AgentFleet Dashboard surfaces real-time metrics: latency, accuracy, escalation rate, HITL queue length. Anomalies trigger alerts.

See Observability & monitoring.

What’s not in the diagram

Three things newcomers commonly assume are part of the platform but aren’t:

  • Customer’s CRM (Salesforce, HubSpot). We integrate, we don’t replace.
  • Customer’s TOS / OMS / ERP. Same.
  • The conversational front-end for end-users. The customer’s existing channels (their phone tree, their WhatsApp Business number, their email) front Shipsy — we don’t ship a chat widget.

Try it yourself

Open the agent-platform repo and trace one agent (Clara is the most documented — see agents/clara) from inbound call to outbound response. Map each step to a block in the diagram above. If you can’t map a step, that’s a gap in this page — file an issue.

For a guided walkthrough using Claude Code, see Querying the repo with Claude Code.

Sources

  • Carrix proposal deck, “Architecture and Key Components” section
  • BDO Unibank deck, same section
  • agent-platform repo

Changelog

  • 26 May 2026: Initial draft.