The agent-platform repo
At a glance
- Repo: github.com/shipsy/agent-platform
- Stack: FastAPI + LangGraph + LangChain + PostgreSQL + Redis + TaskIQ
- Language: Python 3.13 (container runs 3.11-slim)
- Three processes: API server, TaskIQ worker, TaskIQ scheduler
- 277 active branches as of May 2026
Why this matters
Every agent Shipsy deploys — WISMO, Clara, Atlas, freight quotation, address intelligence — runs on this single repo. Understanding its layout means you can trace any agent behavior from API call to LLM response to tool execution. You don’t need to be an engineer to navigate it; you need to know where things live.
Repository structure
agent-platform/
├── app/ # Main application code
│ ├── api/ # REST API routes
│ │ ├── dashboard/ # Dashboard endpoints (agent mgmt, jobs, HITL, policies)
│ │ ├── internal/ # Internal API (agent execution, creation, webhooks)
│ │ ├── internal_dashboard/ # Shipsy-employee-only ops (builder library)
│ │ ├── webhook/ # External vendor callbacks
│ │ ├── callback/ # External tool execution (HMAC auth)
│ │ └── public/ # Health checks (no auth)
│ ├── auth/ # Authentication (ProjectX employee auth, API key, HMAC)
│ ├── communication/ # Multi-channel engine (7 channels, 10+ providers)
│ ├── config/ # App config, rate limiting, response formatting
│ ├── core/ # Domain logic — the heart of the platform
│ │ ├── agents/ # Agent base classes and types
│ │ ├── llm/ # LLM service + provider implementations
│ │ ├── mcp/ # MCP integration layer
│ │ ├── memory/ # MemoryStore (key-value with approval workflow)
│ │ ├── middleware/ # Node-level middleware (14 types)
│ │ ├── orchestration/ # AgentOrchestrator + node builders
│ │ ├── policy/ # Policy framework (5 categories)
│ │ └── tools/ # Tool service + registry
│ ├── db/ # SQLAlchemy models, DAOs, checkpointer, sessions
│ ├── functions/ # Tool implementations (Python function tools)
│ ├── integrations/ # External: ProjectX, LIA, Atlas, geocoding
│ ├── mcp/ # MCP server (FastMCP, mounted at /mcp)
│ ├── middleware/ # HTTP middleware (rate limiting, CORS, logging, New Relic)
│ ├── models/ # Pydantic models (state, agent, workflow, job, task, policy)
│ ├── platform/ # Utilities (caching, Elasticsearch, encryption, webhooks)
│ ├── services/ # Business services (agent, job, task, HITL, policy)
│ ├── triggers/ # Workflow trigger engine
│ └── worker/ # TaskIQ worker (background jobs, follow-ups, timeouts)
├── data/ # Static configuration catalogs
│ ├── agent/agents.json # 12 agent templates
│ ├── tool/tools.json # 18 registered tools
│ ├── prompts/ # 16 system prompt files
│ ├── guardrail/ # 4 guardrail types
│ ├── sop/ # Standard Operating Procedures
│ └── llm_models.json # Supported LLM models
├── agents-helper/ # Internal dev docs (5 files)
├── sdd/ # Software Design Documents (50+ feature HLDs)
├── scripts/ # DB migration + table creation
├── tests/ # Unit, integration, middleware, service tests
├── Dockerfile # Production container
├── Jenkinsfile # CI/CD → AWS ECR → ECS Fargate
├── docker-entrypoint.sh # Starts all 3 processes
├── main.py # Application entrypoint
└── requirements.txt # Python dependenciesKey architectural concepts
Agents are database-driven DAGs
Agents are not hardcoded Python classes. They are workflow definitions stored in PostgreSQL — directed acyclic graphs (DAGs) of nodes and edges. Clara, Vera, Nexa, Maya, Atlas are customer-facing brand names for workflows built from reusable templates.
Node types
| Type | What it does |
|---|---|
start | Entry point — receives input data |
agent | Sub-agent with tools, system message, structured output |
llm | Direct LLM call (no tool access) |
tool_node | Execute a registered tool |
router | Conditional branching (JMESPath-based edge conditions) |
condition | Conditional logic |
prompt | Prompt template node |
agent_graph | Inline subgraph (nested workflow) |
end | Terminal node |
Execution model
1. API receives trigger → creates Job (status: QUEUED)
2. TaskIQ worker picks up job → status: RUNNING
3. AgentOrchestrator builds LangGraph StateGraph from workflow DAG
4. Each node executes through 14 middleware layers
5. Tasks created for each node execution (tracking cost, tokens, latency)
6. On completion → status: SUCCESS (or FAILED/INTERRUPTED for HITL)API routes
| Prefix | Auth | Purpose |
|---|---|---|
/api/internal | API key + Shipsy employee | Agent execution, creation, HITL actions |
/api/dashboard | ProjectX user auth | Agent management, jobs, policies, activity |
/api/internal-dashboard | Shipsy employee Basic Auth | Builder library, metadata |
/api/webhook | Per-provider HMAC | External vendor callbacks |
/api/callback | Provider-specific | External tool execution |
/api/public | None | Health checks |
/mcp | API key (provisional) | MCP streamable HTTP |
Running locally
Three processes in separate terminals:
# 1. Redis
redis-server
# 2. FastAPI server
uvicorn main:app --reload
# 3. TaskIQ worker (processes background jobs)
taskiq worker app.worker.broker:broker \
--ack-type when_executed \
--max-async-tasks 10 \
--log-level INFO
# 4. (Optional) TaskIQ scheduler (periodic tasks)
taskiq scheduler app.worker.broker:schedulerAccess points:
- API:
http://localhost:8000 - Swagger docs:
http://localhost:8000/docs - Health check:
http://localhost:8000/health
Key files to know
| File | Why it matters |
|---|---|
app/core/orchestration/agent_orchestrator.py | Core graph builder + executor — register_agent(), run_agent(), resume_job() |
app/models/state.py | LangGraph state definition — every agent shares this State class |
data/agent/agents.json | 12 pre-built agent templates |
data/tool/tools.json | 18 registered tools |
data/prompts/*.txt | System prompts for each agent persona |
data/guardrail/guardrails.json | Safety guardrail definitions |
app/mcp/server.py | MCP server integration |
main.py | App startup, middleware registration, route mounting |
docker-entrypoint.sh | How the 3 processes start in production |
agents-helper/ | Internal dev docs: base standards, LangGraph patterns, communication, testing |
How to add an agent
- Create a workflow definition (via Dashboard or API) with nodes and edges.
- Each node references a node type and optionally an agent template from
data/agent/agents.json. - Assign tools from
data/tool/tools.jsonto agent nodes. - Write a system prompt (or use an existing one from
data/prompts/). - Configure policies (retry, HITL, follow-up) at the node or workflow level.
- Trigger via
POST /api/trigger/workflow/{workflow_id}/run.
How to add an MCP tool
- Add the tool definition to
data/tool/tools.jsonwithexternal_safe: true. - Implement the Python function in
app/functions/. - The MCP server auto-registers all
external_safetools on startup. - Tool is now available over MCP at
/mcpand via the HMAC external API.
Sources
- agent-platform repo
agents-helper/directory (5 internal dev docs)sdd/directory (50+ Software Design Documents)- See Architecture overview for how this fits the bigger picture
Changelog
- 26 May 2026: Full content from GitHub repo exploration. Directory structure, execution model, API routes, key files.