Memory layers
At a glance
- Short-term: LangGraph
Stateobject — lives for the duration of one job execution - Checkpointing:
AsyncPostgresSaver— persists state to PostgreSQL for resume/retry/HITL - Long-term:
MemoryStore— key-value store with proposal/approval workflow - Vector DB: Pinecone (cloud) or FAISS (on-prem) for semantic retrieval
Why this matters
Memory is what separates a one-shot LLM call from an agent that maintains context across a conversation, resumes after interruption, and remembers a customer from yesterday’s call. When scoping a deployment, the memory architecture determines what the agent can “know” and for how long.
The three layers
Layer 1: LangGraph State (short-term)
Every agent execution shares a common State class (app/models/state.py):
| Field | Type | Purpose |
|---|---|---|
messages | List[BaseMessage] | Conversation history (inherited from MessagesState) |
extras | Dict[str, Any] | Arbitrary key-value store for node-to-node data passing |
responses | List[Dict] | Agent responses collected during execution |
input_data | Dict[str, Any] | Original input parameters |
data | Dict[str, Any] | Accumulated data from tool calls and LLM responses |
execution_variables | Dict | Job context: org_id, workflow_id, ticket_id, customer info |
State uses OverwriteLastValue channels to prevent concurrent update errors when parallel nodes write to the same fields.
Layer 2: Checkpointing (persistence for resume)
The platform uses LangGraph’s AsyncPostgresSaver backed by psycopg_pool.AsyncConnectionPool:
- What it stores: Full state snapshot after each node execution
- Thread ID: Equals the job ID — each job gets its own checkpoint stream
- Singleton: One checkpointer instance per process
- Used for:
- HITL resume: When a job is interrupted for human approval, the checkpoint allows
resume_job()to pick up exactly where it left off - Retry: Failed jobs can restart from the last successful checkpoint
- Follow-up context:
resume_with_context()injects new information into a paused job
- HITL resume: When a job is interrupted for human approval, the checkpoint allows
Layer 3: Long-term memory
MemoryStore (app/core/memory/MemoryStore.py):
- Simple key-value store with a proposal/approval workflow
- Functions:
propose_memory_update(),approve_memory_update(),reject_memory_update(),get_memory_state() - Agents can propose storing information (e.g., customer preferences); a human or policy can approve/reject
Vector DB (for semantic retrieval):
- Cloud: Pinecone — managed vector database, no infrastructure to maintain
- On-prem: FAISS — Facebook’s local vector search library, runs entirely on customer hardware
- Used for: RAG (retrieval-augmented generation), knowledge base search, document similarity
Worked example: HITL resume flow
Pinecone vs FAISS
| Factor | Pinecone (cloud) | FAISS (on-prem) |
|---|---|---|
| Hosting | Managed SaaS | Runs on customer hardware |
| Scaling | Auto-scales | Manual — need to size GPU/CPU |
| Cost | Pay per vector stored + queries | No per-query cost, but hardware capex |
| Latency | ~50-100ms p95 | Depends on hardware — can be faster |
| Data residency | Data stored in Pinecone’s cloud | Data stays on-prem |
| Best for | Cloud deployments, fast time-to-value | Banking, government, strict data residency |
Sources
- agent-platform repo:
app/models/state.py - agent-platform repo:
app/db/checkpointer.py - agent-platform repo:
app/core/memory/MemoryStore.py - See Deployment modes for Pinecone vs FAISS in each deployment mode
- See The agent-platform repo for the full directory structure
Changelog
- 26 May 2026: Full content from GitHub repo exploration. State model, checkpointing, MemoryStore, vector DB comparison.