Architecture¶
Jarvis is designed as a layered personal AI infrastructure, where each layer has well-defined responsibilities and can be replaced or extended independently.
Big picture¶
flowchart TB
subgraph IL[Identity Layer]
AUTH[OAuth · Pairing · Token]
end
subgraph OL[Orchestration & Routing]
LG[LangGraph]
MCP[MCP / A2A protocols]
ROUTE[Capability routing]
end
subgraph ML[Memory Layer]
STM[Short-term]
LTM[Long-term]
SEM[Semantic / vector]
FHIR[Health · FHIR]
end
subgraph DL[Device Mesh]
DESK[Desktop]
MOB[Mobile]
WATCH[Watch]
GLASS[Glasses]
VR[VR]
HOLO[Holographic]
MED[Medical]
end
subgraph PL[Plugins]
PROD[Productivity]
HOME[Smart Home]
DEV[Dev tools]
FIT[Fitness]
end
IL --> OL
OL --> ML
OL --> DL
OL --> PL The five layers¶
1. Identity Layer¶
Responsibilities:
- user authentication (OAuth 2.0, passkeys)
- device registration and pairing
- token management and refresh
- per-device certificates and access scopes
Each device receives a registration like:
{
"device_id": "uuid",
"owner_id": "user_id",
"device_type": "watch",
"capabilities": ["notifications", "voice", "heartrate"],
"trust_level": "primary"
}
2. Orchestration & Routing Layer¶
The brain of the system. Built on LangGraph for:
- graph-based workflows with persistent state
- checkpointing and time travel for debugging
- specialised agents orchestrated as nodes
- cross-agent communication via MCP (Anthropic) and A2A (Google)
Routing decides which device executes a task. Example:
| Input | Context | Decision |
|---|---|---|
| "Remind me in 20 minutes" | User is running | Smartwatch (haptic + voice) |
| "Open PR #42" | User at desk | Desktop agent (IDE) |
| "Show me the way" | User driving | Mobile (TTS) + glasses (overlay) |
3. Memory Layer¶
Three memory tiers working together:
- Short-term: active session, immediate context (Redis)
- Long-term: history, preferences, profile (PostgreSQL + mem0)
- Semantic: search over personal docs and knowledge (Qdrant)
- Health: medical and biometric data (HAPI FHIR R4/R5)
Default backend is mem0 + Qdrant. Alternatives: Zep (temporal knowledge graph) or Letta (agent-managed paging).
4. Device Mesh¶
Every device runs a local agent that talks to the central server. See the Devices section.
5. Plugin & Integrations¶
Modular system to extend capabilities: productivity, smart home, dev tools, fitness, finance, web scraping, external APIs.
LLM model strategy¶
Not a single "monolithic" model, but a distributed hierarchy:
| Tier | Example | Use case |
|---|---|---|
| Small | Llama 3.2 1B (Ollama), Phi-3 | Wake word, intent recognition, smartwatch |
| Medium | Llama 3.1 8B, Gemma 2 9B | Quick chats, mobile |
| Large | Claude Sonnet 4.6, GPT-4 | Complex reasoning, coding, orchestration |
Routing happens by task complexity + device capability + privacy policy.
Privacy & Security¶
- 🔐 All data is on-premise by default
- 🔑 End-to-end TLS encryption between device and server
- 🪪 Short-lived JWT tokens + refresh
- 🛡️ Granular access policies per scope
- 📜 Full audit logging, retained per user policy