Skip to content

Architecture

Jarvis is designed as a layered personal AI infrastructure, where each layer has well-defined responsibilities and can be replaced or extended independently.

Big picture

flowchart TB
    subgraph IL[Identity Layer]
        AUTH[OAuth · Pairing · Token]
    end

    subgraph OL[Orchestration & Routing]
        LG[LangGraph]
        MCP[MCP / A2A protocols]
        ROUTE[Capability routing]
    end

    subgraph ML[Memory Layer]
        STM[Short-term]
        LTM[Long-term]
        SEM[Semantic / vector]
        FHIR[Health · FHIR]
    end

    subgraph DL[Device Mesh]
        DESK[Desktop]
        MOB[Mobile]
        WATCH[Watch]
        GLASS[Glasses]
        VR[VR]
        HOLO[Holographic]
        MED[Medical]
    end

    subgraph PL[Plugins]
        PROD[Productivity]
        HOME[Smart Home]
        DEV[Dev tools]
        FIT[Fitness]
    end

    IL --> OL
    OL --> ML
    OL --> DL
    OL --> PL

The five layers

1. Identity Layer

Responsibilities:

  • user authentication (OAuth 2.0, passkeys)
  • device registration and pairing
  • token management and refresh
  • per-device certificates and access scopes

Each device receives a registration like:

{
  "device_id": "uuid",
  "owner_id": "user_id",
  "device_type": "watch",
  "capabilities": ["notifications", "voice", "heartrate"],
  "trust_level": "primary"
}

2. Orchestration & Routing Layer

The brain of the system. Built on LangGraph for:

  • graph-based workflows with persistent state
  • checkpointing and time travel for debugging
  • specialised agents orchestrated as nodes
  • cross-agent communication via MCP (Anthropic) and A2A (Google)

Routing decides which device executes a task. Example:

Input Context Decision
"Remind me in 20 minutes" User is running Smartwatch (haptic + voice)
"Open PR #42" User at desk Desktop agent (IDE)
"Show me the way" User driving Mobile (TTS) + glasses (overlay)

3. Memory Layer

Three memory tiers working together:

  • Short-term: active session, immediate context (Redis)
  • Long-term: history, preferences, profile (PostgreSQL + mem0)
  • Semantic: search over personal docs and knowledge (Qdrant)
  • Health: medical and biometric data (HAPI FHIR R4/R5)

Default backend is mem0 + Qdrant. Alternatives: Zep (temporal knowledge graph) or Letta (agent-managed paging).

4. Device Mesh

Every device runs a local agent that talks to the central server. See the Devices section.

5. Plugin & Integrations

Modular system to extend capabilities: productivity, smart home, dev tools, fitness, finance, web scraping, external APIs.

LLM model strategy

Not a single "monolithic" model, but a distributed hierarchy:

Tier Example Use case
Small Llama 3.2 1B (Ollama), Phi-3 Wake word, intent recognition, smartwatch
Medium Llama 3.1 8B, Gemma 2 9B Quick chats, mobile
Large Claude Sonnet 4.6, GPT-4 Complex reasoning, coding, orchestration

Routing happens by task complexity + device capability + privacy policy.

Privacy & Security

  • 🔐 All data is on-premise by default
  • 🔑 End-to-end TLS encryption between device and server
  • 🪪 Short-lived JWT tokens + refresh
  • 🛡️ Granular access policies per scope
  • 📜 Full audit logging, retained per user policy

Going deeper