Introduction
"A vector database remembers what was said. OpenMemory remembers what it meant, when it happened, how it felt, and why it matters."
This is article #100 in the "One Open Source Project a Day" series. Today's project is OpenMemory β a self-hosted cognitive memory engine for LLM applications and AI agents.
LLMs are stateless by design. Most "memory" solutions are really RAG pipelines in disguise: chunk text, embed it in a vector store, retrieve by similarity. They don't understand the type of memory (is this a fact, an event, a skill, or a feeling?), don't track time (was this true last month?), don't model importance (is this more relevant than that?), and don't maintain associations (these two things are related).
OpenMemory's thesis: AI agents deserve an actual memory system, not a vector database with "memory" in the marketing copy.
What You'll Learn
- The five-sector memory model: episodic, semantic, procedural, emotional, reflective β what each means and how they decay at different rates
- HMD v2 architecture: how Hierarchical Memory Decomposition works
- Waypoint association graph: single-strongest-path graph and the composite scoring formula
- Temporal knowledge graph:
valid_from/valid_toand fact evolution - The fundamental difference from RAG and vector databases
- Three operating modes: embedded SDK, standalone server, MCP interface
Prerequisites
- Basic understanding of LLM agents
- Familiarity with LangChain, CrewAI, or similar agent frameworks
- Basic understanding of vector embeddings and cosine similarity
Project Background
Overview
OpenMemory is an open-source cognitive memory engine built on HMD (Hierarchical Memory Decomposition) v2 architecture. It provides persistent, structured memory for LLM applications and AI agents.
It's not a vector database wrapper. It's not a replacement for a cloud memory API. The design philosophy is: memory is not a database β it's a dynamic system with decay, reinforcement, association, and temporal dimensions.
The project is maintained by CaviraOSS and ships a Python SDK, Node.js SDK, REST API server, VS Code extension, and a native MCP server.
Author / Team
- Organization: CaviraOSS
- Primary language: TypeScript/Node.js (server), Python (SDK)
- License: Apache 2.0
- VS Code Extension: marketplace.visualstudio.com
Project Stats
- π License: Apache 2.0
- π PyPI:
openmemory-py - π¦ npm:
openmemory-js - π§© Integrations: LangChain, CrewAI, AutoGen, Streamlit, MCP, VS Code
Features
What It Does
Traditional RAG memory (vector DB):
"User is allergic to peanuts, prefers coding at night, feels productive"
β one embedding vector
β retrieved by similarity
β no structure, no time, no importance weighting
OpenMemory cognitive memory:
"User prefers coding at night, feels productive"
β semantic sector: "coding preference" (slow decay)
β emotional sector: "feels productive" (faster decay)
β episodic sector: "time: night" (fastest decay)
β Three sector vectors β mean vector β Waypoint link to related memories
β Composite score = 0.6Γsimilarity + 0.2Γsalience + 0.1Γrecency + 0.1Γwaypoint weight
Use Cases
- Long-term conversation assistants: Remember user preferences, habits, and history across sessions without repeating context
- Agent framework memory layer: Shared long-term memory store for CrewAI, AutoGen, LangGraph agents
- Knowledge worker tools: Ingest GitHub, Notion, Google Drive content; agents can ask "what was the design decision from last week?"
- Coding assistants: Persist code preferences, project context, tech stack choices across sessions
- Emotion-aware applications: Emotional sector stores sentiment separately, preventing it from polluting factual memory retrieval
Quick Start
Python SDK (local SQLite, zero config):
pip install openmemory-py
from openmemory.client import Memory
mem = Memory()
# Add memories
await mem.add("user is allergic to peanuts", user_id="user123")
await mem.add("user prefers coding at night", user_id="user123")
# Query memories (composite score ranking)
results = await mem.search("what dietary restrictions does the user have?", user_id="user123")
# Reinforce a memory (boost salience)
await mem.reinforce("memory_id")
# Delete a memory
await mem.delete("memory_id")
Node.js SDK:
npm install openmemory-js
import { Memory } from "openmemory-js"
const mem = new Memory()
await mem.add("user prefers TypeScript over Python", { user_id: "u1" })
const results = await mem.search("language preference", { user_id: "u1" })
LangChain integration:
from openmemory.integrations.langchain import OpenMemoryChatMessageHistory
history = OpenMemoryChatMessageHistory(memory=mem, user_id="u1")
# Drop-in replacement for LangChain's ConversationBufferMemory
OpenAI interceptor pattern:
mem = Memory()
client = mem.openai.register(OpenAI(), user_id="u1")
# All subsequent chat.completions.create calls automatically store/retrieve memory
resp = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "What language should I use?"}]
)
Ingesting external data sources:
# Ingest a GitHub repository
github = mem.source("github")
await github.connect(token="ghp_...")
await github.ingest_all(repo="owner/repo")
# Ingest Notion pages
notion = mem.source("notion")
await notion.connect(token="secret_...")
await notion.ingest_all(database_id="xxx")
Available connectors: github, notion, google_drive, google_sheets, google_slides, onedrive, web_crawler
MCP integration (Claude Code / Cursor):
# Claude Code
claude mcp add --transport http openmemory http://localhost:8080/mcp
// Cursor .mcp.json
{
"mcpServers": {
"openmemory": {
"type": "http",
"url": "http://localhost:8080/mcp"
}
}
}
Available MCP tools: openmemory_query, openmemory_store, openmemory_list, openmemory_get, openmemory_reinforce
The Five Memory Sectors
| Sector | Meaning | Decay Rate | Weight |
|---|---|---|---|
episodic |
Events and experiences (what happened, when) | 0.015 (fast) | 1.2 |
semantic |
Facts and knowledge (user preferences, domain knowledge) | 0.005 (slowest) | 1.0 |
procedural |
Skills and workflows (how to do something) | 0.008 (medium) | 1.1 |
emotional |
Feelings and attitudes (how something felt) | 0.020 (fastest) | 1.3 |
reflective |
Meta-cognition and insights (what was realized) | 0.001 (near-permanent) | 0.8 |
Decay formula: salience Γ e^(-decay_lambda Γ days_since_last_seen)
Decay runs every 24 hours. Waypoint links with weight < 0.05 are pruned every 7 days.
Deep Dive
HMD v2 Architecture
Input content
β
Sector Classifier
βββ Identifies primary sector + additional sectors
βββ Based on keyword patterns + context
β
Multi-Sector Embedding
βββ Independent embedding vector per relevant sector
βββ Providers: OpenAI / Gemini / AWS / Ollama / local / synthetic
βββ Compute mean vector across all sector vectors
β
Storage (SQLite / Postgres)
βββ memories table: content + metadata + salience + decay parameters
βββ vectors table: one vector per memory Γ per sector
βββ waypoints table: single strongest associative link per memory
β
Query: Composite Scoring
score = 0.6Γsimilarity + 0.2Γsalience + 0.1Γrecency + 0.1Γwaypoint weight
The Waypoint Association Graph
This is one of the key architectural choices that distinguishes OpenMemory from vector databases:
Memory A ββ0.85βββΆ Memory B
(only the single strongest link is kept)
When a memory is added, the system finds the single most similar existing memory (cosine similarity β₯ 0.75) and creates a one-way Waypoint link. Cross-sector links are bidirectional.
At query time, after top-K vector retrieval, a 1-hop graph traversal expands results to include memories linked via Waypoints. Every time a Waypoint is traversed, its weight increases by +0.05 (max 1.0) β memories that are frequently recalled together develop stronger links over time.
The practical effect: a query that isn't semantically close to a memory can still surface it if a linked memory scores high. This creates emergent associative recall rather than pure similarity search.
Temporal Knowledge Graph
OpenMemory treats time as a first-class dimension:
# Add a fact in 2021
POST /api/temporal/fact
{
"subject": "CompanyX",
"predicate": "has_CEO",
"object": "Alice",
"valid_from": "2021-01-01"
}
# Update in 2024
POST /api/temporal/fact
{
"subject": "CompanyX",
"predicate": "has_CEO",
"object": "Bob",
"valid_from": "2024-04-10"
}
# Alice's tenure is automatically closed (valid_to = 2024-04-09)
Supported operations:
-
valid_from/valid_totruth windows - Point-in-time queries ("who was CEO of CompanyX in late 2022?")
- Change detection (when did a fact flip?)
- Entity timeline reconstruction
Performance Data
At 100k memories (SQLite with WAL mode):
| Operation | Latency |
|---|---|
| Add memory | 80-120 ms (depends on embedding provider) |
| Single-sector query | 110-130 ms |
| Multi-sector fusion (2-3 sectors) | 150-200 ms |
| Waypoint expansion (per hop) | +30-50 ms |
| Decay process (background) | ~10 sec (every 24 hours) |
Storage estimates:
- Single memory: ~4-6 KB (including all sector vectors)
- 100k memories: ~500 MB
- 1M memories: ~5 GB
vs. SaaS Alternatives
| Dimension | OpenMemory | Supermemory | OpenAI Memory |
|---|---|---|---|
| Hosting | Self-hosted | Cloud only | Cloud only |
| Query latency | 110-130 ms | 350-400 ms | ~300 ms |
| Cost per 1M tokens | ~$0.30-0.40 | ~$2.50+ | ~$3.00+ |
| Explainability | Fully transparent (Waypoint trace) | Black-box | Black-box |
| Local embeddings | Yes (Ollama, local models) | No | No |
| Data ownership | 100% yours | Vendor-held | OpenAI-held |
The cost difference comes from running local embeddings (Ollama, BGE, E5) instead of API-billed embedding calls, and from no cloud infrastructure markup.
Migrating from Other Systems
OpenMemory ships a migration tool to import existing memory data:
cd migrate
# Migrate from Mem0
python -m migrate --from mem0 --api-key MEM0_KEY --verify
# Migrate from Zep
python -m migrate --from zep --api-key ZEP_KEY --verify
# Migrate from Supermemory
python -m migrate --from supermemory --api-key SM_KEY --verify
Database Schema
The core SQLite schema makes the architecture transparent:
-- A memory record with salience and decay parameters
CREATE TABLE memories (
id TEXT PRIMARY KEY,
content TEXT NOT NULL,
primary_sector TEXT NOT NULL,
salience REAL, -- 0-1 importance score
decay_lambda REAL, -- sector-specific decay rate
last_seen_at INTEGER, -- for decay calculation
mean_vec BLOB -- mean vector for waypoint matching
);
-- One vector per memory per sector
CREATE TABLE vectors (
id TEXT NOT NULL,
sector TEXT NOT NULL,
v BLOB NOT NULL, -- float32 vector
PRIMARY KEY (id, sector)
);
-- Single strongest associative link per memory
CREATE TABLE waypoints (
src_id TEXT PRIMARY KEY,
dst_id TEXT NOT NULL,
weight REAL NOT NULL -- 0-1 link strength
);
Resources
Official Links
- π GitHub: CaviraOSS/OpenMemory
- π¦ PyPI: openmemory-py
- π¦ npm: openmemory-js
- π VS Code Extension: openmemory-vscode
- π Architecture docs: ARCHITECTURE.md (in repo)
Summary
OpenMemory's contribution is treating "memory" as a serious engineering problem rather than a marketing term.
When most developers talk about "AI memory," they mean retrieval-augmented generation β store vectors, retrieve by similarity. OpenMemory's position is that this is search, not memory. A real memory system needs to know whether something is a fact or an emotion, whether it's recent or historical, whether it's still true today, and what else it relates to.
The five-sector model, Waypoint graph, temporal knowledge graph, and composite scoring aren't complexity for its own sake β they map to distinct dimensions of how human memory actually works.
For developers building agent applications that need cross-session memory, OpenMemory is one of the most architecturally coherent open-source options available today: self-hosted, local-first, framework-agnostic, explainable, and performance-predictable. Three lines of code to integrate, a full server mode when you need it.
Explore PrimeSkills β a curated marketplace of AI agents and skills, each validated against real enterprise workflows. No hype, just what actually works.
Visit my personal site for more insights and interesting products.













