Tencent Open-Sources TencentDB Agent Memory: A 4-Tier Local Memory Pipeline for AI Agents

Tencent has released TencentDB Agent Memory, an open-source memory system for AI agents. The project ships under the MIT license. It targets a problem familiar to anyone shipping long-horizon agents: context bloat and recall failure.

It is symbolic short-term memory along with layered long-term memory. It integrates with OpenClaw as a plugin and with the Hermes Agent through a Gateway adapter. The default backend is local SQLite with the sqlite-vec extension, so no external API is required.

Why agent memory is hard

Most current memory stacks shred data into fragments and dump them into a flat vector store. Recall then becomes a blind similarity search across disconnected fragments, with no macro-level guidance. The architecture rests on two pillars: memory layering and symbolic memory.

A 4-tier semantic pyramid

For long-term personalization, TencentDB Agent Memory builds a four-level pyramid instead of a flat log. The layers are L0 Conversation, L1 Atom, L2 Scenario, and L3 Persona. These correspond to raw dialogue, atomic facts, scene blocks, and a user profile.

The Persona layer carries day-to-day user preferences and is queried first. The system drills down to Atoms or raw Conversations only when finer detail is needed. Lower layers preserve evidence; upper layers preserve structure.

Storage is heterogeneous. Facts, logs, and traces are persisted in databases for full-text retrieval. Personas, scenes, and canvases are stored as human-readable Markdown files. Layered memory artifacts live under ~/.openclaw/memory-tdai/.

Symbolic short-term memory via Mermaid

Long-running agent tasks consume tokens through verbose tool logs, search results, code, and error traces. TencentDB Agent Memory addresses this through context offloading combined with symbolic memory.

Full tool logs are offloaded to external files under refs/*.md. State transitions are encoded in Mermaid syntax inside a lightweight task canvas. The agent reasons over the symbol graph in its context window.

When it needs the raw text, it greps for a node_id and retrieves the corresponding file. The Tencent dev team describes this as a deterministic drill-down from top-layer symbol to mid-layer index to bottom-layer raw text.

Benchmark numbers

Results are measured over continuous long-horizon sessions, not isolated turns. SWE-bench, for example, runs 50 consecutive tasks per session to simulate context-accumulation pressure.

On WideSearch, integrating the plugin with OpenClaw raises pass rate from 33% to 50%, a 51.52% relative improvement. Token usage drops from 221.31M to 85.64M, a 61.38% reduction.

On SWE-bench, success climbs from 58.4% to 64.2% while tokens fall from 3474.1M to 2375.4M, a 33.09% reduction. On AA-LCR, the success rate moves from 44.0% to 47.5%. Tokens drop from 112.0M to 77.3M, a 30.98% reduction.

For long-term memory, PersonaMem accuracy rises from 48% to 76%. Note: these numbers come from Tencent’s own evaluations.

Recall and retrieval

Retrieval defaults to a hybrid strategy. The system combines BM25 keyword search with vector embeddings, fused using Reciprocal Rank Fusion (RRF). Developers can switch to pure keyword or embedding mode through a config field. The BM25 tokenizer supports both Chinese (jieba) and English.

Default settings trigger an L1 memory extraction every five turns. A user persona is generated every 50 new memories. Recall returns five items by default with a 5-second timeout. On timeout, the system skips injection rather than blocking the conversation.

Installation and developer surface

The OpenClaw integration ships as a single npm package: @tencentdb-agent-memory/memory-tencentdb. The project requires Node.js 22.16 or higher. Enabling it takes one config flag. The plugin then handles conversation capture, memory extraction, scene aggregation, persona generation, and recall.

For Hermes, a Docker image bundles the agent, the plugin, and the TDAI Memory Gateway. The default model is Tencent Cloud’s DeepSeek-V3.2. Any OpenAI-compatible endpoint works through the MODEL_PROVIDER=custom flag.

Two tools are exposed to agents during a session: tdai_memory_search and tdai_conversation_search. Both return references with node_id and result_ref fields for traceback. A Tencent Cloud Vector Database (TCVDB) backend is also available as an alternative to local SQLite.

Marktechpost’s Visual Explainer

TencentDB Agent Memory — Preview

01 / OVERVIEW

What is TencentDB Agent Memory?

An MIT-licensed memory system for AI agents that combines symbolic short-term memory with a 4-tier long-term memory pipeline. Runs fully local with zero external API dependencies.

Short-term memory

Offloads verbose tool logs to files and keeps a compact Mermaid task canvas in context.

Long-term memory

Distills conversations into a 4-tier semantic pyramid: L0 → L1 → L2 → L3.

Local backend

Defaults to SQLite + sqlite-vec. Tencent Cloud Vector Database (TCVDB) is optional.

Integrations

Ships as an OpenClaw plugin and a Hermes Agent Docker image.

02 / ARCHITECTURE

The 4-Tier Semantic Pyramid

Long-term memory is layered, not flat. Upper layers carry structure; lower layers preserve evidence.

L3 · PersonaUser profile (persona.md)

L2 · ScenarioScene blocks (Markdown)

L1 · AtomAtomic facts (JSONL)

L0 · ConversationRaw dialogue

Drill-down path: Persona → Scenario → Atom → Conversation. References use node_id and result_ref for deterministic traceback.

03 / SYMBOLIC SHORT-TERM

Mermaid task canvas + context offloading

Verbose intermediate logs are the largest token consumers in long tasks. The plugin offloads them to disk and keeps a high-density symbol graph in context.

How it works

Full tool logs are offloaded to refs/*.md under the data directory.
State transitions are encoded in Mermaid syntax inside a lightweight task canvas.
The agent reasons over the symbol graph, then greps a node_id to pull raw text.

Storage path on disk: ~/.openclaw/memory-tdai/. All artifacts are human-readable for white-box debugging.

04 / INSTALL

Install the OpenClaw plugin

Requires Node.js 22.16 or higher and an OpenClaw installation.

# Install the npm package as an OpenClaw plugin openclaw plugins install @tencentdb-agent-memory/memory-tencentdb openclaw gateway restart

Zero-config enable

Add the following to ~/.openclaw/openclaw.json to turn it on with default SQLite + sqlite-vec.

{   "memory-tencentdb": {     "enabled": true   } }

05 / CONFIGURATION

Daily-tuning parameters

Every field has a sensible default. The most common knobs are listed below.

Field	Default	Description
`storeBackend`	sqlite	Storage backend
`recall.strategy`	hybrid	keyword / embedding / hybrid (RRF)
`recall.maxResults`	5	Items returned per recall
`recall.timeoutMs`	5000	Skip injection on timeout
`pipeline.everyNConversations`	5	L1 extraction every N turns
`persona.triggerEveryN`	50	Generate persona every N memories
`offload.enabled`	false	Short-term compression toggle

06 / SHORT-TERM COMPRESSION

Enable Mermaid offloading (v0.3.4+)

Three steps to turn on context offload for long-horizon tasks.

Step 1 · Enable offload in plugin config

{   "memory-tencentdb": {     "config": {       "offload": { "enabled": true }     }   } }

Step 2 · Register the slot so OpenClaw routes offload requests

{   "plugins": {     "slots": {       "contextEngine": "openclaw-context-offload"     }   } }

Step 3 · Apply the runtime patch (once per OpenClaw install)

bash scripts/openclaw-after-tool-call-messages.patch.sh

07 / HERMES DOCKER

Run memory-enabled Hermes in one container

A single Docker image bundles Hermes Agent, the memory_tencentdb plugin, and the TDAI Memory Gateway.

# Build the image docker build -f Dockerfile.hermes -t hermes-memory .  # Run the container (default model: DeepSeek-V3.2 on Tencent Cloud LKE) docker run -d    --name hermes-memory    --restart unless-stopped    -p 8420:8420    -e MODEL_API_KEY="your-api-key"    -e MODEL_BASE_URL="https://api.lkeap.cloud.tencent.com/v1"    -e MODEL_NAME="deepseek-v3.2"    -e MODEL_PROVIDER="custom"    -v hermes_data:/opt/data    hermes-memory  # Health check curl http://localhost:8420/health

Any OpenAI-compatible endpoint works through MODEL_PROVIDER=custom. Memory data persists in the hermes_data volume.

08 / AGENT TOOLS & RECALL

What the agent sees

Two tools are exposed to the agent during a session. Recall uses BM25 + vector + RRF fusion by default.

tdai_memory_search

Search across L1 Atoms, L2 Scenarios, and L3 Persona.

tdai_conversation_search

Search raw L0 Conversation history.

Retrieval defaults

Hybrid strategy: BM25 keyword + vector embedding, fused via Reciprocal Rank Fusion.
BM25 tokenizer supports Chinese (jieba) and English.
Returns 5 items per recall; 5000 ms timeout; on timeout it skips injection.
References include node_id and result_ref for traceback.

09 / BENCHMARKS

Reported gains with OpenClaw

Measured over continuous long-horizon sessions, not isolated turns. SWE-bench runs 50 consecutive tasks per session.

Benchmark	Baseline	With Plugin	Δ Pass	Δ Tokens
WideSearch	33%	50%	+51.52%	−61.38%
SWE-bench	58.4%	64.2%	+9.93%	−33.09%
AA-LCR	44.0%	47.5%	+7.95%	−30.98%
PersonaMem	48%	76%	+59%	—

Numbers come from Tencent’s own evaluations and reflect the integration with OpenClaw.

10 / RESOURCES

Where to go next

Documentation, source code, and community channels.

Source code

github.com/Tencent/TencentDB-Agent-Memory

npm package

@tencentdb-agent-memory/memory-tencentdb

Roadmap

Portable memory, automatic Skill generation, visual debugging dashboard.

Curated by MARKTECHPOST · AI Research, Engineered for Builders

Key Takeaways

TencentDB Agent Memory is Tencent’s open-source (MIT) memory system for AI agents, built on symbolic short-term memory along with a layered long-term memory pipeline with zero external API dependencies.
Long-term memory is structured as a 4-tier semantic pyramid (L0 Conversation → L1 Atom → L2 Scenario → L3 Persona), with drill-down via node_id and result_ref instead of flat vector recall.
Short-term memory offloads verbose tool logs to refs/*.md and keeps only a compact Mermaid task canvas in context, cutting token usage while preserving full traceability.
Reported gains when integrated with OpenClaw: WideSearch pass rate 33% → 50% with a 61.38% token reduction, SWE-bench 58.4% → 64.2%, AA-LCR 44.0% → 47.5%, and PersonaMem accuracy 48% → 76%.
Ships as a single npm plugin for OpenClaw and a Docker image for Hermes, with local SQLite + sqlite-vec by default, hybrid BM25 + vector + RRF retrieval, and an optional Tencent Cloud Vector Database (TCVDB) backend.

Check out the Repo. Also, feel free to follow us on Twitter and don’t forget to join our 150k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.

Need to partner with us for promoting your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar etc.? Connect with us

Michal Sutter

Michal Sutter is a data science professional with a Master of Science in Data Science from the University of Padova. With a solid foundation in statistical analysis, machine learning, and data engineering, Michal excels at transforming complex datasets into actionable insights.