TencentDB Agent Memory: The Token Cost Problem Has a Memory-Layer Solution Now
The token cost problem in long-running agents now has an open-source memory-layer solution.
Tencent Cloud open-sourced TencentDB Agent Memory (MIT, May 2026). Instead of truncating or summarizing context, it offloads verbose tool output to local files and keeps a lightweight Mermaid task graph in the context window. The agent retrieves specifics by node ID when needed.
According to the project's benchmarks: −61.38% token consumption on WideSearch tasks, with success rate improving from 33% to 50%. PersonaMem long-term accuracy: 48% → 76%.
Four memory tiers (raw logs → facts → patterns → personas), fully traceable from top to bottom. Default storage is local SQLite — no external APIs.
Two optimization layers that stack: memory reduces tokens per call, gateway routing reduces cost per token. If your agents call LLMs through an API, both matter.
How to set up unified API routing for multi-model agents →













