I've spent the last several months building ANIMUS, an autonomous system in Rust that gives a local LLM persistent memory. The idea is simple: a knowledge graph that grows on its own, cycle after cycle, as the system reads documents, detects gaps in its knowledge, and fills them in.
For months, the metric I watched most closely was the node count of the graph. It kept climbing. I felt good about that.
Until I ran a full audit and found out that 52% of those nodes were undetected duplicates. Of 1,892 reported nodes, only 911 were actually unique.
How did this happen?
ANIMUS's autonomous loop actively looks for "gaps" — holes in its knowledge that the system decides to fill on its own. The problem: an overly aggressive filter was excluding certain categories from the gap pool, which trapped the system in a loop of re-exploring the same ~40 topics for thousands of cycles. Each pass generated content that was similar but not identical to the last — different enough to avoid triggering any exact-duplicate check, but substantially the same information rephrased.
The node count kept climbing. Actual knowledge, not so much.
The Rust engineering side
The fix wasn't magic, it was audit work:
- Reopening the gap filter that had been closed too aggressively, so the system would explore genuinely new topics instead of repeating itself.
- Fixing a recency bias in the semantic search (
Brain::search): it walked the graph from node 0 with.take(2), which meant it almost always returned stale content from earlier versions of the system. A simple.rev()fixed it. - Building an "auto-census" process that runs every 37 cycles and generates real statistics about the graph by category — so the system itself (and I) could see with numbers, not intuition, whether it was growing in a healthy way.
Along the way, I also migrated the inference engine: from a Python wrapper to a llama-server.exe launched directly from Rust, and from the original model to a quantized Gemma 4 E2B, running at ~77 tokens/second on a consumer GPU (RTX 3050, 4GB). None of this required the cloud or paid APIs — everything runs locally.
What I learned
The most valuable part of this whole episode wasn't fixing the bug. It was realizing that a metric that only goes up never warns you that something is wrong. Node count was a proxy for "the system is learning," but optimizing that one proxy, with nothing to balance it, ended up producing the opposite: inflated content, not new knowledge.
ANIMUS now runs on several cross-checked signals (verified uniqueness, recency-weighted relevance, source validation) instead of one vanity metric. If two signals start to diverge, the system stops and re-audits instead of continuing to generate.
If you're curious about the full picture (architecture, benchmarks, comparison against a simple vector RAG baseline), the technical paper is open access with a DOI: 10.5281/zenodo.20674981. Code is on GitHub.
ANIMUS is an independent project, developed in Santo Domingo, Dominican Republic.
This post was written with the assistance of an LLM, based on my own project, data, and experience.













