Hermes Gets a Memory Plugin – Finally Smarter Every Time You Use It

Lately, I’ve noticed something interesting. A lot of people are switching from OpenClaw to Hermes Agent. A few friends of mine did too, and they say there’s no going back. I’ve been using it for over a month, and honestly, it’s pretty solid.

For those who haven’t tried it yet: Hermes Agent is an open-source AI agent framework from Nous Research, and it’s already gathered over 100K stars on GitHub. Think of it as an alternative to OpenClaw, but with a focus on running locally on your own device. The core idea is simple: the more you use it, the smarter it gets. It has self-evolving learning loops, memory mechanisms, and supports over 40 chat platforms.

But after a while, I started to notice a problem.

It remembers things, but it remembers them in a messy way. For example, I told it a few weeks ago that I was on a diet and wanted to stay under 1800 calories a day. A week later I gave up and said I was back to normal eating. Next time I asked it to plan my weekend, it still recommended low-calorie recipes because both memories were there, and it couldn’t tell which one was the latest. That kind of thing happens a lot when you chat with it frequently.

Hermes stores every conversation in SQLite and retrieves them with simple text matching. So when you repeat similar information in different chats, the memory base fills up with duplicates. The signal-to-noise ratio just gets worse over time.

I started wondering if there was a way to organize those memories properly. And then I found one.

The team behind MemOS, an open-source project with over 8,400 stars on GitHub, built a local memory plugin for Hermes. MemOS has been working on AI memory for a while, and this plugin brings their memory capabilities directly into Hermes. Everything runs locally – no cloud uploads needed.

The plugin solves two core problems: storing memories intelligently and retrieving them accurately.

For storage, it adds a full pipeline: semantic chunking → LLM summarization → vectorization → smart deduplication. The deduplication part is what really makes a difference. It doesn’t just compare strings; it uses an LLM to decide whether the new information is a duplicate, an update, or something completely new. Back to my diet example: when I said “I’m back to normal eating,” the plugin automatically recognized that as an update to the previous diet memory and merged the two into one entry, keeping the history of the merge.

This keeps the memory base clean and usable, instead of turning into a cluttered mess over time.

For retrieval, Hermes’ native text search can miss things when keywords don’t match. You ask “What was that place you recommended last time?” and it can’t find it because the original text says “That restaurant had good food” – completely different keywords. The MemOS plugin uses a hybrid search engine: full-text search plus vector semantic search, then fusion ranking with diversity deduplication and time decay. The result? Even if you use different words, the semantic channel can pull up the relevant memory.

And here’s a nice touch: at the start of each conversation, the system automatically does a pre-retrieval based on your latest message, injecting relevant memories into the context. If it doesn’t hit, it prompts the agent to actively search. The difference in experience is immediate – before, I’d often get vague answers or “I don’t remember,” but after installing the plugin, answers became noticeably more accurate.

Another improvement: Hermes’ built-in skill generation uses the same model for both generation and evaluation, which sometimes leads to low-quality skills. The MemOS plugin supports independent model configuration for three levels: a lightweight model for embedding, a medium model for summarization, and a strong model for skill generation. It also adds a rule-based filter plus LLM evaluation, so only repeatable, valuable tasks get turned into skills.

If you’re already using Hermes, this plugin is definitely worth trying. It doesn’t just add memory – it makes sure the memory actually works for you, instead of against you.