Unified Agentic Memory Across Harnesses Using Hooks

Editor
13 Min Read


that the main debate isn’t about when the next better model drops, but about who will build the right harness around them. A harness is the scaffolding around the model: the agent loop, tool definitions, context management, memory, prompts, and workflows that turn a raw LLM into a useful product. The model is the engine, the harness is everything that makes it actually drive. Examples of harnesses are Cursor, Claude Desktop, and others.

There’s a running debate in the AI coding tool space: does committing to a specific harness mean vendor lock-in? Memory is the sharpest edge of this. If your agent’s memory lives inside a closed harness or behind a proprietary API, you don’t really own it, and switching costs add up fast. But it doesn’t have to be that way.

The idea is for this blog post is simple: keep the memory layer outside the harness, and let any harness plug into it.

Unified agentic memory design.

In this post, I’ll show how you can build a single, shared memory layer that works across three different coding agents — Claude Code, OpenAI’s Codex, and Cursor — using hooks as the integration mechanism and Neo4j as the persistent store.

The code for hook integration is available on GitHub.

MCP tools can only get you so far with memory

MCP (Model Context Protocol) servers are the go-to answer for giving agents access to external systems. And they work. You can expose a Neo4j database as an MCP tool and let the agent query it when it decides to.

But MCP tools are agent-initiated. The model has to decide to call the tool, and it has to know when and why to do so. That means:

  • The agent needs to “remember to remember”, it must proactively decide to store something worth recalling later.
  • There’s no guarantee of consistency, one session might log everything, the next might log nothing.
  • You’re relying on the model’s judgment about what’s important for memory, in real time, while it’s busy doing something else.

What you really want is passive, deterministic logging, which is something that captures every session event regardless of what the model is doing, without consuming any of its context or attention.

This is exactly what hooks give you.

Hooks allow you to write programmatical and deterministic flows based on predefined set of events.

Enter hooks

Hooks are shell commands that fire automatically on lifecycle events: when a session starts, when the user submits a prompt, before and after every tool use, and when the session ends. The agent doesn’t decide to call them, they run programatically.

The key insight is that hooks are remarkably standardized across providers. Claude Code, Codex, Cursor, and others all support essentially the same lifecycle events:

  • SessionStart for when the agent session begins
  • UserPromptSubmit (or beforeSubmitPrompt in Cursor) for when the user sends a message
  • PreToolUse / PostToolUse for before and after each tool call
  • Stop for when the session ends

The hook receives a JSON payload on stdin with the session ID, event name, tool details, and user prompt. And the hook can emit JSON on stdout to inject additional context back into the conversation. Same contract, three harnesses/clients.

There are other hooks too, things like notification events, subagent stop, or pre-compact hooks, but we won’t be using those here.

Shared memory layer

Now we need somewhere to persist the memory. Quick disclaimer: I work at Neo4j, so we will be using it in this example.

Session structure.

The model is straightforward. Each agent session is a node, connected to a linked list of event nodes, one per hook invocation. Events are typed by the lifecycle event that triggered them: SessionStart, UserPromptSubmit, PreToolUse, PostToolUse, Stop. A session ends up as an ordered timeline of everything that happened during that run.

All five event types are written to the store, which gives you a complete audit trail of every session across every harness. Two of them are also injection points. SessionStart fires before the agent reads its system prompt, so anything the hook emits there gets prepended to the system prompt. That is how persistent, agent-level memory makes its way into context. UserPromptSubmit fires just before the user message is sent, and anything emitted there gets appended to the user prompt. That is the hook for turn-level context, like pulling in memories relevant to what the user just typed.

So, what happens if we start a new session in any of these harnesses with active hooks, for example Cursor?

Example interactions in Cursor

If we inspect the results in Neo4j browser.

Example session persisted as graph in Neo4j.

One important constraint: hooks run outside the harness’s model session. You cannot reuse the LLM the agent is talking to. If you want LLM-powered work inside a hook you have to make your own model call, which adds latency to every event the agent fires. That is why the hooks here only do two things: log events and inject pre-computed memories. They stay fast and deterministic.

Dream phase

The actual memory work happens in a separate dream phase: extracting facts from sessions, summarizing what happened, updating the graph. This is just a batch job that runs every few hours, reads the events accumulated since the last run, and writes back to the memory store. You could in principle kick off a memory update asynchronously every time a session stops, but that feels like a bit too much; a periodic batch is simpler and works fine for this demonstration.

The dream job pulls every event since the session’s last watermark, hands them to Claude along with the current memory store, and asks it to write back a small set of durable notes. The notes themselves imitate a markdown wiki, the same shape Karpathy and others have been gravitating toward for personal LLM memory and the same shape Anthropic’s skills already use: each memory is a file at a semantic path like profile/role.md, tools/bash/common-flags.md, or project/neo4j-skills.md, with YAML frontmatter on top and prose underneath. Claude is told to merge rather than append, so a path is a living document, not a log; if new events contradict an old note, the old note gets rewritten. The result is a tree of small, self-contained markdown files a future session can read cold, indistinguishable in form from a skill, just authored by the dream phase instead of by hand.

If we run it on our example, we get the following memories created.

Dream phase adding and editing memories.

And now if I opened a different harness, this time Claude Code Desktop with hooks activated, I would get the following response.

Claude code desktop using the unified memory layer.

Accessing the memory

The final piece of the puzzle is allowing the agent to access the memory layer. As mentioned, there are two ways to inject information into the agent: hooks and MCP tools.

Agent interacting with the memory layer through hooks and MCP tools.

Hooks are deterministic and run at the start of every session to populate the system prompt. This is where profile information and instructions on how to use memory efficiently should go. You can also append additional context when a user prompt submission event fires, but it’s append-only; you can’t manipulate other parts of the prompt.

MCP tools, on the other hand, give the LLM direct access to the memory layer on demand. Instead of passively receiving context at startup, the agent can search for relevant memories, store new information, and update or remove existing entries. Essentially, it’s basic CRUD over the abstracted markdown files stored in Neo4j.

In the end, I think you’ll almost always need both. In this project we only have hooks, no MCP tools, but you can always just plug in the official Neo4j MCP to let the agent explore the graph.

Getting it to work

Somewhat interesting, the way I set up the hooks was to point the agent in any of the harnesses and asked it to install hooks, but I’m sure there are better approaches as well.

Cursor agent installing hooks.

Summary

If you don’t own your memory, you don’t own your agent. Every harness today builds its own walled garden of context, preferences, and session history. Switch them and you start from zero. That doesn’t have to be the case.

Hooks break that pattern. They let you write integrations that plug into any harness from the outside and the interface is remarkably consistent. Claude Code, Codex, and Cursor all fire the same lifecycle events: session start, prompt submission, tool use, session end. The hook receives JSON on stdin, optionally emits JSON on stdout to inject context, and that’s the entire contract. Because hooks run deterministically on every event, they don’t consume model attention or rely on the agent to decide what’s worth saving. The same two Python scripts handle all three clients; thin shell wrappers that pass a --client flag are the only per-harness glue.

The architecture has three layers:

  1. Hooks (online) — passively log every event into Neo4j as a linked list per session. No model calls, no latency cost, just append.
  2. Dream phase (offline) — a batch job reads accumulated events, asks Claude to distill them into durable markdown memories, and writes them back. Memories are organized by topic and merged rather than appended, so they stay current instead of growing forever.
  3. Injection (online) — on the next session start in any harness, profile memories are loaded into context. On each user prompt, relevant memories are searched and appended automatically.

The result is a memory layer that sits below all three harnesses, works without any of them knowing about the others, and belongs entirely to you. You can switch from Cursor to Claude Code to Codex mid-project and pick up exactly where you left off. Your agent’s understanding of who you are, what you’re working on, and how you prefer to work follows you, not the tool.

Code is available here.

P.s.: All images are created by the author.

Share this Article
Please enter CoinGecko Free Api Key to get this plugin works.