March 15, 2026 · Hugh Pyle

Flows

Code Mode is like vibe-coding a query plan.

What is "code mode", you may ask? It's a powerful observation by Anthropic and Cloudflare recently, that agents just love to write code, and this can massively reduce the context overhead of tool-calls for MCP interfaces:

We found agents are able to handle many more tools, and more complex tools, when those tools are presented as a TypeScript API rather than directly...
The approach really shines when an agent needs to string together multiple calls. With the traditional approach, the output of each tool call must feed into the LLM's neural network, just to be copied over to the inputs of the next call, wasting time, energy, and tokens. When the LLM can write code, it can skip all that, and only read back the final results it needs.

These MCP interfaces are often just a wrapper around REST APIs. In the Cloudflare case, for example, the underlying REST APIs cover over 2500 endpoints across dozens of products. That's a lot of surface area.

If the agent generates a script and sends the code instead of calling the APIs directly, there are two big wins:

The actual call is simple, and compact;
The code runs in Cloudflare's own sandbox, right next to the APIs.

What would code-mode for data look like?

Actually we have lots of names for it already. The whole of SQL, approximately (statements, views, stored procedures): this is just a smart language that wraps complex logic in a single call, and puts it close to the data. Other related techniques: push-down, query-optimization.

So: "code mode" is comparable to one-shot generation of a query plan for a data engine (without any statistics!)

Agentic memory in keep isn't a relational model; it's a collection of nodes (conversations, documents, media, semi-structured objects) in a dynamic graph, where the nodes have vector embeddings for semantic similarity, fused with full-text keyword search, and the edges are driven by tags. Languages such as SQL and Cypher and Gremlin aren't a good match for retrieval or update across this.

The difference shows up in the structure of interaction: exploratory, curatorial. Of course sometimes you just want "put" and "get" -- but more often, you want to try find something, discard some results and focus on others, pivot, collect, dig deeper, and maybe update based on that result-set. Update actions sometimes need summarization, classification, specialized processes such as OCR, and deeper analysis to uncover themes and dynamics.

Steering these activities involves layers of processing: mechanical, small-model tasks such as classification, and often decisions that can only be made by a powerful model or a person.

That's why keep runs on a workflow system: Flows.

Here's how flows work.

Every action in the CLI and REST API (put, get, find, move...) is just a thin wrapper that invokes a workflow. The workflow is driven by "state documents": each state-doc is one or more instructions for the activities to be done, and where to go next. These are just documents in the datastore, so you can change or extend the processing flow for any action just by saving a document (somewhat like a trigger for a stored procedure, in SQL-land). Executing a flow is token-budgeted, workload-budgeted, and returns a cursor so you can ask it to continue.

All the primitives (find, list, extract links, tag, summarize, run small-model inference, and so on): they're just actions that can be called from any state.

And the MCP interface? It's just "run a flow".

The MCP (and the CLI, and the REST API) can either run a predefined flow, such as the builtin processes that handle search and update:

# One call: search with steering, token-budgeted result
keep_flow(
    state="query-resolve",
    params={"query": "authentication design", "bias": {"now": 0}},
    budget=3,
    token_budget=1500
)

# → flow: stopped (3 ticks) via query-resolve > query-branch > query-resolve
#   results:
#   - %a1b2c3 (0.94) OAuth2 token refresh design decision...
#   - %d4e5f6 (0.82) API key rotation policy for staging...
#   margin: 0.12
#   cursor: 5ccf5dd940ac

Or it can do something completely custom; just provide the flow state-doc inline.

# Find all open commitments from last week and mark them reviewed
keep_flow(
    state_doc_yaml="""
match: sequence
rules:
- id: found
    do: find
    with:
    query: "{params.query}"
    tags: {act: "commitment", status: "open"}
    since: "P7D"
    limit: 20
- id: tagged
    do: tag
    with:
    items: "{found.results}"
    tags: {reviewed: "2026-03-15"}
""",
    params={"query": ""},
    token_budget=500
)

# → flow: done (1 ticks)
#   found: 8 items
#   tagged: 8 items

When memory becomes large and diverse, one-shot queries can do pretty well (as we saw with the LoCoMo benchmark) -- until retrieval quality drops off a cliff, or you need better tagging, or pulling large result-sets into context gets too expensive. That's when you need something that has a small surface area, power to run complex tasks close to the data, and real extensibility.

Flows provide a well-scoped, manageable, agent-extensible way to interact with a memory system of any scale.

Memory that pays attention. Because "information" is a verb, not a noun.

Some documentation:

Try it out:

uv tool install keep-skill --upgrade
keep config --setup

To install the MCP in Claude Desktop:

keep config mcpb

To install the MCP and hooks in Claude Code:

/plugin marketplace add https://github.com/keepnotes-ai/keep.git
/plugin install keep@keepnotes-ai

Then say to Claude:

Please read all the `keep_help` documentation, and then use `keep_prompt(name="reflect")` to save some notes about what you learn.

Let me know how it goes!