Persistent Memory / Document Planning.md Strategies #323

RhettCreighton · 2025-04-18T10:19:20Z

RhettCreighton
Apr 18, 2025

Because the LLM Models have a 200_000 token context window, developers often need to reset their context window during long development session. As such, it is convenient to have the AI document its own progress and later start off from the last stopping point after a context window is cleared.

Is there some kind of ideal known strategy to optimize performance for this? Can we tune the internal prompts that codex uses to facilitate these operations?

phaynes · 2025-04-28T01:26:10Z

phaynes
Apr 28, 2025

Hey @RhettCreighton,

I’m not sure if there’s an ‘ideal known strategy’ for orchestrating this type of multi-agent workflow. After experiencing some rather negative issues with Claude Code—mainly running out of context and spinning out of control—I set up a process using Gemini to help coordinate and control the software engineering pipeline.

Specifically, I developed a system where Gemini and I collaboratively broke down the overall project into sets of development tasks for the Claude Code agent. For each task, Gemini generated a detailed task description, which I then manually copied and pasted into Claude through its desktop interface. Claude would then pass the instructions on to the Claude Code agent via MCP, enabling that agent to execute the coding work. This approach allowed for steady and reliable progress over several weeks.

Now, I’m looking to build on this idea further. Instead of managing information via files, I plan to integrate a graph database (SurrealDB) to structure and track the tasks and outputs. Initially, I designed a solution using Tauri, leveraging its JavaScript integration to call the functions of the current OpenAI Codex implementation in TypeScript. However, with the recent emergence of a pure Rust version of Codex, it’s now possible to develop a much more robust system entirely in Rust—removing the dependency on Tauri and JavaScript layers.

Do you think it’s worth making this an open prototyping project? I’d really appreciate your thoughts.

0 replies

RhettCreighton · 2025-04-28T01:40:27Z

RhettCreighton
Apr 28, 2025
Author

codex.md is used by codex and sent with every request in context. I am personally seeing some gains by using that context space wisely during my own development process

1 reply

phaynes Apr 28, 2025

Thanks for clarifying how codex.md is used during development—my experience with the AI tooling has been a bit different, so I thought I’d share what I’ve found so far.

In my case, I’ve found that these agents do a great job getting things started, but they often go off piste and rarely see a small-scale engineering project through to completion. To get the tooling to support projects reliably, I resorted to a more traditional SDLC approach—though honestly, this ended up being far more challenging than expected! I’m currently at around 80 prompt variants, and the process is still nowhere near fully refined.

For example, in my workflow, Gemini and I work together to break down the project into small tasks. Gemini generates specific task descriptions, which I then copy across (via MCP) into Claude. Claude executes these coded tasks, always starting from detailed instructions that include both saving task details and initiating the agent.

As the coding agent starts, it needs seed instructions so it has access to the project context:
CLAUDE.md

From there, the agent reads the structured implementation plan, followed by an explicit project methodology:
PRECISE_IMPLEMENTATION_PLAN.md
IMPLEMENTATION_METHODOLOGY.md

This cascades into further prompts that handle task generation and context management:
TASK_DESCRIPTION_TEMPLATE.md
CONTEXT_MANAGEMENT_GUIDE.md

I also use other templates focused on project-level context and recovery management:
context-recovery-overview.md

Altogether, this layered prompt and template system is designed to support robust recovery—so if agents go astray, instructions are lost, or anything needs to be reverted or retried, the workflow remains resilient.

Given the rapid progress on the Codex project, especially with a pure Rust implementation in the pipeline, I’m now looking to prototype a comprehensive set of prompts and processes. On reflection, I think moving this setup from files to a knowledge graph within SurrealDB could make the system even more reliable and maintainable.

Would love to hear your thoughts on whether this direction would be valuable, or if you see any immediate risks with this architectural change.

jeemitsha · 2026-04-29T04:50:34Z

jeemitsha
Apr 29, 2026

The pattern most people I know use today is exactly what you're describing — keep a planning.md (or an external scratchpad in Apple Notes / a side editor) with running progress notes and the next step, and re-paste the relevant lines back into the chat when context gets cleared. It works but it's friction-heavy: you have to remember to write to it, and you have to remember to paste from it.

I just filed #20138 proposing an in-app version of this for the desktop variant — a notes panel with a markdown surface that auto-injects into the agent's system context every turn from durable host state, so it survives compaction by construction. Two regions: shared (read/write by both the user and the agent — replaces the manual-paste loop) and private (UI-only, never sent to the model — your own scratchpad).

Cross-linking in case it's relevant to anyone here.

0 replies

oliviacraft · 2026-05-01T21:54:09Z

oliviacraft
May 1, 2026

The AGENTS.md / codex.md file (mentioned above) is the key layer here. Beyond progress notes, it is worth distinguishing two types of persistent content:

Type 1: Project rules (behavioral) — conventions, anti-patterns, test setup. These do not change between sessions and should live permanently in AGENTS.md. They are "always-loaded" context that tells the AI how to behave in this project regardless of task.

Type 2: Task state (progress) — current task, what was done, what is next. This is the planning.md use case — reset and rebuild between sessions.

The confusion I see most often: teams put Type 2 content (session progress) in the same file as Type 1 content (rules), which causes both to degrade. Rules get stale as they accumulate task-specific notes; task state gets noisy as rules grow.

What works:

AGENTS.md or CLAUDE.md — permanent project rules, 15-20 high-signal items, never session-specific
PLANNING.md or TASK.md — current task state, active context, rewritten each session
Handoff block in PLANNING.md: "Stopped here: [exact file:line]. Next: [action]. Blocker: [if any]" — 3 lines is enough for a cold resume

The codex.md approach mentioned above works well for the rules layer. For what goes in that file: https://gist.github.com/oliviacraft — free per-stack starters showing what rules produce the most reliable output.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Persistent Memory / Document Planning.md Strategies #323

Uh oh!

{{title}}

Uh oh!

Replies: 4 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Persistent Memory / Document Planning.md Strategies #323

Uh oh!

RhettCreighton Apr 18, 2025

Replies: 4 comments · 1 reply

Uh oh!

phaynes Apr 28, 2025

Uh oh!

RhettCreighton Apr 28, 2025 Author

Uh oh!

phaynes Apr 28, 2025

Uh oh!

jeemitsha Apr 29, 2026

Uh oh!

oliviacraft May 1, 2026

RhettCreighton
Apr 18, 2025

Replies: 4 comments 1 reply

phaynes
Apr 28, 2025

RhettCreighton
Apr 28, 2025
Author

jeemitsha
Apr 29, 2026

oliviacraft
May 1, 2026