Persistent Memory / Document Planning.md Strategies #323
Replies: 4 comments 1 reply
-
|
Hey @RhettCreighton, I’m not sure if there’s an ‘ideal known strategy’ for orchestrating this type of multi-agent workflow. After experiencing some rather negative issues with Claude Code—mainly running out of context and spinning out of control—I set up a process using Gemini to help coordinate and control the software engineering pipeline. Specifically, I developed a system where Gemini and I collaboratively broke down the overall project into sets of development tasks for the Claude Code agent. For each task, Gemini generated a detailed task description, which I then manually copied and pasted into Claude through its desktop interface. Claude would then pass the instructions on to the Claude Code agent via MCP, enabling that agent to execute the coding work. This approach allowed for steady and reliable progress over several weeks. Now, I’m looking to build on this idea further. Instead of managing information via files, I plan to integrate a graph database (SurrealDB) to structure and track the tasks and outputs. Initially, I designed a solution using Tauri, leveraging its JavaScript integration to call the functions of the current OpenAI Codex implementation in TypeScript. However, with the recent emergence of a pure Rust version of Codex, it’s now possible to develop a much more robust system entirely in Rust—removing the dependency on Tauri and JavaScript layers. Do you think it’s worth making this an open prototyping project? I’d really appreciate your thoughts. |
Beta Was this translation helpful? Give feedback.
-
|
codex.md is used by codex and sent with every request in context. I am personally seeing some gains by using that context space wisely during my own development process |
Beta Was this translation helpful? Give feedback.
-
|
The pattern most people I know use today is exactly what you're describing — keep a I just filed #20138 proposing an in-app version of this for the desktop variant — a notes panel with a markdown surface that auto-injects into the agent's system context every turn from durable host state, so it survives compaction by construction. Two regions: shared (read/write by both the user and the agent — replaces the manual-paste loop) and private (UI-only, never sent to the model — your own scratchpad). Cross-linking in case it's relevant to anyone here. |
Beta Was this translation helpful? Give feedback.
-
|
The AGENTS.md / codex.md file (mentioned above) is the key layer here. Beyond progress notes, it is worth distinguishing two types of persistent content: Type 1: Project rules (behavioral) — conventions, anti-patterns, test setup. These do not change between sessions and should live permanently in AGENTS.md. They are "always-loaded" context that tells the AI how to behave in this project regardless of task. Type 2: Task state (progress) — current task, what was done, what is next. This is the planning.md use case — reset and rebuild between sessions. The confusion I see most often: teams put Type 2 content (session progress) in the same file as Type 1 content (rules), which causes both to degrade. Rules get stale as they accumulate task-specific notes; task state gets noisy as rules grow. What works:
The codex.md approach mentioned above works well for the rules layer. For what goes in that file: https://gist.github.com/oliviacraft — free per-stack starters showing what rules produce the most reliable output. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Because the LLM Models have a 200_000 token context window, developers often need to reset their context window during long development session. As such, it is convenient to have the AI document its own progress and later start off from the last stopping point after a context window is cleared.
Is there some kind of ideal known strategy to optimize performance for this? Can we tune the internal prompts that codex uses to facilitate these operations?
Beta Was this translation helpful? Give feedback.
All reactions