feat(check-inbox): add watermark to prevent stale message re-delivery#171
feat(check-inbox): add watermark to prevent stale message re-delivery#171tayhiga-prog wants to merge 2 commits into
Conversation
fujibee
left a comment
There was a problem hiding this comment.
Nice change — aligning turn-mode check-inbox.sh with the watermark model watch.sh already uses is the right call, and the implementation is careful: integer validation on every read, the DB-shrink clamp, and the single LOOP_WM snapshot reused for the SELECT, the mark-read UPDATE, and the post-loop advance. One thing worth changing before merge, plus two notes.
1. Seed the watermark per-agent, not globally (can drop this agent's mail).
The initializer uses the global max:
WATERMARK="$(agmsg_sqlite "$DB" "SELECT COALESCE(MAX(id), 0) FROM messages;" ...)"but watch.sh scopes its watermark to the subscription (WHERE $WHERE_PAIRS), and the post-loop advance here already scopes to WHERE to_agent='$AGENT'. With the global MAX(id), if another agent has posted newer messages, this agent's first check seeds its watermark above its own unread, and those messages are then skipped forever (the mark-read UPDATE is also gated on id > LOOP_WM, so they never clear either). Scoping the seed to WHERE to_agent='$AGENT' matches both watch.sh and your own advance step.
2. Windows: the watermark silently no-ops under Git Bash.
agmsg_sqlite (the on-disk store wrapper) doesn't strip the trailing CR that sqlite3.exe emits on Windows, so MAX(id) comes back as "<n>\r", the *[!0-9]* guard rejects it, and the watermark resets to 0 → every check delivers everything. Not a crash, just inert on Windows. A tr -d '\r' on the integer reads (or routing them through a CR-safe read) would let it engage there too. Context: #180 added CR-stripping for the in-memory path but not this store wrapper.
3. Behaviour note (doc-only).
After this lands, the first turn-mode check seeds the watermark and delivers nothing — intended, and consistent with monitor mode, but it's a change from "first check drains the backlog." Worth a line in the PR body / docs so a user who sends a message and then starts an agent isn't surprised by the empty first turn.
Happy to re-review once (1) is in. Thanks for tightening this up! 🙏
|
@tayhiga-prog — CI's in now, and it surfaces exactly the seeding concern from point (1): the required
All three send a message and then expect the next This is the crux:
Either way, scoping to |
Address review feedback from fujibee on PR fujibee#171: 1. Seed the watermark from the latest *read* message for this agent (WHERE to_agent='$AGENT' AND read_at IS NOT NULL) instead of the global MAX(id). This ensures the first turn-mode check drains current unread messages, fixing the three CI test failures where a just-sent message was skipped. The safety-valve clamp is also scoped to to_agent. 2. Add `| tr -d '\r'` to agmsg_sqlite() (on-disk wrapper), mirroring agmsg_sqlite_mem(). On Windows, sqlite3.exe emits \r\n; without stripping, the watermark integer validation rejects the value and the watermark silently resets to 0 on every check. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
e216e2a to
796f79d
Compare
|
Superseded by #199, which rebases this onto current main with the commits and authorship preserved — the watermark review is happening there. Closing in favor of #199. Thanks @tayhiga-prog. |
Summary
Problem
check-inbox.sh filters unread messages with WHERE read_at IS NULL and no lower-bound on id. For agents using turn delivery mode (Antigravity / Gemini PostToolUse, Codex Stop hook), the hook fires only when the agent is active. Any messages sent while the agent was offline accumulate in the database with read_at IS NULL. When the agent next becomes active and the hook fires, all accumulated messages are delivered at once, no matter how old they are.
This means an agent can receive and act on instructions that were relevant to a session that ended hours or days ago, leading to unintended behaviour.
watch.sh (used in monitor mode) already avoids this by setting a watermark to MAX(id) at startup and only streaming messages with id > watermark. check-inbox.sh had no equivalent protection.
Solution
Add a per-agent watermark persisted at SKILL_DIR/run/check-inbox.agent.watermark:
MAX(id) WHERE to_agent='$AGENT' AND read_at IS NOT NULL— the latest already-read message for this agent. This ensures that current unread messages are delivered on the first check, unlike watch.sh which seeds atMAX(id)(monitor mode only cares about post-attach messages). This is an intentional divergence: turn-mode's first check legitimately needs to drain pending unread.MAX(id) WHERE to_agent='$AGENT'(DB wiped and recreated), reset to that agent-scoped max to prevent permanent message silence. Both the seed and the safety valve are scoped toto_agentto avoid being influenced by other agents' message ids.agmsg_sqlite()(the on-disk DB wrapper) now pipes throughtr -d '\r', mirroringagmsg_sqlite_mem(). On Windows,sqlite3.exeemits\r\n; without stripping, the watermark integer validation rejects the value and silently resets to 0, making the watermark inert.No other files are modified. Existing cooldown, actas-lock, and monitor-deferral logic is unchanged.
Behaviour change
The first turn-mode check now delivers current unread messages (seeded from the latest read id, not the global max). This differs from the original implementation which seeded at
MAX(id)and delivered nothing on the first check. The new behaviour matches what users and the existing CI tests expect: send a message, start the agent, first check delivers it.Testing
Basic replay prevention
Multi-team correctness
Safety valve (DB recreation)
Per-agent isolation