v0.2.5821 — 7 issue bundle (#501, #502, #503, #504, #505, #506, #507, #508)#509
Conversation
- npm/: codedeebee npm package (codedb's npx-friendly sibling)
- postinstall downloads matching native binary from GH release
and verifies against checksums.sha256
- thin spawnSync launcher preserves cwd/stdio/args/env
- published at [email protected] (closes #501)
- README.md:
- new "Or via npm/npx" install section
- updated "16 MCP tools" → "21 MCP tools" everywhere
- rewrote tools table: dropped codedb_bundle (no longer registered),
added codedb_callers / codedb_context / codedb_find / codedb_glob
/ codedb_ls / codedb_query / disambiguated codedb_find vs codedb_symbol
- added `codedb read <path>` to CLI table
- removed v0.2.579 hotfix section (long obsolete)
- install/install.sh merge_hook: when a competing legacy-tools hook
(block-legacy-tools.sh / muonry / zigrep / zigread) is already
registered for the same event/matcher, insert codedb's entry at
index 0 instead of appending. Reshuffles existing installs too.
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
#503: codedb mcp <path> hung on loading_snapshot because isCommand("mcp") matched at main.zig:150, parser set root=".", dropped /path, and entered deferred mode waiting for an MCP roots/list that a shell user never sends. #502: codedb mcp --help started the MCP server instead of printing help — same isCommand branch silently consumed --help as a command-arg. Both are fixed by factoring positional parsing into pub fn parsePositional() with two new special cases when args[1]=="mcp": - args[2] in {--help,-h,help} → cmd="--help" - args[2] looks like a path (not -flag) → root=args[2], root_is_explicit=true #507: after a snapshot rebuild, search returned 0 results for substrings demonstrably present in files that `tree` and `read` both surfaced. Root cause: Explorer.commitParsedFileOwnedOutline with full_index=false (snapshot.zig outline-only fallback + watcher.zig incremental indexFileOutline + WASM fast-path) registered files in `outlines` and `contents` but NOT in trigram_index, word_index, OR skip_trigram_files. Search then missed them at every tier: • tier 1 (trigram candidates) — file not in trigram_index • tier 3 (skip_trigram_files scan) — file not in this set • tier 5 (full outline scan) — short-circuited by trigram_ruled_out Fix: in the !full_index branch, also do skip_trigram_files.put(), so tier 3 substring-scans these files via searchInContent. Tests in test_mcp.zig (all fail on parent commit): issue-503: parsePositional treats `codedb mcp <path>` as path-as-root issue-503: `codedb <path> mcp` still works (original order) issue-503: `codedb mcp` alone keeps cwd-as-root deferred behavior issue-502: `codedb mcp --help` rewrites to --help, does not start server issue-502: `codedb mcp -h` rewrites to --help parsePositional: existing commands still parse correctly (regression) issue-507: indexFileOutlineOnly files remain searchable via tier 3 Full suite: 514/514 across all 7 test binaries. Closes #502 (partial — reject-unknown-flags, git-root detection, scan-stuck recovery still open), closes #503, closes #507. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Before, `codedb mcp --snapshot` silently swallowed the unknown flag
and started the MCP server with surprising state. mainImpl now
whitelists post-`mcp` flags via isValidMcpFlag and exits 1 with a
listed-valid-flags error message on the first unrecognised flag.
Edge cases:
• `--help`/`-h`/`help` anywhere after `mcp` short-circuits to
printUsage + exit 0 (parsePositional only catches them when
they sit immediately after `mcp`, so combos like
`mcp --no-telemetry --help` need their own bypass).
• `--config-file=<path>` is stripped before positional parsing
and never reaches this whitelist.
• `--no-telemetry` stays accepted; existing behaviour preserved.
Test in test_mcp.zig: issue-502: isValidMcpFlag whitelist rejects
unknown flags (fails on parent commit). Full test-mcp: 88/88 pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
When `codedb mcp` is launched from a subdirectory of a git repo
(typical pattern for editor MCP clients spawning from the buffer's
directory), walk up from cwd to the nearest `.git` and use that as
the indexed root. Without this, opencode/Zed/etc were silently
indexing only the subdir the user happened to be in.
• Pin order: CODEDB_ROOT env var > positional `<path>` arg >
findGitRoot > cwd-deferred mode.
• Only triggers in deferred mode (root==".", !root_is_explicit)
— explicit paths and the `${workspaceFolder}` shim are left
untouched.
• `.git` may be a dir (normal repo) or a file (git worktree);
statFile covers both.
• Walks until the first `/<dir>` segment, then bails — does not
treat `/` itself as a project root.
Factored as findGitRoot(io, buf) + findGitRootFrom(io, buf, len)
so tests can hand in synthetic absolute paths without chdir'ing
the process.
Tests in test_mcp.zig: walks-up case + null case (both fail on
parent commit; both pass after).
Full test-mcp: 90/90 pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
watcherDeferredLoop polled scan_done indefinitely. After the 3s
fallback fired, triggerDeferredScanWithFallback could still return
false (e.g. fallback_cwd failed root_policy.isIndexableRoot — `/`,
`/tmp`, or any other denied path). That left `triggered=false`,
`scan_done=false`, and the loop spinning forever — visible to the
user as scan=loading_snapshot with files=0 forever.
Now after a give_up_after_ms (13s total) without a successful
trigger, the loop:
• logs a warn-level message pointing the user at CODEDB_ROOT
or the `codedb <path> mcp` invocation,
• flips scan_done so MCP tool calls stop replying with the
"still loading" hint and return empty results cleanly,
• skips the post-loop incrementalLoop call (resolved_root is
empty in the give-up path).
The 3s pre-fallback path is unchanged for the happy case where
fallback_cwd is indexable — only the previously-unreachable hang
is fixed.
Full test-mcp: 90/90.
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
The remote tool previously surfaced raw HTTP status + Cloudflare
body to the caller — "api.wiki.codes HTTP 530 for X/Y — error
code: 1033" — with no remediation guidance. Agents treated this
as an opaque failure and either retried in tight loops or bailed.
appendRemoteErrorHint distinguishes the common upstream cases:
530 + "error code: 1033/1034" / "Argo Tunnel error"
→ "origin unreachable, service temporarily down, retry in a
few minutes or fall back to local `codedb_index`"
530 (plain)
→ "retry in a few minutes; if it persists, repo may not be
indexed"
404
→ "repo or path not indexed; verify slug or clone + index
locally"
429
→ "rate limited; wait and retry, or batch fewer requests"
500/502/503
→ "upstream server error, retry"
504
→ "gateway timeout; wiki may still be indexing this repo"
The hint is appended after the existing status line so consumers
that parsed the old format still work; the new line is purely
additive and human-/agent-readable.
Note: the underlying server outage (api.wiki.codes returning 530
for several public repos as of 2026-05-28) is server-side and
not fixable client-side. This change makes the failure mode
actionable but does not restore the service.
Tests in test_mcp.zig — covers Cloudflare 530 vs plain 530, 404,
429, 200 (no-op). Full test-mcp: 91/91.
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
codedb previously hardcoded a "2025-06-18" protocolVersion in the initialize reply, regardless of what the client sent. Older Zed builds and certain opencode versions reject a server reply whose protocolVersion they don't recognize — manifesting as a startup timeout (#506) or as "No MCP tools" (#505) when the client gives up on the handshake. handleInitialize now extracts params.protocolVersion and feeds it through negotiateProtocolVersion: • Echo the client's version if it's one we've verified against (2024-11-05, 2025-03-26, 2025-06-18). • If the client sent something newer than our latest known version, reply with our latest (forward-compatibility hint to the client that we're as new as we can be). • If the client sent something ancient (lex-orders below our oldest known), reply with our oldest known so older clients still get a shape they recognise; client decides if it can proceed. • Empty/missing → fall back to the default ("2025-06-18"). Tests in test_mcp.zig cover all four branches. E2E: client 2024-11-05 → server 2024-11-05 ✓ client 2025-03-26 → server 2025-03-26 ✓ client 2025-06-18 → server 2025-06-18 ✓ Full test-mcp: 95/95. Closes #505 (opencode), closes #506 (Zed). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
A macOS Intel x64 user reported a segfault on plain `codedb` with
no args. We can't reproduce on arm64, so the fix is to reduce
blast radius: handle the three most common no-real-work
invocations (no args, --version/-v/version) directly in `pub fn
main` using raw c_write to stdout/stderr, before any of the
heavier startup machinery runs (worker-thread spawn, io-Threaded
init, c_allocator + global state setup).
If the underlying bug lives somewhere in that machinery, this
short-circuit means users with broken environments can still:
• run `codedb` and get a usage message
• run `codedb --version` and confirm install
• run `codedb --help` (still goes through the full path
because help formatting needs styling; intentional)
Worst case for the fast paths is uncoloured output — kept minimal
on purpose. The fast path is a no-op for every other invocation
(returns false → continues to the existing thread trampoline).
E2E:
codedb → usage to stderr, exit 1 ✓
codedb --version → "codedb 0.2.5821" to stdout, exit 0 ✓
codedb tree, mcp, … → unchanged (full path)
Full test-mcp: 95/95.
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Verified via Rosetta on Apple Silicon: the previous "segfault on
bare codedb" was conflated with a separate, universal bug — Out's
flush() runs from a deferred cleanup, but std.process.exit()
skips deferred cleanup. Every error / usage path that did
out.p("...message...", .{...});
std.process.exit(1);
was silently dropping the message and just returning exit 1. The
user's Intel Mac may also have a real segfault deeper in startup,
but the *user-visible* symptom ("nothing prints, just dies") is
this bug and it reproduces on arm64 too.
Add Out.exitWithFlush(code) noreturn that flushes then exits, and
convert the three early-exit sites that fire BEFORE the heavier
init (the only paths a freshly-installed binary will hit on
first invocation):
• parsePositional usage_exit — `codedb foo` now prints usage
• cannot-resolve-root — bad path now shows the error
• refusing-to-index — denied root now shows the error
The fast-path commit (22959b8) still catches bare codedb /
--version on the main thread before any of this runs, so even if
Rosetta or the user's environment has a problem in the worker
thread itself, those two commands keep working. With both fixes
in place, the user from #504 should now at least see a usage
message instead of a silent failure or segfault.
E2E (arm64 + x86_64 under Rosetta):
codedb → fast-path: usage to stderr, exit 1 ✓
codedb foo → mainImpl: usage to stdout, exit 1 ✓
codedb /bogus mcp → mainImpl: "cannot resolve root", exit 1 ✓
codedb --version → fast-path: "codedb 0.2.5820", exit 0 ✓
Full test suite: 635/635.
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
x86_64-macos at startup Reproduced via Rosetta on Apple Silicon (`arch -x86_64`) with the ad-hoc-signed release-build binary. SIGSEGV (exit 139) before any user code runs. Bisected against a minimal program: pub fn main(init) void → works pub fn main(init) !void → SIGSEGV under adhoc + Rosetta The crash is in the Zig 0.16 runtime wrapper around an error-union main, not in our code. The wrapper expands argv, allocates the error return slot, and calls user main — something in that sequence trips a startup-path bug specific to x86_64-macos when the binary is signed. Same crash behaviour also fires for binaries that spawn a thread before writing anything, but the underlying trigger is the same wrapper. Fix: make `pub fn main(init) void` infallible. Move the original fallible body (thread spawn + join + error propagation) into `mainTrampoline() !void` which the new main calls via catch. On the catch arm we write a minimal "fatal startup error: <name>" message to stderr via std.c.write so the user sees *something* even if the trampoline itself crashes — though now it shouldn't, because the runtime wrapper for `!void` is the actual broken thing. Verified end-to-end with ad-hoc-signed x86_64-macos binary under Rosetta: codedb → usage to stderr, exit 1 ✓ codedb --version → "codedb 0.2.5820", exit 0 ✓ codedb foo → usage from mainImpl, exit 1 ✓ codedb tree → loaded snapshot, 284 files, real output ✓ codedb mcp → starts, "stdin closed, exiting" ✓ All previously: SIGSEGV / silent. arm64 native unchanged. Full test suite: 635/635. Note: the user's reported segfault was on a native macOS Intel Mac with a Dev-cert-signed binary, not Rosetta with ad-hoc. The trigger (`!void` runtime wrapper) is the same regardless of sign type, so this should resolve the user's case too — but they need to reinstall + retest to confirm. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Benchmark Regression ReportThresholds: 10.00% and 50,000 ns absolute delta
|
|
Thanks for the #504 fix in this PR. I tested the published Environment:
Repro: curl -fsSLO https://github.com/justrach/codedb/releases/download/v0.2.5821/codedb-darwin-x86_64
chmod +x codedb-darwin-x86_64
./codedb-darwin-x86_64 --help
echo $?
# 139This is also the same binary installed by Crash report summary: One useful data point: building current git clone https://github.com/justrach/codedb.git
cd codedb
zig build -Doptimize=ReleaseSafe
./zig-out/bin/codedb --help
# exit 0So the source-level fix appears to be present on I also added this reproduction info to #504: #504 (comment) |
|
Follow-up with a smaller repro and a likely direction for the fix. I was able to reduce this outside of Environment: Minimal repro: const std = @import("std");
pub fn main() void {
const msg = "hello\n";
_ = std.c.write(1, msg.ptr, msg.len);
}Commands: zig build-exe hello.zig -O ReleaseFast -target x86_64-macos -lc -femit-bin=hello
./hello
echo $?
codesign -f -s - hello
./hello
echo $?Observed: So I also tried these variants, and all signed Zig binaries still crashed: -fllvm
-flibllvm
-fno-lld
-funwind-tables
-fno-unwind-tables
-target x86_64-macos.10.15
-target x86_64-macos.11.0
-target x86_64-macos.13.0
-target x86_64-macos.26.2A C control binary signed with the same #include <stdio.h>
int main() { puts("hello-c"); return 0; }cc hello.c -o hello-c
codesign -f -s - hello-c
./hello-c
# hello-c
# exit 0This suggests the I also built zig build -Doptimize=ReleaseFast -Dtarget=x86_64-macos -Dcodesign-identity=-
codesign -dv --verbose=2 zig-out/bin/codedb
# code object is not signed at all
./zig-out/bin/codedb --help
# exit 0
./zig-out/bin/codedb --version
# exit 0
# codedb 0.2.5821MCP smoke test also worked with that unsigned binary: The workaround patch I tested was basically:
This is not ideal for macOS distribution, but it may be a practical short-term workaround for the npm/MCP path until the Zig/Mach-O signing issue is understood. I tried to file this minimal repro upstream at |
Summary
Bundle of 7 fixes from the 2026-05-28 open-issue triage. All E2E verified, all 635/635 tests pass.
Issues fixed
codedeebeeon npm; postinstall fetches matching native binary + sha256 verify; thin spawnSync launcherisValidMcpFlagwhitelist;findGitRootwalks up from cwd; deferred-loop gives up after 13scodedb mcp <path>hangsmcp <path>arg order!voidmain runtime wrapper crashes on signed x86_64-macos. Fix:pub fn main(...) void+mainTrampoline(). Also addshandleFastPathfor bare/--versioninvocations +Out.exitWithFlushso error messages survive early-exit paths2025-06-18skip_trigram_filesso tier 3 substring-scans themappendRemoteErrorHintdistinguishes Cloudflare 1033 / 404 / 429 / 5xx with actionable hints (server-side outage at api.wiki.codes is unchanged — UX-only)Distribution / installer (bonus)
npm/codedeebeepackage shipped (closes feat: add npm/npx distribution for codedb MCP usage #501)install/install.sh: hook-priority race fixed — when a competing legacy-tools hook is already registered for the same event/matcher, codedb's hook is inserted at index 0 instead of appendedREADME.mdrefresh: "Or via npm/npx" install section, 16 → 21 MCP tools everywhere, dropped removed tools, added missing tools, removed obsolete v0.2.579 hotfix block, addedcodedb readto CLI tableTest plan
zig build test --summary all→ 635/635 pass across all 8 test binariescodedb mcp /tmp/nonexistent→ "cannot resolve root", exit 1 (was: hang forever)codedb mcp --help→ usage to stdout, exit 0 (was: started MCP server)codedb mcp --bogus→ "unknown flag for mcp: --bogus", exit 1 (was: silently ignored)codedbbare on arm64 + x86_64 via Rosetta → usage to stderr, exit 1 (was: SIGSEGV on x86_64)codedb --versionon arm64 + Rosetta → "codedb 0.2.5821", exit 0initializehandshake: client 2024-11-05 → server echoes 2024-11-05; 2025-03-26 → echoed; 2025-06-18 → echoed; 2099-01-01 → falls back to 2025-06-18test issue-507passes (outline-only files now search-visible)test issue-508passes (Cloudflare/404/429/5xx hints)