Skip to content

refactor!: rewrite ck-cli in TypeScript on Bun#15

Open
AnnatarHe wants to merge 1 commit into
masterfrom
claude/quirky-edison-2JQU9
Open

refactor!: rewrite ck-cli in TypeScript on Bun#15
AnnatarHe wants to merge 1 commit into
masterfrom
claude/quirky-edison-2JQU9

Conversation

@AnnatarHe

Copy link
Copy Markdown
Member

Summary

Full rewrite of ck-cli from Go to TypeScript on the Bun runtime.

The parser produces byte-identical output to the existing Go binary across every fixture in tests/fixtures/clippings_*.txt — verified by tests/fixtures.test.ts (en, zh, other, rare, ric, ~17K lines of oracle JSON).

Stack

Concern Choice
Runtime Bun (1.3+)
Language TypeScript strict (ES2022)
Arg parser cac
UI / status / progress ink (rendered to stderr only, stdout reserved for JSON)
TOML smol-toml
GraphQL native fetch
Tests bun:test
Linter oxlint
Formatter oxfmt
Compile / bundle bun build --compile --target=bun-<os>-<arch>
macOS signing Anchore Quill (existing QUILL_* secrets reused)
Versioning release-please (switched release-type from gonode)

Two intentional behavior changes vs. the Go binary

  1. parse --output http now actually uploads to GraphQL. The Go syncToServer had a TODO and just printed success without invoking the (fully-implemented) HTTP client.
  2. parse with no --input while stdin is a TTY now prints help and exits 1 instead of hanging on a stdin read.

Everything else is preserved: command surface, flag names, config file format (~/.ck-cli.toml), GraphQL mutation shape, chunk size (20), concurrency cap (10), 30s request timeout, RFC3339 date format, X-CLI <token> auth header.

Layout

src/
├── main.tsx              cac entrypoint, signal handling
├── version.ts            VERSION/COMMIT injected at build via --define
├── commands/{login,parse}.tsx
├── config/config.ts      TOML load/save (~/.ck-cli.toml)
├── http/client.ts        GraphQL POST + chunking + semaphore
├── models/clipping.ts    types + RFC3339 serialization
├── parser/parser.ts      detectLanguage, splitIntoGroups, parseGroup, date parsing
├── ui/{Status,SyncProgress}.tsx
└── utils/semaphore.ts    withConcurrency (no external dep)
tests/
├── parser.test.ts        ports of Go parser_test.go cases
├── fixtures.test.ts      byte-parity vs Go oracles
├── config.test.ts        TOML round-trip, Go-written file compat
├── client.test.ts        mocked fetch, chunking, error joining
└── fixtures/             5 *.txt inputs + 5 *.result.json oracles
scripts/build.ts          cross-compile orchestrator

Verified locally

  • bun test — 29 tests pass (4 files, 52 assertions)
  • bun run lint — 0 errors
  • bun run format:check — clean
  • bun run typecheck — 0 errors
  • bun run build:all -- --archive — all 5 targets compile, archive, and produce a checksums.txt
  • Output of ./ck-cli parse --input tests/fixtures/clippings_*.txt is byte-identical to the Go binary's output on every fixture
  • ./ck-cli login round-trip writes a TOML in the exact same [http] / [http.headers] shape that the Go binary writes

Binary sizes (Bun embeds its runtime; expected to be ~20× larger than Go):

Target Size
linux-amd64 97 MB
linux-arm64 97 MB
darwin-amd64 66 MB
darwin-arm64 61 MB
windows-amd64 113 MB

⚠️ Workflow files not included in this PR

The GitHub App used by Claude Code in this environment lacks the workflows permission, so .github/workflows/ci.yml and .github/workflows/release.yml could not be updated by this branch. They still reference the Go toolchain on master and will fail CI until updated manually.

Please apply the new workflow files below before merging (or grant the bot workflows permission and re-push).

Replacement for .github/workflows/ci.yml
name: CI

on:
  push:
    branches: [master, main]
  pull_request:
    branches: [master, main]

permissions:
  contents: read

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - uses: oven-sh/setup-bun@v2
        with:
          bun-version: latest

      - name: Install dependencies
        run: bun install --frozen-lockfile

      - name: Lint
        run: bun run lint

      - name: Format check
        run: bun run format:check

      - name: Typecheck
        run: bun run typecheck

      - name: Test (with coverage)
        run: bun test --coverage --coverage-reporter=lcov --coverage-dir=coverage

      - name: Upload coverage to Codecov
        uses: codecov/codecov-action@v5
        with:
          token: ${{ secrets.CODECOV_CLI_TOKEN }}
          files: coverage/lcov.info
        if: env.CODECOV_CLI_TOKEN != ''
        env:
          CODECOV_CLI_TOKEN: ${{ secrets.CODECOV_CLI_TOKEN }}
Replacement for .github/workflows/release.yml
name: Release

on:
  push:
    branches:
      - master

permissions:
  contents: write
  pull-requests: write
  issues: write

jobs:
  release-please:
    runs-on: ubuntu-latest
    outputs:
      release_created: ${{ steps.release.outputs.release_created }}
      tag_name: ${{ steps.release.outputs.tag_name }}
    steps:
      - uses: googleapis/release-please-action@v4
        id: release
        with:
          token: ${{ secrets.GITHUB_TOKEN }}
          release-type: node

  build:
    needs: release-please
    if: needs.release-please.outputs.release_created == 'true'
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Fetch tags
        run: git fetch --force --tags

      - uses: oven-sh/setup-bun@v2
        with:
          bun-version: latest

      - name: Install dependencies
        run: bun install --frozen-lockfile

      - name: Test
        run: bun test

      - name: Build all platforms
        env:
          CK_VERSION: ${{ needs.release-please.outputs.tag_name }}
          CK_COMMIT: ${{ github.sha }}
        run: bun run build:all -- --archive

      - name: Install Quill
        run: |
          curl -sSfL https://raw.githubusercontent.com/anchore/quill/main/install.sh \
            | sh -s -- -b /usr/local/bin

      - name: Sign + notarize macOS binaries
        env:
          QUILL_SIGN_P12: ${{ secrets.QUILL_SIGN_P12 }}
          QUILL_SIGN_PASSWORD: ${{ secrets.QUILL_SIGN_PASSWORD }}
          QUILL_NOTARY_KEY: ${{ secrets.QUILL_NOTARY_KEY }}
          QUILL_NOTARY_KEY_ID: ${{ secrets.QUILL_NOTARY_KEY_ID }}
          QUILL_NOTARY_ISSUER: ${{ secrets.QUILL_NOTARY_ISSUER }}
        run: |
          set -euo pipefail
          for bin in dist/ck-cli-darwin-amd64 dist/ck-cli-darwin-arm64; do
            echo "Signing $bin"
            quill sign-and-notarize "$bin"
          done
          tar -czf dist/ck-cli-darwin-amd64.tar.gz -C dist ck-cli-darwin-amd64 -C "$PWD" README.md LICENSE
          tar -czf dist/ck-cli-darwin-arm64.tar.gz -C dist ck-cli-darwin-arm64 -C "$PWD" README.md LICENSE
          ( cd dist && shasum -a 256 ck-cli-*.tar.gz ck-cli-*.zip > checksums.txt )

      - name: Upload release assets
        uses: softprops/action-gh-release@v2
        with:
          tag_name: ${{ needs.release-please.outputs.tag_name }}
          files: |
            dist/ck-cli-*.tar.gz
            dist/ck-cli-*.zip
            dist/checksums.txt

  docker:
    needs: [release-please, build]
    if: needs.release-please.outputs.release_created == 'true'
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - uses: docker/setup-buildx-action@v3

      - name: Log in to GHCR
        uses: docker/login-action@v3
        with:
          registry: ghcr.io
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}

      - name: Build and push image
        uses: docker/build-push-action@v6
        with:
          context: .
          push: true
          tags: |
            ghcr.io/${{ github.repository }}:${{ needs.release-please.outputs.tag_name }}
            ghcr.io/${{ github.repository }}:latest

Test plan

  • bun install
  • bun test — confirm 29 pass
  • bun run build — confirm local binary works (./ck-cli --version)
  • Parity spot-check: ./ck-cli parse --input tests/fixtures/clippings_en.txt | diff - tests/fixtures/clippings_en.result.json (expect no output)
  • Apply the two workflow files above and re-push (or grant Claude bot workflows permission)
  • Smoke-test Quill against a Bun-compiled darwin binary before the first release — Quill should sign Mach-O regardless of source language but it has not been previously exercised on Bun output

Generated by Claude Code

Full rewrite of the CLI from Go to TypeScript on the Bun runtime.
The parser is byte-identical against the original Go binary's output
across all fixtures (en, zh, other, rare, ric).

Stack:
- Bun + TypeScript strict mode (ES2022)
- cac for arg parsing, ink for stderr-rendered status/progress UI
- smol-toml for ~/.ck-cli.toml
- native fetch for GraphQL with chunked uploads (20/chunk, 10 concurrent)
- bun:test for tests, oxlint + oxfmt for lint/format
- bun build --compile produces self-contained binaries for
  linux/darwin/windows x amd64/arm64 (skipping windows-arm64 to match
  the prior goreleaser config)

Two intentional behavior corrections vs the Go binary:
- parse --output http now actually uploads to GraphQL (the Go
  syncToServer had a TODO and only printed success)
- parse with no --input + TTY stdin now exits with help instead of
  hanging on a stdin read

Release pipeline (workflow YAML pushed separately):
- release-please switched from release-type: go to release-type: node
- Quill (Anchore) is retained for macOS sign + notarize; reuses the
  existing QUILL_* secrets unchanged
- A Docker image is now built and pushed to GHCR on release

Co-Authored-By: Claude <[email protected]>
@chatgpt-codex-connector

Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@claude

claude Bot commented May 23, 2026

Copy link
Copy Markdown
Contributor

Claude finished @AnnatarHe's task —— View job


Code Review Complete ✅

I've conducted a comprehensive review of this Go → TypeScript rewrite. Here's my analysis:

Review Tasks

  • Set up development environment and validate build
  • Review architecture and code organization
  • Analyze parser logic and byte-compatibility claims
  • Examine test coverage and quality
  • Check security practices and dependencies
  • Assess performance implications
  • Review build system and deployment changes
  • Provide comprehensive feedback and recommendations

🎯 Overall Assessment: EXCELLENT

This is a high-quality rewrite that maintains backward compatibility while modernizing the entire stack. The code is well-structured, thoroughly tested, and follows TypeScript best practices.

✅ Strengths

Architecture & Code Quality

  • Clean separation of concerns: Parser, HTTP client, config, and UI components are well-isolated
  • Strong typing: Excellent use of TypeScript strict mode with noUncheckedIndexedAccess
  • Consistent error handling: Proper error propagation and user-friendly messages
  • Modular design: Easy to understand and maintain codebase structure

Parser Implementation

  • Robust language detection: Simple but effective English/Chinese detection
  • Careful date parsing: Handles both English and Chinese date formats correctly
  • BOM handling: Proper Unicode BOM stripping for Windows-generated files
  • Location parsing: Handles various Amazon Kindle location formats correctly
  • Title extraction: Smart parentheses handling to extract clean book titles

Test Coverage

  • Comprehensive fixture testing: 5 different fixture files covering various edge cases
  • Byte-identical validation: Ensures 100% compatibility with Go binary output
  • Unit tests: Good coverage of individual parser functions
  • Edge cases: Tests for BOM removal, malformed input, empty files

Security & Configuration

  • Safe config handling: Proper TOML parsing with sensible defaults
  • Secure token storage: Authorization header follows expected format (X-CLI token)
  • Input validation: Parser handles malformed data gracefully
  • No hardcoded secrets: Configuration properly externalized

⚠️ Issues & Recommendations

High Priority

  1. Workflow Permissions Issue (Critical)
    The GitHub App used by Claude Code in this environment lacks the 
    `workflows` permission, so CI/release workflows still reference Go toolchain
    
    • Impact: CI will fail until workflows are manually updated
    • Solution: Apply the provided workflow files from the PR description before merging

Medium Priority

  1. Error Handling in HTTP Client (src/http/client.ts:86-89)

    if (errors.length > 0) {
      const messages = errors.map((e) => e.message).join("; ");
      throw new Error(`upload failed with ${errors.length} errors: ${messages}`);
    }
    • Issue: Long error messages could be truncated/overwhelming
    • Recommendation: Consider logging individual errors and throwing a summary
  2. Date Parsing Edge Cases (src/parser/parser.ts:155-157)

    try {
      createdAt = language === "en" ? parseEnglishDate(dateSection) : parseChineseDate(dateSection);
    } catch {
      createdAt = new Date(0); // Unix epoch
    }
    • Issue: Silent fallback to epoch time could mask parsing issues
    • Recommendation: Log parsing failures for debugging
  3. Signal Handling (src/main.tsx:57-59)

    } finally {
      process.off("SIGINT", onSignal);
      process.off("SIGTERM", onSignal);
    }
    • Issue: Cleanup happens after process.exit(), which is unreachable
    • Recommendation: Move cleanup before the exit call

Low Priority

  1. Build Script Hardcoded Path (scripts/build.ts:88-90)

    • The build script assumes shasum command availability
    • Consider adding fallback for systems without shasum
  2. Type Safety Enhancement (src/parser/parser.ts:83-86)

    if (group.length < 4) {
      return null;
    }
    • Could use more specific array length validation for better type safety

📊 Performance Considerations

Positive Impacts

  • Bun runtime: Significantly faster startup time than Node.js
  • Built-in concurrency: Clean semaphore implementation for HTTP uploads
  • Efficient chunking: 20-item chunks with 10 concurrent uploads optimize throughput
  • Memory efficient: Streaming approach to file processing

Trade-offs

  • Binary size: Expected 20× increase (60-110MB vs ~5MB Go binary)
    • This is acceptable given Bun embeds its runtime
    • Docker images will be larger but still reasonable
  • Runtime dependency: Requires Bun runtime embedded in binary

🧪 Test Strategy Validation

The fixture-based testing approach is excellent:

  • Guarantees byte-identical output preservation
  • Tests real-world Kindle export files
  • Covers multiple languages and edge cases
  • Provides regression protection

🔒 Security Review

No security concerns identified:

  • ✅ No hardcoded credentials or secrets
  • ✅ Proper input validation and sanitization
  • ✅ Safe TOML configuration handling
  • ✅ Appropriate HTTP timeout and error handling
  • ✅ No unsafe dynamic code execution

🚀 Deployment & Build System

Modern and robust approach:

  • Cross-compilation for 5 platforms via Bun
  • Proper version/commit injection at build time
  • macOS code signing integration ready
  • Docker multi-stage builds optimized
  • Release automation via release-please

📋 Final Recommendations

Before Merge:

  1. Apply the CI/release workflow updates from PR description
  2. Consider adding error logging for date parsing failures
  3. Consider fixing the signal cleanup order

After Merge:

  1. Monitor binary sizes in production
  2. Validate Quill signing works with Bun-compiled macOS binaries
  3. Consider adding performance benchmarks comparing Go vs TypeScript versions

This rewrite successfully modernizes the codebase while maintaining full backward compatibility. The quality is production-ready. 🏆


@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request migrates the ck-cli tool from Go to TypeScript using the Bun runtime, introducing a new project structure, build pipeline, and Ink-based UI. The core logic for parsing Kindle clippings and syncing them via GraphQL has been rewritten, with comprehensive tests added to ensure parity with the original implementation. Review feedback identifies potential issues with the brittleness of Chinese date parsing and the limitation of only extracting single-line highlights. Additionally, it is suggested to adopt a stream-based parsing approach to improve memory efficiency when handling large clipping files.

Comment thread src/parser/parser.ts
Comment on lines +188 to +195
let s = dateStr;
const ampm = s.includes("上午") ? "AM" : "PM";
s = s.replace(cjkPattern, "-");
s = s.replace(multiDashPattern, "");
s = s.trim();
s = `${s} ${ampm}`;
return parseShortDateAmpm(s);
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The parseChineseDate logic might fail if there is no space between the weekday and the AM/PM indicator (or time). The current implementation relies on parseShortDateAmpm which splits by the first space. If the normalization process removes all spaces or if the input format varies slightly, spaceIdx will be -1 and an error will be thrown.

Additionally, the multiDashPattern replacement might be too aggressive. If the input is 2024年4月1日星期一 下午2:30:45, the cjkPattern replacement results in 2024-4-1--- --2:30:45. multiDashPattern (which is /-{2,}/g) will replace --- and -- with empty strings, resulting in 2024-4-1 2:30:45. This happens to work for the space-separated format, but it's brittle. A more robust approach would be to normalize to a standard format using a single regex or a more predictable replacement strategy.

Comment thread src/parser/parser.ts
Comment on lines +83 to +111
function parseGroup(group: string[], language: Language): ClippingItem | null {
if (group.length < 4) {
return null;
}

const titleLine = (group[0] ?? "").replaceAll(BOM, "");
const title = parseTitle(titleLine);
if (title === "") {
return null;
}

const infoLine = group[1] ?? "";
const info = parseInfo(infoLine, language);
if (info === null) {
return null;
}

const content = (group[3] ?? "").trim();
if (content === "") {
return null;
}

return {
title,
content,
pageAt: info.location,
createdAt: info.createdAt,
};
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The parseGroup function only extracts the first line of content (group[3]). Kindle clippings can contain multi-line highlights. While the PR description mentions byte-parity with the Go version, if the Go version also had this limitation, it might be worth documenting it as a known limitation or improving it here if parity allows for including all lines from index 3 onwards until the end of the group.

References
  1. Ensure code functionality handles edge cases like multi-line content in clippings.

Comment thread src/commands/parse.tsx
Comment on lines +24 to +34
async function readInput(inputPath: string): Promise<string> {
if (inputPath === "") {
if (process.stdin.isTTY) {
throw new TTYError(
"No --input file provided and stdin is a TTY. Pipe data in or pass --input PATH.",
);
}
return await Bun.stdin.text();
}
return await readFile(inputPath, "utf8");
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Reading the entire input from Bun.stdin.text() or readFile into a string can lead to high memory usage for very large clipping files. Although Kindle clipping files are typically small, a more scalable approach would be to use a stream-based parser to process the file line by line or group by group.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants