Skip to content

csdb: add RocksDB backend with runtime backend selection#73

Open
akaitrade wants to merge 1 commit into
CREDITSCOM:masterfrom
akaitrade:dualrocksdb
Open

csdb: add RocksDB backend with runtime backend selection#73
akaitrade wants to merge 1 commit into
CREDITSCOM:masterfrom
akaitrade:dualrocksdb

Conversation

@akaitrade
Copy link
Copy Markdown
Contributor

Add RocksDB storage backend with runtime backend selection

Why

The node has run exclusively on BerkeleyDB (blockchain.db) since inception.
BDB works, but it has real operational ceilings:

  • Write throughput under load. BDB's B-tree plus its txn/checkpoint loop
    becomes a bottleneck during high-rate ingestion (initial sync, catch-up,
    migrations). On Windows in particular it wedges under sustained write pressure.
  • Compression. BDB stores blocks uncompressed. The CREDITS chain is mostly
    empty / low-tx blocks, which is a lot of disk for little data.
  • Operational tooling. BDB's recovery and inspection story is thin compared
    to a modern LSM engine.

RocksDB is an LSM-tree engine built for high write throughput, with built-in
compression (LZ4 / LZ4HC), tunable caches, bulk-load mode, and a mature
operational surface. Rather than a hard cutover (risky on a live chain), this
change ships both backends in one binary and lets operators choose per node
at runtime, so RocksDB can be rolled out gradually with instant fallback.

What

RocksDB backend (database_rocksdb.cpp / .hpp) implementing csdb::Database,
behaviorally equivalent to the BerkeleyDB backend:

  • Three column families: blocks keyed big-endian seq+1 so lexicographic order
    matches sequence order, seq_no, and contracts.
  • Per-CF tuning: shared LRU block cache, bloom filters, partitioned index on the
    seq_no CF, LZ4 throughout with LZ4HC at the bottommost level.
  • put_batch() coalescing block + index writes into a single WriteBatch / WAL append.
  • flush() (SyncWAL) for checkpoint-boundary durability; async writes by default
    with durability anchored at checkpoints (mirrors BDB's DB_TXN_NOSYNC model).
  • Bulk-load mode (set_bulk_load / compact_full) for one-shot high-rate import.

Surrounding changes:

  • Dual-backend build. The binary always links both backends
    (CSDB_BACKEND=both); no compile-time backend switch.
  • Runtime selection. config.ini [storage] db_backend (default berkeleydb,
    set rocksdb to opt in), plus tuning knobs rocksdb_block_cache_mb,
    rocksdb_memtable_mb, and the storage write-pipeline knobs
    async_write_queue_size and write_batch_size.
  • Storage layer. Runtime backend selection, set_tuning plumbing,
    Storage::flush(), and an async write-queue that batches pools through put_batch().
  • Build deps. Added LZ4HC (lz4hc.c / .h) to the vendored lz4 (RocksDB's
    bottommost kLZ4HCCompression needs it); defined BOOST_ALL_NO_LIB to disable
    Boost's MSVC auto-link pragmas (Boost is linked via CMake targets).

How to use

In config.ini:

[storage]
db_backend = rocksdb
rocksdb_block_cache_mb = 1024
rocksdb_memtable_mb    = 256

Default config is unchanged (berkeleydb), so existing nodes behave identically
until explicitly switched.

Compatibility

  • Per-node, opt-in. No protocol / wire / block-format change. A RocksDB node
    and a BerkeleyDB node produce identical block hashes and interoperate normally.
  • Backends are not on-disk interchangeable. Switching db_backend on an
    existing node means re-syncing (or migrating) that node's chain DB; the new
    backend starts from a fresh DB directory.
  • Default remains BerkeleyDB; nothing changes for nodes that do not set db_backend.

Testing

  • Builds clean on Windows (MSVC2022) with CSDB_BACKEND=both.
  • WSL Ubuntu-20.04 / 22.04 build verification: pending.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant