feat:progressive compact by chenbt-hz · Pull Request #3243 · OpenAtomFoundation/pikiwidb

chenbt-hz · 2026-04-17T03:53:37Z

1.修复obd-compact的错误
2.新增测试的渐进式compact

fix：#3230
about：#3226
about：#3240

Summary by CodeRabbit

New Features
- Added a "progressive-compact" compaction strategy with bounded, multi-pass compaction behavior.
Configuration
- New config options: progressive-compact-interval, -max-files, -max-time-ms, -min-rate, -min-file-age.
- Config load/rewrite now persists and validates these fields.
Admin
- CONFIG GET/SET extended to view and set progressive-compact settings and compaction-strategy.
Storage
- Storage and compaction paths now support running progressive compactions across instances with gating by interval, time, file count, rate and age.
Chores
- VCS ignore list updated to exclude a new local path.

coderabbitai · 2026-04-17T03:53:57Z

📝 Walkthrough

Walkthrough

Adds a new "progressive-compact" compaction strategy with configuration, server scheduling, DB APIs, storage-layer APIs, and Redis-backed progressive compaction implementation guarded by interval, file-count, time, rate, and file-age parameters.

Changes

Cohort / File(s)	Summary
Config & build `/.gitignore`, `CMakeLists.txt`, `conf/pika.conf`, `include/pika_conf.h`, `src/pika_conf.cc`, `src/pika_admin.cc`	Introduce `progressive-compact` strategy and five `progressive-compact-*` config keys; update load/serialize, add CONFIG GET/SET handlers and validation; minor build/source-file exclusion tweak in CMake/.gitignore.
Server & DB surface `include/pika_server.h`, `include/pika_db.h`, `src/pika_server.cc`, `src/pika_db.cc`	Add `kProgressiveCompact` task type, track last progressive-compact time, expose `DB::ProgressiveCompact`, and route progressive compact tasks in server dispatch/AutoCompactRange with interval gating.
Storage API & orchestration `src/storage/include/storage/storage.h`, `src/storage/src/storage.cc`	Add `kProgressiveCompact` operation and `Storage::ProgressiveCompact` method; parse BG task argv for progressive-compact parameters; support sync/async execution and scheduling.
Redis storage implementation `src/storage/src/redis.h`, `src/storage/src/redis.cc`	Add `Redis::ProgressiveCompact` implementation and declaration; adjust `LongestNotCompactionSstCompact` behavior and use strong compare_exchange; add helper to extract SST file numbers; implement per-CF progressive compaction with max-files, max-time-ms, min-rate, min-file-age and result reporting.

Sequence Diagram(s)

sequenceDiagram
    participant Scheduler as Scheduler (cron/AutoCompactRange)
    participant PikaServer as PikaServer
    participant DB as DB
    participant Storage as Storage
    participant RedisInst as Redis Instance

    Scheduler->>PikaServer: trigger progressive-compact task
    PikaServer->>DB: ProgressiveCompact(type)
    DB->>Storage: ProgressiveCompact(type, max_files, max_time_ms, min_rate, min_file_age, sync=false)
    Storage->>RedisInst: ProgressiveCompact(column_family, params)
    RedisInst-->>Storage: compact result(s)
    Storage-->>DB: aggregated results
    DB-->>PikaServer: done
    PikaServer-->>Scheduler: acknowledge

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

v4.0 渐进式compact参考 #3226 — Implements a progressive compaction feature matching the added Redis::ProgressiveCompact, config keys, and server/storage changes.

Possibly related PRs

feat: a new strategy for auto compaction (ospp 2024) #2816 — Touches compaction logic, server scheduling, and related storage APIs similar to this PR.

Suggested reviewers

chejinge
Mixficsol

Poem

🐰 I hopped through configs, fields in tow,

files trimmed gently, not all in a blow.
Timed and measured, nibble by nibble,
background work makes storage more nimble.
A carrot for builds — progressive compact, let's go!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 10.53% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'feat:progressive compact' directly reflects the main objective of this PR, which is to add progressive compaction functionality as indicated in the file-level changes across configuration, headers, and implementation.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Review rate limit: 7/8 reviews remaining, refill in 7 minutes and 30 seconds.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Copilot

Pull request overview

This PR fixes an OBD compaction bug and introduces a new “incremental-compact” strategy intended to run compact work progressively with tunable limits (files/time/rate/age), wired through server scheduling and configuration.

Changes:

Fix OBD compaction to compact the correct RocksDB Column Family and improve failure logging.
Add incremental compaction end-to-end: storage API + Redis implementation + server scheduling + CONFIG GET/SET + default config entries.
Minor build/config hygiene updates (key encoding fix, CMake build_version source handling, .gitignore formatting, added .claude local settings file).

Reviewed changes

Copilot reviewed 14 out of 15 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
src/storage/src/storage.cc	Adds BG task dispatch + Storage wrapper for `IncrementalCompact` with argv parameter passing.
src/storage/src/redis.h	Exposes new `Redis::IncrementalCompact` API.
src/storage/src/redis.cc	Fixes OBD CF compaction call; implements incremental compaction via `CompactFiles`.
src/storage/include/storage/storage_define.h	Fixes pointer advancement bug in `EncodeUserKey`.
src/storage/include/storage/storage.h	Adds new background operation enum and `Storage::IncrementalCompact` API.
src/pika_server.cc	Schedules incremental compaction based on a configurable interval.
src/pika_db.cc	Wires DB-level call into `storage_->IncrementalCompact` with config knobs.
src/pika_conf.cc	Parses new `incremental-compact-*` configs and supports `incremental-compact` strategy.
src/pika_admin.cc	Adds CONFIG GET/SET support for incremental compact parameters.
include/pika_server.h	Adds `TaskType::kIncrementalCompact` and tracking timestamp for interval gating.
include/pika_db.h	Declares `DB::IncrementalCompact`.
include/pika_conf.h	Adds getters/setters + stored fields for incremental compact parameters and strategy enum.
conf/pika.conf	Documents and adds defaults for `incremental-compact-*` config keys.
CMakeLists.txt	Adjusts build-version source handling (removes `src/build_version.cc` from `DIR_SRCS`, simplifies generated source list).
.gitignore	Fixes formatting of `tests/tmp` entry.
.claude/settings.local.json	Adds a Claude local settings/permissions file to the repo.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

coderabbitai

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/pika_conf.cc (1)

914-931: ⚠️ Potential issue | 🟠 Major

CONFIG REWRITE still drops the active compaction strategy.

cs_ is never derived from compaction_strategy_ before SetConfStr("compaction-strategy", cs_) runs, so rewrite persists an empty value. After restart, Load() falls back to NONE, which means the new incremental-compact mode is not actually durable.

Suggested fix

-  std::string cs_;
-  SetConfStr("compaction-strategy", cs_);
-  if (cs_ == "full-compact") {
-    compaction_strategy_ = FullCompact;
-  } else if (cs_ == "obd-compact") {
-    compaction_strategy_ = OldestOrBestDeleteRatioSstCompact;
-  } else if (cs_ == "incremental-compact") {
-    compaction_strategy_ = IncrementalCompact;
-  } else {
-    compaction_strategy_ = NONE;
-  }
+  std::string cs;
+  switch (compaction_strategy_) {
+    case FullCompact:
+      cs = "full-compact";
+      break;
+    case OldestOrBestDeleteRatioSstCompact:
+      cs = "obd-compact";
+      break;
+    case IncrementalCompact:
+      cs = "incremental-compact";
+      break;
+    default:
+      cs.clear();
+      break;
+  }
+  SetConfStr("compaction-strategy", cs);

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/pika_conf.cc` around lines 914 - 931, The config-rewrite logic calls
SetConfStr("compaction-strategy", cs_) but cs_ is uninitialized so the current
compaction_strategy_ isn't persisted; change the flow in the Save/Rewrite path
to derive cs_ from the existing compaction_strategy_ before calling SetConfStr
(e.g., map compaction_strategy_ enum values FullCompact,
OldestOrBestDeleteRatioSstCompact, IncrementalCompact, NONE into the string
cs_), then call SetConfStr("compaction-strategy", cs_) and keep the subsequent
SetConfInt calls for incremental_* so the active incremental-compact mode is
durably written.

🧹 Nitpick comments (3)

include/pika_server.h (1)
554-554: Minor: initialization style differs from sibling fields.

last_check_compact_time_ and last_check_resume_time_ are initialized in the PikaServer constructor initializer list (see src/pika_server.cc L48-49), while last_incremental_compact_time_ uses an inline default member initializer here. Functionally equivalent, but consider initializing it in the same place as the other timing fields for consistency.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/pika_server.h` at line 554, Move the inline default initialization of
the timeval member last_incremental_compact_time_ out of the class declaration
and initialize it alongside the other timing fields in the PikaServer
constructor initializer list (same place where last_check_compact_time_ and
last_check_resume_time_ are initialized in PikaServer), ensuring the member
declaration becomes just "struct timeval last_incremental_compact_time_;" and
the constructor sets it to {0, 0}.
src/storage/src/redis.cc (2)
6-9: <thread> is not used in this translation unit.

std::chrono (<chrono>) and std::time (<ctime>) are used by IncrementalCompact, but I don't see any std::thread, std::this_thread, or std::jthread reference introduced. Consider dropping #include <thread> to avoid unnecessary header bloat.
♻️ Proposed change
 `#include` <chrono>
 `#include` <sstream>
 `#include` <ctime>
-#include <thread>
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/storage/src/redis.cc` around lines 6 - 9, The translation unit includes
<thread> which isn't used; remove the unnecessary `#include` <thread> from the top
of the file to avoid header bloat and rebuild; check symbols like
IncrementalCompact to ensure it still compiles (it uses <chrono> and <ctime>
only) and no std::thread/std::this_thread references remain.
557-569: Minor: rate computation uses int division and ignores the unconditional processed++ semantics.

Two small things worth noting here:

job_info.stats.num_output_records and num_input_records are uint64_t; multiplying by 100 first is safe, but assigning to int rate is surprising. Using uint64_t rate = ... * 100 / ...; or a double would be clearer and avoids any future surprise if the ratio ever legitimately exceeds INT_MAX / 100.

processed++ is skipped whenever rate >= min_rate because of the break, but it is incremented when num_input_records == 0 (e.g., a file whose entries were all filtered out). The quota semantics are therefore "files that were compacted and still had low compression benefit" — document this or rename max_files → max_low_benefit_files to match the comparison with v4's strict max_files quota noted in the PR description.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/storage/src/redis.cc` around lines 557 - 569, The rate calculation and
processed-count semantics in the IncrementalCompact loop need tightening: change
the local rate from int to an unsigned 64-bit (or double) to compute rate =
job_info.stats.num_output_records * 100 / job_info.stats.num_input_records
safely (use uint64_t or double for rate) and ensure processed is incremented for
every examined file (move or add processed++ before the rate >= min_rate break)
or explicitly rename/clarify the quota variable (e.g., max_low_benefit_files) to
reflect that files which trigger the break are not counted as processed; update
references to rate, job_info.stats.num_output_records,
job_info.stats.num_input_records, processed, min_rate, and max_files
accordingly.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/pika_server.cc`:
- Around line 1240-1249: Clamp the interval returned by
g_pika_conf->incremental_compact_interval() to a sensible positive minimum
(e.g., >=1) before using it in the comparison with
last_incremental_compact_time_ so a zero/negative config cannot disable
throttling; then, detect when compact-interval (the AutoCompactRange path) is
also set and either log a warning that compact-interval takes precedence over
compaction_strategy == PikaConf::IncrementalCompact or explicitly
document/handle this precedence; update the block that calls
DoSameThingEveryDB(TaskType::kIncrementalCompact) and add the warning near where
AutoCompactRange early-returns so users aren’t silently ignored.

In `@src/storage/src/redis.cc`:
- Around line 492-499: The IncrementalCompact loop resets cf_start_time for each
column family (cf_start_time used inside IncrementalCompact loop with
handles_[idx]->GetName() and max_time_ms), so max_time_ms acts as a per-CF
budget and total wall time can be K * max_time_ms; either make the budget global
by moving/start a single start_time outside the CF loop and compare elapsed
against max_time_ms for the whole IncrementalCompact invocation, or rename the
config key and code/comment to incremental-compact-per-cf-max-time-ms (and
update documentation/conf/pika.conf) to make per-CF semantics explicit; pick one
approach and update the related logic and comments referencing cf_start_time,
max_time_ms, and IncrementalCompact accordingly.
- Around line 551-574: The code currently logs failures from
db_->CompactFiles(...) (the local variable s in the CompactFiles loop) but then
unconditionally pushes Status::OK() into compact_result_vec, hiding real errors;
change the logic so that when CompactFiles returns a non-OK Status (the s
checked in the if (!s.ok()) branch) you record that Status into
compact_result_vec (or record the first non-OK per column family) instead of
Status::OK(), and ensure any early break from the inner loop preserves and
pushes that failing Status for the corresponding CF; adjust the
push_back(Status::OK()) at the end to only push OK when no failures were
observed for that CF.
- Around line 466-472: in_compact_flag_ (the std::atomic<bool> used with
compare_exchange_weak in methods like the compact routine) is not explicitly
initialized which can yield indeterminate behavior; fix this by initializing
in_compact_flag_ to false either in its member declaration or in the class
constructor initializer list so compare_exchange_weak reads a well-defined value
and correctly prevents concurrent compactions when using the in_compact_flag_
check/DEFER pattern.
- Around line 522-529: The code builds an absolute path for oldest_file by
concatenating file_meta.db_path + "/" + file_meta.name, but CompactFiles expects
relative file names (SstFileMetaData::name already contains the leading "/"
relative path); update the assignment that sets oldest_file to use
file_meta.name directly (preserve setting of oldest_number and oldest_level) so
CompactFiles receives the correct relative path format; ensure ExtractFileNumber
usage and comparisons remain unchanged.

In `@src/storage/src/storage.cc`:
- Around line 1793-1799: The background task is being enqueued with type ==
DataType::kAll which makes Storage::AddBGTask clear the whole queue; change the
task type passed to AddBGTask so it is not DataType::kAll (i.e., use a specific
non‑all data type or a dedicated incremental type) when creating the async
incremental compaction task — update the call that currently does
AddBGTask({type, kIncrementalCompact, {...}}) to pass an explicit non‑kAll
DataType (or a new DataType::kIncremental if available) so Storage::AddBGTask
will not drop existing queued work.

---

Outside diff comments:
In `@src/pika_conf.cc`:
- Around line 914-931: The config-rewrite logic calls
SetConfStr("compaction-strategy", cs_) but cs_ is uninitialized so the current
compaction_strategy_ isn't persisted; change the flow in the Save/Rewrite path
to derive cs_ from the existing compaction_strategy_ before calling SetConfStr
(e.g., map compaction_strategy_ enum values FullCompact,
OldestOrBestDeleteRatioSstCompact, IncrementalCompact, NONE into the string
cs_), then call SetConfStr("compaction-strategy", cs_) and keep the subsequent
SetConfInt calls for incremental_* so the active incremental-compact mode is
durably written.

---

Nitpick comments:
In `@include/pika_server.h`:
- Line 554: Move the inline default initialization of the timeval member
last_incremental_compact_time_ out of the class declaration and initialize it
alongside the other timing fields in the PikaServer constructor initializer list
(same place where last_check_compact_time_ and last_check_resume_time_ are
initialized in PikaServer), ensuring the member declaration becomes just "struct
timeval last_incremental_compact_time_;" and the constructor sets it to {0, 0}.

In `@src/storage/src/redis.cc`:
- Around line 6-9: The translation unit includes <thread> which isn't used;
remove the unnecessary `#include` <thread> from the top of the file to avoid
header bloat and rebuild; check symbols like IncrementalCompact to ensure it
still compiles (it uses <chrono> and <ctime> only) and no
std::thread/std::this_thread references remain.
- Around line 557-569: The rate calculation and processed-count semantics in the
IncrementalCompact loop need tightening: change the local rate from int to an
unsigned 64-bit (or double) to compute rate = job_info.stats.num_output_records
* 100 / job_info.stats.num_input_records safely (use uint64_t or double for
rate) and ensure processed is incremented for every examined file (move or add
processed++ before the rate >= min_rate break) or explicitly rename/clarify the
quota variable (e.g., max_low_benefit_files) to reflect that files which trigger
the break are not counted as processed; update references to rate,
job_info.stats.num_output_records, job_info.stats.num_input_records, processed,
min_rate, and max_files accordingly.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3ceb47fa-89e8-43f5-a921-f3280492d2b8

📥 Commits

Reviewing files that changed from the base of the PR and between 651dae3 and 47284f9.

📒 Files selected for processing (15)

.gitignore
CMakeLists.txt
conf/pika.conf
include/pika_conf.h
include/pika_db.h
include/pika_server.h
src/pika_admin.cc
src/pika_conf.cc
src/pika_db.cc
src/pika_server.cc
src/storage/include/storage/storage.h
src/storage/include/storage/storage_define.h
src/storage/src/redis.cc
src/storage/src/redis.h
src/storage/src/storage.cc

coderabbitai · 2026-04-17T04:02:12Z

+          uint64_t number = ExtractFileNumber(file_meta.name);
+          if (number < oldest_number) {
+            oldest_number = number;
+            oldest_file = file_meta.db_path + "/" + file_meta.name;
+            oldest_level = level_meta.level;
+          }
+        }
+      }


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verify the format of SstFileMetaData::name and db_path in the RocksDB version used. fd -t f 'metadata.h' | xargs rg -n -C2 'SstFileMetaData|struct .*FileMetaData' | head -200 # Find any CompactFiles usage in this repo that already constructs the path correctly as reference. rg -nP -C3 'CompactFiles\s*\('

Repository: OpenAtomFoundation/pikiwidb

Length of output: 553

🏁 Script executed:

#!/bin/bash # Read the context around the flagged lines (522-529) to understand the full flow sed -n '500,570p' src/storage/src/redis.cc | cat -n

Repository: OpenAtomFoundation/pikiwidb

Length of output: 2855

🏁 Script executed:

#!/bin/bash # Find all occurrences of oldest_file in the file to understand how it's used rg -n 'oldest_file' src/storage/src/redis.cc

Repository: OpenAtomFoundation/pikiwidb

Length of output: 518

🏁 Script executed:

#!/bin/bash # Look for input_files construction to see if oldest_file or similar paths are added to it rg -n -A5 -B5 'input_files\s*[.push|=]' src/storage/src/redis.cc | head -100

Repository: OpenAtomFoundation/pikiwidb

Length of output: 53

🏁 Script executed:

#!/bin/bash # Search for RocksDB includes and version information rg -n 'include.*rocksdb|rocksdb.*version' src/storage/src/redis.cc | head -20

Repository: OpenAtomFoundation/pikiwidb

Length of output: 325

🏁 Script executed:

#!/bin/bash # Find if there are any other places in the codebase that construct paths from SstFileMetaData rg -n 'file_meta\.(db_path|name)' src/ --type cc --type h

Repository: OpenAtomFoundation/pikiwidb

Length of output: 97

🏁 Script executed:

#!/bin/bash # Check if there's a CMakeLists or conanfile that shows RocksDB version find . -name 'CMakeLists.txt' -o -name 'conanfile.*' -o -name 'vcpkg.json' -o -name 'package.json' | xargs grep -l rocksdb 2>/dev/null | head -5

Repository: OpenAtomFoundation/pikiwidb

Length of output: 223

🏁 Script executed:

#!/bin/bash # Check CMakeLists.txt for RocksDB version cat CMakeLists.txt | grep -i rocksdb -A5 -B5

Repository: OpenAtomFoundation/pikiwidb

Length of output: 1533

🏁 Script executed:

#!/bin/bash # Search for file_meta usage with different approach find src -type f \( -name '*.cpp' -o -name '*.cc' -o -name '*.h' -o -name '*.hpp' \) -exec rg 'file_meta\.(db_path|name)' {} + 2>/dev/null

Repository: OpenAtomFoundation/pikiwidb

Length of output: 251

🏁 Script executed:

#!/bin/bash # Also check what ExtractFileNumber function does - it might give clues about the format rg -n -A10 'ExtractFileNumber' src/storage/src/redis.cc

Repository: OpenAtomFoundation/pikiwidb

Length of output: 805

🌐 Web query:

RocksDB v8.7.3 SstFileMetaData name db_path format structure

💡 Result:

In RocksDB v8.7.3 (part of the v8.x series), the SstFileMetaData structure describes metadata for SST (Sorted String Table) files. It inherits from FileStorageInfo and includes fields relevant to table files. Key fields related to name, db_path, and format: - relative_filename: The name of the file within its directory (e.g., "123456.sst"). This is the basename without path. - directory: The directory containing the file (e.g., the db_path or column family path), without trailing '/'. - db_path: Deprecated; previously the full path or directory, now use directory instead. - name: Deprecated; previously the relative filename with leading slash (e.g., "/123456.sst"). SST files follow the naming convention: <file_number>.sst where file_number is a unique uint64_t identifier for the file within the DB. The full path is typically constructed as db_path + "/" + relative_filename (e.g., "/path/to/db/000123.sst"). Other notable fields in SstFileMetaData: - file_number: uint64_t ID of the file (used in filename). - size: File size in bytes. - smallestkey, largestkey: Smallest and largest user keys. - smallest_seqno, largest_seqno: Sequence number bounds. - being_compacted: bool if currently being compacted. - num_reads_sampled: Number of reads. - num_entries, num_deletions: Counts. - temperature: File temperature (kHot, kWarm, kCold). - oldest_blob_file_number, file_creation_time, epoch_number, file_checksum, etc. This structure is returned via APIs like DB::GetColumnFamilyMetaData in LiveFileMetaData (which extends SstFileMetaData) or ColumnFamilyMetaData. No specific changes to this structure in v8.7.3 release notes; the main source is the current header which reflects v8.x stability.

Citations:

1: https://github.com/facebook/rocksdb/blob/main/include/rocksdb/metadata.h

2: https://github.com/facebook/rocksdb/blob/main/HISTORY.md

3: https://javadoc.io/static/org.rocksdb/rocksdbjni/6.16.4/org/rocksdb/SstFileMetaData.html

4: https://javadoc.io/static/org.rocksdb/rocksdbjni/7.0.3/org/rocksdb/SstFileMetaData.html

5: https://javadoc.io/static/org.rocksdb/rocksdbjni/10.4.2/org/rocksdb/SstFileMetaData.html

🌐 Web query:

RocksDB SstFileMetaData file path format leading slash

💡 Result:

In RocksDB's SstFileMetaData (and derived LiveFileMetaData), the file path format uses two separate fields: fileName (or relative_filename in C++), which is the filename without path (e.g., "123456.sst"), and path (or directory in C++), which is the directory path without trailing slash. The full path is constructed as directory + "/" + relative_filename. Regarding leading slash specifically: - In C++ SstFileMetaData constructor: If input _file_name starts with '/', sets relative_filename = _file_name.substr(1) (strips leading '/'), and deprecated name = _file_name (keeps '/'). Otherwise, relative_filename = _file_name, name = "/" + _file_name (adds leading '/'). This ensures the deprecated name field always has a leading slash (e.g., "/123456.sst"), while relative_filename does not. - In Java JNI bindings: path returns "the full path where the file locates" (absolute or relative to DB root), fileName returns "the name of the file". - In RocksDB.getLiveFiles: Returns relative paths that "begin with '/'", confirming the convention of leading slash for relative file listings (e.g., "/000123.sst" relative to dbname/). SST files are stored relative to the database directory (or db_paths), so paths in metadata are typically relative with leading '/' convention for filenames in listings/metadata.

Citations:

1: https://github.com/facebook/rocksdb/blob/main/include/rocksdb/metadata.h

2: https://javadoc.io/static/org.rocksdb/rocksdbjni/10.4.2/org/rocksdb/SstFileMetaData.html

3: https://javadoc.io/static/org.rocksdb/rocksdbjni/6.16.4/org/rocksdb/SstFileMetaData.html

4: https://javadoc.io/static/org.rocksdb/rocksdbjni/7.0.3/org/rocksdb/SstFileMetaData.html

5: https://javadoc.io/static/org.rocksdb/rocksdbjni/10.4.2/org/rocksdb/RocksDB.LiveFiles.html

6: https://javadoc.io/static/org.rocksdb/rocksdbjni/6.20.3/org/rocksdb/RocksDB.LiveFiles.html

🌐 Web query:

RocksDB CompactFiles input_file_names path matching format requirements

💡 Result:

The input_file_names parameter for RocksDB's DB::CompactFiles (or ColumnFamilyHandle::CompactFiles) is a std::vectorstd::string containing the names of SST files to compact. These must be existing, live SST files in the specified column family (default if using DB). Non-SST files like /CURRENT or /MANIFEST-xxx will cause errors such as "Specified compaction input file /000000.sst does not exist in column family default" or similar validation failures. File name format: Relative paths starting with '/', e.g., "/000013.sst", as returned by DB::GetLiveFiles or GetSortedWalFiles. These are relative to the database directory (RocksDB.getName). Absolute paths or plain basenames like "000013.sst" without '/' prefix are invalid and will fail validation, as RocksDB checks for exact matches against internal file representations. To obtain valid names: - Use DB::GetLiveFiles for simple list (filter .sst files). - Use DB::GetLiveFilesMetaData for structured LiveFileMetaData with .name field (e.g., "/123.sst") and level info. Example usage: CompactionOptions options; std::vectorstd::string input_file_names = {"/000012.sst", "/000013.sst"}; int output_level = 1; Status s = db->CompactFiles(options, input_file_names, output_level); Only supported in non-LITE mode. Files must overlap appropriately per LSM rules for correctness.

Citations:

1: https://github.com/facebook/rocksdb/wiki/Manual-Compaction

2: Compactfiles not respecting files to compact facebook/rocksdb#11627

3: https://github.com/facebook/rocksdb/blob/main/include/rocksdb/metadata.h

4: https://javadoc.io/static/org.rocksdb/rocksdbjni/10.4.2/org/rocksdb/RocksDB.LiveFiles.html

5: https://javadoc.io/static/org.rocksdb/rocksdbjni/6.20.3/org/rocksdb/RocksDB.LiveFiles.html

Fix path construction: CompactFiles expects relative paths, not absolute paths with double slashes.

The issue goes beyond the double slash: CompactFiles requires relative file paths (e.g., "/000123.sst") relative to the database directory, not absolute paths. The current code file_meta.db_path + "/" + file_meta.name concatenates an absolute path (/var/db//000123.sst), which will fail validation when matched against RocksDB's internal file representations.

SstFileMetaData::name already provides the correct relative path with leading / (e.g., "/000123.sst"). Use it directly:

🛠️ Proposed fix

- oldest_file = file_meta.db_path + "/" + file_meta.name; + // SstFileMetaData::name is already the relative path with leading '/' + oldest_file = file_meta.name;

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/storage/src/redis.cc` around lines 522 - 529, The code builds an absolute path for oldest_file by concatenating file_meta.db_path + "/" + file_meta.name, but CompactFiles expects relative file names (SstFileMetaData::name already contains the leading "/" relative path); update the assignment that sets oldest_file to use file_meta.name directly (preserve setting of oldest_number and oldest_level) so CompactFiles receives the correct relative path format; ensure ExtractFileNumber usage and comparisons remain unchanged.

This reverts commit 7be1b42.

- 新增 compaction-strategy: incremental-compact - 使用 CompactFiles 每次处理少量最老的 SST 文件 - 可配置单次处理文件数、执行时间、压缩率阈值 - 与 dump 硬连接方案分离，独立测试

漏传了 handles_[idx]，导致非kMetaCF 的 CF 触发 force compact时，实际被 compact 的是 kMetaCF

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/storage/src/storage.cc`:
- Around line 1778-1791: The loop currently overwrites s for each inst and only
returns the last instance Status, losing earlier failures; change
IncrementalCompact handling to record and return the first non-OK Status
encountered: declare a Status first_failure (initialized OK) before the for
(const auto& inst : insts_) loop, after calling inst->IncrementalCompact(...)
check s and set first_failure = s if first_failure.ok() and !s.ok(), then
iterate compact_result_vec and if any compact_result is not ok and
first_failure.ok() set first_failure = compact_result; keep logging as-is but at
the end return first_failure (or s if you prefer), ensuring IncrementalCompact
failures from earlier insts_ or individual compact_result_vec entries are
propagated instead of discarded.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 59b2c9d9-3603-4655-aa5c-93e70291f28e

📥 Commits

Reviewing files that changed from the base of the PR and between 47284f9 and 8043759.

📒 Files selected for processing (14)

.gitignore
CMakeLists.txt
conf/pika.conf
include/pika_conf.h
include/pika_db.h
include/pika_server.h
src/pika_admin.cc
src/pika_conf.cc
src/pika_db.cc
src/pika_server.cc
src/storage/include/storage/storage.h
src/storage/src/redis.cc
src/storage/src/redis.h
src/storage/src/storage.cc

✅ Files skipped from review due to trivial changes (2)

.gitignore
src/pika_server.cc

🚧 Files skipped from review as they are similar to previous changes (5)

src/pika_db.cc
src/pika_admin.cc
conf/pika.conf
src/pika_conf.cc
src/storage/src/redis.cc

coderabbitai · 2026-04-17T06:47:17Z

+  if (sync) {
+    Status s;
+    for (const auto& inst : insts_) {
+      std::vector<rocksdb::Status> compact_result_vec;
+      s = inst->IncrementalCompact(type, &compact_result_vec, storage::kMetaAndData,
+                                   max_files, max_time_ms, min_rate,
+                                   min_file_age);
+      for (auto compact_result : compact_result_vec) {
+        if (!compact_result.ok()) {
+          LOG(ERROR) << compact_result.ToString();
+        }
+      }
+    }
+    return s;


⚠️ Potential issue | 🟠 Major

Propagate the first compaction failure instead of returning only the last instance status.

If an earlier instance fails and a later one succeeds, this method returns OK and the failure is reduced to a log line. That makes the new API report success for partial failures.

Suggested fix

Status Storage::IncrementalCompact(const DataType &type, int max_files, int max_time_ms, int min_rate, int min_file_age, bool sync) { if (sync) { - Status s; + Status overall = Status::OK(); for (const auto& inst : insts_) { std::vector<rocksdb::Status> compact_result_vec; - s = inst->IncrementalCompact(type, &compact_result_vec, storage::kMetaAndData, - max_files, max_time_ms, min_rate, - min_file_age); + Status s = inst->IncrementalCompact(type, &compact_result_vec, storage::kMetaAndData, + max_files, max_time_ms, min_rate, + min_file_age); + if (!s.ok() && overall.ok()) { + overall = s; + } for (auto compact_result : compact_result_vec) { if (!compact_result.ok()) { LOG(ERROR) << compact_result.ToString(); + if (overall.ok()) { + overall = compact_result; + } } } } - return s; + return overall; } else {

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/storage/src/storage.cc` around lines 1778 - 1791, The loop currently overwrites s for each inst and only returns the last instance Status, losing earlier failures; change IncrementalCompact handling to record and return the first non-OK Status encountered: declare a Status first_failure (initialized OK) before the for (const auto& inst : insts_) loop, after calling inst->IncrementalCompact(...) check s and set first_failure = s if first_failure.ok() and !s.ok(), then iterate compact_result_vec and if any compact_result is not ok and first_failure.ok() set first_failure = compact_result; keep logging as-is but at the end return first_failure (or s if you prefer), ensuring IncrementalCompact failures from earlier insts_ or individual compact_result_vec entries are propagated instead of discarded.

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (4)

src/pika_server.cc (1)

1240-1251: ⚠️ Potential issue | 🟡 Minor

compact-interval silently disables progressive-compact.

Because Line 1182 returns early whenever compact-interval is set, this branch is unreachable in that configuration. That makes compaction-strategy = progressive-compact a no-op with no warning. Please either reject that combination or log the precedence explicitly.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/pika_server.cc` around lines 1240 - 1251, The ProgressiveCompact branch
is effectively disabled when the legacy "compact-interval" setting causes an
early return, so detect this conflict and handle it explicitly: before using
g_pika_conf->compaction_strategy() == PikaConf::ProgressiveCompact, check the
legacy compact-interval setting (the same condition that causes the early
return) and either (a) reject this configuration by logging an error and
aborting startup, or (b) log a clear warning that compact-interval takes
precedence and progressive-compact will be ignored; include the conflicting
setting names in the message and ensure the log is emitted near the decision
point that sets last_progressive_compact_time_ / calls
DoSameThingEveryDB(TaskType::kProgressiveCompact) so users can see why that
branch will not run.

src/storage/src/storage.cc (1)

1776-1791: ⚠️ Potential issue | 🟠 Major

Return the first compaction failure instead of the last instance status.

Lines 1780-1791 overwrite s on each iteration. If an early instance fails and a later one succeeds, this method returns OK even though progressive compaction only partially succeeded.

Suggested fix

 Status Storage::ProgressiveCompact(const DataType &type, int max_files, int max_time_ms,
                                    int min_rate, int min_file_age, bool sync) {
   if (sync) {
-    Status s;
+    Status first_failure = Status::OK();
     for (const auto& inst : insts_) {
       std::vector<rocksdb::Status> compact_result_vec;
-      s = inst->ProgressiveCompact(type, &compact_result_vec, storage::kMetaAndData,
-                                   max_files, max_time_ms, min_rate,
-                                   min_file_age);
+      Status s = inst->ProgressiveCompact(type, &compact_result_vec, storage::kMetaAndData,
+                                          max_files, max_time_ms, min_rate,
+                                          min_file_age);
+      if (!s.ok() && first_failure.ok()) {
+        first_failure = s;
+      }
       for (auto compact_result : compact_result_vec) {
         if (!compact_result.ok()) {
           LOG(ERROR) << compact_result.ToString();
+          if (first_failure.ok()) {
+            first_failure = compact_result;
+          }
         }
       }
     }
-    return s;
+    return first_failure;
   } else {

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/storage/src/storage.cc` around lines 1776 - 1791,
Storage::ProgressiveCompact currently overwrites the Status variable s on each
insts_ iteration and can return OK if a later instance succeeds; change the
logic to capture and return the first non-OK Status encountered (from
inst->ProgressiveCompact or from compact_result_vec) instead of overwriting s,
e.g., keep an early_failure Status initialized to OK, set it when you see the
first !ok() result from inst->ProgressiveCompact or any compact_result in
compact_result_vec, still log subsequent errors but do not clear early_failure,
and at the end return early_failure so the method reflects the first compaction
failure.

src/storage/src/redis.cc (2)

519-519: ⚠️ Potential issue | 🔴 Critical

Use RocksDB metadata file name directly for CompactFiles input.

Line 519 builds an absolute path from db_path + "/" + name; this can cause CompactFiles input mismatch. Use file_meta.name (relative name) directly.

🛠️ Proposed fix

-            oldest_file = file_meta.db_path + "/" + file_meta.name;
+            oldest_file = file_meta.name;

For RocksDB C++ `DB::CompactFiles`, does `input_file_names` need to use metadata names like `SstFileMetaData::name` (e.g. `/000123.sst`) instead of absolute paths built with `db_path`?

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/storage/src/redis.cc` at line 519, The code builds an absolute path into
oldest_file using file_meta.db_path + "/" + file_meta.name which causes a
mismatch when passed to RocksDB::DB::CompactFiles; change the logic to use the
RocksDB metadata file name directly (use file_meta.name / SstFileMetaData::name)
when populating input_file_names for CompactFiles so the entries match RocksDB's
expected metadata names (ensure any variable named oldest_file or the vector
passed to CompactFiles contains the relative metadata name rather than an
absolute path).

567-569: ⚠️ Potential issue | 🟠 Major

Per-CF status reporting is currently coupled across CFs.

The condition at Line 567 checks the whole result vector’s last status, so one CF failure can suppress status emission for subsequent CFs. Track status per CF and always push exactly one status per CF.

🛠️ Proposed fix

   for (auto idx : handleIdxVec) {
     auto cf_start_time = std::chrono::steady_clock::now();
     int processed = 0;
+    Status cf_status = Status::OK();

     while (processed < max_files) {
@@
       if (!s.ok()) {
         LOG(WARNING) << "ProgressiveCompact failed for file " << oldest_file
                      << ": " << s.ToString();
-        if (compact_result_vec) {
-          compact_result_vec->push_back(s);
-        }
+        cf_status = s;
         break;
       }
@@
     }

-    if (compact_result_vec && (compact_result_vec->empty() || compact_result_vec->back().ok())) {
-      compact_result_vec->push_back(Status::OK());
+    if (compact_result_vec) {
+      compact_result_vec->push_back(cf_status);
     }
   }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/storage/src/redis.cc` around lines 567 - 569, The current logic that
conditionally pushes Status::OK() into compact_result_vec based on
compact_result_vec->back().ok() couples statuses across column families and can
suppress statuses for later CFs; change the code around compact_result_vec to
track and emit one Status per CF by computing the CF-specific Status (e.g., a
local variable per CF inside the loop that processes each column family) and
unconditionally push that status into compact_result_vec for each CF instead of
inspecting the vector's last element, ensuring exactly one push per column
family.

🧹 Nitpick comments (2)

src/pika_admin.cc (1)

2043-2058: Centralize the compaction-strategy string mapping.

Lines 2043-2058 and Lines 2652-2659 are now two separate sources of truth for the same enum/string conversion. A small helper in PikaConf would keep CONFIG GET, CONFIG SET, and rewrite paths from drifting the next time a strategy is added.

Also applies to: 2652-2659
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/pika_admin.cc` around lines 2043 - 2058, The compaction-strategy string
mapping is duplicated; add a single helper on PikaConf (e.g., a static method
PikaConf::CompactionStrategyToString(PikaConf::CompactionStrategy) or an
instance method ToCompactionStrategyString()) that returns the corresponding
string for each enum value and a sensible default, declare it in PikaConf's
header, implement it in the conf source, and replace the local switch blocks in
src/pika_admin.cc (the current switch that reads
g_pika_conf->compaction_strategy()) and the other duplicated spot (lines
referenced in the review) to call this new helper instead.

src/storage/src/storage.cc (1)

1732-1745: Don't let malformed argv silently become zero.

Lines 1739-1742 use std::atoi, so a bad value overwrites the defaults with 0 and can turn the task into a no-op or immediate timeout. Parse these the same way the admin path does and keep the default on failure.

Suggested hardening

     } else if (task.operation == kProgressiveCompact) {
       // Parse parameters from argv, use defaults if not provided
       int max_files = 1;
       int max_time_ms = 1000;
       int min_rate = 70;
       int min_file_age = 60;
 
-      if (task.argv.size() > 0) max_files = std::atoi(task.argv[0].c_str());
-      if (task.argv.size() > 1) max_time_ms = std::atoi(task.argv[1].c_str());
-      if (task.argv.size() > 2) min_rate = std::atoi(task.argv[2].c_str());
-      if (task.argv.size() > 3) min_file_age = std::atoi(task.argv[3].c_str());
+      auto parse_int_arg = [](const std::vector<std::string>& argv, size_t idx, long* out) {
+        return idx < argv.size() && pstd::string2int(argv[idx].data(), argv[idx].size(), out) != 0;
+      };
+      long parsed = 0;
+      if (parse_int_arg(task.argv, 0, &parsed) && parsed > 0) {
+        max_files = static_cast<int>(parsed);
+      }
+      if (parse_int_arg(task.argv, 1, &parsed) && parsed > 0) {
+        max_time_ms = static_cast<int>(parsed);
+      }
+      if (parse_int_arg(task.argv, 2, &parsed) && parsed > 0 && parsed <= 100) {
+        min_rate = static_cast<int>(parsed);
+      }
+      if (parse_int_arg(task.argv, 3, &parsed) && parsed >= 0) {
+        min_file_age = static_cast<int>(parsed);
+      }
 
       ProgressiveCompact(task.type, max_files, max_time_ms, min_rate, min_file_age, true);

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/storage/src/storage.cc` around lines 1732 - 1745, The argv parsing for
the kProgressiveCompact branch is unsafe: current use of std::atoi will turn
malformed strings into 0 and overwrite defaults; update the parsing logic around
task.argv[0..3] to mirror the admin path parsing (validate each value before
assigning) — e.g., use a safe parse (strtol with endptr or std::stoi inside
try/catch) for each parameter (max_files, max_time_ms, min_rate, min_file_age)
and only replace the default when parsing succeeds and the value is in a valid
range; then pass the validated values into ProgressiveCompact(task.type,
max_files, max_time_ms, min_rate, min_file_age, true).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/pika_server.cc`:
- Around line 1240-1251: This branch for progressive-compact must perform the
same free-disk ratio guard as AutoCompactRange before scheduling work: locate
the free/total disk space check used in AutoCompactRange (the calculation using
free_size/total_size and the configured threshold from g_pika_conf) and
copy/reuse that logic right before the call to
DoSameThingEveryDB(TaskType::kProgressiveCompact) in the progressive-compact
branch (the block that checks progressive_compact_interval() and
last_progressive_compact_time_); if free/total is below the configured
threshold, skip calling DoSameThingEveryDB.

In `@src/storage/src/redis.cc`:
- Around line 385-390: The current loop overwrites compact_result on each call
to db_->CompactRange(handles_[idx], &start_key, &stop_key), which can mask an
earlier failure if a later compaction succeeds; update the logic in the block
that calls db_->CompactRange (use compact_result, handles_[idx], file_path, idx)
to preserve the first non-OK status: either break out of the loop immediately on
a non-OK compact_result, or keep a separate first_error variable and only set it
when compact_result is not OK (and avoid overwriting it on subsequent OK
results), and continue to log using that preserved error instead of the last
compact_result.

---

Duplicate comments:
In `@src/pika_server.cc`:
- Around line 1240-1251: The ProgressiveCompact branch is effectively disabled
when the legacy "compact-interval" setting causes an early return, so detect
this conflict and handle it explicitly: before using
g_pika_conf->compaction_strategy() == PikaConf::ProgressiveCompact, check the
legacy compact-interval setting (the same condition that causes the early
return) and either (a) reject this configuration by logging an error and
aborting startup, or (b) log a clear warning that compact-interval takes
precedence and progressive-compact will be ignored; include the conflicting
setting names in the message and ensure the log is emitted near the decision
point that sets last_progressive_compact_time_ / calls
DoSameThingEveryDB(TaskType::kProgressiveCompact) so users can see why that
branch will not run.

In `@src/storage/src/redis.cc`:
- Line 519: The code builds an absolute path into oldest_file using
file_meta.db_path + "/" + file_meta.name which causes a mismatch when passed to
RocksDB::DB::CompactFiles; change the logic to use the RocksDB metadata file
name directly (use file_meta.name / SstFileMetaData::name) when populating
input_file_names for CompactFiles so the entries match RocksDB's expected
metadata names (ensure any variable named oldest_file or the vector passed to
CompactFiles contains the relative metadata name rather than an absolute path).
- Around line 567-569: The current logic that conditionally pushes Status::OK()
into compact_result_vec based on compact_result_vec->back().ok() couples
statuses across column families and can suppress statuses for later CFs; change
the code around compact_result_vec to track and emit one Status per CF by
computing the CF-specific Status (e.g., a local variable per CF inside the loop
that processes each column family) and unconditionally push that status into
compact_result_vec for each CF instead of inspecting the vector's last element,
ensuring exactly one push per column family.

In `@src/storage/src/storage.cc`:
- Around line 1776-1791: Storage::ProgressiveCompact currently overwrites the
Status variable s on each insts_ iteration and can return OK if a later instance
succeeds; change the logic to capture and return the first non-OK Status
encountered (from inst->ProgressiveCompact or from compact_result_vec) instead
of overwriting s, e.g., keep an early_failure Status initialized to OK, set it
when you see the first !ok() result from inst->ProgressiveCompact or any
compact_result in compact_result_vec, still log subsequent errors but do not
clear early_failure, and at the end return early_failure so the method reflects
the first compaction failure.

---

Nitpick comments:
In `@src/pika_admin.cc`:
- Around line 2043-2058: The compaction-strategy string mapping is duplicated;
add a single helper on PikaConf (e.g., a static method
PikaConf::CompactionStrategyToString(PikaConf::CompactionStrategy) or an
instance method ToCompactionStrategyString()) that returns the corresponding
string for each enum value and a sensible default, declare it in PikaConf's
header, implement it in the conf source, and replace the local switch blocks in
src/pika_admin.cc (the current switch that reads
g_pika_conf->compaction_strategy()) and the other duplicated spot (lines
referenced in the review) to call this new helper instead.

In `@src/storage/src/storage.cc`:
- Around line 1732-1745: The argv parsing for the kProgressiveCompact branch is
unsafe: current use of std::atoi will turn malformed strings into 0 and
overwrite defaults; update the parsing logic around task.argv[0..3] to mirror
the admin path parsing (validate each value before assigning) — e.g., use a safe
parse (strtol with endptr or std::stoi inside try/catch) for each parameter
(max_files, max_time_ms, min_rate, min_file_age) and only replace the default
when parsing succeeds and the value is in a valid range; then pass the validated
values into ProgressiveCompact(task.type, max_files, max_time_ms, min_rate,
min_file_age, true).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3c825f50-2ad3-47ae-a75c-b8b89074eaf1

📥 Commits

Reviewing files that changed from the base of the PR and between 8043759 and ae7b23d.

📒 Files selected for processing (12)

conf/pika.conf
include/pika_conf.h
include/pika_db.h
include/pika_server.h
src/pika_admin.cc
src/pika_conf.cc
src/pika_db.cc
src/pika_server.cc
src/storage/include/storage/storage.h
src/storage/src/redis.cc
src/storage/src/redis.h
src/storage/src/storage.cc

🚧 Files skipped from review as they are similar to previous changes (1)

conf/pika.conf

coderabbitai · 2026-04-29T07:52:15Z

+  } else if (g_pika_conf->compaction_strategy() == PikaConf::ProgressiveCompact) {
+    struct timeval now;
+    gettimeofday(&now, nullptr);
+    int interval = g_pika_conf->progressive_compact_interval();
+    if (interval <= 0) {
+      return;
+    }
+    if (last_progressive_compact_time_.tv_sec == 0 ||
+        now.tv_sec - last_progressive_compact_time_.tv_sec >= interval) {
+      gettimeofday(&last_progressive_compact_time_, nullptr);
+      DoSameThingEveryDB(TaskType::kProgressiveCompact);
+    }


⚠️ Potential issue | 🟠 Major

Gate progressive compaction on free disk space.

This new path bypasses the only disk-space safety check in AutoCompactRange(). When compaction-strategy = progressive-compact, it can still schedule manual compaction even though the compact-interval and compact-cron paths would skip compaction under low free space. Please reuse the same free_size / total_size guard before calling DoSameThingEveryDB(TaskType::kProgressiveCompact).

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/pika_server.cc` around lines 1240 - 1251, This branch for progressive-compact must perform the same free-disk ratio guard as AutoCompactRange before scheduling work: locate the free/total disk space check used in AutoCompactRange (the calculation using free_size/total_size and the configured threshold from g_pika_conf) and copy/reuse that logic right before the call to DoSameThingEveryDB(TaskType::kProgressiveCompact) in the progressive-compact branch (the block that checks progressive_compact_interval() and last_progressive_compact_time_); if free/total is below the configured threshold, skip calling DoSameThingEveryDB.

coderabbitai · 2026-04-29T07:52:15Z

+        compact_result = db_->CompactRange(default_compact_range_options_, handles_[idx], &start_key, &stop_key);                                                                                         
+        if (!compact_result.ok()) {                                                                       
+          LOG(WARNING) << handles_[idx]->GetName()                                                        
+                      << " force CompactRange failed: " << compact_result.ToString()                     
+                      << " file=" << file_path;                                                          
+        }   


⚠️ Potential issue | 🟠 Major

Forced CompactRange failures may be masked by later success in same CF.

On failure you only log and continue scanning; a later successful compaction can overwrite compact_result and report OK. Preserve first error (or break immediately) to avoid false-success reporting.

🛠️ Proposed fix

compact_result = db_->CompactRange(default_compact_range_options_, handles_[idx], &start_key, &stop_key); if (!compact_result.ok()) { LOG(WARNING) << handles_[idx]->GetName() << " force CompactRange failed: " << compact_result.ToString() << " file=" << file_path; + break; }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/storage/src/redis.cc` around lines 385 - 390, The current loop overwrites compact_result on each call to db_->CompactRange(handles_[idx], &start_key, &stop_key), which can mask an earlier failure if a later compaction succeeds; update the logic in the block that calls db_->CompactRange (use compact_result, handles_[idx], file_path, idx) to preserve the first non-OK status: either break out of the loop immediately on a non-OK compact_result, or keep a separate first_error variable and only set it when compact_result is not OK (and avoid overwriting it on subsequent OK results), and continue to log using that preserved error instead of the last compact_result.

chenbt-hz requested review from chejinge, Copilot and guangkun123 April 17, 2026 03:53

github-actions Bot added ✏️ Feature New feature or request Invalid PR Title labels Apr 17, 2026

Copilot started reviewing on behalf of chenbt-hz April 17, 2026 03:54 View session

chenbt-hz commented Apr 17, 2026

View reviewed changes

Comment thread .claude/settings.local.json Outdated

Copilot AI reviewed Apr 17, 2026

View reviewed changes

Comment thread src/storage/src/redis.cc

Comment thread src/storage/src/redis.cc Outdated

Comment thread src/storage/src/redis.cc

Comment thread src/storage/src/redis.cc Outdated

coderabbitai Bot reviewed Apr 17, 2026

View reviewed changes

chejinge changed the title ~~feat：测试渐进式compact~~ feat：progressive compact Apr 17, 2026

chejinge changed the title ~~feat：progressive compact~~ feat:progressive compact Apr 17, 2026

github-actions Bot removed the Invalid PR Title label Apr 17, 2026

chenbt-hz added 11 commits April 17, 2026 14:26

fix: ttl error for hlen update cache

f2b8fbf

Revert "fix: ttl error for hlen update cache"

f2970d4

This reverts commit 7be1b42.

feat: 添加渐进式 compact (incremental-compact)

8103862

- 新增 compaction-strategy: incremental-compact - 使用 CompactFiles 每次处理少量最老的 SST 文件 - 可配置单次处理文件数、执行时间、压缩率阈值 - 与 dump 硬连接方案分离，独立测试

fix: 优化渐进式compact；todo-检查配置能否生效

0629948

feat：测试魔改代码+压测工具

85c4e70

fix: 修复L6跳过逻辑错误；新增动态参数配置

f6ddc52

fix: 尝试修复obdcompact error

db019e1

漏传了 handles_[idx]，导致非kMetaCF 的 CF 触发 force compact时，实际被 compact 的是 kMetaCF

Revert "feat：测试魔改代码"

64ee3a6

fix: 移除临时文件

c90028d

fix: ai review bug 修复

63a13b7

change: 移除多余文件

8043759

chenbt-hz force-pushed the test-incremental-compact branch from 47284f9 to 8043759 Compare April 17, 2026 06:40

coderabbitai Bot reviewed Apr 17, 2026

View reviewed changes

chejinge requested a review from Mixficsol April 17, 2026 11:32

chenbt-hz added 2 commits April 21, 2026 17:44

feat: 支持compaction-strategy动态修改

fd899b6

change: incremental-compact改为progressive-compact

ae7b23d

coderabbitai Bot reviewed Apr 29, 2026

View reviewed changes

Conversation

chenbt-hz commented Apr 17, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related issues

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

chenbt-hz commented Apr 17, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 17, 2026 •

edited

Loading