feat:progressive compact#3243
feat:progressive compact#3243chenbt-hz wants to merge 11 commits intoOpenAtomFoundation:unstablefrom
Conversation
📝 WalkthroughWalkthroughIntroduces a new Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
This PR fixes an OBD compaction bug and introduces a new “incremental-compact” strategy intended to run compact work progressively with tunable limits (files/time/rate/age), wired through server scheduling and configuration.
Changes:
- Fix OBD compaction to compact the correct RocksDB Column Family and improve failure logging.
- Add incremental compaction end-to-end: storage API + Redis implementation + server scheduling + CONFIG GET/SET + default config entries.
- Minor build/config hygiene updates (key encoding fix, CMake build_version source handling, .gitignore formatting, added
.claudelocal settings file).
Reviewed changes
Copilot reviewed 14 out of 15 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| src/storage/src/storage.cc | Adds BG task dispatch + Storage wrapper for IncrementalCompact with argv parameter passing. |
| src/storage/src/redis.h | Exposes new Redis::IncrementalCompact API. |
| src/storage/src/redis.cc | Fixes OBD CF compaction call; implements incremental compaction via CompactFiles. |
| src/storage/include/storage/storage_define.h | Fixes pointer advancement bug in EncodeUserKey. |
| src/storage/include/storage/storage.h | Adds new background operation enum and Storage::IncrementalCompact API. |
| src/pika_server.cc | Schedules incremental compaction based on a configurable interval. |
| src/pika_db.cc | Wires DB-level call into storage_->IncrementalCompact with config knobs. |
| src/pika_conf.cc | Parses new incremental-compact-* configs and supports incremental-compact strategy. |
| src/pika_admin.cc | Adds CONFIG GET/SET support for incremental compact parameters. |
| include/pika_server.h | Adds TaskType::kIncrementalCompact and tracking timestamp for interval gating. |
| include/pika_db.h | Declares DB::IncrementalCompact. |
| include/pika_conf.h | Adds getters/setters + stored fields for incremental compact parameters and strategy enum. |
| conf/pika.conf | Documents and adds defaults for incremental-compact-* config keys. |
| CMakeLists.txt | Adjusts build-version source handling (removes src/build_version.cc from DIR_SRCS, simplifies generated source list). |
| .gitignore | Fixes formatting of tests/tmp entry. |
| .claude/settings.local.json | Adds a Claude local settings/permissions file to the repo. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/pika_conf.cc (1)
914-931:⚠️ Potential issue | 🟠 Major
CONFIG REWRITEstill drops the active compaction strategy.
cs_is never derived fromcompaction_strategy_beforeSetConfStr("compaction-strategy", cs_)runs, so rewrite persists an empty value. After restart,Load()falls back toNONE, which means the newincremental-compactmode is not actually durable.Suggested fix
- std::string cs_; - SetConfStr("compaction-strategy", cs_); - if (cs_ == "full-compact") { - compaction_strategy_ = FullCompact; - } else if (cs_ == "obd-compact") { - compaction_strategy_ = OldestOrBestDeleteRatioSstCompact; - } else if (cs_ == "incremental-compact") { - compaction_strategy_ = IncrementalCompact; - } else { - compaction_strategy_ = NONE; - } + std::string cs; + switch (compaction_strategy_) { + case FullCompact: + cs = "full-compact"; + break; + case OldestOrBestDeleteRatioSstCompact: + cs = "obd-compact"; + break; + case IncrementalCompact: + cs = "incremental-compact"; + break; + default: + cs.clear(); + break; + } + SetConfStr("compaction-strategy", cs);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/pika_conf.cc` around lines 914 - 931, The config-rewrite logic calls SetConfStr("compaction-strategy", cs_) but cs_ is uninitialized so the current compaction_strategy_ isn't persisted; change the flow in the Save/Rewrite path to derive cs_ from the existing compaction_strategy_ before calling SetConfStr (e.g., map compaction_strategy_ enum values FullCompact, OldestOrBestDeleteRatioSstCompact, IncrementalCompact, NONE into the string cs_), then call SetConfStr("compaction-strategy", cs_) and keep the subsequent SetConfInt calls for incremental_* so the active incremental-compact mode is durably written.
🧹 Nitpick comments (3)
include/pika_server.h (1)
554-554: Minor: initialization style differs from sibling fields.
last_check_compact_time_andlast_check_resume_time_are initialized in thePikaServerconstructor initializer list (seesrc/pika_server.ccL48-49), whilelast_incremental_compact_time_uses an inline default member initializer here. Functionally equivalent, but consider initializing it in the same place as the other timing fields for consistency.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@include/pika_server.h` at line 554, Move the inline default initialization of the timeval member last_incremental_compact_time_ out of the class declaration and initialize it alongside the other timing fields in the PikaServer constructor initializer list (same place where last_check_compact_time_ and last_check_resume_time_ are initialized in PikaServer), ensuring the member declaration becomes just "struct timeval last_incremental_compact_time_;" and the constructor sets it to {0, 0}.src/storage/src/redis.cc (2)
6-9:<thread>is not used in this translation unit.
std::chrono(<chrono>) andstd::time(<ctime>) are used byIncrementalCompact, but I don't see anystd::thread,std::this_thread, orstd::jthreadreference introduced. Consider dropping#include <thread>to avoid unnecessary header bloat.♻️ Proposed change
`#include` <chrono> `#include` <sstream> `#include` <ctime> -#include <thread>🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/storage/src/redis.cc` around lines 6 - 9, The translation unit includes <thread> which isn't used; remove the unnecessary `#include` <thread> from the top of the file to avoid header bloat and rebuild; check symbols like IncrementalCompact to ensure it still compiles (it uses <chrono> and <ctime> only) and no std::thread/std::this_thread references remain.
557-569: Minor: rate computation uses int division and ignores the unconditionalprocessed++semantics.Two small things worth noting here:
job_info.stats.num_output_recordsandnum_input_recordsareuint64_t; multiplying by 100 first is safe, but assigning toint rateis surprising. Usinguint64_t rate = ... * 100 / ...;or a double would be clearer and avoids any future surprise if the ratio ever legitimately exceedsINT_MAX / 100.processed++is skipped wheneverrate >= min_ratebecause of thebreak, but it is incremented whennum_input_records == 0(e.g., a file whose entries were all filtered out). The quota semantics are therefore "files that were compacted and still had low compression benefit" — document this or renamemax_files→max_low_benefit_filesto match the comparison with v4's strictmax_filesquota noted in the PR description.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/storage/src/redis.cc` around lines 557 - 569, The rate calculation and processed-count semantics in the IncrementalCompact loop need tightening: change the local rate from int to an unsigned 64-bit (or double) to compute rate = job_info.stats.num_output_records * 100 / job_info.stats.num_input_records safely (use uint64_t or double for rate) and ensure processed is incremented for every examined file (move or add processed++ before the rate >= min_rate break) or explicitly rename/clarify the quota variable (e.g., max_low_benefit_files) to reflect that files which trigger the break are not counted as processed; update references to rate, job_info.stats.num_output_records, job_info.stats.num_input_records, processed, min_rate, and max_files accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/pika_server.cc`:
- Around line 1240-1249: Clamp the interval returned by
g_pika_conf->incremental_compact_interval() to a sensible positive minimum
(e.g., >=1) before using it in the comparison with
last_incremental_compact_time_ so a zero/negative config cannot disable
throttling; then, detect when compact-interval (the AutoCompactRange path) is
also set and either log a warning that compact-interval takes precedence over
compaction_strategy == PikaConf::IncrementalCompact or explicitly
document/handle this precedence; update the block that calls
DoSameThingEveryDB(TaskType::kIncrementalCompact) and add the warning near where
AutoCompactRange early-returns so users aren’t silently ignored.
In `@src/storage/src/redis.cc`:
- Around line 492-499: The IncrementalCompact loop resets cf_start_time for each
column family (cf_start_time used inside IncrementalCompact loop with
handles_[idx]->GetName() and max_time_ms), so max_time_ms acts as a per-CF
budget and total wall time can be K * max_time_ms; either make the budget global
by moving/start a single start_time outside the CF loop and compare elapsed
against max_time_ms for the whole IncrementalCompact invocation, or rename the
config key and code/comment to incremental-compact-per-cf-max-time-ms (and
update documentation/conf/pika.conf) to make per-CF semantics explicit; pick one
approach and update the related logic and comments referencing cf_start_time,
max_time_ms, and IncrementalCompact accordingly.
- Around line 551-574: The code currently logs failures from
db_->CompactFiles(...) (the local variable s in the CompactFiles loop) but then
unconditionally pushes Status::OK() into compact_result_vec, hiding real errors;
change the logic so that when CompactFiles returns a non-OK Status (the s
checked in the if (!s.ok()) branch) you record that Status into
compact_result_vec (or record the first non-OK per column family) instead of
Status::OK(), and ensure any early break from the inner loop preserves and
pushes that failing Status for the corresponding CF; adjust the
push_back(Status::OK()) at the end to only push OK when no failures were
observed for that CF.
- Around line 466-472: in_compact_flag_ (the std::atomic<bool> used with
compare_exchange_weak in methods like the compact routine) is not explicitly
initialized which can yield indeterminate behavior; fix this by initializing
in_compact_flag_ to false either in its member declaration or in the class
constructor initializer list so compare_exchange_weak reads a well-defined value
and correctly prevents concurrent compactions when using the in_compact_flag_
check/DEFER pattern.
- Around line 522-529: The code builds an absolute path for oldest_file by
concatenating file_meta.db_path + "/" + file_meta.name, but CompactFiles expects
relative file names (SstFileMetaData::name already contains the leading "/"
relative path); update the assignment that sets oldest_file to use
file_meta.name directly (preserve setting of oldest_number and oldest_level) so
CompactFiles receives the correct relative path format; ensure ExtractFileNumber
usage and comparisons remain unchanged.
In `@src/storage/src/storage.cc`:
- Around line 1793-1799: The background task is being enqueued with type ==
DataType::kAll which makes Storage::AddBGTask clear the whole queue; change the
task type passed to AddBGTask so it is not DataType::kAll (i.e., use a specific
non‑all data type or a dedicated incremental type) when creating the async
incremental compaction task — update the call that currently does
AddBGTask({type, kIncrementalCompact, {...}}) to pass an explicit non‑kAll
DataType (or a new DataType::kIncremental if available) so Storage::AddBGTask
will not drop existing queued work.
---
Outside diff comments:
In `@src/pika_conf.cc`:
- Around line 914-931: The config-rewrite logic calls
SetConfStr("compaction-strategy", cs_) but cs_ is uninitialized so the current
compaction_strategy_ isn't persisted; change the flow in the Save/Rewrite path
to derive cs_ from the existing compaction_strategy_ before calling SetConfStr
(e.g., map compaction_strategy_ enum values FullCompact,
OldestOrBestDeleteRatioSstCompact, IncrementalCompact, NONE into the string
cs_), then call SetConfStr("compaction-strategy", cs_) and keep the subsequent
SetConfInt calls for incremental_* so the active incremental-compact mode is
durably written.
---
Nitpick comments:
In `@include/pika_server.h`:
- Line 554: Move the inline default initialization of the timeval member
last_incremental_compact_time_ out of the class declaration and initialize it
alongside the other timing fields in the PikaServer constructor initializer list
(same place where last_check_compact_time_ and last_check_resume_time_ are
initialized in PikaServer), ensuring the member declaration becomes just "struct
timeval last_incremental_compact_time_;" and the constructor sets it to {0, 0}.
In `@src/storage/src/redis.cc`:
- Around line 6-9: The translation unit includes <thread> which isn't used;
remove the unnecessary `#include` <thread> from the top of the file to avoid
header bloat and rebuild; check symbols like IncrementalCompact to ensure it
still compiles (it uses <chrono> and <ctime> only) and no
std::thread/std::this_thread references remain.
- Around line 557-569: The rate calculation and processed-count semantics in the
IncrementalCompact loop need tightening: change the local rate from int to an
unsigned 64-bit (or double) to compute rate = job_info.stats.num_output_records
* 100 / job_info.stats.num_input_records safely (use uint64_t or double for
rate) and ensure processed is incremented for every examined file (move or add
processed++ before the rate >= min_rate break) or explicitly rename/clarify the
quota variable (e.g., max_low_benefit_files) to reflect that files which trigger
the break are not counted as processed; update references to rate,
job_info.stats.num_output_records, job_info.stats.num_input_records, processed,
min_rate, and max_files accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 3ceb47fa-89e8-43f5-a921-f3280492d2b8
📒 Files selected for processing (15)
.gitignoreCMakeLists.txtconf/pika.confinclude/pika_conf.hinclude/pika_db.hinclude/pika_server.hsrc/pika_admin.ccsrc/pika_conf.ccsrc/pika_db.ccsrc/pika_server.ccsrc/storage/include/storage/storage.hsrc/storage/include/storage/storage_define.hsrc/storage/src/redis.ccsrc/storage/src/redis.hsrc/storage/src/storage.cc
| uint64_t number = ExtractFileNumber(file_meta.name); | ||
| if (number < oldest_number) { | ||
| oldest_number = number; | ||
| oldest_file = file_meta.db_path + "/" + file_meta.name; | ||
| oldest_level = level_meta.level; | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify the format of SstFileMetaData::name and db_path in the RocksDB version used.
fd -t f 'metadata.h' | xargs rg -n -C2 'SstFileMetaData|struct .*FileMetaData' | head -200
# Find any CompactFiles usage in this repo that already constructs the path correctly as reference.
rg -nP -C3 'CompactFiles\s*\('Repository: OpenAtomFoundation/pikiwidb
Length of output: 553
🏁 Script executed:
#!/bin/bash
# Read the context around the flagged lines (522-529) to understand the full flow
sed -n '500,570p' src/storage/src/redis.cc | cat -nRepository: OpenAtomFoundation/pikiwidb
Length of output: 2855
🏁 Script executed:
#!/bin/bash
# Find all occurrences of oldest_file in the file to understand how it's used
rg -n 'oldest_file' src/storage/src/redis.ccRepository: OpenAtomFoundation/pikiwidb
Length of output: 518
🏁 Script executed:
#!/bin/bash
# Look for input_files construction to see if oldest_file or similar paths are added to it
rg -n -A5 -B5 'input_files\s*[.push|=]' src/storage/src/redis.cc | head -100Repository: OpenAtomFoundation/pikiwidb
Length of output: 53
🏁 Script executed:
#!/bin/bash
# Search for RocksDB includes and version information
rg -n 'include.*rocksdb|rocksdb.*version' src/storage/src/redis.cc | head -20Repository: OpenAtomFoundation/pikiwidb
Length of output: 325
🏁 Script executed:
#!/bin/bash
# Find if there are any other places in the codebase that construct paths from SstFileMetaData
rg -n 'file_meta\.(db_path|name)' src/ --type cc --type hRepository: OpenAtomFoundation/pikiwidb
Length of output: 97
🏁 Script executed:
#!/bin/bash
# Check if there's a CMakeLists or conanfile that shows RocksDB version
find . -name 'CMakeLists.txt' -o -name 'conanfile.*' -o -name 'vcpkg.json' -o -name 'package.json' | xargs grep -l rocksdb 2>/dev/null | head -5Repository: OpenAtomFoundation/pikiwidb
Length of output: 223
🏁 Script executed:
#!/bin/bash
# Check CMakeLists.txt for RocksDB version
cat CMakeLists.txt | grep -i rocksdb -A5 -B5Repository: OpenAtomFoundation/pikiwidb
Length of output: 1533
🏁 Script executed:
#!/bin/bash
# Search for file_meta usage with different approach
find src -type f \( -name '*.cpp' -o -name '*.cc' -o -name '*.h' -o -name '*.hpp' \) -exec rg 'file_meta\.(db_path|name)' {} + 2>/dev/nullRepository: OpenAtomFoundation/pikiwidb
Length of output: 251
🏁 Script executed:
#!/bin/bash
# Also check what ExtractFileNumber function does - it might give clues about the format
rg -n -A10 'ExtractFileNumber' src/storage/src/redis.ccRepository: OpenAtomFoundation/pikiwidb
Length of output: 805
🌐 Web query:
RocksDB v8.7.3 SstFileMetaData name db_path format structure
💡 Result:
In RocksDB v8.7.3 (part of the v8.x series), the SstFileMetaData structure describes metadata for SST (Sorted String Table) files. It inherits from FileStorageInfo and includes fields relevant to table files. Key fields related to name, db_path, and format: - relative_filename: The name of the file within its directory (e.g., "123456.sst"). This is the basename without path. - directory: The directory containing the file (e.g., the db_path or column family path), without trailing '/'. - db_path: Deprecated; previously the full path or directory, now use directory instead. - name: Deprecated; previously the relative filename with leading slash (e.g., "/123456.sst"). SST files follow the naming convention: <file_number>.sst where file_number is a unique uint64_t identifier for the file within the DB. The full path is typically constructed as db_path + "/" + relative_filename (e.g., "/path/to/db/000123.sst"). Other notable fields in SstFileMetaData: - file_number: uint64_t ID of the file (used in filename). - size: File size in bytes. - smallestkey, largestkey: Smallest and largest user keys. - smallest_seqno, largest_seqno: Sequence number bounds. - being_compacted: bool if currently being compacted. - num_reads_sampled: Number of reads. - num_entries, num_deletions: Counts. - temperature: File temperature (kHot, kWarm, kCold). - oldest_blob_file_number, file_creation_time, epoch_number, file_checksum, etc. This structure is returned via APIs like DB::GetColumnFamilyMetaData in LiveFileMetaData (which extends SstFileMetaData) or ColumnFamilyMetaData. No specific changes to this structure in v8.7.3 release notes; the main source is the current header which reflects v8.x stability.
Citations:
- 1: https://github.com/facebook/rocksdb/blob/main/include/rocksdb/metadata.h
- 2: https://github.com/facebook/rocksdb/blob/main/HISTORY.md
- 3: https://javadoc.io/static/org.rocksdb/rocksdbjni/6.16.4/org/rocksdb/SstFileMetaData.html
- 4: https://javadoc.io/static/org.rocksdb/rocksdbjni/7.0.3/org/rocksdb/SstFileMetaData.html
- 5: https://javadoc.io/static/org.rocksdb/rocksdbjni/10.4.2/org/rocksdb/SstFileMetaData.html
🌐 Web query:
RocksDB SstFileMetaData file path format leading slash
💡 Result:
In RocksDB's SstFileMetaData (and derived LiveFileMetaData), the file path format uses two separate fields: fileName (or relative_filename in C++), which is the filename without path (e.g., "123456.sst"), and path (or directory in C++), which is the directory path without trailing slash. The full path is constructed as directory + "/" + relative_filename. Regarding leading slash specifically: - In C++ SstFileMetaData constructor: If input _file_name starts with '/', sets relative_filename = _file_name.substr(1) (strips leading '/'), and deprecated name = _file_name (keeps '/'). Otherwise, relative_filename = _file_name, name = "/" + _file_name (adds leading '/'). This ensures the deprecated name field always has a leading slash (e.g., "/123456.sst"), while relative_filename does not. - In Java JNI bindings: path returns "the full path where the file locates" (absolute or relative to DB root), fileName returns "the name of the file". - In RocksDB.getLiveFiles: Returns relative paths that "begin with '/'", confirming the convention of leading slash for relative file listings (e.g., "/000123.sst" relative to dbname/). SST files are stored relative to the database directory (or db_paths), so paths in metadata are typically relative with leading '/' convention for filenames in listings/metadata.
Citations:
- 1: https://github.com/facebook/rocksdb/blob/main/include/rocksdb/metadata.h
- 2: https://javadoc.io/static/org.rocksdb/rocksdbjni/10.4.2/org/rocksdb/SstFileMetaData.html
- 3: https://javadoc.io/static/org.rocksdb/rocksdbjni/6.16.4/org/rocksdb/SstFileMetaData.html
- 4: https://javadoc.io/static/org.rocksdb/rocksdbjni/7.0.3/org/rocksdb/SstFileMetaData.html
- 5: https://javadoc.io/static/org.rocksdb/rocksdbjni/10.4.2/org/rocksdb/RocksDB.LiveFiles.html
- 6: https://javadoc.io/static/org.rocksdb/rocksdbjni/6.20.3/org/rocksdb/RocksDB.LiveFiles.html
🌐 Web query:
RocksDB CompactFiles input_file_names path matching format requirements
💡 Result:
The input_file_names parameter for RocksDB's DB::CompactFiles (or ColumnFamilyHandle::CompactFiles) is a std::vectorstd::string containing the names of SST files to compact. These must be existing, live SST files in the specified column family (default if using DB). Non-SST files like /CURRENT or /MANIFEST-xxx will cause errors such as "Specified compaction input file /000000.sst does not exist in column family default" or similar validation failures. File name format: Relative paths starting with '/', e.g., "/000013.sst", as returned by DB::GetLiveFiles or GetSortedWalFiles. These are relative to the database directory (RocksDB.getName). Absolute paths or plain basenames like "000013.sst" without '/' prefix are invalid and will fail validation, as RocksDB checks for exact matches against internal file representations. To obtain valid names: - Use DB::GetLiveFiles for simple list (filter .sst files). - Use DB::GetLiveFilesMetaData for structured LiveFileMetaData with .name field (e.g., "/123.sst") and level info. Example usage: CompactionOptions options; std::vectorstd::string input_file_names = {"/000012.sst", "/000013.sst"}; int output_level = 1; Status s = db->CompactFiles(options, input_file_names, output_level); Only supported in non-LITE mode. Files must overlap appropriately per LSM rules for correctness.
Citations:
- 1: https://github.com/facebook/rocksdb/wiki/Manual-Compaction
- 2: Compactfiles not respecting files to compact facebook/rocksdb#11627
- 3: https://github.com/facebook/rocksdb/blob/main/include/rocksdb/metadata.h
- 4: https://javadoc.io/static/org.rocksdb/rocksdbjni/10.4.2/org/rocksdb/RocksDB.LiveFiles.html
- 5: https://javadoc.io/static/org.rocksdb/rocksdbjni/6.20.3/org/rocksdb/RocksDB.LiveFiles.html
Fix path construction: CompactFiles expects relative paths, not absolute paths with double slashes.
The issue goes beyond the double slash: CompactFiles requires relative file paths (e.g., "/000123.sst") relative to the database directory, not absolute paths. The current code file_meta.db_path + "/" + file_meta.name concatenates an absolute path (/var/db//000123.sst), which will fail validation when matched against RocksDB's internal file representations.
SstFileMetaData::name already provides the correct relative path with leading / (e.g., "/000123.sst"). Use it directly:
🛠️ Proposed fix
- oldest_file = file_meta.db_path + "/" + file_meta.name;
+ // SstFileMetaData::name is already the relative path with leading '/'
+ oldest_file = file_meta.name;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/storage/src/redis.cc` around lines 522 - 529, The code builds an absolute
path for oldest_file by concatenating file_meta.db_path + "/" + file_meta.name,
but CompactFiles expects relative file names (SstFileMetaData::name already
contains the leading "/" relative path); update the assignment that sets
oldest_file to use file_meta.name directly (preserve setting of oldest_number
and oldest_level) so CompactFiles receives the correct relative path format;
ensure ExtractFileNumber usage and comparisons remain unchanged.
This reverts commit 7be1b42.
- 新增 compaction-strategy: incremental-compact - 使用 CompactFiles 每次处理少量最老的 SST 文件 - 可配置单次处理文件数、执行时间、压缩率阈值 - 与 dump 硬连接方案分离,独立测试
漏传了 handles_[idx],导致非kMetaCF 的 CF 触发 force compact时,实际被 compact 的是 kMetaCF
47284f9 to
8043759
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/storage/src/storage.cc`:
- Around line 1778-1791: The loop currently overwrites s for each inst and only
returns the last instance Status, losing earlier failures; change
IncrementalCompact handling to record and return the first non-OK Status
encountered: declare a Status first_failure (initialized OK) before the for
(const auto& inst : insts_) loop, after calling inst->IncrementalCompact(...)
check s and set first_failure = s if first_failure.ok() and !s.ok(), then
iterate compact_result_vec and if any compact_result is not ok and
first_failure.ok() set first_failure = compact_result; keep logging as-is but at
the end return first_failure (or s if you prefer), ensuring IncrementalCompact
failures from earlier insts_ or individual compact_result_vec entries are
propagated instead of discarded.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 59b2c9d9-3603-4655-aa5c-93e70291f28e
📒 Files selected for processing (14)
.gitignoreCMakeLists.txtconf/pika.confinclude/pika_conf.hinclude/pika_db.hinclude/pika_server.hsrc/pika_admin.ccsrc/pika_conf.ccsrc/pika_db.ccsrc/pika_server.ccsrc/storage/include/storage/storage.hsrc/storage/src/redis.ccsrc/storage/src/redis.hsrc/storage/src/storage.cc
✅ Files skipped from review due to trivial changes (2)
- .gitignore
- src/pika_server.cc
🚧 Files skipped from review as they are similar to previous changes (5)
- src/pika_db.cc
- src/pika_admin.cc
- conf/pika.conf
- src/pika_conf.cc
- src/storage/src/redis.cc
| if (sync) { | ||
| Status s; | ||
| for (const auto& inst : insts_) { | ||
| std::vector<rocksdb::Status> compact_result_vec; | ||
| s = inst->IncrementalCompact(type, &compact_result_vec, storage::kMetaAndData, | ||
| max_files, max_time_ms, min_rate, | ||
| min_file_age); | ||
| for (auto compact_result : compact_result_vec) { | ||
| if (!compact_result.ok()) { | ||
| LOG(ERROR) << compact_result.ToString(); | ||
| } | ||
| } | ||
| } | ||
| return s; |
There was a problem hiding this comment.
Propagate the first compaction failure instead of returning only the last instance status.
If an earlier instance fails and a later one succeeds, this method returns OK and the failure is reduced to a log line. That makes the new API report success for partial failures.
Suggested fix
Status Storage::IncrementalCompact(const DataType &type, int max_files, int max_time_ms,
int min_rate, int min_file_age, bool sync) {
if (sync) {
- Status s;
+ Status overall = Status::OK();
for (const auto& inst : insts_) {
std::vector<rocksdb::Status> compact_result_vec;
- s = inst->IncrementalCompact(type, &compact_result_vec, storage::kMetaAndData,
- max_files, max_time_ms, min_rate,
- min_file_age);
+ Status s = inst->IncrementalCompact(type, &compact_result_vec, storage::kMetaAndData,
+ max_files, max_time_ms, min_rate,
+ min_file_age);
+ if (!s.ok() && overall.ok()) {
+ overall = s;
+ }
for (auto compact_result : compact_result_vec) {
if (!compact_result.ok()) {
LOG(ERROR) << compact_result.ToString();
+ if (overall.ok()) {
+ overall = compact_result;
+ }
}
}
}
- return s;
+ return overall;
} else {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/storage/src/storage.cc` around lines 1778 - 1791, The loop currently
overwrites s for each inst and only returns the last instance Status, losing
earlier failures; change IncrementalCompact handling to record and return the
first non-OK Status encountered: declare a Status first_failure (initialized OK)
before the for (const auto& inst : insts_) loop, after calling
inst->IncrementalCompact(...) check s and set first_failure = s if
first_failure.ok() and !s.ok(), then iterate compact_result_vec and if any
compact_result is not ok and first_failure.ok() set first_failure =
compact_result; keep logging as-is but at the end return first_failure (or s if
you prefer), ensuring IncrementalCompact failures from earlier insts_ or
individual compact_result_vec entries are propagated instead of discarded.
1.修复obd-compact的错误
2.新增测试的渐进式compact
fix:#3230

about:#3226
about:#3240
Summary by CodeRabbit
Release Notes
New Features
incremental-compactas a new compaction strategy option alongside existingfull-compactandobd-compactmodesincremental-compact-interval,incremental-compact-max-files,incremental-compact-max-time-ms,incremental-compact-min-rate, andincremental-compact-min-file-age