Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions tools/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
add_subdirectory(./aof_to_pika)
add_subdirectory(./benchmark_client)
add_subdirectory(./bigkey_analyzer)
add_subdirectory(./binlog_sender)
add_subdirectory(./manifest_generator)
add_subdirectory(./rdb_to_pika)
Expand Down
30 changes: 30 additions & 0 deletions tools/bigkey_analyzer/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
set(WARNING_FLAGS "-W -Wextra -Wall -Wsign-compare \
-Wno-unused-parameter -Wno-redundant-decls -Wwrite-strings \
-Wpointer-arith -Wreorder -Wswitch -Wsign-promo \
-Woverloaded-virtual -Wnon-virtual-dtor -Wno-missing-field-initializers")

set(CXXFLAGS "${WARNING_FLAGS} -std=c++17 -g")

set(SRC_DIR .)
aux_source_directory(${SRC_DIR} BASE_OBJS)

add_executable(bigkey_analyzer ${BASE_OBJS})

target_include_directories(bigkey_analyzer
PRIVATE
${PROJECT_SOURCE_DIR}
${PROJECT_SOURCE_DIR}/src
${PROJECT_SOURCE_DIR}/src/storage/include
)

target_link_libraries(bigkey_analyzer
storage
rocksdb
pthread
)
Comment on lines +20 to +23
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check RocksDB linking in storage library and other tools

echo "=== Check storage library's RocksDB dependency linking ==="
find . -path ./tools -prune -o -name "CMakeLists.txt" -type f -print | grep -E "src/storage|storage" | while read f; do
  if [ -f "$f" ]; then
    echo "File: $f"
    cat -n "$f" | grep -A 10 -B 2 "target_link_libraries"
  fi
done

echo -e "\n=== Check CMakeLists.txt files in tools directory ==="
find tools -name "CMakeLists.txt" -type f | while read f; do
  echo -e "\n--- $f ---"
  cat -n "$f" | grep -A 8 -B 1 "target_link_libraries"
done

Repository: OpenAtomFoundation/pikiwidb

Length of output: 10408


Add explicit ${ROCKSDB_LIBRARY} link for consistency with other tools.

Although storage provides ${ROCKSDB_LIBRARY} as a PUBLIC dependency, all similar tools in the project—including txt_to_pika, pika_to_txt, benchmark_client, and pika_port—explicitly link ${ROCKSDB_LIBRARY} alongside storage. Update to:

target_link_libraries(bigkey_analyzer 
    storage
    ${ROCKSDB_LIBRARY}
    pthread
)
🤖 Prompt for AI Agents
In tools/bigkey_analyzer/CMakeLists.txt around lines 20 to 23,
target_link_libraries only lists storage and pthread but other tools explicitly
link ${ROCKSDB_LIBRARY}; add ${ROCKSDB_LIBRARY} to the target_link_libraries
call so it becomes: target_link_libraries(bigkey_analyzer storage
${ROCKSDB_LIBRARY} pthread) to match project consistency.


set_target_properties(bigkey_analyzer PROPERTIES
RUNTIME_OUTPUT_DIRECTORY ${CMAKE_BINARY_DIR}
CMAKE_COMPILER_IS_GNUCXX TRUE
COMPILE_FLAGS ${CXXFLAGS}
)
Comment on lines +25 to +29
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Replace deprecated COMPILE_FLAGS and remove incorrect CMAKE_COMPILER_IS_GNUCXX property.

Two issues:

  1. CMAKE_COMPILER_IS_GNUCXX is a CMake-detected variable, not a settable target property. Setting it here has no effect.
  2. COMPILE_FLAGS property has been deprecated since CMake 3.0 in favor of target_compile_options().
🔎 Recommended fix
+target_compile_options(bigkey_analyzer PRIVATE ${WARNING_FLAGS})
+target_compile_features(bigkey_analyzer PRIVATE cxx_std_17)
+
 set_target_properties(bigkey_analyzer PROPERTIES
     RUNTIME_OUTPUT_DIRECTORY ${CMAKE_BINARY_DIR}
-    CMAKE_COMPILER_IS_GNUCXX TRUE
-    COMPILE_FLAGS ${CXXFLAGS}
 )

Also remove or simplify the CXXFLAGS variable at line 6 since flags are now applied via target_compile_options and C++17 via target_compile_features.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In tools/bigkey_analyzer/CMakeLists.txt around lines 25-29, the target
properties block incorrectly sets CMAKE_COMPILER_IS_GNUCXX (a CMake-detected
variable that should not be set) and uses the deprecated COMPILE_FLAGS property;
remove the CMAKE_COMPILER_IS_GNUCXX line entirely, drop the COMPILE_FLAGS
property, and instead apply any desired flags with
target_compile_options(bigkey_analyzer PRIVATE ${CXXFLAGS}) and declare required
C++ standard via target_compile_features(bigkey_analyzer PRIVATE cxx_std_17);
also remove or simplify the CXXFLAGS variable at line 6 since flags are now
applied via target_compile_options and C++17 via target_compile_features.

120 changes: 120 additions & 0 deletions tools/bigkey_analyzer/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
# Big Key Analyzer

大key分析工具,用于分析PikiwiDB实例中的大key情况。本工具适用于unstable分支新的存储结构,支持单实例和多DB实例(db/0, db/1, db/2...)。

## 功能特点

- 支持分析各种数据类型(strings, hashes, lists, sets, zsets)的大key
- 可以按大小过滤key
- 可以限制输出结果数量(top N)
- 支持按key前缀统计
- 输出结果包含key类型、大小和过期时间(TTL)
- 可以将结果输出到文件

## 编译

在PikiwiDB根目录下执行:

```bash
mkdir -p build
cd build
cmake ..
make bigkey_analyzer
```

编译完成后,可执行文件会生成在build目录下。

## 使用方法

```
Usage: bigkey_analyzer [OPTIONS] <db_path>
Options:
--min-size=SIZE Only show keys larger than SIZE bytes
--top=N Only show top N largest keys
--prefix-stat Show statistics by key prefix
--prefix-delimiter=C Character used to delimit prefix (default: ':')
--type=TYPE Only analyze specific type (strings|hashes|lists|sets|zsets|all)
--output=FILE Write output to file instead of stdout
--help Display this help message
```
Comment on lines +33 to +43
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add language identifier to fenced code block.

The usage text block should specify a language identifier for proper syntax highlighting and better readability.

🔎 Proposed fix
-```
+```text
 Usage: bigkey_analyzer [OPTIONS] <db_path>
 Options:
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
```
Usage: bigkey_analyzer [OPTIONS] <db_path>
Options:
--min-size=SIZE Only show keys larger than SIZE bytes
--top=N Only show top N largest keys
--prefix-stat Show statistics by key prefix
--prefix-delimiter=C Character used to delimit prefix (default: ':')
--type=TYPE Only analyze specific type (strings|hashes|lists|sets|zsets|all)
--output=FILE Write output to file instead of stdout
--help Display this help message
```
🧰 Tools
🪛 markdownlint-cli2 (0.18.1)

29-29: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
In tools/bigkey_analyzer/README.md around lines 29 to 39, the fenced code block
with the usage text lacks a language identifier; update the opening fence from
``` to ```text so it becomes ```text and leave the block content unchanged,
ensuring the usage snippet is syntax-highlighted as plain text in renders.


## 示例

1. 分析所有大key:

```bash
# 单实例
./bigkey_analyzer /path/to/pikiwidb/data

# 多DB实例(db/0, db/1, db/2...)
./bigkey_analyzer /path/to/pikiwidb
```

2. 只分析大于1MB的key:

```bash
./bigkey_analyzer --min-size=1048576 /path/to/pikiwidb/data
```

3. 只显示前10个最大的key:

```bash
./bigkey_analyzer --top=10 /path/to/pikiwidb/data
```

4. 只分析hash类型的key:

```bash
./bigkey_analyzer --type=hashes /path/to/pikiwidb/data
```

5. 分析并按前缀统计:

```bash
./bigkey_analyzer --prefix-stat /path/to/pikiwidb/data
```

6. 输出结果到文件:

```bash
./bigkey_analyzer --output=result.txt /path/to/pikiwidb/data
```

## 输出格式

工具输出包括三部分:

1. 大key列表 - 按大小降序排列
2. 按前缀统计(如果使用--prefix-stat选项)
3. 总结统计信息

示例输出:

```
===== Big Key Analysis =====
Type Size Key TTL
hash 1048576 user:profile:1001 -1
zset 524288 ranking:global 3600
string 262144 config:settings -1
...

===== Key Prefix Statistics =====
Prefix Count Total Size Avg Size
user 100 10485760 104857.6
ranking 50 2621440 52428.8
config 10 524288 52428.8
...

===== Summary =====
Total keys analyzed: 160
Keys by type:
hash: 50 keys, 25.0 MB total, 524288.0 bytes avg
zset: 30 keys, 15.0 MB total, 524288.0 bytes avg
string: 80 keys, 10.0 MB total, 131072.0 bytes avg
```
Comment on lines +97 to +118
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add language identifier to fenced code block.

The example output block should specify a language identifier for proper syntax highlighting and better readability.

🔎 Proposed fix
-```
+```text
 ===== Big Key Analysis =====
 Type    Size    Key     TTL
🧰 Tools
🪛 markdownlint-cli2 (0.18.1)

93-93: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
In tools/bigkey_analyzer/README.md around lines 93 to 114, the fenced code block
lacks a language identifier so it doesn’t get proper syntax highlighting; update
the opening fence to include an appropriate language token (for example "text")
— i.e. change ``` to ```text — and ensure the closing fence remains unchanged so
the block renders with the specified language.


## 注意事项

- 工具只读取数据库,不会进行任何写操作
- 大key的大小包括key和value的总大小
- 已过期的key不会被包含在分析结果中
Loading
Loading