feat: Enable batch import of SST files with S3/MinIO in Pika#3175
feat: Enable batch import of SST files with S3/MinIO in Pika#3175byseea11 wants to merge 1370 commits intoOpenAtomFoundation:3.5from
Conversation
* fix delete dump file while still in use * remove unused rsync code --------- Co-authored-by: wangshaoyi <[email protected]>
Co-authored-by: wuxianrong <[email protected]>
…dd a colon OpenAtomFoundation#2375 (OpenAtomFoundation#2384) Co-authored-by: liuchengyu <[email protected]>
* fix: codis-dashboard uses 100% cpu(OpenAtomFoundation#2332) (OpenAtomFoundation#2393) Co-authored-by: liuchengyu <[email protected]> * fix: The role displayed on the first Server in the Group area of the codis-fe is incorrect (OpenAtomFoundation#2350) (OpenAtomFoundation#2387) Co-authored-by: liuchengyu <[email protected]> --------- Co-authored-by: Chengyu Liu <[email protected]> Co-authored-by: liuchengyu <[email protected]>
* fix: codis-dashboard uses 100% cpu(OpenAtomFoundation#2332) (OpenAtomFoundation#2393) Co-authored-by: liuchengyu <[email protected]> * fix: The role displayed on the first Server in the Group area of the codis-fe is incorrect (OpenAtomFoundation#2350) (OpenAtomFoundation#2387) Co-authored-by: liuchengyu <[email protected]> * fix: automatic fix master-slave replication relationship after master or slave service restarted (OpenAtomFoundation#2373, OpenAtomFoundation#2038, OpenAtomFoundation#1950, OpenAtomFoundation#1967, OpenAtomFoundation#2351)) (OpenAtomFoundation#2386) Co-authored-by: liuchengyu <[email protected]> * feat:add 3.5.3 changelog (OpenAtomFoundation#2395) * add 3.5.3 changelog --------- Co-authored-by: chejinge <[email protected]> --------- Co-authored-by: Chengyu Liu <[email protected]> Co-authored-by: liuchengyu <[email protected]> Co-authored-by: chejinge <[email protected]>
…on#2431) OpenAtomFoundation#2429 Signed-off-by: HappyUncle <[email protected]>
…dation#2411) * add kubeblock component post start demo * convert pika kubeblocks helm chart to new component definition api (cherry picked from commit 60535d6) * modify cluster yaml * remove dirty code * refine pika cluster yaml * add PIKA_CODIS_DASHBOARD_SVC_NAME var reference
OpenAtomFoundation#2425 Signed-off-by: HappyUncle <[email protected]>
Co-authored-by: wuxianrong <[email protected]>
Co-authored-by: 白鑫 <[email protected]>
…mission judgment after the default user is connected (OpenAtomFoundation#2449)
… not exist or doing bgsave(OpenAtomFoundation#2289) (OpenAtomFoundation#2437) Co-authored-by: liuchengyu <[email protected]>
…kipped directly(OpenAtomFoundation#2433) (OpenAtomFoundation#2439) Co-authored-by: liuchengyu <[email protected]>
…omFoundation#2456) (OpenAtomFoundation#2457) Co-authored-by: liuchengyu <[email protected]>
Co-authored-by: wuxianrong <[email protected]>
* fix: ACL user authentication errors * blacklist instead of acl user * add rename command (OpenAtomFoundation#2462) * support config get userblacklist ---------
…omFoundation#2467) Co-authored-by: liuyuecai <[email protected]>
…top (OpenAtomFoundation#2475) Co-authored-by: liuchengyu <[email protected]>
Co-authored-by: liuchengyu <[email protected]>
…dation#2474) Co-authored-by: liuchengyu <[email protected]>
…enAtomFoundation#2458) Co-authored-by: liuchengyu <[email protected]>
…#2451) Co-authored-by: liuchengyu <[email protected]>
add rename-command go test
Co-authored-by: wuxianrong <[email protected]>
…ases where a higher version of CMake does not compile Co-authored-by: wuxianrong <[email protected]>
… file (OpenAtomFoundation#3076) * Added the correct loading of admin-cmd-list in the configuration file * add config to admin_cmd_list --------- Co-authored-by: wuxianrong <[email protected]>
…OpenAtomFoundation#3098) Co-authored-by: wuxianrong <[email protected]>
…Foundation#3088) Co-authored-by: wangshaoyi <[email protected]>
Co-authored-by: wuxianrong <[email protected]>
…lient close connection (OpenAtomFoundation#3089) Co-authored-by: wangshaoyi <[email protected]>
…o front (OpenAtomFoundation#3108) Co-authored-by: caiyu <[email protected]>
…ion#3107) Co-authored-by: wangshaoyi <[email protected]>
…penAtomFoundation#3111) Co-authored-by: wangshaoyi <[email protected]>
Co-authored-by: wangshaoyi <[email protected]>
* fix:unused conf --------- Co-authored-by: chejinge <[email protected]>
Co-authored-by: chejinge <[email protected]>
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the ✨ Finishing touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🧪 Early access (Sonnet 4.5): enabledWe are currently testing the Sonnet 4.5 model, which is expected to improve code review quality. However, this model may lead to increased noise levels in the review comments. Please disable the early access features if the noise level causes any inconvenience. Note:
Comment |
9646d59 to
98896c4
Compare
修改内容概述
当前分支
pika-ingest-v3.5相比于3.5分支,新增了 SST文件批量导入功能 (pika_batch_ingest),支持通过 S3/MinIO 等对象存储服务进行大规模数据的快速导入。该功能允许用户通过manifest文件批量导入大量 SST 文件到 Pika 数据库,并提供一个完整的批量数据生成和导入流水线系统。主要更改
新增核心功能模块
A. Manifest Ingest 功能
manifestingest命令,用于从 S3/MinIO 下载manifest文件并导入对应的 SST 文件ManifestIngestCmd类,负责下载和导入外部 SST 文件RedisStrings::SstExtendIngest方法中实现 SST 扩展导入逻辑B. S3 服务模块
src/ingest目录,包含完整的 S3 服务支持S3Service类,封装 AWS S3 客户端和传输管理器SstDownloader类,用于下载manifest和 SST 文件配置文件更新
A. 主配置文件
conf/pika.conf中新增ingest-conf-path配置项,指定 ingest 配置文件路径s3-conf-path配置项,用于 S3 访问配置B. S3 配置文件
新增
conf/s3.conf:定义 S3/MinIO 连接参数endpoint:S3 或 MinIO 服务地址region:区域设置bucket:存储桶名称access_key/secret_key:访问凭证transfer_threads、max_inflight等C. Ingest 配置文件
conf/ingest.conf:定义 ingest 过程中的 RocksDB 参数配置新增工具链
A. pika_batch_ingest 工具
B. Shell 脚本集合
run.sh、mock.sh、exchange.sh、s3put.sh、iagent.sh等代码架构改动
A. PikaServer 更新
PikaServer中新增S3Service成员变量InitS3()和StopS3()方法用于 S3 服务管理B. 存储层扩展
storage模块中新增SstExtendIngest接口C. 新增 proto 定义
manifest.proto定义manifest文件格式性能优化特性
A. 激进导入参数
B. 并发处理
集成测试
string_ingest.tcl测试文件manifestingest命令的正确性和数据完整性主要代码更新细节
A. pika.cc
B. pika_server.cc
StopS3()调用,确保 S3 服务正确关闭InitS3()和StopS3()方法C. pika_command.cc 和 pika_command.h
manifestingest命令D. pika_conf.cc 和 pika_conf.h
s3-conf-path和ingest-conf-path参数E. 全局变量声明更新
extern std::unique_ptr<PikaServer> g_pika_server;更改为extern PikaServer* g_pika_server;技术实现要点
manifest文件,解析后批量下载对应的 SST 文件checksums,支持阻塞刷新等选项,确保数据完整性manifestingest命令正确注册到命令表中,使其可以在 Redis 协议中使用