fix: blockcache compute problem#3203
Conversation
There was a problem hiding this comment.
Pull request overview
This PR fixes a block cache metrics double-counting issue that occurred when share-block-cache is enabled. When multiple Column Families share the same RocksDB block cache, the code was previously aggregating metrics from each CF separately, causing the same cache usage to be counted multiple times (2x for hash/list/set/streams, 3x for zset).
Key Changes
- Added
share_block_cache_member variable to track shared cache configuration - Modified
GetProperty()methods to return early after querying the first CF when shared cache is enabled for block cache metrics - Updated
GetRocksDBInfo()to query only the first CF handle for block cache properties when cache is shared - Added integration test to verify block cache metrics are reported correctly
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| src/storage/src/redis.h | Added share_block_cache_ boolean member variable to track shared cache configuration |
| src/storage/src/redis.cc | Modified GetRocksDBInfo() to avoid double-counting by checking first CF only when cache is shared |
| src/storage/src/redis_hashes.cc | Set share_block_cache_ in Open() and added early return in GetProperty() for shared cache properties |
| src/storage/src/redis_lists.cc | Set share_block_cache_ in Open() and added early return in GetProperty() for shared cache properties |
| src/storage/src/redis_sets.cc | Set share_block_cache_ in Open() and added early return in GetProperty() for shared cache properties |
| src/storage/src/redis_zsets.cc | Set share_block_cache_ in Open() and added early return in GetProperty() for shared cache properties (3 CFs) |
| src/storage/src/redis_streams.cc | Set share_block_cache_ in Open() and added early return in GetProperty() for shared cache properties |
| tests/integration/server_test.go | Added integration test to verify block cache capacity metrics are present in INFO output |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // fix: verify that shared block cache does not cause double counting of block cache metrics | ||
| It("should report correct block cache capacity when share-block-cache is yes", func() { | ||
| // Verify share-block-cache is enabled | ||
| shareBlockCacheConfig := client.ConfigGet(ctx, "share-block-cache") | ||
| Expect(shareBlockCacheConfig.Err()).NotTo(HaveOccurred()) | ||
| Expect(shareBlockCacheConfig.Val()).To(Equal(map[string]string{"share-block-cache": "yes"})) | ||
|
|
||
| info := client.Info(ctx, "rocksdb") | ||
| Expect(info.Err()).NotTo(HaveOccurred()) | ||
| infoStr := info.Val() |
There was a problem hiding this comment.
Inconsistent indentation: This test uses spaces for indentation (lines 795-805), while the rest of the file uses tabs. This creates mixed indentation that can cause issues in Go code. Please use tabs consistently with the rest of the file.
| // fix: verify that shared block cache does not cause double counting of block cache metrics | |
| It("should report correct block cache capacity when share-block-cache is yes", func() { | |
| // Verify share-block-cache is enabled | |
| shareBlockCacheConfig := client.ConfigGet(ctx, "share-block-cache") | |
| Expect(shareBlockCacheConfig.Err()).NotTo(HaveOccurred()) | |
| Expect(shareBlockCacheConfig.Val()).To(Equal(map[string]string{"share-block-cache": "yes"})) | |
| info := client.Info(ctx, "rocksdb") | |
| Expect(info.Err()).NotTo(HaveOccurred()) | |
| infoStr := info.Val() | |
| // fix: verify that shared block cache does not cause double counting of block cache metrics | |
| It("should report correct block cache capacity when share-block-cache is yes", func() { | |
| // Verify share-block-cache is enabled | |
| shareBlockCacheConfig := client.ConfigGet(ctx, "share-block-cache") | |
| Expect(shareBlockCacheConfig.Err()).NotTo(HaveOccurred()) | |
| Expect(shareBlockCacheConfig.Val()).To(Equal(map[string]string{"share-block-cache": "yes"})) | |
| info := client.Info(ctx, "rocksdb") | |
| Expect(info.Err()).NotTo(HaveOccurred()) | |
| infoStr := info.Val() |
|
|
||
| Expect(infoStr).To(ContainSubstring("strings_block_cache_capacity:")) | ||
| Expect(infoStr).To(ContainSubstring("hashes_block_cache_capacity:")) | ||
| Expect(infoStr).To(ContainSubstring("lists_block_cache_capacity:")) |
There was a problem hiding this comment.
Trailing whitespace detected at the end of this line. Please remove it.
| Expect(infoStr).To(ContainSubstring("lists_block_cache_capacity:")) | |
| Expect(infoStr).To(ContainSubstring("lists_block_cache_capacity:")) |
| Expect(infoStr).To(ContainSubstring("strings_block_cache_capacity:")) | ||
| Expect(infoStr).To(ContainSubstring("hashes_block_cache_capacity:")) | ||
| Expect(infoStr).To(ContainSubstring("lists_block_cache_capacity:")) | ||
| Expect(infoStr).To(ContainSubstring("sets_block_cache_capacity:")) |
There was a problem hiding this comment.
Trailing whitespace detected at the end of this line. Please remove it.
| Expect(infoStr).To(ContainSubstring("sets_block_cache_capacity:")) | |
| Expect(infoStr).To(ContainSubstring("sets_block_cache_capacity:")) |
| It("should report correct block cache capacity when share-block-cache is yes", func() { | ||
| // Verify share-block-cache is enabled | ||
| shareBlockCacheConfig := client.ConfigGet(ctx, "share-block-cache") | ||
| Expect(shareBlockCacheConfig.Err()).NotTo(HaveOccurred()) | ||
| Expect(shareBlockCacheConfig.Val()).To(Equal(map[string]string{"share-block-cache": "yes"})) | ||
|
|
||
| info := client.Info(ctx, "rocksdb") | ||
| Expect(info.Err()).NotTo(HaveOccurred()) | ||
| infoStr := info.Val() | ||
|
|
||
| Expect(infoStr).To(ContainSubstring("strings_block_cache_capacity:")) | ||
| Expect(infoStr).To(ContainSubstring("hashes_block_cache_capacity:")) | ||
| Expect(infoStr).To(ContainSubstring("lists_block_cache_capacity:")) | ||
| Expect(infoStr).To(ContainSubstring("sets_block_cache_capacity:")) | ||
| Expect(infoStr).To(ContainSubstring("zsets_block_cache_capacity:")) | ||
|
|
||
| // Simple string-based verification that values are present | ||
| Expect(infoStr).To(MatchRegexp(`strings_block_cache_capacity:\d+`)) | ||
| Expect(infoStr).To(MatchRegexp(`hashes_block_cache_capacity:\d+`)) | ||
| Expect(infoStr).To(MatchRegexp(`lists_block_cache_capacity:\d+`)) | ||
| Expect(infoStr).To(MatchRegexp(`sets_block_cache_capacity:\d+`)) | ||
| Expect(infoStr).To(MatchRegexp(`zsets_block_cache_capacity:\d+`)) | ||
| }) |
There was a problem hiding this comment.
The test only verifies that block cache metrics are present in the output, but doesn't actually verify that the double-counting bug has been fixed. According to the PR description, the fix prevents metrics from being counted multiple times (2x for hash/list/set, 3x for zset).
Consider enhancing the test to:
- Parse the actual capacity values from the INFO output for each data type
- Verify that they all report the same capacity value (since they share the cache)
- Optionally, verify that this shared value matches the configured block cache size from the config
Example approach:
// Parse capacity values
stringsCapacity := parseCapacity(infoStr, "strings_block_cache_capacity")
hashesCapacity := parseCapacity(infoStr, "hashes_block_cache_capacity")
// ... parse others
// Verify all report the same value
Expect(stringsCapacity).To(Equal(hashesCapacity))
Expect(stringsCapacity).To(Equal(listsCapacity))
// ... compare othersThis would provide stronger validation that the fix actually prevents double-counting.
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Mixficsol
left a comment
There was a problem hiding this comment.
修复 CI 的部分去掉吧,一个 PR 做一件事
1f3d2e9 to
c27e4b4
Compare
Done. |
Done. |
c27e4b4 to
81733b9
Compare
d27e1c3 to
e284374
Compare
e284374 to
557d8c9
Compare
|
share_block_cache如果为true的话,应该不止是Columnfamily共用同一个blockcache。strings,hashes等所有rocksdb实例的block_cache都用的是同一个。所以你可以再考虑下不同rocksdb实例之间重复计算的问题,如果不太好改,可以现在代码里加个todo。 |
If share_block_cache is true, more than just Columnfamily should share the same blockcache. The block_cache of all rocksdb instances such as strings, hashes, etc. use the same one. So you can consider the problem of repeated calculations between different rocksdb instances. If it is not easy to change, you can add a todo to the code now. |
#3168



share-block-cache 为yes时,多个 Column Family 共享同一个 block cache,但在统计 block cache usage 时,代码仍然对每个 CF 分别查询然后累加,导致同一个 block cache 的使用量被计算了多次。
修复前,hash,list,set 的计算均为两倍,zset为三倍:
修复后这些数据结构的不会再被计算多次:
添加测试: