-
Notifications
You must be signed in to change notification settings - Fork 196
Clamp max_iter in OOC KMeans test #2256
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
1098a78
a84501d
c3372d0
1a42257
0f495f4
0d47620
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -381,6 +381,9 @@ class KmeansFitBatchedTest : public ::testing::TestWithParam<KmeansBatchedInputs | |
| params.tol = testparams.tol; | ||
| params.rng_state.seed = 1; | ||
| params.oversampling_factor = 0; | ||
| // Limit the number of iterations to ensure same number of iterations for reference and batched | ||
| // code paths. | ||
| params.max_iter = 3; | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This fix doesn't seem particularly robust, and appears to me like it'll reduce the coverage for the tests, right? Default max_iters for k-means is much higher. How do we test in a way where we can validate its ability to stop early without having to reduce the max-iters to the point where we are essentially not testing the early stopping at all?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the point of the test when we are bit matching centroids is to ensure that the accumulation of partial sums in the OOC setting in a single iteration matches the in-memory setting. The test fails when there is an edge case -- the variation due to floating point precision in the arithmetic is just enough to push the difference in centroids between consecutive iterations to fall below the convergence criteria -- in-memory converges one iteration sooner in than OOC. |
||
|
|
||
| auto stream = raft::resource::get_cuda_stream(handle); | ||
|
|
||
|
|
@@ -402,8 +405,7 @@ class KmeansFitBatchedTest : public ::testing::TestWithParam<KmeansBatchedInputs | |
|
|
||
| auto d_sw = d_sw_view(); | ||
|
|
||
| params.init = cuvs::cluster::kmeans::params::Array; | ||
| params.max_iter = 20; | ||
| params.init = cuvs::cluster::kmeans::params::Array; | ||
|
|
||
| T ref_inertia = 0; | ||
| int ref_n_iter = 0; | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.