Fix SSF issues on large grids#197
Open
vmitq wants to merge 1 commit intowavefunction91:masterfrom
Open
Conversation
Large molecules (e.g. ubiquitin) could fail during SSF grid reweighting. Changing lddist type to size_t seems to fix it.
There was a problem hiding this comment.
Pull request overview
This PR widens the CUDA SSF distance-stride (lddist) from int32_t to size_t so the 1D SSF weight and derivative kernels can index the per-point distance scratch buffer without 32-bit overflow on large workloads. In the GauXC CUDA Scheme1 path, this change is aimed at the SSF reweighting logic used after grid-to-center distances are computed.
Changes:
- Update the CUDA SSF 1D kernel declarations/definitions to take
size_t lddist. - Propagate the widened stride type through the CUDA Scheme1 weight and first-derivative wrappers.
- Keep the CUDA Scheme1 call chain consistent with the existing
size_t-basedget_ldatoms()device stride.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
src/xc_integrator/local_work_driver/device/cuda/kernels/cuda_ssf_1d.hpp |
Widened SSF 1D function declarations to use size_t for the distance stride. |
src/xc_integrator/local_work_driver/device/cuda/kernels/cuda_ssf_1d.cu |
Widened the SSF 1D kernel/function implementations for weight and derivative paths. |
src/xc_integrator/local_work_driver/device/cuda/cuda_aos_scheme1_weights.hpp |
Updated CUDA Scheme1 wrapper declarations to pass a size_t stride. |
src/xc_integrator/local_work_driver/device/cuda/cuda_aos_scheme1_weights.cu |
Propagated the widened stride through the CUDA Scheme1 wrapper implementations. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const double* coords, | ||
| const double* dist_scratch, | ||
| int32_t lddist, | ||
| size_t lddist, |
| const double* points_z, | ||
| const double* dist_scratch, | ||
| int32_t lddist, | ||
| size_t lddist, |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Large molecules (e.g. ubiquitin) could fail during SSF grid reweighting step. Changing
lddisttype tosize_tseems to fix it.