Enable seminumerical exact exchange calculation with CUDA#183
Open
vmitq wants to merge 2 commits intowavefunction91:masterfrom
Open
Enable seminumerical exact exchange calculation with CUDA#183vmitq wants to merge 2 commits intowavefunction91:masterfrom
vmitq wants to merge 2 commits intowavefunction91:masterfrom
Conversation
Co-authored-by: Lukas Gergs <[email protected]>
There was a problem hiding this comment.
Pull request overview
This PR adds support for computing the seminumerical exact-exchange (EXX) contribution to the nuclear gradient on CUDA device backends, and exposes it through the public XCIntegrator / replicated-integrator APIs.
Changes:
- Add
eval_exx_gradAPI plumbing acrossXCIntegratorand replicated integrator layers. - Add device storage + local-work driver support for EXX-gradient intermediates and reduction.
- Extend the standalone driver to load/write/print/compare
EXX_GRADfrom HDF5 references.
Reviewed changes
Copilot reviewed 33 out of 33 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/standalone_driver.cxx | Loads/writes/prints EXX_GRAD reference data and compares norms. |
| src/xc_integrator/xc_data/device/xc_device_stack_data.hpp | Adds device pointers and interface hooks for EXX-gradient buffers. |
| src/xc_integrator/xc_data/device/xc_device_stack_data.cxx | Allocates/zeros/retrieves EXX-gradient intermediates on device. |
| src/xc_integrator/xc_data/device/xc_device_data.hpp | Adds exx_grad term tracking and device-data virtual interface for EXX gradient. |
| src/xc_integrator/shell_batched/shell_batched_replicated_xc_integrator.hpp | Declares eval_exx_grad_ in shell-batched replicated integrator. |
| src/xc_integrator/shell_batched/shell_batched_replicated_xc_integrator_exx_grad.hpp | Adds NYI stub for shell-batched EXX gradient. |
| src/xc_integrator/replicated/replicated_xc_integrator_impl.cxx | Adds replicated pimpl forwarding function eval_exx_grad. |
| src/xc_integrator/replicated/host/shell_batched_replicated_xc_host_integrator.cxx | Wires in the shell-batched EXX-gradient stub header. |
| src/xc_integrator/replicated/host/reference_replicated_xc_host_integrator.hpp | Declares eval_exx_grad_ for reference host integrator. |
| src/xc_integrator/replicated/host/reference_replicated_xc_host_integrator.cxx | Wires in the reference EXX-gradient stub header. |
| src/xc_integrator/replicated/host/reference_replicated_xc_host_integrator_exx_grad.hpp | Adds NYI stub for reference host EXX gradient. |
| src/xc_integrator/replicated/device/shell_batched_replicated_xc_device_integrator.cxx | Wires in the shell-batched EXX-gradient stub header on device build. |
| src/xc_integrator/replicated/device/incore_replicated_xc_device_integrator.hpp | Declares EXX-gradient evaluation and local-work helpers. |
| src/xc_integrator/replicated/device/incore_replicated_xc_device_integrator.cxx | Includes the new EXX-gradient device implementation header. |
| src/xc_integrator/replicated/device/incore_replicated_xc_device_integrator_exx_grad.hpp | Implements EXX gradient evaluation workflow on device. |
| src/xc_integrator/local_work_driver/device/scheme1_magma_base.hpp | Adds EXX-gradient driver API surface for MAGMA scheme (NYI). |
| src/xc_integrator/local_work_driver/device/scheme1_magma_base.cxx | Adds NYI throws for EXX-gradient operations under MAGMA. |
| src/xc_integrator/local_work_driver/device/scheme1_base.hpp | Extends scheme1 base interface with EXX-gradient ops. |
| src/xc_integrator/local_work_driver/device/scheme1_base.cxx | Implements EXX K-derivative accumulation and contraction to basis-function gradients. |
| src/xc_integrator/local_work_driver/device/local_device_work_driver.hpp | Exposes EXX-gradient operations on the local device work driver. |
| src/xc_integrator/local_work_driver/device/local_device_work_driver.cxx | Forwards EXX-gradient calls to the device-driver PIMPL. |
| src/xc_integrator/local_work_driver/device/local_device_work_driver_pimpl.hpp | Adds pure-virtual EXX-gradient hooks for device-driver implementations. |
| src/xc_integrator/local_work_driver/device/cuda/kernels/cublas_extensions.cu | Adds CUDA kernels for matrix row/column reductions used in EXX gradient assembly. |
| src/xc_integrator/local_work_driver/device/common/device_blas.hpp | Declares matrix_reduce_rows/cols device BLAS helpers. |
| src/runtime_environment/device/device_runtime_environment.cxx | Adds DeviceRuntimeEnvironment constructor taking an explicit byte count. |
| src/runtime_environment/device/device_runtime_environment_impl.hpp | Implements explicit-size device-buffer allocation constructor. |
| include/gauxc/xc_integrator/xc_integrator_impl.hpp | Adds eval_exx_grad to the virtual integrator interface and public wrapper. |
| include/gauxc/xc_integrator/replicated/replicated_xc_integrator_impl.hpp | Adds low-level replicated EXX-gradient virtual + public entrypoint. |
| include/gauxc/xc_integrator/replicated/impl.hpp | Implements ReplicatedXCIntegrator::eval_exx_grad_ wrapper returning a vector. |
| include/gauxc/xc_integrator/replicated_xc_integrator.hpp | Plumbs eval_exx_grad_ through replicated integrator type. |
| include/gauxc/xc_integrator/impl.hpp | Adds XCIntegrator::eval_exx_grad public wrapper. |
| include/gauxc/xc_integrator.hpp | Adds public eval_exx_grad API type and method declaration. |
| include/gauxc/runtime_environment/decl.hpp | Declares the new explicit-size DeviceRuntimeEnvironment constructor. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Fix allreduce length from nbf to 3*nbf - Guard EXX_GRAD access in standalone_driver for UKS/GKS where the vector is empty - Add missing settings to util::unused - Fix typos
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR implements a new feature: the calculation of the seminumerical exact exchange contribution to the gradient. Currently, the EXX gradient is supported only for CUDA+cuBLAS.