security: branch-free modular primitives on decrypt hot path#1177
Open
security: branch-free modular primitives on decrypt hot path#1177
Conversation
Replace the final sign-correcting ternary in ModMulFastConstEq with an
arithmetic-shift bitmask select, and add an inline-asm compiler barrier
on GCC/Clang to prevent aggressive optimization (notably -Ofast) from
re-folding the sequence back into a conditional branch.
ModMulFastConstEq is reached during decryption through
DCRTPolyImpl::ScaleAndRound, operating on secret-key-derived values.
A data-dependent branch at this point exposes a timing side channel on
targets without reliable branch predictors or under observation by a
co-resident attacker (SMT siblings, external power analysis). The
replacement computes
signmask = yprime >> (word_bits - 1) // arithmetic shift
corrected = yprime + (signmask & modulus)
which is the canonical branch-free sign-correction idiom. Correctness
is exactly equivalent for all valid yprime in the original range.
The compiler barrier follows the methodology demonstrated in the
PermNet-RM Reed-Muller encoder (Issaei, "PermNet-RM: The Unmasked
Butterfly"), which shows that without an opaque barrier, GCC -Ofast can
reintroduce the branch via value-range propagation.
Verified:
- Existing core_tests (UTBinInt, UTBinVect::ModMulTest) pass.
- Existing pke_tests (UTBFVRNS_DECRYPT 41 cases, UTCKKSRNS, etc.,
58 tests across 5 suites) pass.
- arithmetic correctness unchanged for the full signed-int input
range.
Follow-up work (separate PR) will add a CI disassembly-grep harness
covering -O0..-Ofast and a ctgrind-based decrypt timing test.
Add utils/constanttime.h providing four branch-free helpers:
- AddIfNeg(x, m) : x + m if x < 0, else x (signed)
- SubIfGE(x, m) : x - m if x >= m, else x (unsigned)
- ModSubFast(a, b, m) : (a - b) mod m, operands in [0, m)
- SubIfAboveHalf(x, m, halfQ) : centered-lift below halfQ
Each helper is a small inline template: sign-mask/borrow-bit select
plus an inline-asm compiler barrier on GCC/Clang. Methodology follows
the PermNet-RM constant-time encoder (Issaei 2026), which demonstrates
that without an explicit barrier, aggressive optimization (notably
-Ofast) can reconstruct a data-dependent branch via value-range
propagation on constructs like `x >= 0 ? x : x + m`.
Refactor call sites in the decrypt hot path to use these helpers:
ubintnat.h:
- ModMulFastConst / ModMulFastConstEq (sign correction)
- ModAddFast / ModAddFastEq (Barrett tail)
- ModSubFast / ModSubFastEq (borrow-guarded subtract)
transformnat-impl.h (Cooley-Tukey & Gentleman-Sande butterflies):
- ForwardTransformIterative
- ForwardTransformToBitReverseInPlace (mu-based and precon-based)
- ForwardTransformToBitReverse (mu-based and precon-based)
- InverseTransformFromBitReverseInPlace (mu-based and precon-based,
including peeled first and final stages)
The NTT butterflies previously split into `#if defined(__GNUC__) &&
!defined(__clang__)` vs `#else` branches; both branches carried their
own conditionals. This consolidation replaces all of them with the
same branchless primitives, removing ~140 lines of duplicated logic
while making the constant-time guarantee uniform across compilers.
A data-dependent `if (omegaFactor != zero)` skip in two NTT variants
is also removed: the skip was an optimization that branched on a
secret-derived coefficient, and eliminating it is required for
constant-time behavior.
Verified on this branch:
- core_tests : 158/158 pass (17 suites, including
UTBinInt, UTBinVect::ModMulTest, UTNTT,
UTTrapdoor, Utilities)
- pke_tests (decrypt-filtered) : 59/59 pass (6 suites, including
UTBFVRNS_DECRYPT 41 cases, CKKSRNS,
BGVRNS decrypt paths)
Follow-up (separate PR) will add a CI harness that disassembles the
decrypt object files across -O0..-Ofast and fails on any conditional-
jump mnemonic inside whitelisted functions, mirroring the PermNet-RM
CI approach.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR removes data-dependent branches from modular arithmetic used during
Decrypt. Branches whose direction depends on secret values can leak information through timing, so this change replaces them with branchless bitmask-based equivalents.The new helpers also include a small inline-asm barrier for GCC and Clang to make it harder for the optimizer to rewrite the branchless form back into conditional branches at higher optimization levels.
What changes
New header —
src/core/include/utils/constanttime.hAdds four small inline helpers in namespace
lbcrypto::ct. Each helper implements the same operation as the original conditional logic, but uses a mask derived from the sign bit or borrow bit instead of a branch.AddIfNeg(x, m)x < 0 ? x + m : xSubIfGE(x, m)x >= m ? x - m : xModSubFast(a, b, m)a - b mod mSubIfAboveHalf(x, m, halfQ)x > halfQCall sites —
src/core/include/math/hal/intnat/ubintnat.hThe existing public functions are preserved; only their internals change.
ModAddFast/ModAddFastEqnow callct::SubIfGEModSubFast/ModSubFastEqnow callct::ModSubFastModMulFastConst/ModMulFastConstEqnow callct::AddIfNegNTT butterflies —
src/core/include/math/hal/intnat/transformnat-impl.hThe NTT butterflies previously had separate GCC and non-GCC implementations, each with its own conditional fixups such as: