Vectorize promo logit computation by mcognetta · Pull Request #2 · CSSLab/maia3

mcognetta · 2026-05-25T10:15:54Z

This replaces the nested loop in the promotion logit computation with a vectorized version that gives an ~18% speedup for the entire forward pass on a small local benchmark on my machine.

It also moves the rank7_indices and rank8_indices to the model definition, since these are static and don't need to be constructed each forward pass.

mcognetta · 2026-05-25T23:12:53Z

-                    bias = promo_biases[:, to_file, piece_idx]  # (B,)
-                    promotion_logits.append((base_score + bias).unsqueeze(1))
-        promotion_logits = torch.cat(promotion_logits, dim=1)  # (B, 256)
+        base = scores_base[:, self.rank7_indices][:, :, self.rank8_indices]  # (B, 8, 8)


The rank_7/rank_8 could also just be replaced directly with a slice like:

base = scores_base[:, 48:56][:, :, 56:65].

This avoids a new allocation, etc. since it is just a contiguous view. Doesn't give much speedup on my machine though, so maybe not worth the loss in clarity.

mcognetta added 3 commits May 25, 2026 10:12

vectorize promo logit computation

e93fe08

add shape information

ed4a0ad

remove trailing newline

f24d553

mcognetta commented May 25, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vectorize promo logit computation#2

Vectorize promo logit computation#2
mcognetta wants to merge 3 commits into
CSSLab:mainfrom
mcognetta:vectorized_promo

mcognetta commented May 25, 2026 •

edited

Loading

Uh oh!

mcognetta May 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mcognetta commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mcognetta May 25, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mcognetta commented May 25, 2026 •

edited

Loading