fix(estimators): negate RenyiNeg and FisherRao to match uncertainty orientation#459
Open
Yedson54 wants to merge 1 commit into
Open
fix(estimators): negate RenyiNeg and FisherRao to match uncertainty orientation#459Yedson54 wants to merge 1 commit into
Yedson54 wants to merge 1 commit into
Conversation
…core convention RenyiNeg and FisherRao compute a divergence/distance to the uniform token distribution. The raw quantity increases when token distributions are sharper, so it behaves as a certainty score. Most LM-Polygraph uncertainty estimators use the opposite convention, where higher scores indicate higher uncertainty. Negate the returned scores and clarify the convention in the estimator docstrings. BREAKING CHANGE: RenyiNeg and FisherRao scores change sign. Existing results using these estimators should be reinterpreted accordingly or recomputed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
RenyiNegandFisherRaocurrently compute a token-level divergence/distance between the predictive distributionFor$$U_{\mathrm{RenyiNeg}}(x) = \frac{1}{T} \sum_{t=1}^{T} D_\alpha(p_t \Vert u)$$ (missing negative sign)
RenyiNeg:For$$U_{\mathrm{FisherRao}}(x) = \frac{1}{T} \sum_{t=1}^{T} FR(p_t, u)$$ (missing negative sign)
FisherRao:Both quantities increase when$p_t$ becomes more peaked and decrease as $p_t$ approaches the uniform distribution. The returned values therefore appear to behave as certainty/confidence scores rather than uncertainty scores.
From my understanding, most other uncertainty estimators and evaluation protocols in LM-Polygraph seem to follow the convention:$$\text{higher score} \Rightarrow \text{higher uncertainty}.$$
To keep the score orientation consistent with the rest of the library, I negate the returned values of
RenyiNegandFisherRao. The underlying divergence/distance computations are unchanged. Only the returned score orientation is modified. Docstrings are updated accordingly.Breaking change [semantic; downstream usage; not functional]
Returned scores change sign.
Existing benchmark results, stored predictions, or downstream analyses using these estimators may therefore need to:
Test plan
flake8 src/lm_polygraph/estimators/renyi_neg.py src/lm_polygraph/estimators/fisher_rao.pypytest --ignore=test/local