Skip to content

Use Flex processing for the OpenAI judge model#152

Merged
JosephMarinier merged 1 commit into
mainfrom
joseph/use-flex-tier-for-openai-judge-model
Jun 18, 2026
Merged

Use Flex processing for the OpenAI judge model#152
JosephMarinier merged 1 commit into
mainfrom
joseph/use-flex-tier-for-openai-judge-model

Conversation

@JosephMarinier

Copy link
Copy Markdown
Collaborator

Use Flex processing for the OpenAI judge model, which will halve its cost in exchange for slower response times and occasional resource unavailability. This doesn't affect benchmarking an OpenAI model. More details on flex tier here.

which will halve its cost in exchange for slower response times and occasional resource unavailability. This doesn't affect benchmarking an OpenAI model. More details on flex tier [here](https://developers.openai.com/api/docs/guides/flex-processing).
@JosephMarinier JosephMarinier self-assigned this Jun 16, 2026
)
category = "accuracy"
default_model = "us.anthropic.claude-opus-4-6-v1"
default_params = {"max_tokens": 100000} # Drop the OpenAI-only flex tier inherited from TextJudgeMetric.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we sure 100k is enough? Did you check the prompt size after 10 min convo?

@JosephMarinier JosephMarinier Jun 17, 2026

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not check that myself, but it has been 100k since the beginning of EVA (inherited from src/eva/metrics/base.py). Is that OK?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah right, I guess so then.

@JosephMarinier JosephMarinier added this pull request to the merge queue Jun 18, 2026
Merged via the queue into main with commit b444624 Jun 18, 2026
2 checks passed
@JosephMarinier JosephMarinier deleted the joseph/use-flex-tier-for-openai-judge-model branch June 18, 2026 16:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants