You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+26Lines changed: 26 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -176,6 +176,32 @@ export function Factuality(model: LanguageModel) {
176
176
}
177
177
````
178
178
179
+
#### Compatibility with `autoevals`
180
+
181
+
We maintain compatibility with the [autoevals package](https://github.com/braintrustdata/autoevals) from Braintrust. To use it you'll typically need to use te `partial` helper provided on the scorers. For example, with the `ClosedQA` scorer:
182
+
183
+
```javascript
184
+
import { describeEval } from"vitest-evals";
185
+
import { ClosedQA } from"autoevals";
186
+
187
+
describeEval("my evals", {
188
+
data:async () => {
189
+
// The scenarios you wish to evaluate
190
+
return [
191
+
{
192
+
input:"What is the capital of France?",
193
+
expected:"Paris",
194
+
}
195
+
];
196
+
},
197
+
task: answerQuestion,
198
+
scorers: [ClosedQA.partial({
199
+
criteria:"Does the submission indicate that the question is out of scope?",
200
+
})],
201
+
threshold:0.6,
202
+
})
203
+
```
204
+
179
205
### Separating Evals
180
206
181
207
An alternative to `skipIf` for controlling if evals run is creating an separate `vitest` configuration for them. This gives a lot of advantages, particularly allowing you to maintain two completely separate test suites. A good pattern you can enable with this is a filename-based-test selector:
0 commit comments