Skip to content

Commit a3347ad

Browse files
nmetulevCopilot
andcommitted
Fix outdated GenAI APIs and add Windows ML guidance
- Update InferStreaming in get-started-models-genai.md to use current OnnxRuntimeGenAI 0.6+ APIs: - SetInputSequences -> generator.AppendTokenSequences (moved to Generator) - Remove ComputeLogits (handled by GenerateNextToken internally) These APIs were removed in the continuous decoding change: microsoft/onnxruntime-genai#1142 - Add tip recommending .WinML package for automatic EP selection in the GenAI walkthrough - Add tip in ONNX WinUI walkthrough pointing to Windows ML as the recommended path for new apps (shared runtime + auto EP management) Co-authored-by: Copilot <[email protected]>
1 parent f89f68c commit a3347ad

2 files changed

Lines changed: 8 additions & 3 deletions

File tree

docs/models/get-started-models-genai.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,9 @@ In Visual Studio, create a new project. In the **Create a new project** dialog,
2929

3030
In **Solution Explorer**, right-click **Dependencies** and select **Manage NuGet packages...**. In the NuGet package manager, select the **Browse** tab. Search for "Microsoft.ML.OnnxRuntimeGenAI.DirectML", select the latest stable version in the **Version** drop-down and then click **Install**.
3131

32+
> [!TIP]
33+
> For new Windows apps, consider using the [Microsoft.ML.OnnxRuntimeGenAI.WinML](https://www.nuget.org/packages/Microsoft.ML.OnnxRuntimeGenAI.WinML/) package instead. It uses Windows ML to automatically select the best hardware (NPU → GPU → CPU) without you needing to choose a specific execution provider package. See [Run GenAI models with Windows ML](../new-windows-ml/run-genai-onnx-models.md) for more info.
34+
3235

3336
## Add a model and vocabulary file to your project
3437

@@ -133,7 +136,7 @@ private async void MainWindow_Activated(object sender, WindowActivatedEventArgs
133136

134137
Create a helper method that submits the prompt to the model and then asynchronously returns the results to the caller with an [IAsyncEnumerable](/dotnet/api/system.collections.generic.iasyncenumerable-1).
135138

136-
In this method, the [Generator](https://onnxruntime.ai/docs/genai/api/csharp.html#generator-class) class is used in a loop, calling **GenerateNextToken** in each pass to retrieve what the model predicts the next few characters, called a token, should be based on the input prompt. The loop runs until the generator **IsDone** method returns true or until any of the tokens "<|end|>", "<|system|>", or "<|user|>" are received, which signals that we can stop generating tokens.
139+
In this method, the [Generator](https://onnxruntime.ai/docs/genai/api/csharp.html#generator-class) class is used in a loop, calling **GenerateNextToken** in each pass to retrieve what the model predicts the next few characters, called a token, should be based on the input prompt. Input token sequences are provided to the generator via **AppendTokenSequences**. The loop runs until the generator **IsDone** method returns true or until any of the tokens "<|end|>", "<|system|>", or "<|user|>" are received, which signals that we can stop generating tokens.
137140

138141
```csharp
139142
public async IAsyncEnumerable<string> InferStreaming(string prompt)
@@ -148,19 +151,18 @@ public async IAsyncEnumerable<string> InferStreaming(string prompt)
148151
var sequences = tokenizer.Encode(prompt);
149152

150153
generatorParams.SetSearchOption("max_length", 2048);
151-
generatorParams.SetInputSequences(sequences);
152154
generatorParams.TryGraphCaptureWithMaxBatchSize(1);
153155

154156
using var tokenizerStream = tokenizer.CreateStream();
155157
using var generator = new Generator(model, generatorParams);
158+
generator.AppendTokenSequences(sequences);
156159
StringBuilder stringBuilder = new();
157160
while (!generator.IsDone())
158161
{
159162
string part;
160163
try
161164
{
162165
await Task.Delay(10).ConfigureAwait(false);
163-
generator.ComputeLogits();
164166
generator.GenerateNextToken();
165167
part = tokenizerStream.Decode(generator.GetSequence(0)[^1]);
166168
stringBuilder.Append(part);

docs/models/get-started-onnx-winui.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,9 @@ no-loc: [ONNX Runtime, ONNX Runtime Generative AI, scikit-learn, DirectML Execut
1010

1111
This article walks you through creating a WinUI app that uses an ONNX model to classify objects in an image and display the confidence of each classification. For more information on using AI and machine learning models in your windows app, see [Get started with AI on Windows](../overview.md).
1212

13+
> [!TIP]
14+
> For new Windows apps, consider using [Windows ML](../new-windows-ml/overview.md) instead of bundling the ONNX Runtime directly. Windows ML provides a shared system-wide ONNX Runtime and automatically downloads the best execution providers for the user's hardware (GPU, NPU, CPU). See [Get started with Windows ML](../new-windows-ml/get-started.md) for setup instructions. This walkthrough demonstrates the standalone ONNX Runtime approach, which gives you full control over the runtime version and execution providers.
15+
1316
When utilizing AI features, we recommend that you review: [Developing Responsible Generative AI Applications and Features on Windows](../rai.md).
1417

1518
## What is the ONNX runtime

0 commit comments

Comments
 (0)