Skip to content

Commit e7bdd38

Browse files
Merge pull request #262736 from aahill/docs-editor/use-your-data-1704760862
updating system limits
2 parents eb8b050 + 99d37b5 commit e7bdd38

1 file changed

Lines changed: 7 additions & 3 deletions

File tree

articles/ai-services/openai/concepts/use-your-data.md

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ ms.service: azure-ai-openai
88
ms.topic: quickstart
99
author: aahill
1010
ms.author: aahi
11-
ms.date: 11/14/2023
11+
ms.date: 01/09/2023
1212
recommendations: false
1313
---
1414

@@ -337,7 +337,7 @@ Use the following sections to help you configure Azure OpenAI on your data for o
337337

338338
### System message
339339

340-
Give the model instructions about how it should behave and any context it should reference when generating a response. You can describe the assistant's personality, what it should and shouldn't answer, and how to format responses. There's no token limit for the system message, but will be included with every API call and counted against the overall token limit. The system message will be truncated if it's greater than 400 tokens.
340+
Give the model instructions about how it should behave and any context it should reference when generating a response. You can describe the assistant's personality, what it should and shouldn't answer, and how to format responses. There are token limits that apply to the system message, used with every API call, and counted against the overall token limit. The system message will be truncated if it exceeds the token limits listed in the [token estimation](#token-usage-estimation-for-azure-openai-on-your-data) section.
341341

342342
For example, if you're creating a chatbot where the data consists of transcriptions of quarterly financial earnings calls, you might use the following system message:
343343

@@ -358,6 +358,7 @@ Set a limit on the number of tokens per model response. The upper limit for Azur
358358
This option encourages the model to respond using your data only, and is selected by default. If you unselect this option, the model might more readily apply its internal knowledge to respond. Determine the correct selection based on your use case and scenario.
359359

360360

361+
361362
### Interacting with the model
362363

363364
Use the following practices for best results when chatting with the model.
@@ -387,7 +388,6 @@ Avoid asking long questions and break them down into multiple questions if possi
387388

388389
* *"**You are an AI assistant designed to help users extract information from retrieved Japanese documents. Please scrutinize the Japanese documents carefully before formulating a response. The user's query will be in Japanese, and you must response also in Japanese."*
389390

390-
391391
* If you have documents in multiple languages, we recommend building a new index for each language and connecting them separately to Azure OpenAI.
392392

393393
### Deploying the model
@@ -466,6 +466,8 @@ After you upload your data through Azure OpenAI studio, you can make a call agai
466466

467467

468468

469+
470+
469471
|Parameter |Recommendation |
470472
|---------|---------|
471473
|`fieldsMapping` | Explicitly set the title and content fields of your index. This impacts the search retrieval quality of Azure AI Search, which impacts the overall response and citation quality. |
@@ -537,6 +539,7 @@ When you chat with a model, providing a history of the chat will help the model
537539
## Token usage estimation for Azure OpenAI on your data
538540

539541

542+
540543
| Model | Total tokens available | Max tokens for system message | Max tokens for model response |
541544
|-------------------------|------------------------|------------------------------------|------------------------------------|
542545
| ChatGPT Turbo (0301) 8k | 8000 | 400 | 1500 |
@@ -547,6 +550,7 @@ When you chat with a model, providing a history of the chat will help the model
547550
The table above shows the total number of tokens available for each model type. It also determines the maximum number of tokens that can be used for the [system message](#system-message) and the model response. Additionally, the following also consume tokens:
548551

549552

553+
550554
* The meta prompt (MP): if you limit responses from the model to the grounding data content (`inScope=True` in the API), the maximum number of tokens is 4036 tokens. Otherwise (for example if `inScope=False`) the maximum is 3444 tokens. This number is variable depending on the token length of the user question and conversation history. This estimate includes the base prompt as well as the query rewriting prompts for retrieval.
551555
* User question and history: Variable but capped at 2000 tokens.
552556
* Retrieved documents (chunks): The number of tokens used by the retrieved document chunks depends on multiple factors. The upper bound for this is the number of retrieved document chunks multiplied by the chunk size. It will, however, be truncated based on the tokens available tokens for the specific model being used after counting the rest of fields.

0 commit comments

Comments
 (0)