You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/openai/concepts/use-your-data.md
+7-3Lines changed: 7 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,7 +8,7 @@ ms.service: azure-ai-openai
8
8
ms.topic: quickstart
9
9
author: aahill
10
10
ms.author: aahi
11
-
ms.date: 11/14/2023
11
+
ms.date: 01/09/2023
12
12
recommendations: false
13
13
---
14
14
@@ -337,7 +337,7 @@ Use the following sections to help you configure Azure OpenAI on your data for o
337
337
338
338
### System message
339
339
340
-
Give the model instructions about how it should behave and any context it should reference when generating a response. You can describe the assistant's personality, what it should and shouldn't answer, and how to format responses. There's no token limit for the system message, but will be included with every API call and counted against the overall token limit. The system message will be truncated if it's greater than 400 tokens.
340
+
Give the model instructions about how it should behave and any context it should reference when generating a response. You can describe the assistant's personality, what it should and shouldn't answer, and how to format responses. There are token limits that apply to the system message, used with every API call, and counted against the overall token limit. The system message will be truncated if it exceeds the token limits listed in the [token estimation](#token-usage-estimation-for-azure-openai-on-your-data) section.
341
341
342
342
For example, if you're creating a chatbot where the data consists of transcriptions of quarterly financial earnings calls, you might use the following system message:
343
343
@@ -358,6 +358,7 @@ Set a limit on the number of tokens per model response. The upper limit for Azur
358
358
This option encourages the model to respond using your data only, and is selected by default. If you unselect this option, the model might more readily apply its internal knowledge to respond. Determine the correct selection based on your use case and scenario.
359
359
360
360
361
+
361
362
### Interacting with the model
362
363
363
364
Use the following practices for best results when chatting with the model.
@@ -387,7 +388,6 @@ Avoid asking long questions and break them down into multiple questions if possi
387
388
388
389
**"**You are an AI assistant designed to help users extract information from retrieved Japanese documents. Please scrutinize the Japanese documents carefully before formulating a response. The user's query will be in Japanese, and you must response also in Japanese."*
389
390
390
-
391
391
* If you have documents in multiple languages, we recommend building a new index for each language and connecting them separately to Azure OpenAI.
392
392
393
393
### Deploying the model
@@ -466,6 +466,8 @@ After you upload your data through Azure OpenAI studio, you can make a call agai
466
466
467
467
468
468
469
+
470
+
469
471
|Parameter |Recommendation |
470
472
|---------|---------|
471
473
|`fieldsMapping`| Explicitly set the title and content fields of your index. This impacts the search retrieval quality of Azure AI Search, which impacts the overall response and citation quality. |
@@ -537,6 +539,7 @@ When you chat with a model, providing a history of the chat will help the model
537
539
## Token usage estimation for Azure OpenAI on your data
538
540
539
541
542
+
540
543
| Model | Total tokens available | Max tokens for system message | Max tokens for model response |
@@ -547,6 +550,7 @@ When you chat with a model, providing a history of the chat will help the model
547
550
The table above shows the total number of tokens available for each model type. It also determines the maximum number of tokens that can be used for the [system message](#system-message) and the model response. Additionally, the following also consume tokens:
548
551
549
552
553
+
550
554
* The meta prompt (MP): if you limit responses from the model to the grounding data content (`inScope=True` in the API), the maximum number of tokens is 4036 tokens. Otherwise (for example if `inScope=False`) the maximum is 3444 tokens. This number is variable depending on the token length of the user question and conversation history. This estimate includes the base prompt as well as the query rewriting prompts for retrieval.
551
555
* User question and history: Variable but capped at 2000 tokens.
552
556
* Retrieved documents (chunks): The number of tokens used by the retrieved document chunks depends on multiple factors. The upper bound for this is the number of retrieved document chunks multiplied by the chunk size. It will, however, be truncated based on the tokens available tokens for the specific model being used after counting the rest of fields.
0 commit comments