Update.

Sherry Yang · Sherry Yang · commit 249a50dd77fa · 2026-03-27T16:22:33.000-07:00
diff --git a/learn-pr/wwl-data-ai/get-started-with-generative-ai-and-agents/includes/2-generative-ai-models.md b/learn-pr/wwl-data-ai/get-started-with-generative-ai-and-agents/includes/2-generative-ai-models.md
@@ -103,7 +103,7 @@ Deployment parameters that you can customize in Foundry include:
 > [!NOTE]
 > A **token** is the smallest unit of text or data that a generative AI model can process. Models break input into tokens—such as words, subwords, characters, or punctuation—so they can understand and generate language efficiently.
 
-When you deploy a model, you can assign it a *Tokens Per Minute* (TPM) allocation. TPM determines the speed and scale the model can process inputs and the rate‑limit boundaries such as requests per minute (RPM).
+When you deploy a model, you can assign it a *Tokens Per Minute* (TPM) allocation. TPM determines the speed and scale the model can process inputs and the rate‑limit boundaries such as requests per minute (RPM). When you assign a higher TPM allocation to a model deployment, you're increasing its capacity to handle token traffic per minute. Lower TPM reduces how fast your deployment is allowed to consume tokens across requests.
 
 Limits differ by model family, for example:
 - High‑end reasoning models (for example: DeepSeek R1, Grok, large Llama versions) may have high TPM ceilings.