MicrosoftDocs
diff --git a/‎…tions-use-multiple-dynamics-365-apps.yml‎ ‎…solutions-multiple-dynamics-365-apps.yml‎learn-pr/wwl/manage-testing-ai-powered-business-solutions/5-design-end-to-end-test-scenarios-ai-solutions-use-multiple-dynamics-365-apps.yml renamed to learn-pr/wwl/manage-testing-ai-powered-business-solutions/5-design-test-scenarios-ai-solutions-multiple-dynamics-365-apps.yml
Lines changed: 2 additions & 2 deletions b/‎…tions-use-multiple-dynamics-365-apps.yml‎ ‎…solutions-multiple-dynamics-365-apps.yml‎learn-pr/wwl/manage-testing-ai-powered-business-solutions/5-design-end-to-end-test-scenarios-ai-solutions-use-multiple-dynamics-365-apps.yml renamed to learn-pr/wwl/manage-testing-ai-powered-business-solutions/5-design-test-scenarios-ai-solutions-multiple-dynamics-365-apps.yml
Lines changed: 2 additions & 2 deletions
diff --git a/‎learn-pr/wwl/manage-testing-ai-powered-business-solutions/includes/2-recommend-process-metrics-test-agents.md‎
Lines changed: 24 additions & 24 deletions b/‎learn-pr/wwl/manage-testing-ai-powered-business-solutions/includes/2-recommend-process-metrics-test-agents.md‎
Lines changed: 24 additions & 24 deletions
diff --git a/‎learn-pr/wwl/manage-testing-ai-powered-business-solutions/includes/3-create-validation-criteria-custom-ai-models.md‎
Lines changed: 15 additions & 15 deletions b/‎learn-pr/wwl/manage-testing-ai-powered-business-solutions/includes/3-create-validation-criteria-custom-ai-models.md‎
Lines changed: 15 additions & 15 deletions
@@ -1,5 +1,5 @@
 ### YamlMime:ModuleUnit
-uid: learn.wwl.manage-testing-ai-powered-business-solutions.design-end-to-end-test-scenarios-ai-solutions-use-multiple-dynamics-365-apps
+uid: learn.wwl.manage-testing-ai-powered-business-solutions.design-test-scenarios-ai-solutions-multiple-dynamics-365-apps
 title: "Design end-to-end test scenarios for AI solutions using multiple Dynamics 365 apps"
 metadata:
   title: "Design End-to-End Test Scenarios for AI Solutions"
@@ -10,4 +10,4 @@ metadata:
   ms.topic: unit
 durationInMinutes: 5
 content: |
-  [!include[](includes/5-design-end-to-end-test-scenarios-ai-solutions-use-multiple-dynamics-365-apps.md)]
+  [!include[](includes/5-design-test-scenarios-ai-solutions-multiple-dynamics-365-apps.md)]
@@ -2,11 +2,11 @@
 
 This unit teaches solution architects how to design and implement a structured and repeatable process for testing AI agents before production deployment. Testing ensures that agents operate reliably, meet business requirements, and behave predictably across diverse scenarios. You'll define key performance metrics, establish standardized test plans, and recommend measurement strategies to validate agent quality, usability, and compliance.
 
-## 1. Testing Framework for AI Agents
+## 1. Testing framework for AI agents
 
 It's important to create a testing framework for all AI Agents.
 
-### 1.1 Establish the Testing Objective
+### 1.1 Establish the testing objective
 
 #### Before testing begins, define the purpose of the test:
 
@@ -18,7 +18,7 @@ It's important to create a testing framework for all AI Agents.
 
 - Detect issues early and establish a baseline for future performance tuning.
 
-### 1.2 Develop a Structured Test Plan
+### 1.2 Develop a structured test plan
 
 #### A complete agent testing plan should include:
 
@@ -30,11 +30,11 @@ It's important to create a testing framework for all AI Agents.
 
 - **Success Criteria** - measurable thresholds for accuracy, speed, safety, and usability.
 
-## 2. Recommended Testing Process
+## 2. Recommended testing process
 
 There are several types of testing which can occur against AI agents. They can be manually tested or through an automated testing process.
 
-### 2.1 Scenario-Based Testing
+### 2.1 Scenario-based testing
 
 - Use real business workflows that reflect how employees will interact with the agent.
 
@@ -44,7 +44,7 @@ There are several types of testing which can occur against AI agents. They can b
 
 - Ensure agent output matches expected outcomes for each scenario.
 
-### 2.2 Performance and Reliability Testing
+### 2.2 Performance and reliability testing
 
 #### Evaluate how the agent performs under different conditions:
 
@@ -56,7 +56,7 @@ There are several types of testing which can occur against AI agents. They can b
 
 - Concurrent sessions.
 
-### 2.3 Safety and Compliance Testing
+### 2.3 Safety and compliance testing
 
 #### Confirm the agent respects enterprise constraints:
 
@@ -68,7 +68,7 @@ There are several types of testing which can occur against AI agents. They can b
 
 - Rejection of disallowed instructions.
 
-### 2.4 Usability Testing
+### 2.4 Usability testing
 
 #### Assess agent clarity, helpfulness, and ease of use:
 
@@ -78,91 +78,91 @@ There are several types of testing which can occur against AI agents. They can b
 
 - Do users understand how to prompt the agent effectively?
 
-## 3. Metrics to Validate Agent Performance
+## 3. Metrics to validate agent performance
 
 When measuring the AI Agent's performance, consider the below metrics.
 
-### 3.1 Core Quantitative Metrics
+### 3.1 Core quantitative metrics
 
 Use measurable indicators to determine whether the agent is performing optimally.
 
-#### Accuracy and Relevance
+#### Accuracy and relevance
 
 - Percentage of responses that correctly answer the user's intent.
 
 - Alignment with the expected business process.
 
-#### Response Time
+#### Response time
 
 - How quickly the agent generates useful answers.
 
 - Variability of response time across different tasks.
 
-#### Success Rate
+#### Success rate
 
 - Percentage of tasks fully completed without human intervention.
 
-#### Failure Rate
+#### Failure rate
 
 - Incorrect, incomplete, or unusable answers.
 
 - Frequency of unexpected errors or guardrail triggers.
 
-#### Token Efficiency (for generative agents)
+#### Token efficiency (for generative agents)
 
 - Amount of content generated relative to cost.
 
 - Signs of overly verbose or inefficient prompting.
 
-### 3.2 Behavioral and Quality Metrics
+### 3.2 Behavioral and quality metrics
 
-#### User Satisfaction
+#### User satisfaction
 
 - Survey or rating-based signals.
 
 - Number of escalations or repeated attempts.
 
-#### Conversation Quality
+#### Conversation quality
 
 - Coherence.
 
 - Step-by-step reasoning quality.
 
 - Ability to interpret follow-up questions.
 
-#### Knowledge Coverage
+#### Knowledge coverage
 
 - Depth and breadth of domain knowledge.
 
 - Completeness of grounding sources.
 
 - Gaps where the agent fails to retrieve necessary information.
 
-### 3.3 Observability and Operational Metrics
+### 3.3 Observability and operational metrics
 
 #### Stability
 
 - Sessions completed without interruption.
 
 - Error spikes or instability patterns.
 
-#### Load Handling
+#### Load handling
 
 - Agent behavior under heavy usage.
 
 - Throughput capacity.
 
-#### Guardrail Compliance
+#### Guardrail compliance
 
 - Count of prevented actions.
 
 - Instances where the agent approached restricted content.
 
-## 4. Agent Testing Lifecycle
+## 4. Agent testing lifecycle
 
 :::image type="content" source="../media/agent-testing-lifecycle.png" alt-text="Diagram showing the agent testing lifecycle: Test Planning, Scenario Design, Execution, Measurement, Analysis, Tuning, Re-Test, Approval, and Deployment." border="false":::
 
-## 5. Recommendations for Solution Architects
+## 5. Recommendations for solution architects
 
 - Create a unified **testing blueprint** used across all agent implementations.
 
 
@@ -14,11 +14,11 @@ Validation criteria help architects consistently confirm that a model is:
 
 - Evaluated consistently before, during, and after deployment.
 
-## 1. Foundations of Model Validation
+## 1. Foundations of model validation
 
 Model validation establishes whether a custom AI model performs as expected and maintains consistent quality in production.
 
-### Core Questions for Validation
+### Core questions for validation
 
 - _Does the model generate correct, relevant, grounded outputs?_
 
@@ -28,7 +28,7 @@ Model validation establishes whether a custom AI model performs as expected and
 
 - _Is model behavior aligned with established business intent and expected outcomes?_
 
-### Key Validation Dimensions
+### Key validation dimensions
 
 - **Performance metrics**
 
@@ -40,11 +40,11 @@ Model validation establishes whether a custom AI model performs as expected and
 
 - **User-centric metrics**
 
-## 2. Define Quantitative Validation Criteria
+## 2. Define quantitative validation criteria
 
 Quantitative criteria ensure measurable and repeatable evaluation during tuning or deployment.
 
-### Primary Metrics
+### Primary metrics
 
 Below are key metrics that must be included in the evaluation of custom AI Models.
 
@@ -55,22 +55,22 @@ Below are key metrics that must be included in the evaluation of custom AI Model
 - **Token Efficiency**<br>Amount of model usage cost relative to output quality.
 - **Drift Indicators**<br>Changes in output quality due to evolving data or shifting patterns.
 
-## 3. Define Qualitative Validation Criteria
+## 3. Define qualitative validation criteria
 
 Qualitative evaluation helps architects identify nuanced issues that numeric metrics can't capture.
 
-### Criteria Examples
+### Criteria examples
 
-- **Relevance and Completeness**<br>Does the model respond with the right level of detail, in context, without hallucinations?
+- **Relevance and Completeness**<br>Does the model respond with the right level of detail, in context, without incorrect information?
 - **Consistency of Reasoning**<br>Does the model follow logical steps aligned with enterprise workflows?
 - **Grounding Integrity**<br>Does the model use approved organizational knowledge?
 - **User Experience Quality**<br>Clarity, coherence, readability, and instructional usefulness.
 
-## 4. Establish Safety and Compliance Validation
+## 4. Establish safety and compliance validation
 
 Before production, custom models must satisfy enterprise governance requirements. Depending on the organization, there may be additional requirements. It's important to use the below as a neutral baseline.
 
-### Key Safety Criteria
+### Key safety criteria
 
 - Enforces role-based access to restricted content.
 
@@ -80,30 +80,30 @@ Before production, custom models must satisfy enterprise governance requirements
 
 - Maintains auditability and traceability of actions.
 
-### Risk-Mitigation Requirements
+### Risk-mitigation requirements
 
 - Human-in-the-loop review for sensitive workflows.
 
 - Guardrail testing for disallowed instructions.
 
 - Verified grounding exclusively in authorized knowledge sources.
 
-## 5. Operational Validation Criteria
+## 5. Operational validation criteria
 
 Operational validation ensures the model can be trusted in real systems.
 
-### Areas to Validate
+### Areas to validate
 
 - **Scalability** - Stable behavior under varying compute and workload patterns.<br>**Resilience** - Recovery from errors, timeouts, or dependency interruptions.<br>**Integration Reliability** - Works consistently with APIs, connectors, or orchestration components.<br>**Monitoring Support** - Telemetry produced is adequate for observability and triage.
 
-## 6. Example Validation Metrics for Custom Models
+## 6. Example validation metrics for custom models
 
 | Validation Area | Metric / Criteria | Success Threshold |
 |---|---|---|
 | Performance | Latency | < 2 seconds |
 | | Throughput | 95th percentile stable |
 | Quality | Accuracy | ≥ 90% correctness |
-| | Hallucination Rate | ≤ 3% |
+| | Incorrect Information Rate | ≤ 3% |
 | Safety | Guardrail Violations | 0 |
 | | Sensitive Output Detection | 100% blocked |
 | Cost Efficiency | Token Utilization | On par with baseline |