MicrosoftDocs
diff --git a/‎learn-pr/wwl/manage-testing-ai-powered-business-solutions/1-introduction.yml‎
Lines changed: 13 additions & 0 deletions b/‎learn-pr/wwl/manage-testing-ai-powered-business-solutions/1-introduction.yml‎
Lines changed: 13 additions & 0 deletions
diff --git a/‎learn-pr/wwl/manage-testing-ai-powered-business-solutions/2-recommend-process-metrics-test-agents.yml‎
Lines changed: 13 additions & 0 deletions b/‎learn-pr/wwl/manage-testing-ai-powered-business-solutions/2-recommend-process-metrics-test-agents.yml‎
Lines changed: 13 additions & 0 deletions
diff --git a/‎learn-pr/wwl/manage-testing-ai-powered-business-solutions/3-create-validation-criteria-custom-ai-models.yml‎
Lines changed: 13 additions & 0 deletions b/‎learn-pr/wwl/manage-testing-ai-powered-business-solutions/3-create-validation-criteria-custom-ai-models.yml‎
Lines changed: 13 additions & 0 deletions
diff --git a/‎learn-pr/wwl/manage-testing-ai-powered-business-solutions/4-validate-effective-copilot-prompt-best-practices.yml‎
Lines changed: 13 additions & 0 deletions b/‎learn-pr/wwl/manage-testing-ai-powered-business-solutions/4-validate-effective-copilot-prompt-best-practices.yml‎
Lines changed: 13 additions & 0 deletions
diff --git a/‎learn-pr/wwl/manage-testing-ai-powered-business-solutions/5-design-test-scenarios-ai-solutions-multiple-dynamics-365-apps.yml‎
Lines changed: 13 additions & 0 deletions b/‎learn-pr/wwl/manage-testing-ai-powered-business-solutions/5-design-test-scenarios-ai-solutions-multiple-dynamics-365-apps.yml‎
Lines changed: 13 additions & 0 deletions
diff --git a/‎learn-pr/wwl/manage-testing-ai-powered-business-solutions/6-build-strategy-creating-test-cases-using-copilot.yml‎
Lines changed: 13 additions & 0 deletions b/‎learn-pr/wwl/manage-testing-ai-powered-business-solutions/6-build-strategy-creating-test-cases-using-copilot.yml‎
Lines changed: 13 additions & 0 deletions
diff --git a/‎learn-pr/wwl/manage-testing-ai-powered-business-solutions/7-knowledge-check.yml‎
Lines changed: 57 additions & 0 deletions b/‎learn-pr/wwl/manage-testing-ai-powered-business-solutions/7-knowledge-check.yml‎
Lines changed: 57 additions & 0 deletions
diff --git a/‎learn-pr/wwl/manage-testing-ai-powered-business-solutions/8-summary.yml‎
Lines changed: 13 additions & 0 deletions b/‎learn-pr/wwl/manage-testing-ai-powered-business-solutions/8-summary.yml‎
Lines changed: 13 additions & 0 deletions
diff --git a/‎learn-pr/wwl/manage-testing-ai-powered-business-solutions/includes/1-introduction.md‎
Lines changed: 7 additions & 0 deletions b/‎learn-pr/wwl/manage-testing-ai-powered-business-solutions/includes/1-introduction.md‎
Lines changed: 7 additions & 0 deletions
diff --git a/‎learn-pr/wwl/manage-testing-ai-powered-business-solutions/includes/2-recommend-process-metrics-test-agents.md‎
Lines changed: 179 additions & 0 deletions b/‎learn-pr/wwl/manage-testing-ai-powered-business-solutions/includes/2-recommend-process-metrics-test-agents.md‎
Lines changed: 179 additions & 0 deletions
@@ -0,0 +1,13 @@
+### YamlMime:ModuleUnit
+uid: learn.wwl.manage-testing-ai-powered-business-solutions.introduction
+title: "Introduction"
+metadata:
+  title: "Introduction"
+  description: "Learn essential practices to validate and maintain the quality of AI-powered business solutions with structured testing and governance."
+  ms.date: 02/13/2026
+  author: msdavidram
+  ms.author: taeldin
+  ms.topic: unit
+durationInMinutes: 3
+content: |
+  [!include[](includes/1-introduction.md)]
@@ -0,0 +1,13 @@
+### YamlMime:ModuleUnit
+uid: learn.wwl.manage-testing-ai-powered-business-solutions.recommend-process-metrics-test-agents
+title: "Recommend process metrics for testing AI agents"
+metadata:
+  title: "Recommend Process Metrics for Testing AI Agents"
+  description: "Learn how to recommend process metrics for testing AI agents to ensure reliability, usability, and compliance before production deployment."
+  ms.date: 02/13/2026
+  author: msdavidram
+  ms.author: taeldin
+  ms.topic: unit
+durationInMinutes: 5
+content: |
+  [!include[](includes/2-recommend-process-metrics-test-agents.md)]
@@ -0,0 +1,13 @@
+### YamlMime:ModuleUnit
+uid: learn.wwl.manage-testing-ai-powered-business-solutions.create-validation-criteria-custom-ai-models
+title: "Create validation criteria for custom AI models"
+metadata:
+  title: "Create Validation Criteria for Custom AI Models"
+  description: "Learn how to define robust validation criteria for custom AI models to ensure accuracy, reliability, safety, and scalability in enterprise environments."
+  ms.date: 02/13/2026
+  author: msdavidram
+  ms.author: taeldin
+  ms.topic: unit
+durationInMinutes: 6
+content: |
+  [!include[](includes/3-create-validation-criteria-custom-ai-models.md)]
@@ -0,0 +1,13 @@
+### YamlMime:ModuleUnit
+uid: learn.wwl.manage-testing-ai-powered-business-solutions.validate-effective-copilot-prompt-best-practices
+title: "Validate effective Copilot prompt best practices"
+metadata:
+  title: "Validate Effective Copilot Prompt Best Practices"
+  description: "Learn how to validate effective Copilot prompt best practices to ensure clarity, safety, and high-quality AI output for enterprise workflows."
+  ms.date: 02/13/2026
+  author: msdavidram
+  ms.author: taeldin
+  ms.topic: unit
+durationInMinutes: 5
+content: |
+  [!include[](includes/4-validate-effective-copilot-prompt-best-practices.md)]
@@ -0,0 +1,13 @@
+### YamlMime:ModuleUnit
+uid: learn.wwl.manage-testing-ai-powered-business-solutions.design-test-scenarios-ai-solutions-multiple-dynamics-365-apps
+title: "Design end-to-end test scenarios for AI solutions using multiple Dynamics 365 apps"
+metadata:
+  title: "Design End-to-End Test Scenarios for AI Solutions"
+  description: "Learn how to design end-to-end test scenarios for AI solutions that integrate multiple Dynamics 365 apps, ensuring seamless workflows and accurate outputs."
+  ms.date: 02/13/2026
+  author: msdavidram
+  ms.author: taeldin
+  ms.topic: unit
+durationInMinutes: 5
+content: |
+  [!include[](includes/5-design-test-scenarios-ai-solutions-multiple-dynamics-365-apps.md)]
@@ -0,0 +1,13 @@
+### YamlMime:ModuleUnit
+uid: learn.wwl.manage-testing-ai-powered-business-solutions.build-strategy-creating-test-cases-using-copilot
+title: "Build a strategy for creating test cases using Copilot"
+metadata:
+  title: "Build a Strategy for Creating Test Cases Using Copilot"
+  description: "Learn to build a scalable strategy for creating high-quality test cases using Copilot. Strengthen reliability, coverage, and consistency in AI solutions."
+  ms.date: 02/13/2026
+  author: msdavidram
+  ms.author: taeldin
+  ms.topic: unit
+durationInMinutes: 5
+content: |
+  [!include[](includes/6-build-strategy-creating-test-cases-using-copilot.md)]
@@ -0,0 +1,57 @@
+### YamlMime:ModuleUnit
+uid: learn.wwl.manage-testing-ai-powered-business-solutions.knowledge-check
+title: "Module assessment"
+metadata:
+  title: "Knowledge check"
+  description: "Knowledge check"
+  ms.date: 02/13/2026
+  author: msdavidram
+  ms.author: taeldin
+  ms.topic: unit
+  module_assessment: false
+durationInMinutes: 3
+content: "Choose the best response for each of the following questions."
+quiz:
+  questions:
+  - content: "Which statement best explains why traditional software testing isn't sufficient for AI-powered business solutions?"
+    choices:
+    - content: "AI solutions require more UI automation than traditional apps."
+      isCorrect: false
+      explanation: "Incorrect. While UI automation may be part of testing, it doesn't address the unique challenges of AI systems."
+    - content: "AI outputs are probabilistic and can vary based on context, data, and input phrasing."
+      isCorrect: true
+      explanation: "Correct. AI models produce probabilistic, context-dependent outputs. Two similar inputs can generate different results depending on data grounding, prompt variations, or system state. This variability requires new testing approaches that validate behavior patterns, safety, guardrails, grounding integrity, and consistency—areas traditional deterministic software testing doesn't fully cover."
+    - content: "AI systems don't require validation of compliance or safety."
+      isCorrect: false
+      explanation: "Incorrect. AI systems often require rigorous validation for compliance and safety, especially in regulated industries."
+    - content: "AI solutions only need to be tested once before deployment."
+      isCorrect: false
+      explanation: "Incorrect. AI solutions require continuous testing and monitoring to ensure consistent performance and alignment with business goals."
+  - content: "Which metric is most important when validating whether an AI solution is producing outputs aligned with business outcomes?"
+    choices:
+    - content: "Length of the conversation"
+      isCorrect: false
+      explanation: "Incorrect. Conversation length doesn't measure the quality or alignment of AI outputs with business outcomes."
+    - content: "Number of users interacting with the AI"
+      isCorrect: false
+      explanation: "Incorrect. While user engagement is important, it doesn't directly validate the accuracy or relevance of AI outputs."
+    - content: "Accuracy and relevance of the AI's responses"
+      isCorrect: true
+      explanation: "Correct. To ensure an AI solution is delivering value, you must confirm that the outputs are accurate, relevant, and aligned with business intent. This directly validates whether the AI is producing correct insights or actions that support real business workflows."
+    - content: "Frequency of UI updates"
+      isCorrect: false
+      explanation: "Incorrect. UI updates are unrelated to the validation of AI output quality or business alignment."
+  - content: "Why must end-to-end test scenarios for AI solutions validate cross-app data flow across multiple Dynamics 365 applications?"
+    choices:
+    - content: "Because each Dynamics 365 app requires a separate AI model."
+      isCorrect: false
+      explanation: "Incorrect. AI solutions can share models across apps. The need for cross-app testing is driven by data dependencies, not separate models."
+    - content: "Because AI output quality depends on consistent, trusted, and well-timed input data from across integrated systems."
+      isCorrect: true
+      explanation: "Correct. AI decisions depend on data flowing across multiple apps. Workflow orchestration can break when one app changes, data may be duplicated or transformed, and AI output quality relies on consistent, trusted, well-timed input data. Testing must validate the entire business process to ensure accurate outputs."
+    - content: "Because testing individual apps isn't possible with AI solutions."
+      isCorrect: false
+      explanation: "Incorrect. Individual app testing is still possible, but it doesn't validate the end-to-end behavior of AI solutions that span multiple systems."
+    - content: "Because Dynamics 365 apps can't share data without manual intervention."
+      isCorrect: false
+      explanation: "Incorrect. Dynamics 365 apps integrate through connectors, data sync, and automations. The challenge is validating that these integrations work correctly for AI-driven workflows."
@@ -0,0 +1,13 @@
+### YamlMime:ModuleUnit
+uid: learn.wwl.manage-testing-ai-powered-business-solutions.summary
+title: "Summary"
+metadata:
+  title: "Summary"
+  description: "Discover key practices for testing AI solutions, ensuring reliability, safety, and performance across dynamic business environments."
+  ms.date: 02/13/2026
+  author: msdavidram
+  ms.author: taeldin
+  ms.topic: unit
+durationInMinutes: 3
+content: |
+  [!include[](includes/8-summary.md)]
@@ -0,0 +1,7 @@
+This module introduces solution architects to the essential practices required to validate and maintain the quality of AI-powered business solutions across the enterprise. Because AI systems generate probabilistic outputs and rely on dynamic data sources, traditional testing methods aren't sufficient. This module equips learners with the frameworks, metrics, and governance needed to ensure AI solutions behave reliably, safely, and in alignment with business goals.
+
+Learners will explore how to design structured testing processes for agents, custom AI models, prompts, and end-to-end multi-application scenarios. Each unit provides practical guidance on defining objectives, creating measurable validation criteria, evaluating safety and compliance, and understanding how data flows across integrated business applications affect AI behavior.
+
+The module emphasizes measurable performance indicators such as accuracy, latency, stability, guardrail adherence, and user experience quality. Learners also gain strategies for validating prompt design, assessing grounding integrity, and ensuring predictable AI reasoning across varied scenarios and user types.
+
+Finally, the module introduces scalable testing strategies using Copilot to accelerate test case creation while maintaining consistency, coverage, and governance. By the end, solution architects will be able to design repeatable testing frameworks that ensure AI solutions remain trustworthy, resilient, and aligned to enterprise requirements throughout their lifecycle.
@@ -0,0 +1,179 @@
+## Overview
+
+This unit teaches solution architects how to design and implement a structured and repeatable process for testing AI agents before production deployment. Testing ensures that agents operate reliably, meet business requirements, and behave predictably across diverse scenarios. You'll define key performance metrics, establish standardized test plans, and recommend measurement strategies to validate agent quality, usability, and compliance.
+
+## 1. Testing framework for AI agents
+
+It's important to create a testing framework for all AI Agents.
+
+### 1.1 Establish the testing objective
+
+#### Before testing begins, define the purpose of the test:
+
+- Validate the agent's ability to meet the intended business outcome.
+
+- Ensure accuracy and consistency across scenarios.
+
+- Verify that guardrails, data boundaries, and compliance policies operate correctly.
+
+- Detect issues early and establish a baseline for future performance tuning.
+
+### 1.2 Develop a structured test plan
+
+#### A complete agent testing plan should include:
+
+- **Test Scope** - features, workflows, channels, and scenarios.
+
+- **Test Data** - representative prompts, business cases, and realistic contextual inputs.
+
+- **Test Roles** - who executes tests, who validates behavioral output, who documents findings.
+
+- **Success Criteria** - measurable thresholds for accuracy, speed, safety, and usability.
+
+## 2. Recommended testing process
+
+There are several types of testing which can occur against AI agents. They can be manually tested or through an automated testing process.
+
+### 2.1 Scenario-based testing
+
+- Use real business workflows that reflect how employees will interact with the agent.
+
+- Include ambiguous, incomplete, and varied user inputs.
+
+- Validate multi-turn reasoning, memory handling, and follow-up behavior.
+
+- Ensure agent output matches expected outcomes for each scenario.
+
+### 2.2 Performance and reliability testing
+
+#### Evaluate how the agent performs under different conditions:
+
+- High request volume.
+
+- Long interactions.
+
+- Complex multi-step tasks.
+
+- Concurrent sessions.
+
+### 2.3 Safety and compliance testing
+
+#### Confirm the agent respects enterprise constraints:
+
+- Sensitive data protection.
+
+- Role-based access rules.
+
+- Policy triggers (such as restricted actions or DLP rules).
+
+- Rejection of disallowed instructions.
+
+### 2.4 Usability testing
+
+#### Assess agent clarity, helpfulness, and ease of use:
+
+- Are answers concise, accurate, and understandable?
+
+- Does the agent require excessive refinement?
+
+- Do users understand how to prompt the agent effectively?
+
+## 3. Metrics to validate agent performance
+
+When measuring the AI Agent's performance, consider the below metrics.
+
+### 3.1 Core quantitative metrics
+
+Use measurable indicators to determine whether the agent is performing optimally.
+
+#### Accuracy and relevance
+
+- Percentage of responses that correctly answer the user's intent.
+
+- Alignment with the expected business process.
+
+#### Response time
+
+- How quickly the agent generates useful answers.
+
+- Variability of response time across different tasks.
+
+#### Success rate
+
+- Percentage of tasks fully completed without human intervention.
+
+#### Failure rate
+
+- Incorrect, incomplete, or unusable answers.
+
+- Frequency of unexpected errors or guardrail triggers.
+
+#### Token efficiency (for generative agents)
+
+- Amount of content generated relative to cost.
+
+- Signs of overly verbose or inefficient prompting.
+
+### 3.2 Behavioral and quality metrics
+
+#### User satisfaction
+
+- Survey or rating-based signals.
+
+- Number of escalations or repeated attempts.
+
+#### Conversation quality
+
+- Coherence.
+
+- Step-by-step reasoning quality.
+
+- Ability to interpret follow-up questions.
+
+#### Knowledge coverage
+
+- Depth and breadth of domain knowledge.
+
+- Completeness of grounding sources.
+
+- Gaps where the agent fails to retrieve necessary information.
+
+### 3.3 Observability and operational metrics
+
+#### Stability
+
+- Sessions completed without interruption.
+
+- Error spikes or instability patterns.
+
+#### Load handling
+
+- Agent behavior under heavy usage.
+
+- Throughput capacity.
+
+#### Guardrail compliance
+
+- Count of prevented actions.
+
+- Instances where the agent approached restricted content.
+
+## 4. Agent testing lifecycle
+
+:::image type="content" source="../media/agent-testing-lifecycle.png" alt-text="Diagram showing the agent testing lifecycle: Test Planning, Scenario Design, Execution, Measurement, Analysis, Tuning, Re-Test, Approval, and Deployment." border="false":::
+
+## 5. Recommendations for solution architects
+
+- Create a unified **testing blueprint** used across all agent implementations.
+
+- Maintain a **centralized log** of test results for comparison across releases.
+
+- Incorporate **automation** where possible, including repeatable scripts for standard interactions.
+
+- Establish governance checkpoints before each deployment.
+
+- Pair telemetry insights with qualitative feedback to drive continuous improvement.
+
+## References
+
+[Conversational agents performance testing](/microsoft-copilot-studio/guidance/conversational-agents-performance-testing)