Skip to content

Commit 81b1030

Browse files
authored
Merge pull request #314887 from leijgao/release-microsoft-discovery
Add Responsible AI concept doc for Microsoft Discovery
2 parents 9ac6baa + 3e63487 commit 81b1030

2 files changed

Lines changed: 195 additions & 0 deletions

File tree

Lines changed: 190 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,190 @@
1+
---
2+
title: Responsible AI in Microsoft Discovery
3+
description: Learn about responsible AI principles, safety components, limitations, and best practices for using Microsoft Discovery in scientific research.
4+
author: leijgao
5+
ms.author: leijiagao
6+
ms.service: azure
7+
ms.topic: concept-article
8+
ms.date: 04/17/2026
9+
10+
#CustomerIntent: As a researcher or deployer, I want to understand responsible AI practices in Microsoft Discovery so that I can use the platform safely and effectively.
11+
---
12+
13+
# Responsible AI in Microsoft Discovery
14+
15+
Microsoft Discovery is an enterprise agentic AI platform for scientific research and development. It uses large language models (LLMs), multi-agent orchestration, and high-performance computing (HPC) to help researchers reason through complex problems, generate hypotheses, and analyze results.
16+
17+
Like all AI systems, Discovery has limitations and potential risks. This article describes the responsible AI principles, safety components, known limitations, and best practices that help you use the platform effectively and responsibly.
18+
19+
Microsoft's approach to responsible AI follows the [Microsoft Responsible AI Standard](https://aka.ms/RAI). This standard organizes risk management into three stages: **Discover** potential risks, **Protect** against them, and **Govern** the system in production.
20+
21+
## Intended uses
22+
23+
Microsoft Discovery is designed for R&D organizations in life sciences, materials science, semiconductors, energy, manufacturing, and advanced engineering. The platform supports scenarios such as:
24+
25+
- **Hypothesis generation and testing**—Researchers can generate, refine, and evaluate scientific hypotheses using AI-coordinated workflows.
26+
- **Literature synthesis**—Agents summarize and compare findings across large volumes of scientific papers.
27+
- **Experiment design**—The platform helps design experiments with appropriate controls, parameters, and validation steps.
28+
- **Simulation orchestration**—Discovery coordinates compute-intensive simulations across HPC clusters.
29+
- **Data analysis**—Agents perform statistical analysis, identify patterns, and produce structured results from research data.
30+
31+
Discovery isn't only designed for simple, one-off queries. It's optimized for complex, multi-step research challenges that require autonomous agent orchestration over extended periods.
32+
33+
## Safety components
34+
35+
Discovery incorporates multiple layers of safety controls to protect users and research integrity.
36+
37+
### Content filtering
38+
39+
Discovery uses [Foundry Guardrails](/azure/foundry/responsible-use-of-ai-overview), Microsoft's content safety and filtering capability. These guardrails scan content at defined intervention points to detect and block unsafe or inappropriate content before it reaches the model. The platform applies guardrails by default to all models created within Discovery.
40+
41+
You can customize guardrails in the Foundry portal. Configuration options include:
42+
43+
- Defining risk categories to detect
44+
- Adjusting severity thresholds for content filters
45+
- Assigning guardrails to evaluate and monitor agent behavior
46+
- Assessing model performance across different use cases
47+
48+
### Grounding and citations
49+
50+
Discovery surfaces citations and grounding sources where applicable. When responses are grounded in knowledge base documents or web sources, the platform provides hyperlinked citations. These citations help you trace the origin of generated content and assess its reliability.
51+
52+
Output quality depends on the structure, completeness, and relevance of your underlying knowledge base data. Sparse or poorly structured knowledge bases can lead to incomplete or insufficiently grounded responses.
53+
54+
### System-level safeguards
55+
56+
Discovery uses input classifiers, output filters, and system-level instructions aligned with [Microsoft AI principles](https://www.microsoft.com/ai/responsible-ai). System messages guide agent behavior, enforcing boundaries such as refusing to generate content that could cause physical, emotional, or financial harm.
57+
58+
## Known limitations
59+
60+
Understanding Discovery's limitations helps you use the platform within safe and effective boundaries.
61+
62+
### Automation bias and loop drift
63+
64+
Discovery supports semi-autonomous agentic workflows. This creates a risk of automation bias, where users over-trust AI-generated outputs without sufficient validation. In iterative workflows, small inaccuracies can compound over time (loop drift). Human-in-the-loop oversight is essential to keep research aligned with scientific intent.
65+
66+
### Representation bias
67+
68+
Discovery's outputs can reflect imbalances in the scientific literature or training data. The platform might overrepresent dominant research perspectives or underrepresent emerging areas. Apply domain expertise when working in novel or interdisciplinary fields where training data representation is limited.
69+
70+
### Temporal relevance
71+
72+
Scientific knowledge evolves rapidly. If Discovery relies on static or outdated datasets, it can surface obsolete findings or miss recent developments. Regularly assess the currency of your data sources, especially in fast-moving fields like synthetic biology or AI-driven drug discovery.
73+
74+
### Inaccurate or ungrounded outputs
75+
76+
When querying knowledge bases, response quality depends on how agents are configured and the quality of the underlying data. In scenarios where the knowledge base is sparse or weakly scoped, the system can produce responses that are incomplete or insufficiently grounded. Always validate AI-generated claims against trusted sources.
77+
78+
### Conversation saturation
79+
80+
As conversations become very long, answer quality can gradually decline. When the conversation approaches model context limits, earlier parts might be summarized. During this process, some details can be simplified or lost, causing answers to drift in accuracy. Starting a new conversation can restore clarity when you notice degradation.
81+
82+
### Tooling signal dilution
83+
84+
As the system gets access to a larger pool of agents and tools, its ability to choose the best ones for a given task can degrade. Actively curate the list of agents available to your project to keep them relevant to your scenarios.
85+
86+
### Toolchain compatibility
87+
88+
Discovery integrates with computational tools and models. Mismatches in tool assumptions (input formats, parameter ranges, or versioning) can lead to execution failures or misleading results. Test and validate tool orchestration workflows carefully, especially when integrating custom or third-party components.
89+
90+
## Prohibited uses
91+
92+
The following uses of Discovery are strictly prohibited:
93+
94+
- **Weapons development**—Discovery can't be used to support the design, development, or deployment of chemical, biological, radiological, nuclear, or other weapons intended to cause mass harm.
95+
- **Harmful applications**—Discovery can't be used in ways that could cause physical, psychological, environmental, or financial harm to individuals, organizations, or society.
96+
- **Violation of laws or regulations**—Discovery can't be used in any manner that violates applicable laws, regulations, or industry-specific compliance requirements.
97+
- **Bypassing safety systems**—Attempts to circumvent, disable, or interfere with Discovery's built-in safety mechanisms, classifiers, or content filters are strictly prohibited.
98+
99+
## Evaluations
100+
101+
Microsoft evaluates Discovery using manual, custom evaluations grounded in the Responsible AI (RAI) evaluation framework from Microsoft AI Foundry. Evaluations target two dimensions: safety and groundedness.
102+
103+
### Safety evaluation
104+
105+
Safety evaluation assesses whether Discovery appropriately refuses, deflects, or responds safely when prompts attempt to elicit disallowed content or bypass system safeguards. Metrics include:
106+
107+
- Policy compliance score
108+
- Risk detection and mitigation effectiveness
109+
- Content safety classification accuracy
110+
- Direct and indirect jailbreak resistance rate
111+
112+
### Groundedness evaluation
113+
114+
Groundedness evaluation assesses whether Discovery stays faithful to provided context and retrieved sources. Metrics include:
115+
116+
- Ungrounded attributes defect rates
117+
- Groundedness defect rates
118+
119+
Each release is benchmarked against Foundry-provided baseline model behavior and historical results from prior releases. This comparative approach detects regressions and validates stability over time.
120+
121+
## Best practices for end users
122+
123+
Follow these best practices to get reliable, well-grounded results from Discovery.
124+
125+
### Write clear, specific prompts
126+
127+
Effective prompts include:
128+
129+
- **Objective**—What you're trying to learn or decide
130+
- **Context**—Domain assumptions, constraints, and success criteria
131+
- **Sources**—Which knowledge base, documents, or tools to reference
132+
- **Output format**—Table, ranked list, experiment plan, or other structure
133+
- **Grounding request**—Ask for citations and traceability when available
134+
135+
For detailed guidance, see [Write effective prompts for agents](how-to-prompt-engineering.md).
136+
137+
### Monitor for performance drift
138+
139+
Output quality can change as your knowledge base, tools, or workflows evolve. To detect drift:
140+
141+
- Rerun a small set of benchmark prompts periodically and compare outputs for consistency and grounding.
142+
- Watch for warning signs: fewer or weaker citations, increased uncertainty, contradictory conclusions, or missing constraints.
143+
- Treat knowledge base updates (adding or removing documents) as version changes and recheck critical workflows.
144+
145+
### Exercise human oversight
146+
147+
AI outputs can be inaccurate, incomplete, or misaligned with your goals. Review Discovery's responses and verify they match your expectations. Don't accept outputs without validation, especially for consequential decisions.
148+
149+
### Avoid overreliance
150+
151+
Overreliance occurs when users accept incorrect AI outputs because mistakes are hard to detect. This risk increases in long iterative workflows and in scenarios with limited knowledge base coverage. Treat outputs as decision support, not decision replacement.
152+
153+
## Best practices for deployers
154+
155+
Deployers have extra responsibilities for safe and effective Discovery use.
156+
157+
### Use the reference sample agent
158+
159+
Discovery provides a reference sample agent in its GitHub repository. This sample demonstrates recommended patterns for querying knowledge bases, enforcing grounding constraints, and handling cases where evidence is missing. Use it as a starting point when you build custom agents and workflows.
160+
161+
### Implement least-privilege access
162+
163+
Discovery uses Microsoft Entra ID authentication and role-based access control (RBAC). Grant only the minimum roles needed for each user or workload. Avoid broad "Owner" or subscription-wide permissions. Use user-assigned managed identities for service-to-service access and periodically review role assignments.
164+
165+
### Maintain private-by-default network posture
166+
167+
Discovery enables Azure Private Link by default and disables public network access for data-plane APIs. Maintain this posture by restricting access to trusted virtual networks (VNets). Use private endpoints, private DNS zones, and network security groups (NSGs) to reduce your attack surface.
168+
169+
### Keep safety controls enabled
170+
171+
Discovery applies Foundry Guardrails by default. If you change thresholds or customize filters, test those changes with representative prompts before scaling access. Disabling safety mechanisms is prohibited except for managed customers who have received explicit approval.
172+
173+
### Test before scaling access
174+
175+
Create a small set of end-to-end test scenarios that reflect your intended use cases, including your own tools, models, and knowledge base content. Rerun these tests whenever you change models, tools, agents, or datasets.
176+
177+
### Enable logging and monitoring
178+
179+
Enable diagnostic logs and route them to your logging solutions. Set up alerts for repeated failures or abnormal error patterns. Periodically review security findings and configuration drift as part of ongoing platform security assessments. For more information, see [Configure network security](how-to-configure-network-security.md).
180+
181+
## Shared responsibility
182+
183+
Security in Discovery follows the Azure shared responsibility model. Microsoft secures the underlying platform, managed services, and control plane. You're responsible for securing your subscriptions, VNets, role assignments, and data access policies.
184+
185+
## Related content
186+
187+
- [What is Microsoft Discovery?](overview-what-is-microsoft-discovery.md)
188+
- [Responsible use of AI overview for Microsoft Foundry](/azure/foundry/responsible-use-of-ai-overview)
189+
- [Microsoft AI principles](https://www.microsoft.com/ai/responsible-ai)
190+
- [Microsoft responsible AI resources](https://www.microsoft.com/ai/tools-practices)

articles/microsoft-discovery/toc.yml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -120,6 +120,11 @@ items:
120120
- name: Observability overview
121121
href: concept-observability.md
122122
displayName: Microsoft Discovery, observability, monitoring, logs, diagnostics
123+
- name: Responsible AI
124+
items:
125+
- name: Responsible AI in Microsoft Discovery
126+
href: concept-responsible-ai.md
127+
displayName: Microsoft Discovery, responsible AI, safety, limitations, best practices, evaluations
123128
- name: How-tos
124129
items:
125130
- name: Migration

0 commit comments

Comments
 (0)