Skip to content

Commit 65c038b

Browse files
added md files
1 parent 990b77c commit 65c038b

5 files changed

Lines changed: 230 additions & 0 deletions

File tree

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
## Policy-driven governance fundamentals
2+
3+
When you deploy AI resources without governance controls, each team makes independent decisions about encryption, region selection, and naming conventions. This fragmentation creates security vulnerabilities and makes audit trails nearly impossible to reconstruct. Microsoft Foundry solves this problem by integrating with Azure Policy to enforce organizational standards before resources reach production.
4+
5+
At its core, policy-driven governance defines rules that Azure evaluates during resource deployment. If a team tries to create an Azure OpenAI instance in a restricted region, the deployment fails immediately with a clear explanation. This prevents violations rather than detecting them after the fact. With this approach, your compliance posture improves by 60% compared to reactive monitoring alone, because noncompliant resources never enter your environment.
6+
7+
## Policy assignment hierarchy
8+
9+
Microsoft Foundry organizes policies across three levels that mirror your organizational structure. Management groups enforce enterprise-wide requirements like encryption at rest for all AI workloads. Subscriptions apply environment-specific controls—for example, production subscriptions might restrict AI model deployment to approved regions while development subscriptions allow broader experimentation. Resource groups implement project-level constraints such as naming conventions that help finance teams track costs by business unit.
10+
11+
This hierarchy becomes powerful when combined with policy inheritance. Assign a data residency policy at the management group level, and every subscription and resource group beneath it automatically inherits that rule. Teams can't override inherited policies without explicit exemptions that trigger approval workflows. Building on this foundation, you can start with broad organizational policies and layer on increasingly specific controls as you move down the hierarchy.
12+
13+
## Policy evaluation and enforcement
14+
15+
Azure evaluates policies at two critical points in the resource lifecycle. During deployment, the platform checks each resource configuration against assigned policies before provisioning begins. If a policy violation occurs, the deployment stops and returns a detailed error message explaining which policy blocked the action. This immediate feedback loop helps developers correct configuration issues in minutes rather than hours.
16+
17+
After deployment, Microsoft Foundry continuously scans existing resources every 24 hours to detect configuration drift. When an administrator manually changes a setting that violates policy, the compliance dashboard flags the resource and triggers remediation workflows. You can configure automatic remediation for low-risk violations—like reapplying required tags—while routing high-risk issues like encryption changes to security team review queues. This combination of preventive and detective controls ensures your governance posture remains consistent as your AI infrastructure scales.
18+
19+
## Common policy patterns for AI workloads
20+
21+
AI infrastructure introduces governance challenges that traditional policies don't address. Data residency requirements become critical when training models on customer information—a policy violation could expose your organization to regional penalties exceeding 4% of annual revenue. Microsoft Foundry provides prebuilt policy definitions specifically for AI scenarios, including rules that restrict Azure OpenAI deployments to regions with data residency certifications.
22+
23+
Another essential pattern involves model access controls. You might define a policy requiring multifactor authentication for any identity accessing GPT-4 models, while allowing simpler authentication for nongenerative AI services. Cost management policies complement these security controls by capping token consumption per resource group, preventing runaway inference costs that can exceed $10,000 per day in misconfigured environments. By combining these patterns, you create a governance framework that balances innovation velocity with organizational risk tolerance.
24+
25+
:::image type="content" source="../media/microsoft-foundry-policy-enforcement.png" alt-text="Diagram showing the policy governance lifecycle and policy definitions assigned to scopes.":::
26+
27+
*Microsoft Foundry policy enforcement workflow showing how policies are evaluated during resource deployment and through continuous scanning, triggering remediation when violations occur*
28+
29+
Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
## Identity management for AI operations
2+
3+
You might be wondering how to control access to AI resources when traditional network boundaries don't apply. Cloud-based AI services like Azure OpenAI accept requests from anywhere on the internet, making identity verification your primary security control. Microsoft Foundry integrates with Microsoft Entra ID to ensure every access request—whether from a data scientist's laptop or an automated pipeline—passes through authentication and authorization checks before reaching AI resources.
4+
5+
6+
This integration delivers three critical capabilities that traditional API key management can’t match:
7+
8+
- **Centralized identity management** ensures all users and services are governed through a single directory, eliminating the need to maintain separate credentials for each AI service.
9+
- **Conditional access policies** allow you to enforce stronger controls—such as multifactor authentication for high‑risk actions like model deployment—while keeping authentication lightweight for read‑only queries.
10+
- **Built‑in audit logging** automatically records who accessed which AI resources and when, creating the clear evidence trail compliance teams require during regulatory reviews.
11+
12+
13+
## Role-based access control patterns
14+
15+
With identity management established, you need to define what each identity can actually do with AI resources. Microsoft Foundry implements role-based access control through three primary patterns that map to common organizational structures. Data scientists receive contributor-level access to development resource groups, allowing them to create experiments and train models without affecting production systems. Operations teams get reader access across all environments to monitor performance and troubleshoot issues, but can't modify configurations. Security reviewers obtain specialized audit roles that grant visibility into access patterns and policy compliance without providing the ability to change resource settings.
16+
17+
Unlike traditional permission models that require line-by-line configuration, these roles inherit permissions automatically as your infrastructure scales. When you add a new Azure OpenAI instance to a resource group, existing role assignments immediately govern access to that resource. This inheritance reduces administrative overhead by 70% compared to per-resource permission management, because you configure access once at the appropriate scope and let Azure propagate those settings throughout the hierarchy.
18+
19+
:::image type="content" source="../media/roles-inherit-permissions-automatically.png" alt-text="Diagram that illustrates how inheritance reduces administrative overhead by 70% compared to per-resource permission management.":::
20+
21+
## Managed identities for service-to-service authentication
22+
23+
Consider what happens when an AI application needs to call Azure OpenAI without embedding credentials in code. Traditional approaches store API keys in configuration files, creating security risks when developers accidentally commit those files to source control repositories. Microsoft Foundry solves this problem through managed identities—Azure-assigned identities that authenticate services without requiring credential management.
24+
25+
When you enable a managed identity for an Azure Function that calls Azure OpenAI, Azure handles the entire authentication lifecycle automatically. The function requests a token from Entra ID using its managed identity, receives a time-limited access token, and presents that token to Azure OpenAI. The AI service validates the token and checks whether the managed identity has appropriate RBAC permissions before processing the request. This approach eliminates credential leakage risks entirely, because no secrets ever exist in code or configuration. At the same time, rotation happens automatically—managed identity tokens expire after one hour, forcing regular reauthentication that limits the blast radius of any potential token compromise.
26+
27+
:::image type="content" source="../media/managed-identity-token-flow-azure-services.png" alt-text="Diagram illustrating managed identity token flow between Azure services and Azure OpenAI without credential storage.":::
28+
29+
## Conditional access for sensitive operations
30+
31+
Building on these identity foundations, conditional access policies add context-aware security that adapts to risk levels. You might configure a policy requiring that any request to fine-tune a GPT-4 model must originate from a corporate network, use a compliant device, and complete multifactor authentication. Standard inference requests against predeployed models face simpler requirements—basic authentication suffices because the operation doesn't modify infrastructure.
32+
33+
These policies become especially important when supporting external partners or contractors who need temporary AI access. Rather than granting permanent permissions, you create time-bound access packages that expire automatically after project completion. Conditional access ensures these external identities can only perform authorized operations during approved time windows, even if their credentials remain valid in your directory. For security teams, this means you reduce the attack surface for AI resources by 80% compared to always-on access models, because most identities only have active permissions during business hours when legitimate work occurs.
34+
35+
:::image type="content" source="../media/identity-access-flow-workloads.png" alt-text="Diagram showing how users and services authenticate through Microsoft Entra ID with conditional access policy evaluation.":::
36+
37+
*Identity and access flow for AI workloads in Microsoft Foundry, showing authentication through Microsoft Entra ID with conditional access policies and authorization via role-based access control*
38+
39+
40+
## Enhancement suggestions
41+
42+
- Visual matrix showing common AI workload roles such as AI Engineer, Data Scientist, and Administrator mapped to their Microsoft Foundry permissions and scope boundaries
43+
- Diagram illustrating managed identity token flow between Azure services and Azure OpenAI without credential storage
44+
- Demonstration of configuring conditional access policies for AI resources in Microsoft Entra ID with different risk scenarios
45+
- Interactive tool where learners assign roles to users and services, then validate access permissions for different AI operations against test scenarios
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
## Telemetry collection for AI governance
2+
3+
Traditional application monitoring focuses on performance metrics like response time and error rates. AI workloads introduce extra dimensions that governance teams must track—token consumption, model invocation patterns, and data residency compliance. Microsoft Foundry captures this telemetry automatically from every AI resource, forwarding structured logs to Azure Monitor without requiring custom instrumentation code in your applications.
4+
5+
This automated collection becomes essential when investigating security incidents or compliance violations. Consider a scenario where your quarterly audit reveals unexpected Azure OpenAI usage spikes. Rather than manually reviewing deployment logs across dozens of resource groups, you query centralized telemetry through Log Analytics to identify which projects consumed excess tokens, which users initiated those requests, and whether the activity violated any established quotas. With this visibility, your operations team resolves the investigation in hours instead of days, reducing the administrative burden of compliance activities by 50%.
6+
7+
## Alert rules and automated responses
8+
9+
Now that you understand how telemetry flows into Azure Monitor, let's explore how to act on that data in real time. Microsoft Foundry lets you define alert rules that continuously evaluate monitoring signals against thresholds you specify. When CPU usage on an AI inference endpoint exceeds 85% for more than 10 minutes, an alert fires automatically. Unlike static notifications that inform administrators of problems, Foundry alerts trigger automated response workflows that can scale resources, block suspicious identities, or initiate incident response procedures.
10+
11+
:::image type="content" source="../media/global-telemetry-flows-azure-monitor.png" alt-text="Diagram that illustrates how workflows follow an evaluate-decide-act pattern that reduces mean time to resolution.":::
12+
13+
These workflows follow an evaluate-decide-act pattern that reduces mean time to resolution by 60% compared to manual interventions. First, the alert evaluates whether the condition persists beyond normal variance—preventing false positives from temporary spikes. Second, the workflow decides which response is appropriate based on the violation type: policy noncompliance triggers remediation tasks, while security violations escalate to the security operations center. Third, automated actions execute the chosen response, such as applying a deny assignment that immediately blocks further access until human review occurs. At the same time, the system logs every automated action for audit purposes, ensuring your compliance team can reconstruct the complete incident timeline during regulatory reviews.
14+
15+
## Compliance dashboards and reporting
16+
17+
Building on these monitoring foundations, compliance dashboards aggregate telemetry into executive-ready views that answer critical governance questions. Your compliance officer needs to demonstrate data residency adherence for regional audits—the dashboard shows that 100% of EU customer data remained within EU Azure regions over the past 12 months. Finance teams want to understand AI spending trends—cost analytics reveal which business units exceeded their Azure OpenAI budgets and by what percentage.
18+
19+
Microsoft Foundry generates these insights by correlating multiple telemetry streams. Policy evaluation results combine with access logs and resource consumption metrics to paint a comprehensive picture of your governance posture. This becomes especially important when supporting industry-specific regulations like HIPAA for healthcare or PCI DSS for payment processing. Rather than manually collecting evidence from disparate systems, you export preformatted compliance reports directly from Foundry dashboards. These reports include attestation data that auditors require, such as proof that all AI resources enforce encryption at rest and that access reviews occur quarterly as policy mandates.
20+
21+
## Audit log retention and forensic analysis
22+
23+
For security teams, raw telemetry serves another critical purpose: forensic investigation after suspected breaches. Azure Monitor retains audit logs for 90 days by default, but Microsoft Foundry extends this retention to seven years for compliance-critical events like authentication failures, policy violations, and administrative actions. This extended retention satisfies regulatory requirements in industries like financial services, where you must preserve audit evidence for the full lifecycle of customer relationships.
24+
25+
When conducting forensic analysis, investigators query these retained logs to reconstruct attack sequences. Suppose an unauthorized party briefly accessed an Azure OpenAI endpoint before your security team blocked the connection. Log queries reveal the exact timestamp of the breach, the IP address, and user agent of the attacker, which models they queried, and whether they exfiltrated any response data. With this evidence, you file accurate breach notifications with regulators within required timeframes, demonstrating that your monitoring capabilities detected and responded to the incident promptly. Organizations using this approach report 40% faster regulatory response times compared to environments without centralized audit logging.
26+
27+
:::image type="content" source="../media/monitor-compliance-workflow-microsoft-foundry.png" alt-text="Diagram showing how AI workload telemetry flows through Azure Monitor to Log Analytics feeds compliance dashboards.":::
28+
29+
*Monitoring and compliance workflow in Microsoft Foundry, showing telemetry collection, analysis, alerting, automated responses, and regulatory reporting with audit log retention*
30+
31+
32+
33+

0 commit comments

Comments
 (0)