Skip to content

Commit ba3a286

Browse files
authored
Merge pull request #54076 from ivorb/agents-bugfix
improve deployment info
2 parents 6263568 + 8e3d678 commit ba3a286

2 files changed

Lines changed: 81 additions & 179 deletions

File tree

learn-pr/wwl-data-ai/develop-ai-agents-azure-vs-code/8-test-deploy-integrate.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,6 @@ metadata:
99
ms.author: berryivor
1010
ms.topic: unit
1111
ai-usage: ai-generated
12-
durationInMinutes: 10
12+
durationInMinutes: 9
1313
content: |
1414
[!include[](includes/8-test-deploy-integrate.md)]
Lines changed: 80 additions & 178 deletions
Original file line numberDiff line numberDiff line change
@@ -1,234 +1,136 @@
1-
Testing, deploying, and integrating agents are critical steps in moving from development to production. Microsoft Foundry provides comprehensive capabilities for validating agent behavior, deploying to production environments, and connecting agents to your applications. These final steps transform prototypes into reliable automation that delivers business value.
1+
Testing, deploying, and publishing agents are critical steps in moving from development to production. Microsoft Foundry provides comprehensive capabilities for validating agent behavior, deploying to your Foundry project, and publishing agents as callable endpoints that external consumers and applications can use.
22

33
## Testing strategies for agents
44

5-
Thorough testing ensures your agents behave reliably across diverse scenarios before reaching users. Testing should cover expected interactions, edge cases, and error conditions.
6-
7-
### Testing with integrated playgrounds
8-
9-
Both the Foundry portal and Visual Studio Code extension provide playgrounds for interactive testing. These environments simulate real user interactions while providing visibility into agent decision-making.
5+
Thorough testing ensures your agents behave reliably across diverse scenarios before reaching users. Both the Foundry portal and Visual Studio Code extension provide playgrounds for interactive testing.
106

117
**Using the playground effectively:**
128

13-
Start with **happy path testing** - Verify the agent handles common, expected requests correctly. Test typical user questions and workflows to confirm basic functionality works as intended.
14-
15-
Move to **edge case testing** - Try ambiguous inputs, incomplete information, and unusual requests. Edge cases reveal how agents handle uncertainty and unexpected situations.
16-
17-
Perform **boundary testing** - Test the limits of what your agent should and shouldn't do. Confirm the agent respects boundaries defined in its instructions.
18-
19-
Conduct **multi-turn conversation testing** - Verify the agent maintains context across multiple exchanges. Test whether the agent remembers prior information and builds on previous responses appropriately.
20-
21-
Execute **tool invocation testing** - When agents use tools, verify they call the right tools at the right times and incorporate results correctly.
22-
23-
### Testing scenarios to validate
24-
25-
For a customer service agent, test these scenarios:
26-
27-
**Expected requests:**
28-
- "I need to schedule an appointment"
29-
- "What are your hours?"
30-
- "Can I reschedule my appointment?"
31-
32-
**Out-of-scope requests:**
33-
- "What medication should I take?" (should decline and suggest consulting a provider)
34-
- "Can you access my medical records?" (should explain privacy boundaries)
35-
36-
**Ambiguous inputs:**
37-
- "I need help" (should ask clarifying questions)
38-
- "appointment" (should gather more context)
39-
40-
**Error conditions:**
41-
- Tool failures or timeouts
42-
- Requests requiring unavailable information
43-
- System errors during processing
44-
45-
Recording test results helps you track improvements over time and ensures regressions don't reintroduce old issues.
46-
47-
## Working with conversations
48-
49-
Understanding how the Responses API manages conversations helps you design better agent experiences and troubleshoot issues effectively.
50-
51-
### Conversation lifecycle
52-
53-
**Conversation creation** - A new conversation starts when a user interacts with your agent. Each conversation maintains its own message history, separate from other users' interactions.
54-
55-
**Message exchange** - As users send messages, the Responses API processes them with your agent's configuration and generates responses based on conversation context.
56-
57-
**Context preservation** - Conversations preserve the full message history, enabling agents to reference earlier exchanges and maintain continuity.
9+
- **Happy path testing** - Verify the agent handles common, expected requests correctly.
10+
- **Edge case testing** - Try ambiguous inputs, incomplete information, and unusual requests to reveal how agents handle uncertainty.
11+
- **Boundary testing** - Confirm the agent respects boundaries defined in its instructions by testing out-of-scope requests.
12+
- **Multi-turn conversation testing** - Verify the agent maintains context across multiple exchanges and builds on previous responses.
13+
- **Tool invocation testing** - Verify agents call the right tools at the right times and incorporate results correctly.
5814

59-
**Conversation completion** - Conversations can be explicitly ended or allowed to expire based on inactivity. Completed conversations preserve their history for review.
15+
Record test results to track improvements and catch regressions.
6016

61-
### Managing conversations in production
17+
## Deploying agents to your project
6218

63-
When deploying agents, consider conversation management strategies:
64-
65-
**Session boundaries** - Decide when new conversations should start. Customer service agents might create new conversations for each support case, while productivity assistants might maintain longer conversations.
66-
67-
**Context limits** - Conversations can grow large over extended interactions. Monitor conversation length and implement strategies for summarizing or archiving old context when needed.
68-
69-
**Privacy and retention** - Define retention policies for conversation data. Determine how long message histories should be preserved and when they should be deleted.
70-
71-
You can view and manage conversations through the Foundry portal or programmatically through the Responses API, providing visibility into how users interact with your deployed agents.
72-
73-
## Deployment approaches
74-
75-
Microsoft Foundry supports multiple deployment approaches to match different operational needs and team workflows.
19+
Microsoft Foundry supports deploying agents from the portal or Visual Studio Code. Deploying saves your agent configuration to your Foundry project so you can test and iterate.
7620

7721
### Deploying from the Foundry portal
7822

79-
Portal deployment provides a visual, guided experience:
80-
8123
1. Navigate to your agent in the Foundry portal
8224
1. Verify configuration and test results are satisfactory
83-
1. Select **Deploy** from the agent's page
84-
1. Confirm deployment settings
85-
1. Wait for deployment to complete
86-
87-
Deployed agents appear in your project's resource list with active status indicators.
25+
1. Select **Save** from the agent's page
26+
1. Confirm version and deployment settings
8827

8928
### Deploying from Visual Studio Code
9029

91-
VS Code deployment integrates with your development workflow:
92-
93-
1. Open your agent in the Agent Designer
94-
1. Select **Update on Microsoft Foundry** to push your configuration changes
95-
1. For hosted agents, use the **Deploy Hosted Agents** option in the Tools section
96-
1. Wait for deployment confirmation
97-
1. Refresh the Resources view to see the updated agent
30+
1. Open your agent in the AI Toolkit
31+
1. Select **Save to Foundry** to push configuration changes
32+
1. For hosted agents, open the **+Build** menu in the developer tools and select **Deploy to Microsoft Foundry**
33+
1. Select your container configuration and confirm
9834

99-
This streamlined process keeps you in your development environment, eliminating context switches during deployment.
35+
Both approaches keep your agent within your project workspace where team members can access and test it.
10036

101-
### Deployment considerations
37+
## Publishing agents to an endpoint
10238

103-
When deploying agents, consider:
39+
Publishing moves an agent from your project workspace into a managed Azure resource called an **Agent Application**. This step is what makes your agent externally callable through a stable endpoint.
10440

105-
**Model availability** - Ensure your selected model deployment has sufficient capacity for expected load. Monitor usage and scale as needed.
41+
### What publishing creates
10642

107-
**Tool dependencies** - Verify all tools your agent uses are properly configured. File Search requires vector stores with uploaded documents, API tools need valid credentials.
43+
When you publish an agent version, Foundry creates:
10844

109-
**Instruction clarity** - Double-check instructions before deployment. Changes after deployment require redeployment and may affect user experiences.
45+
- **Agent Application** - An Azure resource with its own invocation URL, authentication policy, and Entra agent identity.
46+
- **Deployment** - A running instance of a specific agent version inside the application, with start/stop lifecycle management.
11047

111-
**Testing validation** - Confirm comprehensive testing is complete. Deploying untested changes risks production issues.
48+
The key difference between deploying and publishing is scope. Deploying keeps the agent within your project. Publishing creates a dedicated endpoint that external consumers can call without needing access to your Foundry project.
11249

113-
## Generating integration code
114-
115-
Once deployed, agents need to connect to your applications. The Microsoft Foundry extension generates sample integration code that accelerates this process.
116-
117-
### Code generation process
50+
### Publishing from the Foundry portal
11851

119-
To generate integration code:
52+
1. In the portal, select the agent version you want to publish
53+
1. Select **Publish** to create the Agent Application and deployment
12054

121-
1. Select your deployed agent in the Azure Resources view (VS Code)
122-
1. Select **Open Code File** from the available actions
123-
1. The extension presents structured options:
124-
- **Choose your preferred SDK** - Select the SDK framework for your integration
125-
- **Choose your language** - Select your programming language (Python, JavaScript, C#, etc.)
126-
- **Choose your authentication method** - Select how your application authenticates (managed identity, service principal, interactive, etc.)
127-
1. The extension generates sample code showing how to:
128-
- Authenticate with Microsoft Foundry
129-
- Connect to your specific agent
130-
- Send messages using the Responses API
131-
- Process agent responses
55+
### Publishing from Visual Studio Code
13256

133-
## Production integration patterns
57+
1. Open the Command Palette (**Ctrl+Shift+P**) and run **Microsoft Foundry: Deploy Hosted Agent** for hosted agents
58+
1. Select the target workspace and container configuration
59+
1. Confirm and deploy
13460

135-
Different applications require different integration approaches. Common patterns include:
61+
After publishing, the agent appears in the **Hosted Agents (Preview)** section of the AI Toolkit extension tree view.
13662

137-
### Web application integration
63+
### The Agent Application endpoint
13864

139-
Integrate agents into web applications to provide AI-powered features:
140-
- Start conversations when users interact with your agent
141-
- Send user messages to the agent through the Responses API
142-
- Display agent responses in your UI
143-
- Maintain conversation context across user sessions
65+
Published agents expose a stable endpoint using the Responses API protocol:
14466

145-
### API-driven workflows
67+
`https://<foundry-resource-name>.services.ai.azure.com/api/projects/<project-name>/applications/<app-name>/protocols/openai/responses`
14668

147-
Use agents in backend workflows triggered by events or schedules:
148-
- Send structured data as messages using the Responses API
149-
- Process agent responses programmatically
150-
- Use agent outputs to drive next steps in workflows
69+
This URL stays the same even as you roll out new agent versions, so downstream consumers aren't disrupted by updates.
15170

152-
### Chatbot implementations
71+
### Authentication and identity
15372

154-
Build conversational interfaces powered by agents:
155-
- Map user sessions to agent conversations
156-
- Handle real-time message exchange through the Responses API
157-
- Implement typing indicators while agents process requests
158-
- Support rich media in responses
73+
Agent Applications use Microsoft Entra ID for authentication. Callers must have the **Azure AI User** role on the Agent Application resource. API key authentication isn't supported for Agent Applications.
15974

160-
### Background automation
75+
> [!IMPORTANT]
76+
> When you publish an agent, it receives its own dedicated Entra identity, separate from the project's shared identity. Permissions don't transfer automatically. You must reassign RBAC roles to the new agent identity for any resources the agent accesses. If you skip this step, tool calls that work during development fail with authorization errors once the agent is published.
16177
162-
Deploy agents for automated tasks running without user interaction:
163-
- Schedule agent executions for regular tasks
164-
- Feed data from systems into agents using the Responses API
165-
- Process agent outputs to update business systems
166-
- Monitor agent performance and results
167-
168-
## Production considerations
78+
### Verifying the endpoint
16979

170-
Successfully running agents in production requires attention to operational aspects:
80+
After publishing, verify the endpoint works:
17181

172-
### Monitoring and observability
82+
1. Get an access token:
17383

174-
**Track key metrics:**
175-
- Response times and latency
176-
- Tool invocation success rates
177-
- Error rates and failure patterns
178-
- Conversation volume and message counts
179-
- Model token consumption
84+
```azurecli
85+
az account get-access-token --resource https://ai.azure.com
86+
```
18087

181-
These metrics help you identify performance issues and optimize agent behavior.
88+
1. Call the Agent Application endpoint:
18289

183-
### Security and compliance
90+
```bash
91+
curl -X POST \
92+
"https://<foundry-resource-name>.services.ai.azure.com/api/projects/<project-name>/applications/<app-name>/protocols/openai/responses?api-version=2025-11-15-preview" \
93+
-H "Authorization: Bearer <access-token>" \
94+
-H "Content-Type: application/json" \
95+
-d '{"input":"Say hello"}'
96+
```
18497

185-
**Implement security best practices:**
186-
- Use managed identities or service principals for authentication
187-
- Apply least-privilege access controls
188-
- Encrypt sensitive data in transit and at rest
189-
- Audit agent actions and conversations
190-
- Implement data retention policies compliant with regulations
98+
If you receive `403 Forbidden`, confirm the caller has the **Azure AI User** role on the Agent Application resource.
19199

192-
### Cost management
100+
### Updating published agents
193101

194-
**Monitor and optimize costs:**
195-
- Track token usage across agents and conversations
196-
- Set response length limits to control costs
197-
- Choose appropriate models balancing capability and cost
198-
- Implement rate limiting to prevent unexpected usage spikes
199-
- Manage conversation history retention to reduce storage costs
102+
To roll out a new agent version:
200103

201-
### Performance optimization
104+
1. Make changes in your development environment and test thoroughly
105+
1. In the Foundry portal, select **Publish Updates** from the Agent playground
106+
1. The Agent Application routes 100% of traffic to the new version automatically
202107

203-
**Optimize agent performance:**
204-
- Cache frequently requested information
205-
- Optimize instructions for clarity and conciseness
206-
- Remove unnecessary tools that add latency
207-
- Monitor model selection, as some models are faster than others
208-
- Implement timeout handling for long-running operations
108+
The endpoint URL remains unchanged, so existing integrations continue working.
209109

210-
## Error handling and resilience
211-
212-
Robust agent implementations handle errors gracefully:
213-
214-
**Network failures** - Implement retry logic with exponential backoff when API calls fail due to transient network issues.
110+
## Generating integration code
215111

216-
**Tool failures** - When tools timeout or error, ensure agents provide helpful fallback responses rather than failing silently.
112+
The Microsoft Foundry VS Code extension generates sample integration code to connect your application to a published agent:
217113

218-
**Rate limiting** - Handle rate limit responses from Azure by implementing backoff strategies and queueing mechanisms.
114+
1. Select your deployed agent in the My Resources view
115+
1. Select **View Code**
116+
1. Choose your folder
117+
1. The extension generates code for authenticating, connecting, sending messages, and processing responses
219118

220-
**Invalid inputs** - Validate user inputs before sending to agents, filtering malicious content or formatting issues.
119+
## Integration patterns
221120

222-
## Updating production agents
121+
Common patterns for integrating published agents include:
223122

224-
As requirements evolve, you'll need to update deployed agents:
123+
- **Web applications** - Send user messages to the Responses API endpoint and display responses in your UI. Store conversation history client-side for multi-turn interactions.
124+
- **API-driven workflows** - Call the agent endpoint from backend services triggered by events or schedules. Process responses programmatically to drive downstream actions.
125+
- **Chatbot interfaces** - Map user sessions to conversations. Handle real-time message exchange through the endpoint.
126+
- **Background automation** - Schedule agent calls for recurring tasks. Feed system data into agents and process outputs to update business systems.
225127

226-
1. Make changes in your development environment
227-
1. Test thoroughly before deploying updates
228-
1. Deploy updates during low-traffic periods when possible
229-
1. Monitor for issues after deployment
230-
1. Have rollback plans if updates cause problems
128+
## Production considerations
231129

232-
The agent ID remains constant across updates, so existing integrations continue working with updated behavior.
130+
Running agents in production requires attention to several operational areas:
233131

234-
Testing, deploying, and integrating agents transforms development efforts into production value. By following systematic testing approaches, leveraging integrated deployment tools, and implementing robust integration patterns, you can confidently deliver AI agents that enhance your applications and automate workflows while maintaining enterprise-grade reliability and security.
132+
- **Monitoring** - Track response times, tool invocation success rates, error patterns, and token consumption using Application Insights integration.
133+
- **Security** - Use managed identities for authentication, apply least-privilege access, and define data retention policies.
134+
- **Cost management** - Monitor token usage, set response length limits, and implement rate limiting to prevent unexpected spikes.
135+
- **Error handling** - Implement retry logic with exponential backoff for transient failures. Handle rate limiting with backoff strategies. Validate inputs before sending to agents.
136+
- **Conversation management** - Agent Application endpoints currently support only the stateless Responses API. Store conversation history in your client for multi-turn experiences.

0 commit comments

Comments
 (0)