Skip to content

Commit f7893bc

Browse files
authored
Merge pull request #309103 from craigshoemaker/sre/memory-system
[SRE Agent] New: Memory system
2 parents 7633a8b + 511d5a8 commit f7893bc

4 files changed

Lines changed: 357 additions & 0 deletions

File tree

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
---
2+
title: Documentation Connector in Azure SRE Agent Preview
3+
description: Discover how the Azure SRE Agent documentation connector enables automated crawling, semantic search, and wide file format support for Azure DevOps repositories.
4+
author: craigshoemaker
5+
ms.author: cshoe
6+
ms.reviewer: cshoe
7+
ms.date: 12/18/2025
8+
ms.topic: article
9+
ms.service: azure-sre-agent
10+
---
11+
12+
# Documentation connector in Azure SRE Agent preview
13+
14+
The Azure SRE Agent documentation connector automatically crawls Azure DevOps repositories to index troubleshooting guides, runbooks, and documentation for agent retrieval.
15+
16+
### Key features
17+
18+
- **Automated crawling**: Runs every 24 hours without manual intervention
19+
20+
- **Wide file format support**: Indexes `.md`, `.txt`, `.rst`, `.adoc`, `.html`, `.json`, `.yaml`, `.yml`, `.xml`, `.csv`, and more
21+
22+
- **Azure DevOps integration**: Connects to Git repositories using managed identity
23+
24+
- **Semantic search**: Documents are chunked, embedded, and indexed for AI-powered retrieval
25+
26+
### Prerequisites
27+
28+
Before setting up a documentation connector:
29+
30+
- Azure DevOps repository containing documentation
31+
- Managed identity configured for the agent (User-Assigned or System-Assigned)
32+
- Repository read access granted to the managed identity
33+
34+
### Setup
35+
36+
1. In the portal, go to **Settings** > **Basics** and note the managed identity name.
37+
1. In Azure DevOps, add the managed identity as a user with **Basic** access level.
38+
1. Grant **Read** permission on the target repository.
39+
1. Go to **Settings** > **Connectors** and select **Add connector**.
40+
1. Select **Documentation connector**, enter the repository URL, and select the managed identity.
41+
1. The connector starts indexing right away.
42+
43+
## Related content
44+
45+
- [Memory system](./memory-system.md)
63.6 KB
Loading
Lines changed: 308 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,308 @@
1+
---
2+
title: Memory System in SRE Agent Preview
3+
description: Use the SRE Agent memory system to build team knowledge that agents retrieve during incident handling, enabling context-aware responses that improve over time.
4+
author: craigshoemaker
5+
ms.author: cshoe
6+
ms.reviewer: cshoe
7+
ms.date: 12/18/2025
8+
ms.topic: article
9+
ms.service: azure-sre-agent
10+
ms.collection: ce-skilling-ai-copilot
11+
#customer intent: As an SRE team member, I want to understand how the memory system works so I can add knowledge that helps agents provide better responses during incident handling.
12+
---
13+
14+
# Memory system in SRE Agent preview
15+
16+
The SRE Agent memory system gives agents the knowledge they need to troubleshoot effectively. By adding runbooks, team standards, and service-specific context, you help agents provide better answers during incidents. The system learns from each session to improve over time.
17+
18+
## Memory components
19+
20+
The memory system consists of four complementary components:
21+
22+
| Component | Purpose | Setup | Best for |
23+
|-----------|---------|-------|----------|
24+
| **User Memories** | Quick chat commands for team knowledge | Instant (chat commands) | Team standards, service configurations, workflow patterns |
25+
| **Knowledge Base** | Direct document uploads for runbooks | Quick (file upload) | Static runbooks, troubleshooting guides, internal documentation |
26+
| **Documentation connector** | Automated Azure DevOps synchronization | Configuration required | Living documentation, frequently updated guides |
27+
| **Session insights** | Agent-generated memories from sessions | Automatic | Learned troubleshooting patterns, past incident resolutions |
28+
29+
### How agents retrieve memory
30+
31+
During conversations, agents retrieve information from memory sources through configured tools.
32+
33+
:::image type="content" source="media/memory-system/azure-sre-agent-memory-system-loop.png" alt-text="Diagram of the Azure SRE Agent memory system loop.":::
34+
35+
<!--
36+
```mermaid
37+
flowchart TD
38+
subgraph Trigger
39+
A[User Question / Incident / Scheduled Task]
40+
end
41+
42+
subgraph Memory Sources
43+
B[User Memories<br/>chat commands]
44+
C[Knowledge Base<br/>documents]
45+
D[Documentation Connector<br/>ADO repos]
46+
E[Session Insights<br/>auto-generated]
47+
end
48+
49+
subgraph Retrieval
50+
F[SearchMemory Tool]
51+
end
52+
53+
A -- > B & C & D & E
54+
B & C & D & E -- > F
55+
F -- > G[Agent Reasoning]
56+
G -- > H[Relevant Context Retrieved]
57+
H -- > I[Agent Response]
58+
```
59+
-->
60+
61+
### Tool configuration
62+
63+
The `SearchMemory` tool retrieves all memory components. It searches across user memories, knowledge base, session insights, and documentation connector simultaneously.
64+
65+
- SRE Agent (default): `SearchMemory` is built in
66+
- Custom subagents: Add `SearchMemory` tool to your configuration
67+
68+
> [!IMPORTANT]
69+
> Don't store secrets, credentials, API keys, or sensitive data in any memory component. Memories are shared across your team and indexed for search.
70+
71+
## Quick start
72+
73+
Begin by establishing foundational knowledge with user memories, and then expand to document storage and automated synchronization as your needs grow.
74+
75+
### 1. Start with user memories
76+
77+
Use chat commands to save immediate team knowledge:
78+
79+
```text
80+
#remember Team owns services: app-service-prod, redis-cache-prod, and sql-db-prod
81+
82+
#remember For latency issues, check Redis cache health first
83+
84+
#remember Production deployments happen Tuesdays at 2 PM PST
85+
```
86+
87+
These facts are now available across all conversations.
88+
89+
### 2. Upload key documents
90+
91+
Add critical runbooks and guides to the knowledge base:
92+
93+
1. Open your SRE Agent in the Azure portal.
94+
95+
1. Go to **Settings** > **Knowledge base**.
96+
97+
1. Select **Add file** or drag and drop files into the upload area.
98+
99+
1. Upload `.md` or `.txt` files (up to 16 MB each).
100+
101+
1. The system indexes files and makes them available for retrieval through `SearchMemory`.
102+
103+
### 3. Review session insights
104+
105+
After troubleshooting sessions, check **Settings** > **Session insights** to see what went well and where the agent needs more context. Use the insights to identify knowledge gaps and add targeted memories or documentation.
106+
107+
### 4. Connect repositories (optional)
108+
109+
For teams with existing documentation in Azure DevOps:
110+
111+
1. Go to **Settings** > **Connectors**.
112+
113+
1. Select **Add connector** and select **Documentation connector**.
114+
115+
1. Enter your Azure DevOps repository URL and select a managed identity.
116+
117+
The connector starts indexing automatically.
118+
119+
## User memories
120+
121+
User memories let you save team facts, standards, and context that agents remember across all conversations. By using simple chat commands (`#remember`, `#forget`, `#retrieve`), you can build a persistent knowledge base that automatically enhances agent responses.
122+
123+
### Chat commands
124+
125+
#### Save information by using `#remember`
126+
127+
Save facts, standards, or context for future conversations.
128+
129+
**Syntax:**
130+
131+
```text
132+
#remember [content to save]
133+
```
134+
135+
**Examples:**
136+
137+
```text
138+
#remember Team owns app-service-prod in East US region
139+
#remember For app-service-prod latency issues, check Redis cache health first
140+
#remember Team uses Kusto for logs. Workspace is "myteam-prod-logs"
141+
```
142+
143+
Content is embedded by using OpenAI, stored in Azure AI Search, and becomes available for automatic retrieval across all conversations. You see a confirmation: `✅ Agent Memory saved.`
144+
145+
#### Remove memories by using `#forget`
146+
147+
Delete previously saved memories by searching for them.
148+
149+
**Syntax:**
150+
151+
```text
152+
#forget [description of what to forget]
153+
```
154+
155+
**Examples:**
156+
157+
```text
158+
#forget NSG rules information
159+
#forget production environment location
160+
```
161+
162+
The system searches your memories semantically for the best match, shows you the content, and deletes it. You see a confirmation: `✅ Agent Memory forgotten: [deleted content]`
163+
164+
#### Query memories by using `#retrieve`
165+
166+
Explicitly search and display saved memories without triggering agent reasoning.
167+
168+
**Syntax:**
169+
170+
```text
171+
#retrieve [search query]
172+
```
173+
174+
**Examples:**
175+
176+
```text
177+
#retrieve production environment
178+
#retrieve deployment process
179+
```
180+
181+
Searches memories semantically, and then uses the top five matches to synthesize a response. Both the individual memories and the synthesized answer are displayed.
182+
183+
### Scope and storage
184+
185+
- **Shared across the team**: All users of the SRE Agent can access it.
186+
187+
- **Persist across all conversations**: Save it once, and it's available forever.
188+
189+
- **Automatically retrieved when relevant**: Agents search memories semantically during reasoning.
190+
191+
## Knowledge base
192+
193+
The knowledge base provides direct document upload capabilities for runbooks, troubleshooting guides, and internal documentation that agents can retrieve during conversations.
194+
195+
### Supported file types and limits
196+
197+
- **Formats**: `.md` (markdown, recommended), `.txt` (plain text)
198+
- **Per file**: 16 MB maximum (Azure AI Search limit)
199+
- **Per request**: 100 MB total for all files in a single upload
200+
201+
### Upload documents
202+
203+
1. Go to **Settings** > **Knowledge Base**.
204+
1. Select **Add file** or drag and drop files into the upload area.
205+
206+
The portal automatically validates, uploads, and indexes files.
207+
208+
### Manage documents
209+
210+
- **View**: Go to **Settings** > **Knowledge Base** to see all uploaded documents.
211+
212+
- **Update**: To overwrite the previous version, upload a file with the same name.
213+
214+
- **Delete**: Select documents and use the delete action. Changes take effect immediately.
215+
216+
## Session insights
217+
218+
As the agent handles your incidents, it learns. Session insights capture what worked, what didn't, and key learnings from each session. The agent automatically applies that knowledge to help with similar issues in the future.
219+
220+
### Automatic improvement
221+
222+
The agent learns from every session without any manual effort:
223+
224+
* The agent handles an issue autonomously or works with you directly.
225+
* The agent captures symptoms, resolution steps, root cause, and pitfalls.
226+
* These insights become searchable memories.
227+
* Future sessions automatically retrieve relevant past insights.
228+
229+
The result: the agent gets better over time, suggesting proven resolutions and avoiding known pitfalls.
230+
231+
### Discover opportunities
232+
233+
While session insights work automatically, reviewing them can surface valuable patterns you might want to act on.
234+
235+
| Pattern you might discover | Potential action |
236+
|---------------------------|------------------|
237+
| Same issue keeps recurring | Fix the underlying code or configuration |
238+
| Agent lacks context about your service | Create a custom subagent with domain knowledge |
239+
| Troubleshooting steps aren't documented | Update or create a runbook |
240+
| Telemetry gaps made diagnosis harder | Improve logging or add metrics |
241+
| Alert triggered but wasn't actionable | Tune the alert or add runbook links |
242+
243+
Think of session insights as a window into what the agent learns. You might find something worth acting on, or you might just let the agent handle any surfaced issues.
244+
245+
### How it works
246+
247+
Session insights create a continuous improvement loop: the agent captures symptoms, steps, root cause, and pitfalls from each session, then retrieves relevant past insights when similar issues arise. This automatic cycle helps the agent resolve problems faster over time.
248+
249+
<!--
250+
```mermaid
251+
flowchart TD
252+
subgraph Loop["Automatic Learning Loop"]
253+
A["Issue arises<br/>Incident, alert, or question"] -- > B["Agent captures insight<br/>symptoms, steps, root cause,<br/>pitfalls, learnings"]
254+
B -- > C["Insight indexed<br/>Becomes searchable memory"]
255+
C -- > D["Future sessions benefit<br/>Agent retrieves relevant insights"]
256+
D -.- >|Similar issue arises| A
257+
end
258+
259+
Loop -- > E["Automatic: Agent improves over time"]
260+
Loop -- > F["Optional: Review insights for<br/>code/telemetry/runbook opportunities"]
261+
```
262+
-->
263+
264+
:::image type="content" source="media/memory-system/azure-sre-agent-memory-system-loop.png" alt-text="Diagram of Azure SRE Agent memory system loop.":::
265+
266+
### What the agent captures
267+
268+
The agent captures series of data points from each session to improve future troubleshooting.
269+
270+
| Captured | How the agent uses it |
271+
|----------|----------------------|
272+
| **Symptoms observed** | Recognizes similar patterns in future problems |
273+
| **Steps that worked** | Suggests proven resolution paths |
274+
| **Root cause found** | Jumps to likely causes faster |
275+
| **Pitfalls encountered** | Avoids repeating mistakes |
276+
| **Context you provided** | Remembers facts about your environment |
277+
| **Resources involved** | Connects past problems on same resources |
278+
279+
### When insights are generated
280+
281+
The system generates insights automatically after conversations finish, or you can request them on-demand.
282+
283+
- **Automatically**: After conversations finish (runs periodically, approximately every 30 minutes)
284+
- **On-demand**: Select **Generate Session insights** in the chat footer for immediate results (about 30 seconds)
285+
286+
### Browse insights
287+
288+
Go to **Settings** > **Session insights** to see what the agent learned:
289+
290+
- **Total count** in the header
291+
- **List of insights** with session title and timestamp
292+
- **Detail view** with expandable Timeline and Agent Performance sections
293+
- **Go to Thread** to revisit the original conversation
294+
295+
> [!NOTE]
296+
> While periodic manual browsing of insights can surface recurring patterns worth addressing, the agent benefits from these insights whether you review them or not.
297+
298+
### Insight structure
299+
300+
Each insight includes:
301+
302+
- **Timeline**: Chronological milestones of the troubleshooting session (up to eight)
303+
- **Agent Performance**: What went well, areas for improvement, and key learnings
304+
- **Investigation quality score**: 1-5 rating for investigation completeness
305+
306+
## Related content
307+
308+
- [Documentation connector](./documentation-connector.md)

articles/sre-agent/toc.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,8 @@ items:
1919
items:
2020
- name: Connectors overview
2121
href: connectors.md
22+
- name: Documentation connector
23+
href: documentation-connector.md
2224
- name: Connect to custom MCP server
2325
href: custom-mcp-server.md
2426
- name: Build custom subagents
@@ -33,6 +35,8 @@ items:
3335
href: code-repository-connect.md
3436
- name: Scheduled tasks
3537
href: scheduled-tasks.md
38+
- name: Knowledge retention
39+
href: memory-system.md
3640
- name: Incident management
3741
items:
3842
- name: Overview

0 commit comments

Comments
 (0)