MicrosoftDocs
diff --git a/‎.gitignore‎
Lines changed: 0 additions & 1 deletion b/‎.gitignore‎
Lines changed: 0 additions & 1 deletion
diff --git a/‎articles/app-service/overview-ai-integration.md‎
Lines changed: 3 additions & 3 deletions b/‎articles/app-service/overview-ai-integration.md‎
Lines changed: 3 additions & 3 deletions
diff --git a/‎articles/app-service/scenario-ai-agentic-web-apps.md‎
Lines changed: 38 additions & 0 deletions b/‎articles/app-service/scenario-ai-agentic-web-apps.md‎
Lines changed: 38 additions & 0 deletions
diff --git a/‎…s/app-service/scenario-ai-chatbot-rag.md‎ ‎…hatbot-retrieval-augmented-generation.md‎articles/app-service/scenario-ai-chatbot-rag.md renamed to articles/app-service/scenario-ai-chatbot-retrieval-augmented-generation.md
Lines changed: 26 additions & 0 deletions b/‎…s/app-service/scenario-ai-chatbot-rag.md‎ ‎…hatbot-retrieval-augmented-generation.md‎articles/app-service/scenario-ai-chatbot-rag.md renamed to articles/app-service/scenario-ai-chatbot-retrieval-augmented-generation.md
Lines changed: 26 additions & 0 deletions
diff --git a/‎articles/app-service/scenario-ai-local-slm.md‎
Lines changed: 0 additions & 34 deletions b/‎articles/app-service/scenario-ai-local-slm.md‎
Lines changed: 0 additions & 34 deletions
diff --git a/‎articles/app-service/scenario-ai-local-small-language-model.md‎
Lines changed: 64 additions & 0 deletions b/‎articles/app-service/scenario-ai-local-small-language-model.md‎
Lines changed: 64 additions & 0 deletions
diff --git a/‎articles/app-service/scenario-ai-mcp-server.md‎
Lines changed: 0 additions & 37 deletions b/‎articles/app-service/scenario-ai-mcp-server.md‎
Lines changed: 0 additions & 37 deletions
@@ -158,4 +158,3 @@ articles/planetary-computer/geocatalog_configs/umbra-sar-ships/stac.json
 articles/planetary-computer/geocatalog_configs/umbra-sar-ships/tile-settings.json
 articles/planetary-computer/helper-content/collection-config-scraper.py
 articles/planetary-computer/helper-content/generate_collection_docs.py
-.beads/
@@ -31,11 +31,11 @@ Explore these scenarios to find the right approach for your AI-powered applicati
 
 | Scenario | Description | Learn more |
 |----------|-------------|------------|
-| **Chatbots and RAG applications** | Build intelligent chatbots and RAG-powered web apps with Azure OpenAI (and optional Azure AI Search) for grounded, context-aware responses directly in App Service. | [Get started](scenario-ai-chatbot-rag.md) |
+| **Chatbots and RAG applications** | Build intelligent chatbots and RAG-powered web apps with Azure OpenAI (and optional Azure AI Search) for grounded, context-aware responses directly in App Service. | [Get started](scenario-ai-chatbot-retrieval-augmented-generation.md) |
 | **Agentic web applications** | Add autonomous, reasoning AI agents to your existing CRUD web apps using frameworks like Semantic Kernel, LangGraph, or Foundry Agent Service to enable planning, multi-step actions, and natural language interactions. | [Get started](scenario-ai-agentic-web-apps.md) |
 | **OpenAPI tools for Foundry agents** | Expose your App Service REST APIs as secure, callable tools via OpenAPI specs so Foundry Agent Service agents can discover and invoke them for real-world actions and data access. | [Get started](scenario-ai-openapi-tool.md) |
-| **Model Context Protocol servers** | Host your App Service app as an MCP server to extend AI coding assistants like GitHub Copilot Chat, Cursor, and Winsurf with your custom business logic, APIs, and data context via the Model Context Protocol. | [Get started](scenario-ai-mcp-server.md) |
-| **Local small language models** | Run small language models (e.g., Phi-3/Phi-4) entirely locally as sidecar containers in App Service for full data privacy, zero-latency inference, offline capability, and no external API calls or dependencies. | [Get started](scenario-ai-local-slm.md) |
+| **Model Context Protocol servers** | Host your App Service app as an MCP server to extend AI coding assistants like GitHub Copilot Chat, Cursor, and Winsurf with your custom business logic, APIs, and data context via the Model Context Protocol. | [Get started](scenario-ai-model-context-protocol-server.md) |
+| **Local small language models** | Run small language models (e.g., Phi-3/Phi-4) entirely locally as sidecar containers in App Service for full data privacy, zero-latency inference, offline capability, and no external API calls or dependencies. | [Get started](scenario-ai-local-small-language-model.md) |
 | **Secure AI applications** | Secure your OpenAPI tools, MCP servers, and AI endpoints in App Service using Microsoft Entra ID authentication, managed identities, and built-in authorization to ensure only authorized users and agents can access them. | [Get started](scenario-ai-authentication.md) |
 
 ## More resources
 
@@ -16,6 +16,44 @@ ms.update-cycle: 180-days
 
 Transform your traditional CRUD web applications for the AI era by adding agentic capabilities with frameworks like Microsoft Semantic Kernel, LangGraph, or Foundry Agent Service. Instead of users navigating forms, textboxes, and dropdowns, you can offer a conversational interface that lets users "talk to an agent" that intelligently performs the same operations your app provides. This approach enables your web app to reason, plan, and take actions on behalf of users.
 
+## Overview
+
+Agentic web applications represent a paradigm shift from traditional web interfaces. Rather than requiring users to understand and navigate your application's structure, agentic apps use AI to understand user intent, plan multi-step actions, and execute complex workflows autonomously.
+
+Key characteristics of agentic web applications include:
+
+- **Conversational interfaces**: Users express goals in natural language rather than clicking through forms
+- **Autonomous reasoning**: Agents break down complex requests into executable steps
+- **Multi-step planning**: Agents chain multiple operations to accomplish sophisticated tasks
+- **Tool usage**: Agents call your existing APIs and functions as needed to complete user requests
+- **Context awareness**: Agents maintain conversation history and application state across interactions
+- **Error handling**: Agents recover from failures and adapt their approach based on results
+
+Frameworks like Semantic Kernel, LangGraph, and Foundry Agent Service provide the orchestration layer that connects large language models with your application's business logic, enabling these agentic capabilities in your App Service applications.
+
+## When to build agentic web applications
+
+Consider adding agentic capabilities when:
+
+- **Complex workflows are common**: Users frequently need to perform multi-step operations that could be simplified through conversation
+- **Domain expertise is required**: Your application requires specialized knowledge that an AI agent can learn and apply
+- **User experience matters**: You want to reduce training time and make your application more intuitive
+- **Data exploration is important**: Users need to query, analyze, and visualize data in flexible ways
+- **Task automation is valuable**: Repetitive or time-consuming workflows could benefit from AI-assisted completion
+
+Agentic patterns work especially well for enterprise applications, data analysis tools, content management systems, and administrative interfaces where the combination of natural language understanding and automated workflows provides significant productivity gains.
+
+## Choosing an agent framework
+
+App Service supports any agent framework that runs on your chosen language stack. You have complete flexibility to use the tools and frameworks that best fit your needs. Popular options include:
+
+- **Semantic Kernel**: Microsoft's cross-platform SDK for .NET, Python, and Java, ideal for building custom agents with full control
+- **LangGraph**: Python and JavaScript framework for building stateful, multi-agent systems with complex workflows
+- **Foundry Agent Service**: Managed service for hosting production-ready agents with built-in monitoring and scalability
+- **Custom frameworks**: Any other agentic framework supported by your language (e.g., AutoGen, CrewAI, or proprietary solutions)
+
+## Get started with tutorials
+
 ## [.NET](#tab/dotnet)
 - [Tutorial: Build an agentic web app in Azure App Service with Microsoft Semantic Kernel or Foundry Agent Service (.NET)](tutorial-ai-agent-web-app-semantic-kernel-foundry-dotnet.md)
 - Blog series: Build long-running AI agents with Microsoft Agent Framework
 
@@ -16,6 +16,32 @@ ms.update-cycle: 180-days
 
 Build intelligent web apps that use Azure OpenAI for chat or retrieval augmented generation (RAG). These tutorials show you how to integrate Azure OpenAI and (optionally) Azure AI Search to create chatbots and RAG solutions in your preferred language, using managed identities for secure authentication.
 
+## Overview
+
+Chatbots and Retrieval Augmented Generation (RAG) applications represent two of the most popular AI integration patterns in modern web applications. A chatbot uses a large language model to engage in natural conversations with users, while RAG enhances these conversations by grounding responses in your own data sources, reducing hallucinations and providing more accurate, contextual answers.
+
+Azure App Service provides a comprehensive platform for hosting these intelligent applications with built-in support for:
+
+- **Azure OpenAI integration**: Seamless connection to the latest Azure OpenAI models using managed identities
+- **Azure AI Search connectivity**: Optional integration with Azure AI Search for vector search and document retrieval
+- **Secure authentication**: Built-in support for managed identities eliminates the need for API keys
+- **Scalability**: Automatic scaling to handle varying workloads
+- **Multiple language support**: Deploy chatbots in .NET, Java, Node.js, or Python
+
+## When to use chatbots and RAG
+
+Consider building a chatbot or RAG application when you want to:
+
+- **Provide conversational interfaces**: Replace traditional form-based UIs with natural language interactions
+- **Enable intelligent document search**: Allow users to query large document repositories using natural language
+- **Create customer support assistants**: Build AI-powered help desks that understand context and provide accurate responses
+- **Develop knowledge bases**: Transform static documentation into interactive Q&A systems
+- **Build internal tools**: Create employee-facing assistants that can access and explain company data
+
+RAG is particularly valuable when you need to ensure your AI responses are grounded in specific, up-to-date information from your organization's data sources, such as product catalogs, documentation, policies, or customer records.
+
+## Get started with tutorials
+
 ## [.NET](#tab/dotnet)
 - [Build a chatbot with Azure OpenAI (Blazor)](tutorial-ai-openai-chatbot-dotnet.md)
 - [Build a RAG application with Azure OpenAI and Azure AI Search (.NET)](tutorial-ai-openai-search-dotnet.md)
 
@@ -0,0 +1,64 @@
+---
+title: Use local small language models (SLMs) in Azure App Service
+description: Deploy a web app with a local small language model (SLM) as a sidecar container to run AI models entirely within your App Service environment. No outbound calls or external AI service dependencies required.
+author: cephalin
+ms.author: cephalin
+ms.service: azure-app-service
+ms.topic: how-to
+ms.date: 01/29/2026
+ms.custom:
+  - build-2025
+ms.collection: ce-skilling-ai-copilot
+ms.update-cycle: 180-days
+---
+
+# Use a local SLM (sidecar container)
+
+Deploy a web app with a local small language model (SLM) as a sidecar container to run AI models entirely within your App Service environment. No outbound calls or external AI service dependencies required. This approach is ideal if you have strict data privacy or compliance requirements, as all AI processing and data remain local to your app. App Service offers high-performance, memory-optimized pricing tiers needed for running SLMs in sidecars.
+
+## Overview
+
+Small Language Models (SLMs) are compact AI models, such as Microsoft's Phi-3 and Phi-4 series, that can run efficiently with fewer computational resources compared to large language models. By deploying an SLM as a sidecar container alongside your web application in App Service, you can process AI requests entirely locally without making external API calls.
+
+This architecture provides several advantages:
+
+- **Complete data privacy**: All data and AI processing stays within your App Service environment
+- **Zero external dependencies**: No reliance on external AI services or internet connectivity
+- **Predictable latency**: Responses are consistently fast with no network overhead
+- **Cost control**: Pay only for App Service compute resources, with no per-token charges
+- **Regulatory compliance**: Meet strict data residency and privacy requirements
+
+## When to use local SLMs
+
+Local SLMs are ideal for scenarios where:
+
+- **Data privacy is critical**: Healthcare, finance, legal, or government applications that cannot send data to external services
+- **Offline capability is required**: Applications that must function without internet connectivity
+- **Predictable costs are important**: Fixed infrastructure costs instead of variable per-request pricing
+- **Low latency is essential**: Sub-100ms response times without network calls
+- **Moderate AI capabilities suffice**: Tasks like classification, summarization, entity extraction, or simple Q&A that don't require the most powerful models
+
+While SLMs may not match the capabilities of large models like GPT-4, they excel in focused, domain-specific tasks where smaller models can be fine-tuned for excellent performance with complete control over your data.
+
+## Technical approach
+
+App Service supports running SLMs through sidecar containers that deploy alongside your main application. The sidecar runs the model inference engine (such as ONNX Runtime or llama.cpp) and exposes a local endpoint your app can call. This keeps all processing in-process and maintains isolation while sharing the same compute resources.
+
+## Get started with tutorials
+
+## [.NET](#tab/dotnet)
+- [Run a chatbot with a local SLM sidecar extension](tutorial-ai-slm-dotnet.md)
+
+## [Java](#tab/java)
+- [Run a chatbot with a local SLM (Spring Boot)](tutorial-ai-slm-spring-boot.md)
+
+## [Node.js](#tab/nodejs)
+- [Run a chatbot with a local SLM (Express.js)](tutorial-ai-slm-expressjs.md)
+
+## [Python](#tab/python)
+- [Run a chatbot with a local SLM (FastAPI)](tutorial-ai-slm-fastapi.md)
+-----
+
+## Related content
+
+- [Integrate AI into your Azure App Service applications](overview-ai-integration.md)