| title | Tutorial: Spring Boot chatbot with SLM extension | |
|---|---|---|
| description | Learn how to deploy a Spring Boot application integrated with a Phi-4 sidecar extension on Azure App Service. | |
| author | cephalin | |
| ms.author | cephalin | |
| ms.date | 11/18/2025 | |
| ms.topic | tutorial | |
| ms.custom |
|
|
| ms.collection | ce-skilling-ai-copilot | |
| ms.update-cycle | 180-days | |
| ms.service | azure-app-service |
This tutorial guides you through deploying a Spring Boot-based chatbot application integrated with the Phi-4 sidecar extension on Azure App Service. By following the steps, you'll learn how to set up a scalable web app, add an AI-powered sidecar for enhanced conversational capabilities, and test the chatbot's functionality.
[!INCLUDE advantages]
- An Azure account with an active subscription.
- A GitHub account.
-
In the browser, navigate to the sample application repository.
-
Start a new Codespace from the repository.
-
Log in with your Azure account:
az login -
Open the terminal in the Codespace and run the following commands:
cd use_sidecar_extension/springapp ./mvnw clean package az webapp up --sku P3MV3 --runtime "JAVA:21-java21" --os-type linux
[!INCLUDE phi-4-extension-create-test]
The sample application demonstrates how to integrate a Java service with the SLM sidecar extension. The ReactiveSLMService class encapsulates the logic for sending requests to the SLM API and processing the streamed responses. This integration enables the application to generate conversational responses dynamically.
Looking in use_sidecar_extension/springapp/src/main/java/com/example/springapp/service/ReactiveSLMService.java, you see that:
-
The service reads the URL from
fashion.assistant.api.url, which is set in application.properties and has the value ofhttp://localhost:11434/v1/chat/completions.public ReactiveSLMService(@Value("${fashion.assistant.api.url}") String apiUrl) { this.webClient = WebClient.builder() .baseUrl(apiUrl) .build(); }
-
The POST payload includes the system message and the prompt that's built from the selected product and the user query.
JSONObject requestJson = new JSONObject(); JSONArray messages = new JSONArray(); JSONObject systemMessage = new JSONObject(); systemMessage.put("role", "system"); systemMessage.put("content", "You are a helpful assistant."); messages.put(systemMessage); JSONObject userMessage = new JSONObject(); userMessage.put("role", "user"); userMessage.put("content", prompt); messages.put(userMessage); requestJson.put("messages", messages); requestJson.put("stream", true); requestJson.put("cache_prompt", false); requestJson.put("n_predict", 2048); String requestBody = requestJson.toString();
-
The reactive POST request streams the response line by line. Each line is parsed to extract the generated content (or token).
return webClient.post() .contentType(MediaType.APPLICATION_JSON) .body(BodyInserters.fromValue(requestBody)) .accept(MediaType.TEXT_EVENT_STREAM) .retrieve() .bodyToFlux(String.class) .filter(line -> !line.equals("[DONE]")) .map(this::extractContentFromResponse) .filter(content -> content != null && !content.isEmpty()) .map(content -> content.replace(" ", "\u00A0"));
[!INCLUDE faq]
Tutorial: Configure a sidecar container for a Linux app in Azure App Service