MicrosoftDocs
diff --git a/‎learn-pr/wwl-data-ai/finetune-model-copilot-ai-studio/1-introduction.yml‎
Lines changed: 1 addition & 1 deletion b/‎learn-pr/wwl-data-ai/finetune-model-copilot-ai-studio/1-introduction.yml‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎learn-pr/wwl-data-ai/finetune-model-copilot-ai-studio/2-understand-finetune.yml‎
Lines changed: 2 additions & 2 deletions b/‎learn-pr/wwl-data-ai/finetune-model-copilot-ai-studio/2-understand-finetune.yml‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎learn-pr/wwl-data-ai/finetune-model-copilot-ai-studio/3-prepare-data.yml‎
Lines changed: 2 additions & 2 deletions b/‎learn-pr/wwl-data-ai/finetune-model-copilot-ai-studio/3-prepare-data.yml‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎learn-pr/wwl-data-ai/finetune-model-copilot-ai-studio/4-finetune-model.yml‎
Lines changed: 1 addition & 1 deletion b/‎learn-pr/wwl-data-ai/finetune-model-copilot-ai-studio/4-finetune-model.yml‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎learn-pr/wwl-data-ai/finetune-model-copilot-ai-studio/5-exercise.yml‎
Lines changed: 1 addition & 1 deletion b/‎learn-pr/wwl-data-ai/finetune-model-copilot-ai-studio/5-exercise.yml‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎learn-pr/wwl-data-ai/finetune-model-copilot-ai-studio/6-knowledge-check.yml‎
Lines changed: 1 addition & 1 deletion b/‎learn-pr/wwl-data-ai/finetune-model-copilot-ai-studio/6-knowledge-check.yml‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎learn-pr/wwl-data-ai/finetune-model-copilot-ai-studio/7-summary.yml‎
Lines changed: 1 addition & 1 deletion b/‎learn-pr/wwl-data-ai/finetune-model-copilot-ai-studio/7-summary.yml‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎learn-pr/wwl-data-ai/finetune-model-copilot-ai-studio/includes/2-understand-finetune.md‎
Lines changed: 77 additions & 1 deletion b/‎learn-pr/wwl-data-ai/finetune-model-copilot-ai-studio/includes/2-understand-finetune.md‎
Lines changed: 77 additions & 1 deletion
diff --git a/‎learn-pr/wwl-data-ai/finetune-model-copilot-ai-studio/includes/3-prepare-data.md‎
Lines changed: 54 additions & 2 deletions b/‎learn-pr/wwl-data-ai/finetune-model-copilot-ai-studio/includes/3-prepare-data.md‎
Lines changed: 54 additions & 2 deletions
diff --git a/‎learn-pr/wwl-data-ai/finetune-model-copilot-ai-studio/includes/7-summary.md‎
Lines changed: 5 additions & 1 deletion b/‎learn-pr/wwl-data-ai/finetune-model-copilot-ai-studio/includes/7-summary.md‎
Lines changed: 5 additions & 1 deletion
@@ -6,7 +6,7 @@ metadata:
   description: Introduction to fine-tuning a foundation model from the model catalog in Microsoft Foundry.
   author: madiepev
   ms.author: madiepev
-  ms.date: 07/15/2025
+  ms.date: 02/01/2026
   ms.topic: unit
   ms.collection:
     - wwl-ai-copilot
 
@@ -3,10 +3,10 @@ uid: learn.wwl.finetune-model-copilot-ai-studio.understand-finetune
 title: Understand when to fine-tune a language model 
 metadata:
   title: Understand when to fine-tune a language model
-  description: Learn how to optimize a language model, and when to fine-tune a language model.
+  description: Learn how to optimize a language model and explore fine-tuning techniques including supervised fine-tuning, reinforcement fine-tuning, and direct preference optimization.
   author: madiepev
   ms.author: madiepev
-  ms.date: 07/15/2025
+  ms.date: 02/01/2026
   ms.topic: unit
   ms.collection:
     - wwl-ai-copilot
 
@@ -3,10 +3,10 @@ uid: learn.wwl.finetune-model-copilot-ai-studio.prepare-data
 title: Prepare your data to fine-tune a chat completion model
 metadata:
   title: Prepare your data to fine-tune a chat completion model
-  description: Prepare your data to fine-tune a chat completion model in the Microsoft Foundry portal.
+  description: Learn how to prepare training data from real conversations or generate synthetic data using Microsoft Foundry's data generation tools for fine-tuning chat completion models.
   author: madiepev
   ms.author: madiepev
-  ms.date: 07/15/2025
+  ms.date: 02/01/2026
   ms.topic: unit
   ms.collection:
     - wwl-ai-copilot
 
@@ -6,7 +6,7 @@ metadata:
   description: Explore fine-tuning foundation models from the model catalog in Microsoft Foundry portal.
   author: madiepev
   ms.author: madiepev
-  ms.date: 07/15/2025
+  ms.date: 02/01/2026
   ms.topic: unit
   ms.collection:
     - wwl-ai-copilot
 
@@ -6,7 +6,7 @@ metadata:
   description: Exercise - Fine-tune a language model in the Microsoft Foundry portal.
   author: madiepev
   ms.author: madiepev
-  ms.date: 07/15/2025
+  ms.date: 02/01/2026
   ms.topic: unit
   ms.collection:
     - wwl-ai-copilot
 
@@ -6,7 +6,7 @@ metadata:
   description: Knowledge check about fine-tuning a language model.
   author: madiepev
   ms.author: madiepev
-  ms.date: 07/15/2025
+  ms.date: 02/01/2026
   ms.topic: unit
   ms.collection:
     - wwl-ai-copilot
 
@@ -6,7 +6,7 @@ metadata:
   description: Summary of fine-tuning language models.
   author: madiepev
   ms.author: madiepev
-  ms.date: 07/15/2025
+  ms.date: 02/01/2026
   ms.topic: unit
   ms.collection:
     - wwl-ai-copilot
 
@@ -1,4 +1,17 @@
-Before you start fine-tuning a model, you need to have a clear understanding of what fine-tuning is and when you should use it.
+Fine-tuning a language model gives you greater control over how your model behaves, helping you achieve consistent responses in a specific style, format, and tone. Here you learn when to use fine-tuning and you explore five key fine-tuning techniques:
+
+Starting with the most basic approach:
+- Supervised fine-tuning for training with labeled examples
+
+And four more advanced techniques:
+- Function calling fine-tuning for structured output and API integration
+- Vision fine-tuning for image understanding tasks
+- Reinforcement fine-tuning for reward-based learning
+- Direct preference optimization for alignment using preference pairs
+
+Let's start by comparing fine-tuning to other optimization techniques for models and agents.
+
+## Understand when to use fine-tuning
 
 When you want to develop a chat application with Microsoft Foundry, you can use prompt flow to create a chat application that is integrated with a language model to generate responses. To improve the quality of the responses the model generates, you can try various strategies. The easiest strategy is to apply **prompt engineering**. You can change the way you format your question, but you can also update the **system message** that is sent along with the prompt to the language model.
 
@@ -17,3 +30,66 @@ Within prompt engineering, a technique used to *"force"* the model to generate o
 
 To maximize the **consistency of the model's behavior**, you can **fine-tune a base model** with your own training data.
 
+## Explore fine-tuning techniques
+
+Microsoft Foundry supports multiple fine-tuning techniques, each designed for different use cases and model capabilities:
+
+### Apply supervised fine-tuning
+
+**Supervised fine-tuning** is the most basic and common approach where you train a base model on labeled example data. You provide the model with example conversations that demonstrate the desired behavior, including system messages, user prompts, and assistant responses. This technique is ideal for teaching the model specific formats, styles, tones, or domain-specific behaviors.
+
+Supervised fine-tuning is supported for models like GPT-4, GPT-4o, GPT-3.5-Turbo, and many other foundation models in the model catalog. This is the recommended starting point for most fine-tuning scenarios.
+
+### Implement reinforcement fine-tuning
+
+**Reinforcement fine-tuning (RFT)** is an advanced technique that improves reasoning models by training them through a reward-based process, rather than relying only on labeled data. Instead of providing example responses, you provide prompts and a grader that scores the quality of the model's outputs. The model learns to generate better responses by maximizing the reward signal.
+
+RFT is especially useful for:
+
+- Complex reasoning and problem-solving tasks
+- Scenarios where labeled examples are limited
+- Cases where you want the model to develop sophisticated reasoning strategies
+
+RFT is supported for advanced reasoning models like o4-mini and gpt-5. When using RFT, you need to define a grader (such as text comparison, model-based, or custom code graders) that evaluates the model's outputs during training.
+
+### Align with direct preference optimization
+
+**Direct preference optimization (DPO)** is an advanced alignment technique that adjusts model weights based on human preferences. Instead of providing single example responses, you provide pairs of responses - one preferred and one nonpreferred - for each prompt. The model learns to generate outputs more similar to the preferred examples.
+
+DPO is especially useful when:
+
+- There's no clear-cut correct answer
+- Subjective elements like tone, style, or content preferences are important  
+- You have preference data from user logs, A/B tests, or manual annotations
+- You want a computationally lighter alternative to reinforcement learning from human feedback (RLHF)
+
+DPO is supported for models like gpt-4o, gpt-4.1, and gpt-4.1-mini. You can use DPO with base models or with models already fine-tuned using supervised fine-tuning.
+
+### Fine-tune for function calling
+
+**Function calling fine-tuning** is an advanced technique that trains models to reliably call external functions or APIs with structured arguments. You provide training examples that demonstrate how the model should respond to user requests by calling specific functions with the correct parameters. This technique teaches the model when and how to use tools, improving its ability to generate structured output and integrate with external systems.
+
+Function calling fine-tuning is especially useful for:
+
+- Building agents that need to interact with APIs or databases
+- Ensuring consistent JSON schema adherence
+- Teaching domain-specific function usage patterns
+- Reducing errors in parameter extraction and function selection
+
+Function calling fine-tuning is supported for models like GPT-4o and GPT-4o-mini. You can combine this with supervised fine-tuning to teach both conversational behavior and function calling capabilities.
+
+### Fine-tune for vision tasks
+
+**Vision fine-tuning** is an advanced technique that enhances models' ability to understand and reason about images. You provide training examples that pair images with text prompts and expected responses, teaching the model to recognize specific visual patterns, objects, or concepts relevant to your domain. This technique is ideal for specialized computer vision applications where general-purpose vision models need domain-specific understanding.
+
+Vision fine-tuning is especially useful for:
+
+- Medical imaging analysis with specialized terminology
+- Industrial quality control and defect detection
+- Document understanding with custom layouts or formats
+- Domain-specific image classification and captioning
+
+Vision fine-tuning is supported for multimodal models like GPT-4o. You can fine-tune both the vision and language understanding capabilities together to create models tailored to your specific visual tasks.
+
+Now that you understand the options, let's explore what to consider when preparing your data for fine-tuning workloads.
+
@@ -1,7 +1,9 @@
-Fine-tuning involves combining a suitable *foundation* model to use as a base, and with a set of *training data* that includes example prompts and responses that the model can learn from.
+Fine-tuning involves combining a suitable *foundation* model to use as a base, and with a set of *training data* that includes example prompts and responses that the model can learn from. This unit covers understanding the training data format, preparing datasets from real data, and generating synthetic data for fine-tuning.
 
 ![Diagram of a base model plus training data resulting in a fine-tuned model.](../media/fine-tuning.png)
 
+## Understand the training data format
+
 When you decide you want to fine-tune a language model, you need to identify the dataset you can use to fine-tune your language model.
 
 Similar to any machine learning model, the quality of the dataset has a large effect on the quality of your model. Though you need less data than when you would train a language model from scratch, you still might need enough data to maximize the consistency of your desired model's behavior. How much data you need depends on your use case.
@@ -18,6 +20,8 @@ The three variables come together in a JSON Lines or JSONL file. For example, on
 {"messages": [{"role": "system", "content": "You are an Xbox customer support agent whose primary goal is to help users with issues they are experiencing with their Xbox devices. You are friendly and concise. You only provide factual answers to queries, and do not provide answers that are not related to Xbox."}, {"role": "user", "content": "Is Xbox better than PlayStation?"}, {"role": "assistant", "content": "I apologize, but I cannot provide personal opinions. My primary job is to assist you with any issues related to your Xbox device. Do you have any Xbox-related issues that need addressing?"}]}
 ```
 
+## Prepare datasets from real data
+
 The dataset should show the model's ideal behavior. You can create this dataset based on the chat history of a chat application you have. A few things to keep in mind when you use real data is to:
 
 - Remove any personal or sensitive information.
@@ -31,4 +35,52 @@ An example of a multi-turn chat file format with weights:
 {"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "What's the capital of France?"}, {"role": "assistant", "content": "Paris", "weight": 0}, {"role": "user", "content": "Can you be more sarcastic?"}, {"role": "assistant", "content": "Paris, as if everyone doesn't know that already.", "weight": 1}]}
 ```
 
-When preparing your dataset to fine-tune a language model, you should understand your desired model behaviors, create a dataset in JSONL format, and ensure the examples you include are high quality and diverse. By preparing your dataset, you have a higher chance that the fine-tuned model improves your chat application's performance.
+## Generate synthetic data for fine-tuning
+
+Creating high-quality training datasets manually can be time-consuming and expensive. Microsoft Foundry provides **synthetic data generation** capabilities that can help you create training data from reference documents or API specifications.
+
+Synthetic data generation is especially useful when:
+
+- Real data is scarce or difficult to collect
+- You need to preserve privacy while retaining useful structure
+- You want to generate domain-specific data from existing documents or code
+- You want to reduce the cost compared to manual data collection
+
+### Choose synthetic data generators
+
+Microsoft Foundry offers two types of synthetic data generators:
+
+**Simple Q&A Generator**: Converts domain documents (PDF, Markdown, or plain text up to 20 MB) into fine-tuning question-answer pairs. You can configure the question type:
+
+- **Long answer**: Generates questions that require analytical reasoning in the model response
+- **Short answer**: Generates questions focused on factual brevity
+
+**Tool use Generator**: Creates multi-turn conversations with tool calls based on your API surface. Requires a valid OpenAPI 3.0.x or 3.1.x specification (JSON format, up to 20MB) that describes the APIs you want the model to learn to call as tools.
+
+### Generate synthetic data in the portal
+
+To generate synthetic data in Microsoft Foundry:
+
+1. Navigate to **Data > Synthetic Data Generation** in the portal
+1. Select **Generate data**
+1. Choose a task type (Simple Q&A or Tool use)
+1. Configure generation parameters if applicable  
+1. Upload your reference file
+1. Specify the number of samples to generate (between 50 and 1,000)
+1. Select the model to use for data generation
+1. Optionally enable an 80/20 train-validation split
+1. Submit the job
+
+The generated dataset is automatically formatted as JSONL files compatible with Microsoft Foundry fine-tuning workflows. You can download the dataset for review or use it directly in fine-tuning.
+
+### Apply best practices for synthetic data
+
+To get the best results when using synthetic data:
+
+- **Use high-quality reference files**: The quality of your reference file directly impacts the generated data. Use relevant, well-structured documents with clear formatting and avoid excessive noise or irrelevant information.
+- **Start small and iterate**: Begin with a smaller sample size to evaluate quality, then scale up after reviewing and refining your approach.
+- **Combine with real data**: When possible, mix synthetic data with real-world examples to improve model performance and generalization.
+- **Experiment with hyperparameters**: When fine-tuning on synthetic data, you need to adjust learning rates and other parameters differently than with real data to avoid overfitting.
+- **Monitor performance**: Regularly evaluate your fine-tuned model on real-world tasks to ensure it meets your requirements.
+
+When preparing your dataset to fine-tune a language model, you should understand your desired model behaviors, create a dataset in JSONL format (manually or through synthetic generation), and ensure the examples you include are high quality and diverse. By preparing your dataset carefully, you have a higher chance that the fine-tuned model improves your chat application's performance.
@@ -1,13 +1,17 @@
 In this module, you learned:
 
 - Understand when to fine-tune a model.
-- Prepare your data to fine-tune a chat completion model.
+- Explore different fine-tuning techniques including supervised fine-tuning, reinforcement fine-tuning, and direct preference optimization.
+- Prepare your data to fine-tune a chat completion model, including generating synthetic data.
 - Fine-tune a base model in the Microsoft Foundry portal.
 
 ### Learn more
 
 - [Customize a model with fine-tuning](/azure/ai-services/openai/how-to/fine-tuning)
 - [Fine-tune models using standard deployments in Microsoft Foundry](/azure/ai-foundry/how-to/fine-tune-serverless)
+- [Reinforcement fine-tuning in Microsoft Foundry](/azure/ai-foundry/openai/how-to/reinforcement-fine-tuning?view=foundry)
+- [Direct preference optimization in Microsoft Foundry](/azure/ai-foundry/openai/how-to/fine-tuning-direct-preference-optimization?view=foundry)
+- [Generate synthetic data for fine-tuning](/azure/ai-foundry/fine-tuning/data-generation?view=foundry)
 - [Microsoft Foundry Discord](https://aka.ms/azureaifoundry/discord)
 - [Microsoft Foundry Developer Forum](https://aka.ms/azureaifoundry/forum)