|
| 1 | +The most accessible way to optimize a model's performance is through **prompt engineering**. Prompt engineering is the process of designing and refining prompts to improve the quality, accuracy, and relevance of the responses a language model generates. It requires no additional infrastructure or training data, and you can start experimenting immediately. |
| 2 | + |
| 3 | +## Understand prompt components |
| 4 | + |
| 5 | +When you interact with a language model, the quality of your question directly influences the quality of the response. A well-constructed prompt helps the model understand what you need and generate a more useful answer. |
| 6 | + |
| 7 | +Prompts for chat completion models typically include the following components: |
| 8 | + |
| 9 | +- **System message**: Instructions that define the model's behavior, role, and constraints. |
| 10 | +- **User message**: The question or input from the user. |
| 11 | +- **Assistant message**: Previous model responses, used in multi-turn conversations. |
| 12 | +- **Examples**: Sample input/output pairs that demonstrate the expected response format. |
| 13 | + |
| 14 | +How you structure and combine these components determines how effectively the model responds. |
| 15 | + |
| 16 | +## Design effective system messages |
| 17 | + |
| 18 | +A **system message** is a set of instructions you provide to the model to guide its responses. System messages typically appear first in the conversation and act as the highest-level set of instructions. You use them to: |
| 19 | + |
| 20 | +- Define the assistant's role and boundaries. |
| 21 | +- Set the tone and communication style. |
| 22 | +- Specify output formats, such as JSON or bullet points. |
| 23 | +- Add safety and quality constraints for your scenario. |
| 24 | + |
| 25 | +A system message can be as simple as: |
| 26 | + |
| 27 | +```text |
| 28 | +You are a helpful AI assistant. |
| 29 | +``` |
| 30 | + |
| 31 | +Or it can include detailed rules and formatting requirements. For example, the travel agency's chat application could use: |
| 32 | + |
| 33 | +```text |
| 34 | +You are a friendly travel advisor for Margie's Travel. |
| 35 | +Answer only questions related to travel, hotels, and trip planning. |
| 36 | +Use a warm, conversational tone. |
| 37 | +If you don't have enough information to answer, ask a clarifying question. |
| 38 | +Format hotel recommendations as a bulleted list with the hotel name, location, and price range. |
| 39 | +``` |
| 40 | + |
| 41 | +> [!IMPORTANT] |
| 42 | +> A system message influences the model but doesn't guarantee compliance. You should test and iterate on your system messages, and layer them with other mitigations like content filtering and evaluation. |
| 43 | +
|
| 44 | +When designing a system message, follow this checklist: |
| 45 | + |
| 46 | +1. **Start with the assistant's role**: State the role and the expected outcome for a typical request. |
| 47 | +1. **Define boundaries**: List the topics, actions, and content types the assistant should avoid. |
| 48 | +1. **Specify the output format**: If you need a specific format, state it plainly and keep it consistent. |
| 49 | +1. **Add a "when unsure" policy**: Tell the model what to do when the user's request is ambiguous, out of scope, or when the model lacks information. |
| 50 | + |
| 51 | +## Apply prompt patterns |
| 52 | + |
| 53 | +Effective prompts use patterns that help the model produce better responses. Here are some common patterns you can use: |
| 54 | + |
| 55 | +### Persona pattern |
| 56 | + |
| 57 | +Instruct the model to take on a specific perspective or role. For example, asking the model to respond as a seasoned marketing professional produces different results than using no persona at all. |
| 58 | + |
| 59 | +| | No persona | With persona | |
| 60 | +|---|---|---| |
| 61 | +| **System message** | *None* | You're a seasoned marketing professional writing for technical customers. | |
| 62 | +| **User prompt** | Write a one-sentence description of a CRM product. | Write a one-sentence description of a CRM product. | |
| 63 | +| **Response** | A CRM product is a software tool designed to manage a company's interactions with customers. | Experience seamless customer relationship management with our CRM, designed to streamline operations and drive sales growth with robust analytics. | |
| 64 | + |
| 65 | +### Format template pattern |
| 66 | + |
| 67 | +Provide a template or structure in your prompt to get output in a specific format. For example, if you need a structured response about a hotel: |
| 68 | + |
| 69 | +```text |
| 70 | +Format the result to show: |
| 71 | +- Hotel name |
| 72 | +- Location |
| 73 | +- Star rating |
| 74 | +- Price range per night |
| 75 | +``` |
| 76 | + |
| 77 | +This pattern ensures consistent, organized responses that are easy to parse in your application. |
| 78 | + |
| 79 | +### Chain-of-thought pattern |
| 80 | + |
| 81 | +Ask the model to explain its reasoning step by step. This technique, called **chain of thought**, reduces the chance of inaccurate results and makes it easier to verify the model's logic. |
| 82 | + |
| 83 | +For example, instead of asking "Which hotel is best for a family of four?", you can prompt: |
| 84 | + |
| 85 | +```text |
| 86 | +Which hotel is best for a family of four? Take a step-by-step approach: |
| 87 | +consider room size, amenities for children, location, and price. |
| 88 | +``` |
| 89 | + |
| 90 | +A related technique is to **break the task down** into explicit sub-steps *before* the model responds, rather than asking it to reason through everything at once. For example, you might first ask the model to extract key facts from a passage, and then in a follow-up prompt ask it to answer a question based on those facts. Decomposing the work this way reduces errors on complex, multi-part tasks. |
| 91 | + |
| 92 | +> [!NOTE] |
| 93 | +> Chain-of-thought prompting is a technique for non-reasoning models. Reasoning models like o-series models handle step-by-step logic internally. |
| 94 | +
|
| 95 | +### Few-shot learning pattern |
| 96 | + |
| 97 | +Provide one or more examples of the desired input and output to help the model identify the pattern you want. This technique is called **few-shot learning** (or **one-shot** for a single example). When no examples are provided, it's called **zero-shot** learning. |
| 98 | + |
| 99 | +For example, to classify customer inquiries: |
| 100 | + |
| 101 | +```text |
| 102 | +Classify the following customer messages: |
| 103 | +
|
| 104 | +Message: "I need to change my flight to Rome" |
| 105 | +Category: Booking change |
| 106 | +
|
| 107 | +Message: "What's the weather like in Bali in March?" |
| 108 | +Category: Travel information |
| 109 | +
|
| 110 | +Message: "Can I get a refund for my cancelled tour?" |
| 111 | +Category: |
| 112 | +``` |
| 113 | + |
| 114 | +The model learns the classification pattern from the examples and correctly completes the last entry. |
| 115 | + |
| 116 | +### Use clear syntax and delimiters |
| 117 | + |
| 118 | +When your prompt includes multiple sections — such as instructions, source text, and examples — use delimiters like `---`, Markdown headings, or XML tags to separate them. Clear boundaries help the model distinguish instructions from content and reduce the chance of misinterpretation. |
| 119 | + |
| 120 | +> [!TIP] |
| 121 | +> Models can be susceptible to **recency bias**, meaning text near the end of a prompt can have more influence than text at the beginning. If the model isn't following your instructions consistently, try repeating the key instruction at the end of the prompt. |
| 122 | +
|
| 123 | +## Configure model parameters |
| 124 | + |
| 125 | +Beyond the text of your prompts, you can adjust model parameters that control how the model generates responses: |
| 126 | + |
| 127 | +- **Temperature**: Controls the randomness of the output. A higher value (for example, 0.7) produces more creative and varied responses, while a lower value (for example, 0.2) produces more focused and deterministic responses. Use lower values for factual tasks and higher values for creative ones. |
| 128 | +- **Top_p**: Also controls randomness, but in a different way. It limits the model to a subset of the most probable next tokens. For example, a `top_p` of 0.9 means the model considers only the top 90% of probable tokens. |
| 129 | + |
| 130 | +> [!TIP] |
| 131 | +> The general recommendation is to adjust either temperature or top_p, not both at the same time. |
| 132 | +
|
| 133 | +For the travel agency scenario, you might use a low temperature (0.2) when answering factual questions about hotel amenities, but a higher temperature (0.7) when generating creative travel itinerary suggestions. |
| 134 | + |
| 135 | +## When prompt engineering is enough |
| 136 | + |
| 137 | +Prompt engineering is the right starting point for any model optimization effort. It's effective when you need to: |
| 138 | + |
| 139 | +- Guide the model's tone, format, and behavior. |
| 140 | +- Provide specific instructions for a task. |
| 141 | +- Quickly iterate on results without infrastructure changes. |
| 142 | +- Keep costs low, as no additional training or data storage is required. |
| 143 | + |
| 144 | +However, prompt engineering has limits. If the model doesn't have access to the information it needs (like your company's hotel catalog), or if it consistently fails to maintain a specific behavior despite detailed instructions, you need to consider additional strategies. |
0 commit comments