|
| 1 | +### YamlMime:ModuleUnit |
| 2 | +uid: learn.wwl.implement-vector-search-azure-cosmos-db.knowledge-check |
| 3 | +title: Module assessment |
| 4 | +metadata: |
| 5 | + title: Module Assessment |
| 6 | + description: Module assessment |
| 7 | + ms.date: 02/05/2026 |
| 8 | + author: jeffkoms |
| 9 | + ms.author: jeffko |
| 10 | + ms.topic: unit |
| 11 | +durationInMinutes: 5 |
| 12 | +content: "Choose the best response for each of the following questions." |
| 13 | +quiz: |
| 14 | + questions: |
| 15 | + - content: "A developer is creating a container to store knowledge base documents with embeddings generated by the text-embedding-ada-002 model. Which vector policy configuration correctly specifies the embedding path for this model?" |
| 16 | + choices: |
| 17 | + - content: "Set dataType to float32, dimensions to 1536, and distanceFunction to cosine" |
| 18 | + isCorrect: true |
| 19 | + explanation: "Azure OpenAI's text-embedding-ada-002 model produces 1,536-dimensional vectors that are normalized. Using float32 provides full precision, 1536 matches the model's output dimensions, and cosine distance is recommended for normalized embeddings because it measures the angle between vectors regardless of magnitude." |
| 20 | + - content: "Set dataType to int8, dimensions to 1024, and distanceFunction to euclidean" |
| 21 | + isCorrect: false |
| 22 | + explanation: "The text-embedding-ada-002 model produces 1,536 dimensions, not 1,024. Additionally, int8 is for quantized embeddings, not the floating-point values this model produces. The dimensions must exactly match your embedding model's output." |
| 23 | + - content: "Set dataType to float16, dimensions to 3072, and distanceFunction to dotproduct" |
| 24 | + isCorrect: false |
| 25 | + explanation: "The text-embedding-ada-002 model produces 1,536 dimensions, not 3,072. While float16 and dotproduct are valid options for some scenarios, the dimensions must match your specific embedding model. The 3,072 value is for larger models like text-embedding-3-large." |
| 26 | + - content: "An AI application executes vector searches that return 100 results to display to users, but query performance is slow and RU consumption is high. Which change would most effectively improve performance while maintaining search quality?" |
| 27 | + choices: |
| 28 | + - content: "Force brute-force search by setting the third parameter of VectorDistance to true" |
| 29 | + isCorrect: false |
| 30 | + explanation: "Brute-force search performs exact matching but is significantly slower and more expensive than indexed search. It compares the query vector against every document, which increases both latency and RU consumption. This would make performance worse, not better." |
| 31 | + - content: "Remove the ORDER BY clause from the query" |
| 32 | + isCorrect: false |
| 33 | + explanation: "Removing the ORDER BY clause breaks the vector search functionality. The ORDER BY VectorDistance clause is essential for ranking results by similarity. Without it, results would be returned in arbitrary order rather than by relevance." |
| 34 | + - content: "Reduce the TOP N clause to return only 10-20 results" |
| 35 | + isCorrect: true |
| 36 | + explanation: "Requesting fewer results significantly improves query performance by reducing the amount of data processed and returned. For user-facing search, 10-20 results with pagination typically provides a good user experience. Users rarely need 100 results displayed at once." |
| 37 | + - content: "A support knowledge base application needs to find documents similar to a user's query but only within a specific product category. Which query structure efficiently combines vector search with metadata filtering?" |
| 38 | + choices: |
| 39 | + - content: "Execute a cross-partition vector search first, then filter results by category in application code" |
| 40 | + isCorrect: false |
| 41 | + explanation: "Filtering in application code after a cross-partition search is inefficient. You pay RU costs for retrieving documents from all partitions, then discard most results. Pre-filtering at the database level reduces the search space and improves efficiency." |
| 42 | + - content: "Include the category filter in the WHERE clause and specify the partition_key parameter when the category is the partition key" |
| 43 | + isCorrect: true |
| 44 | + explanation: "Combining the filter in the WHERE clause with the partition_key parameter enables single-partition routing. This significantly reduces RU consumption by targeting only the relevant partition instead of scanning all partitions, while the WHERE clause restricts results to the specified category." |
| 45 | + - content: "Create a separate container for each product category and execute vector searches against the appropriate container" |
| 46 | + isCorrect: false |
| 47 | + explanation: "Creating separate containers for each category adds management complexity and doesn't leverage Cosmos DB's partitioning capabilities. Using partition keys with filtered queries provides the same performance benefits while maintaining a simpler architecture." |
| 48 | + - content: "A development team wants to implement search that combines semantic understanding with exact keyword matching for technical terms and error codes. Which approach enables this hybrid search capability?" |
| 49 | + choices: |
| 50 | + - content: "Use ORDER BY RANK RRF with VectorDistance and FullTextScore functions" |
| 51 | + isCorrect: true |
| 52 | + explanation: "The RRF (Reciprocal Rank Fusion) function merges rankings from VectorDistance (semantic similarity) and FullTextScore (keyword matching) into a unified result set. Documents that rank highly in both approaches appear at the top, providing results that match both semantic meaning and specific keywords." |
| 53 | + - content: "Execute two separate queries and merge results in application code" |
| 54 | + isCorrect: false |
| 55 | + explanation: "While possible, merging results in application code requires multiple round trips to the database and custom ranking logic. The RRF function handles this efficiently within a single query, reducing complexity and latency." |
| 56 | + - content: "Add keyword terms to the query vector before executing vector search" |
| 57 | + isCorrect: false |
| 58 | + explanation: "Embedding models don't work by appending keywords. The embedding API converts text to a vector representation of semantic meaning. To combine semantic and keyword search, you need both VectorDistance for similarity and FullTextScore for exact matches, merged using RRF." |
| 59 | + - content: "An AI application needs to keep vector embeddings synchronized with document content as documents are updated. Which approach provides reliable, automatic embedding refresh without polling?" |
| 60 | + choices: |
| 61 | + - content: "Schedule a batch job to periodically scan all documents and regenerate embeddings" |
| 62 | + isCorrect: false |
| 63 | + explanation: "Periodic batch scanning is inefficient and introduces delays between document changes and embedding updates. It requires comparing all documents to detect changes and doesn't provide real-time synchronization. The change feed provides immediate notification of changes." |
| 64 | + - content: "Store a timestamp with each document and query for recently modified documents" |
| 65 | + isCorrect: false |
| 66 | + explanation: "Timestamp-based queries require polling and don't guarantee you catch all changes, especially during high-volume periods. The change feed provides a reliable, ordered log of all changes without the complexity of timestamp-based tracking." |
| 67 | + - content: "Use an Azure Functions Cosmos DB trigger to detect changes and regenerate embeddings" |
| 68 | + isCorrect: true |
| 69 | + explanation: "Azure Functions with a Cosmos DB trigger provides push-based change feed processing. Changes are automatically delivered to your function, which can regenerate embeddings for modified documents. The trigger handles partition management and checkpointing automatically." |
0 commit comments