Skip to content

Commit ae8d32d

Browse files
Merge pull request #53294 from MicrosoftDocs/NEW-optimize-search-azure-database-postgresql
New module optimize search azure database postgresql from release branch
2 parents ba53d29 + 0bcf49a commit ae8d32d

19 files changed

Lines changed: 849 additions & 0 deletions
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
### YamlMime:ModuleUnit
2+
uid: learn.wwl.optimize-vector-search-azure-database-postgresql.introduction
3+
title: Introduction
4+
metadata:
5+
title: Introduction
6+
description: Introduction
7+
ms.date: 01/26/2026
8+
author: jeffkoms
9+
ms.author: jeffko
10+
ms.topic: unit
11+
durationInMinutes: 3
12+
content: |
13+
[!include[](includes/1-introduction.md)]
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
### YamlMime:ModuleUnit
2+
uid: learn.wwl.optimize-vector-search-azure-database-postgresql.tune-postgresql-pgvector
3+
title: Tune PostgreSQL for pgvector
4+
metadata:
5+
title: Tune PostgreSQL for pgvector
6+
description: Tune PostgreSQL for pgvector
7+
ms.date: 01/26/2026
8+
author: jeffkoms
9+
ms.author: jeffko
10+
ms.topic: unit
11+
durationInMinutes: 15
12+
content: |
13+
[!include[](includes/2-tune-postgresql-pgvector.md)]
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
### YamlMime:ModuleUnit
2+
uid: learn.wwl.optimize-vector-search-azure-database-postgresql.choose-configure-vector-indexes
3+
title: Choose and configure vector indexes
4+
metadata:
5+
title: Choose and Configure Vector Indexes
6+
description: Choose and configure vector indexes
7+
ms.date: 01/26/2026
8+
author: jeffkoms
9+
ms.author: jeffko
10+
ms.topic: unit
11+
durationInMinutes: 15
12+
content: |
13+
[!include[](includes/3-choose-configure-vector-indexes.md)]
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
### YamlMime:ModuleUnit
2+
uid: learn.wwl.optimize-vector-search-azure-database-postgresql.optimize-data-layout
3+
title: Optimize data layout
4+
metadata:
5+
title: Optimize Data Layout
6+
description: Optimize data layout
7+
ms.date: 01/26/2026
8+
author: jeffkoms
9+
ms.author: jeffko
10+
ms.topic: unit
11+
durationInMinutes: 14
12+
content: |
13+
[!include[](includes/4-optimize-data-layout.md)]
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
### YamlMime:ModuleUnit
2+
uid: learn.wwl.optimize-vector-search-azure-database-postgresql.scale-high-volume-workloads
3+
title: Scale for high-volume workloads
4+
metadata:
5+
title: Scale for High-Volume Workloads
6+
description: Scale for high-volume workloads
7+
ms.date: 01/26/2026
8+
author: jeffkoms
9+
ms.author: jeffko
10+
ms.topic: unit
11+
durationInMinutes: 12
12+
content: |
13+
[!include[](includes/5-scale-high-volume-workloads.md)]
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
### YamlMime:ModuleUnit
2+
uid: learn.wwl.optimize-vector-search-azure-database-postgresql.connection-optimization
3+
title: Connection optimization
4+
metadata:
5+
title: Connection Optimization
6+
description: Connection optimization
7+
ms.date: 01/26/2026
8+
author: jeffkoms
9+
ms.author: jeffko
10+
ms.topic: unit
11+
durationInMinutes: 15
12+
content: |
13+
[!include[](includes/6-connection-optimization.md)]
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
### YamlMime:ModuleUnit
2+
uid: learn.wwl.optimize-vector-search-azure-database-postgresql.exercise-optimize-vector-search
3+
title: Exercise - Optimize vector search performance in Azure Database for PostgreSQL
4+
metadata:
5+
title: Exercise - Optimize Vector Search Performance in Azure Database for PostgreSQL
6+
description: Exercise - Optimize vector search performance in Azure Database for PostgreSQL
7+
ms.date: 01/26/2026
8+
author: jeffkoms
9+
ms.author: jeffko
10+
ms.topic: unit
11+
durationInMinutes: 30
12+
content: |
13+
[!include[](includes/7-exercise-optimize-vector-search.md)]
Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
### YamlMime:ModuleUnit
2+
uid: learn.wwl.optimize-vector-search-azure-database-postgresql.knowledge-check
3+
title: Module assessment
4+
metadata:
5+
title: Module Assessment
6+
description: Module assessment
7+
ms.date: 01/26/2026
8+
author: jeffkoms
9+
ms.author: jeffko
10+
ms.topic: unit
11+
azureSandbox: false
12+
durationInMinutes: 5
13+
content: "Choose the best response for each of the following questions."
14+
quiz:
15+
questions:
16+
- content: "You're tuning PostgreSQL for a vector search workload with 2 million 1536-dimensional embeddings. Queries are slow and you observe a cache hit ratio of 85%. Which configuration change should you prioritize?"
17+
choices:
18+
- content: "Increase `shared_buffers` to keep more data in the PostgreSQL cache"
19+
isCorrect: true
20+
explanation: "Correct. A cache hit ratio of 85% indicates that 15% of data page reads require disk access. Increasing `shared_buffers` allows PostgreSQL to cache more vector data and indexes in memory, reducing disk I/O and improving query latency for vector workloads."
21+
- content: "Decrease `random_page_cost` to encourage more index scans"
22+
isCorrect: false
23+
explanation: "Incorrect. While lowering `random_page_cost` encourages index usage, it doesn't address the root cause of the problem. The 85% cache hit ratio indicates insufficient memory caching, not incorrect query planning. The planner settings won't help if the data isn't in cache."
24+
- content: "Increase `ivfflat.probes` to search more index partitions"
25+
isCorrect: false
26+
explanation: "Incorrect. Increasing `ivfflat.probes` improves recall by searching more partitions, but it doesn't address the caching issue. The low cache hit ratio suggests data is being read from disk repeatedly, which is the primary performance bottleneck."
27+
- content: "You need to create a vector index for a dataset of 5 million product embeddings that receives frequent batch updates (daily full refresh). Build time must be under 30 minutes. Which index configuration should you choose?"
28+
choices:
29+
- content: "IVFFlat with lists set to sqrt(rows)"
30+
isCorrect: true
31+
explanation: "Correct. IVFFlat indexes build faster than HNSW indexes, making them practical for datasets that require frequent rebuilding. For 5 million rows, setting lists to approximately 2,200-2,500 provides good partitioning while keeping build time manageable."
32+
- content: "HNSW with m=16 and ef_construction=64"
33+
isCorrect: false
34+
explanation: "Incorrect. HNSW indexes provide excellent query performance but have long build times that increase more than linearly with data size. For 5 million vectors, an HNSW index typically takes 2-6 hours to build, far exceeding the 30-minute requirement."
35+
- content: "HNSW with m=8 and ef_construction=32"
36+
isCorrect: false
37+
explanation: "Incorrect. Even with reduced parameters, HNSW build times for 5 million vectors would likely exceed the 30-minute target. Lower parameter values also reduce graph quality, potentially defeating the purpose of using HNSW over IVFFlat."
38+
- content: "Your filtered vector search query filters by `category_id` and then orders by vector similarity. The query plan shows a sequential scan on the products table. What should you check first?"
39+
choices:
40+
- content: "Verify that a B-tree index exists on the `category_id` column"
41+
isCorrect: true
42+
explanation: "Correct. Without a B-tree index on `category_id`, PostgreSQL can't efficiently filter rows before applying vector similarity. Creating the index allows the planner to reduce the candidate set before vector operations, significantly improving performance for filtered queries."
43+
- content: "Verify that the vector index uses the same operator class as the query"
44+
isCorrect: false
45+
explanation: "Incorrect. While operator class matching is important for vector index usage, the sequential scan in this case is likely due to the metadata filter. If there's no efficient way to filter by `category_id`, PostgreSQL might scan the entire table regardless of vector index configuration."
46+
- content: "Increase `hnsw.ef_search` to expand the search space"
47+
isCorrect: false
48+
explanation: "Incorrect. The `hnsw.ef_search` parameter affects recall and speed for HNSW index searches, but it doesn't address why a sequential scan is being used. The problem is that metadata filtering isn't being handled efficiently, which requires a B-tree index on the filter column."
49+
- content: "You're implementing connection management for an AI application that makes 500 vector queries per second during peak traffic. Your Azure Database for PostgreSQL instance supports 1,719 max connections. Which approach should you use?"
50+
choices:
51+
- content: "Enable PgBouncer in transaction mode with a pool size appropriate for your application instances"
52+
isCorrect: true
53+
explanation: "Correct. PgBouncer in transaction mode multiplexes many client connections across fewer database connections. Clients hold server connections only during transactions, allowing hundreds of concurrent requests to share a smaller pool. This prevents hitting connection limits while maintaining throughput."
54+
- content: "Create a new database connection for each query request"
55+
isCorrect: false
56+
explanation: "Incorrect. Creating connections per request incurs 50-200ms overhead for TCP handshake, TLS negotiation, and authentication on each query. At 500 queries per second, this approach would also quickly exhaust the 1,719 connection limit, causing connection failures."
57+
- content: "Enable PgBouncer in session mode to maintain persistent connections"
58+
isCorrect: false
59+
explanation: "Incorrect. Session mode holds database connections for the entire client session, providing minimal connection reduction. For high-throughput workloads with many concurrent requests, session mode doesn't provide the connection multiplexing benefits needed to stay within limits."
60+
- content: "You're scaling a recommendation engine that currently runs on a General Purpose 8 vCore instance. CPU utilization averages 75% and P95 query latency is 150 ms, but you need to achieve sub-50ms latency. What scaling approach should you try first?"
61+
choices:
62+
- content: "Upgrade to a Memory Optimized tier with more vCores"
63+
isCorrect: true
64+
explanation: "Correct. High CPU utilization (75%) combined with slow queries suggests compute constraints. Memory Optimized tiers provide more memory per vCore, which helps keep vector indexes cached. More vCores enable parallel query execution. Vertical scaling is simpler to implement and should be tried before adding architectural complexity."
65+
- content: "Add read replicas to distribute query load"
66+
isCorrect: false
67+
explanation: "Incorrect. Read replicas help when total query volume exceeds what one server can handle, but they don't improve single-query latency. Since the problem is that individual queries take 150ms, adding replicas won't reduce that time. You first need to make single queries faster."
68+
- content: "Implement application-level caching with Azure Cache for Redis"
69+
isCorrect: false
70+
explanation: "Incorrect. While caching can reduce database load for repeated queries, recommendation queries typically have high cardinality (many unique query vectors), making cache hit rates low. The primary issue is single-query performance, which requires database-level optimization before considering caching."
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
### YamlMime:ModuleUnit
2+
uid: learn.wwl.optimize-vector-search-azure-database-postgresql.summary
3+
title: Summary
4+
metadata:
5+
title: Summary
6+
description: Summary
7+
ms.date: 01/26/2026
8+
author: jeffkoms
9+
ms.author: jeffko
10+
ms.topic: unit
11+
durationInMinutes: 2
12+
content: |
13+
[!include[](includes/9-summary.md)]
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
AI applications require fast, reliable vector search to power features like semantic retrieval, recommendation engines, and RAG pipelines. Poorly tuned databases create latency bottlenecks that degrade user experience and limit throughput. This module guides you through optimizing Azure Database for PostgreSQL and pgvector to achieve the performance your AI solutions demand.
2+
3+
Imagine you're a developer building a product recommendation engine for an e-commerce platform. The system uses vector embeddings to find similar products based on user behavior, product descriptions, and visual features. When users browse the site, recommendations must appear in under 100 milliseconds to avoid disrupting the shopping experience. During flash sales and holiday peaks, the platform handles tens of thousands of concurrent users requesting personalized recommendations.
4+
5+
Your initial deployment performs well with a catalog of 50,000 products, but as the inventory grows to two million items and traffic spikes during promotions, query latency climbs from 30 milliseconds to over one second. Conversion rates drop as users abandon slow-loading pages. You need to tune the database, select the right vector index, and scale the infrastructure to deliver fast recommendations without overspending on compute resources.
6+
7+
This scenario represents challenges common across AI applications: vector search performance degrades as data grows, concurrent users strain connection limits, and the trade-off between accuracy and speed becomes critical. The techniques you learn in this module apply whether you're building recommendation systems, semantic search, RAG pipelines, or other vector-powered features.
8+
9+
After completing this module, you'll be able to:
10+
11+
- Tune PostgreSQL and pgvector configuration parameters to optimize query latency and memory usage for AI workloads
12+
- Select and configure the appropriate vector index type based on dataset size, query patterns, and accuracy requirements
13+
- Design data layouts that optimize vector storage and metadata filtering performance
14+
- Scale Azure Database for PostgreSQL to handle high-volume vector workloads
15+
- Implement connection pooling and session management strategies for AI applications

0 commit comments

Comments
 (0)