Skip to content

Add collation tests#225

Open
danielfrankcom wants to merge 4 commits into
documentdb:mainfrom
danielfrankcom:pr/collation
Open

Add collation tests#225
danielfrankcom wants to merge 4 commits into
documentdb:mainfrom
danielfrankcom:pr/collation

Conversation

@danielfrankcom
Copy link
Copy Markdown
Collaborator

This change adds tests for collation across commands/features. The intent is to cover semantic behavior of collation in general, so these tests cover a number of different features that could potentially be affected by collation.

While working on this change I found some gaps in the collation testing for the aggregate command, so will add them in #221. The split is the command tests aim to test the syntax and accepted options for the collation configuration, whereas the dedicated collation tests focus on the behavior.

There is some overlap here with #191 as some of the framework changes are needed by both. I've included them here so they should merge cleanly.

Signed-off-by: Daniel Frankcom <[email protected]>
@danielfrankcom danielfrankcom requested a review from a team as a code owner May 21, 2026 23:51
@documentdb-triage-tool documentdb-triage-tool Bot added compatibility test Compatibility test related enhancement New feature or request labels May 21, 2026
@documentdb-triage-tool
Copy link
Copy Markdown

🤖 Auto-triaged by documentdb-triage-tool.

Applied: compatibility test, enhancement
Project fields suggested: Component test-coverage · Priority P2 · Effort XL · Status Needs Review
Confidence: 0.80 (mixed)

Reasoning

component from path globs (test-coverage, test-framework); effort from diff stats (14234+21 LOC, 62 files); LLM: Adds new collation behavior tests across multiple commands/features, touching test coverage and some framework changes across multiple files.

If a label is wrong, remove it manually and ping @patty-chow so the rules can be tuned. The bot will not re-label items that already have component labels.

@@ -0,0 +1,356 @@
"""Tests for collation effects on $top, $bottom, $topN, $bottomN, $minN, $maxN accumulators."""
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To better demonstrate what each test does and ensure test coverage, can we consider a high level folder structure change?
For example:

collation/
  ├── utils/
  │   ├── __init__.py
  │   └── collation_view_mismatch.py
  │
  ├── options/
  │   ├── test_collation_locale.py              
  │   ├── ...e.g. strength, caseLevel, caseFirst, numericOrdering, alternate, maxVariable, backwards, normalization
  │   └── test_collation_error_cases.py         # invalid collation docs: missing locale, bad types, unknown fields
  │       # NOTE: each option test uses $match or find to verify behavior
  │
  ├── command_level/
  │   │   # 1-2 tests per operation/stage proving command-level collation works
  │   │   # operations
  │   ├── test_collation_find.py                
  │   ├── test_collation_find_and_modify.py
  │   ├── test_collation_count.py
  │   ...
  │   │   # stages
  │   ├── test_collation_aggregate_match.py     
  │   ├── test_collation_aggregate_sort.py      
  │   ...
  │   └── test_collation_aggregate_graphlookup.py 
  │
  ├── collection_level/
  │   ├── test_collation_collection_default.py  
  │   ├── test_collation_views.py              
  │   ...
  │   # Note: Collation on view should not impact the underlying collection
  │   └── test_collation_view_from_view.py         
  │
  ├── index_level/
  │   ├── test_collation_index.py               # create index with collation, query uses it when collation matches
  │   ...
  │   └── test_collation_index_not_used.py      # query with different collation → COLLSCAN
  │
  └── resolution/
      └── test_collation_resolution.py          # three-way precedence:command-vs-collection-vs-index level collation precedence
          # NOTE: each test uses one operation as example should be good

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline, have pushed a change with suggested structure based on this.

@danielfrankcom danielfrankcom requested a review from yshanhu May 26, 2026 22:18
# created with a collation other than simple; creating one on a collection
# with a non-simple default collation requires specifying
# collation {locale: "simple"} on the index.
COLLATION_TEXT_INDEX_TESTS: list[CommandTestCase] = [
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be moved to /index_level/

# Property [Capped Collection Collation]: a capped collection can be created
# with a default collation, and collation affects filter matching and sort
# ordering on capped collections the same as regular collections.
COLLATION_CAPPED_TESTS: list[CommandTestCase] = [
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move to /collection_level

@@ -0,0 +1,542 @@
"""Tests for collation behavior with indexes."""
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing 2dsphere and 2d index.

@@ -0,0 +1,332 @@
"""Tests for collation constraints on views and cross-view stage behavior."""

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Timeseries, Clustered not covered.

@@ -0,0 +1,113 @@
"""Tests for collation effects on positional $ and $elemMatch projection."""
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not put under operations/test_operations_projection_ops.py.

from documentdb_tests.framework.parametrize import pytest_params

# Property [Dotted Path Filter Matching]: collation affects equality and
# comparison operators on dotted field paths in find and aggregate $match,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need this file, collation affects on every string comparison. deep field path is implied. One is enough.

@@ -0,0 +1,166 @@
"""Tests for collation interaction with command-level let variables."""
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$let is expression operator, test_stages_expression.py, and use in update is not needed. We just test one level down: $let + collation. If we test both find and update, we are testing $let + collation + command.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

compatibility test Compatibility test related enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants