Skip to content

fix: treat Compression.NONE sentinel as uncompressed in COPY INTO formats#1

Merged
simozzy merged 1 commit into
mainfrom
fix/compression-none-sentinel
Jun 19, 2026
Merged

fix: treat Compression.NONE sentinel as uncompressed in COPY INTO formats#1
simozzy merged 1 commit into
mainfrom
fix/compression-none-sentinel

Conversation

@simozzy

@simozzy simozzy commented Jun 18, 2026

Copy link
Copy Markdown
Member

Problem

Compression.NONE is a truthy Enum member, so the if compression: guard
in CSVFormat, TSVFormat, NDJSONFormat, and ParquetFormat treats the
no-compression sentinel as "a codec was specified":

if compression:                                   # Compression.NONE is truthy!
    if compression not in [Compression.ZSTD, Compression.SNAPPY]:
        raise TypeError('Compression should be None, ZStd, or Snappy.')
    self.options["COMPRESSION"] = compression.value
  • ParquetFormat(compression=Compression.NONE) raises TypeError: Compression should be None, ZStd, or Snappy. — even though NONE is the "no compression" case.
  • The CSV/TSV/NDJSON classes don't raise, but emit a spurious COMPRESSION = NONE option for the sentinel.

Fix

Skip the block for both None and the Compression.NONE sentinel, so the
uncompressed case renders no COMPRESSION option:

if compression and compression is not Compression.NONE:
    ...

Applied to all four format classes. Explicit codecs are unchanged, and
ParquetFormat still rejects codecs its writer can't use (gzip/bz2/zip).

Tests

tests/test_copy_format.py — constructor-level (no DB):

  • None and Compression.NONE emit no COMPRESSION option, across all four classes.
  • Explicit codecs are still emitted where valid.
  • ParquetFormat still raises TypeError for unsupported codecs.

Context / downstream

Found in PlaidCloud's data-export path: every Parquet export with the default
(no) compression failed with this TypeError. Tracked there as:

  • PlaidCloud/plaid#6305 (story sc-22680) — currently carries a call-site
    workaround; this is the root-cause fix.
  • Discovered while building the AI export-step sub-agent (story sc-22516).

Note: this fork's main is at 0.5.1, while PlaidCloud pins PyPI
databend-sqlalchemy==0.5.5. To consume this fix, plaid will either re-pin at
the fork or this should be upstreamed to databendlabs/databend-sqlalchemy.

🤖 Generated with Claude Code

@rad-pat rad-pat left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok apart from the test which does not follow the Sqlalchemy testing suite pattern

@simozzy simozzy marked this pull request as draft June 18, 2026 21:15
@simozzy simozzy marked this pull request as ready for review June 18, 2026 21:16
@simozzy simozzy force-pushed the fix/compression-none-sentinel branch from 3f5695b to ada0c50 Compare June 18, 2026 21:48
@simozzy

simozzy commented Jun 18, 2026

Copy link
Copy Markdown
Member Author

@rad-pat Reworked the test to follow the SQLAlchemy testing-suite pattern — tests/test_copy_format.py is now a fixtures.TestBase + AssertsCompiledSQL class (__only_on__ = "databend") asserting the rendered FILE_FORMAT = (...) via self.assert_compile(...), mirroring test_copy_into.py, with assert_raises_message for the ParquetFormat rejection. The old direct .options-dict pytest checks are gone. Verified the compiled output offline against the databend dialect.

…mats

Compression.NONE is a truthy Enum member, so the `if compression:` guard
in CSVFormat/TSVFormat/NDJSONFormat/ParquetFormat treated the
no-compression sentinel as 'a codec was specified'. For ParquetFormat this
raised TypeError('Compression should be None, ZStd, or Snappy.') outright;
for the others it emitted a spurious COMPRESSION = NONE option.

Guard now skips both None and Compression.NONE, so the uncompressed case
renders no COMPRESSION option. Explicit codecs are unchanged, and
ParquetFormat still rejects codecs its writer can't use (gzip/bz2/zip).

Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
@simozzy simozzy force-pushed the fix/compression-none-sentinel branch from ada0c50 to 50e7206 Compare June 18, 2026 21:56
@simozzy

simozzy commented Jun 18, 2026

Copy link
Copy Markdown
Member Author

@rad-pat ticket logged with databend/sqlaclhemy databendlabs#75

@simozzy simozzy merged commit 7aa3e96 into main Jun 19, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants