fix: treat Compression.NONE sentinel as uncompressed in COPY INTO formats#75
fix: treat Compression.NONE sentinel as uncompressed in COPY INTO formats#75simozzy wants to merge 1 commit into
Conversation
…mats
Compression.NONE is a truthy Enum member, so the `if compression:` guard
in CSVFormat/TSVFormat/NDJSONFormat/ParquetFormat treated the
no-compression sentinel as 'a codec was specified'. For ParquetFormat this
raised TypeError('Compression should be None, ZStd, or Snappy.') outright;
for the others it emitted a spurious COMPRESSION = NONE option.
Guard now skips both None and Compression.NONE, so the uncompressed case
renders no COMPRESSION option. Explicit codecs are unchanged, and
ParquetFormat still rejects codecs its writer can't use (gzip/bz2/zip).
Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e9239985e4
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| ) | ||
| self.options["MISSING_FIELD_AS"] = f"{missing_field_as}" | ||
| if compression: | ||
| if compression and compression is not Compression.NONE: |
There was a problem hiding this comment.
Reject NONE for Parquet instead of using defaults
When callers pass ParquetFormat(compression=Compression.NONE), this condition now skips validation and emits no COMPRESSION option. For Parquet, Databend only supports ZSTD/SNAPPY for this option and the omitted-option default is ZSTD, so a user asking for no compression silently gets ZSTD-compressed output instead of the previous clear rejection. Keep Compression.NONE rejected for Parquet rather than treating it like None.
Useful? React with 👍 / 👎.
Problem
Compression.NONEis a truthyEnummember, so theif compression:guard inCSVFormat,TSVFormat,NDJSONFormat, andParquetFormattreats the no-compression sentinel as "a codec was specified":ParquetFormat(compression=Compression.NONE)raisesTypeError: Compression should be None, ZStd, or Snappy.— even thoughNONEis the "no compression" case.COMPRESSION = NONEoption for the sentinel.Fix
Skip the block for both
Noneand theCompression.NONEsentinel, so the uncompressed case renders noCOMPRESSIONoption:Applied to all four format classes. Explicit codecs are unchanged, and
ParquetFormatstill rejects codecs its writer can't use (gzip/bz2/zip).The four format classes also now set
inherit_cache = False(matchingCopyIntoTable/CopyIntoLocation), which silences SQLAlchemy's "will not make use of SQL compilation caching" warning when a format is compiled — needed for the new compile tests to pass under the suite's warning-as-error config.Tests
tests/test_copy_format.py— afixtures.TestBase+AssertsCompiledSQLclass mirroringtests/test_copy_into.py, asserting the renderedFILE_FORMAT = (...)viaassert_compile:NoneandCompression.NONEemit noCOMPRESSIONoption, across all four classes.ParquetFormatstill raisesTypeErrorfor unsupported codecs.