PackageArchiveToCsv

This driver maps ZIP archive details to CSV about each .nupkg (NuGet package) on NuGet.org. It writes archive-level information as well as ZIP entry (file) level information.


`CatalogScanDriverType` enum value	`PackageArchiveToCsv`
Driver implementation	`PackageArchiveToCsvDriver`
Processing mode	process latest catalog leaf per package ID and version
Cursor dependencies	`PackageFileToCsv`: provides .nupkg hash in table storage (transitive) `LoadPackageArchive`: needed by `PackageFileToCsv`
Components using driver output	Kusto ingestion via `KustoIngestionMessageProcessor`, since this driver produces CSV data
Temporary storage config	Table Storage: `CsvRecordTableName` (name prefix): holds CSV records before they are added to a CSV blob `TaskStateTableName` (name prefix): tracks completion of CSV blob aggregation
Persistent storage config	Blob Storage: `PackageArchiveContainerName`: contains CSVs for the `PackageArchives` table `PackageArchiveEntryContainerName`: contains CSVs for the `PackageArchiveEntries` table
Output CSV tables	`PackageArchiveEntries` `PackageArchives`

Algorithm

For each catalog leaf passed to driver, the ZIP central directory, size, and HTTP response headers are fetched from Azure Table Storage. These are populated by the LoadPackageArchive driver. Hashes of the whole ZIP are also read from table storage. These are populated by the PackageAssemblyToCsv driver (because it needs the full ZIP content).

The ZIP central directory is enumerated. A single CSV record is produced for each .nupkg and one or more CSV records are created for each entry in the ZIP file.

Detailed ZIP information is included in the produced CSV records to aid in the debugging of esoteric ZIP archive issues.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PackageArchiveToCsv

Algorithm

FilesExpand file tree

PackageArchiveToCsv.md

Latest commit

History

PackageArchiveToCsv.md

File metadata and controls

PackageArchiveToCsv

Algorithm