This driver maps ZIP archive details to CSV about each .nupkg (NuGet package) on NuGet.org. It writes archive-level information as well as ZIP entry (file) level information.
CatalogScanDriverType enum value |
PackageArchiveToCsv |
| Driver implementation | PackageArchiveToCsvDriver |
| Processing mode | process latest catalog leaf per package ID and version |
| Cursor dependencies | PackageFileToCsv: provides .nupkg hash in table storage(transitive) LoadPackageArchive: needed by PackageFileToCsv |
| Components using driver output | Kusto ingestion via KustoIngestionMessageProcessor, since this driver produces CSV data |
| Temporary storage config | Table Storage:CsvRecordTableName (name prefix): holds CSV records before they are added to a CSV blobTaskStateTableName (name prefix): tracks completion of CSV blob aggregation |
| Persistent storage config | Blob Storage:PackageArchiveContainerName: contains CSVs for the PackageArchives tablePackageArchiveEntryContainerName: contains CSVs for the PackageArchiveEntries table |
| Output CSV tables | PackageArchiveEntriesPackageArchives |
For each catalog leaf passed to driver, the ZIP central directory, size, and HTTP response headers are fetched from Azure Table Storage. These are populated by the LoadPackageArchive driver. Hashes of the whole ZIP are also read from table storage. These are populated by the PackageAssemblyToCsv driver (because it needs the full ZIP content).
The ZIP central directory is enumerated. A single CSV record is produced for each .nupkg and one or more CSV records are created for each entry in the ZIP file.
Detailed ZIP information is included in the produced CSV records to aid in the debugging of esoteric ZIP archive issues.