Skip to content
This repository was archived by the owner on Mar 31, 2026. It is now read-only.

Latest commit

 

History

History
20 lines (15 loc) · 2.94 KB

File metadata and controls

20 lines (15 loc) · 2.94 KB

PackageReadmeToCsv

This driver reads package README markdown and writes the full content to CSV.

CatalogScanDriverType enum value PackageReadmeToCsv
Driver implementation PackageReadmeToCsvDriver
Processing mode process latest catalog leaf per package ID and version
Cursor dependencies LoadPackageReadme: reads README bytes from table storage
Components using driver output Kusto ingestion via KustoIngestionMessageProcessor, since this driver produces CSV data
Temporary storage config Table Storage:
CsvRecordTableName (name prefix): holds CSV records before they are added to a CSV blob
TaskStateTableName (name prefix): tracks completion of CSV blob aggregation
Persistent storage config Blob Storage:
PackageReadmeContainerName: contains CSVs for the PackageReadmes table
Output CSV tables PackageReadmes

Algorithm

For each catalog leaf passed to the driver, the package README bytes are fetched from Azure Table Storage (as stored by LoadPackageReadme). Not all packages have READMEs. This driver has the same caveats as the LoadPackageReadme driver in that legacy (non-embedded) README content may not match the latest content on NuGet.org.

The full README string is read from bytes using .NET's StreamReader (which attempts string encoding detection) amd is added to the CSV as an unrendered markdown string.