Skip to content
This repository was archived by the owner on Mar 31, 2026. It is now read-only.

Latest commit

 

History

History
24 lines (17 loc) · 4.2 KB

File metadata and controls

24 lines (17 loc) · 4.2 KB

LoadSymbolPackageArchive

This driver uses MiniZip to read the ZIP directory information for each .snupkg. This information is stored in Azure Table Storage for other drivers to use. This is an optimization to reduce the amount of data downloads from NuGet.org's APIs in later steps.

CatalogScanDriverType enum value LoadSymbolPackageArchive
Driver implementation LoadSymbolPackageArchiveDriver
Processing mode process latest catalog leaf per package ID and version
Cursor dependencies V3 package content: blocks on this cursor to align with other drivers
Components using driver output SymbolPackageFileToCsv: needs to know if the .snupkg exists
Temporary storage config none
Persistent storage config Table Storage:
SymbolPackageArchiveTableName: ZIP directory bytes stored using MessagePack and WideEntityStorageService
Output CSV tables none

Algorithm

A batch of catalog leaf items are passed to the driver. For each catalog leaf, MiniZip is used to fetch just the symbol package (.snupkg) ZIP directory data. This is done via minimal HTTP HEAD and GET Range requests so that the whole package is not downloaded. This is similar to the LoadPackageArchive driver but for the .snupkg instead of the .nupkg. There is no signature file expected in the .snupkg so that is not fetched.

The ZIP directory are serialized and compressed using MessagePack and then written into Azure Table Storage as "wide entities".

The SymbolPackagesContainerBaseUrl is used to define where to fetch .snupkg files from. This cannot be automatically fetched from the V3 package content resource because there is no defined egress flow for .snupkg files on NuGet.org. The way that symbol data is served from NuGet.org is via the documented symbol server. This serves individual PDB files, not .snupkg archives.

The symbol package container base URL is attempted in this driver based on the catalog. This is actually not a perfect solution. NuGet.org supports updating symbol packages at any time (and multiple times) after the package is published. These symbol package updates don't result in catalog leaf items being produced, which means that symbol package available on NuGet.org may not be seen by NuGet Insights. The only way to resolve this is for NuGet.org to somehow signal a symbol package update or by periodically checking the symbol package location. This latter idea is tracked by NuGet/Insights#67.