Skip to content

Commit a16d1ec

Browse files
ZideChen0Peter Zijlstra
authored andcommitted
perf/x86/intel/uncore: Fix die ID init and look up bugs
In snbep_pci2phy_map_init(), in the nr_node_ids > 8 path, uncore_device_to_die() may return -1 when all CPUs associated with the UBOX device are offline. Remove the WARN_ON_ONCE(die_id == -1) check for two reasons: - The current code breaks out of the loop. This is incorrect because pci_get_device() does not guarantee iteration in domain or bus order, so additional UBOX devices may be skipped during the scan. - Returning -EINVAL is incorrect, since marking offline buses with die_id == -1 is expected and should not be treated as an error. Separately, when NUMA is disabled on a NUMA-capable platform, pcibus_to_node() returns NUMA_NO_NODE, causing uncore_device_to_die() to return -1 for all PCI devices. As a result, spr_update_device_location(), used on Intel SPR and EMR, ignores the corresponding PMON units and does not add them to the RB tree. Fix this by using uncore_pcibus_to_dieid(), which retrieves topology from the UBOX GIDNIDMAP register and works regardless of whether NUMA is enabled in Linux. This requires snbep_pci2phy_map_init() to be added in spr_uncore_pci_init(). Keep uncore_device_to_die() only for the nr_node_ids > 8 case, where NUMA is expected to be enabled. Fixes: 9a7832c ("perf/x86/intel/uncore: With > 8 nodes, get pci bus die id from NUMA info") Fixes: 65248a9 ("perf/x86/uncore: Add a quirk for UPI on SPR") Signed-off-by: Zide Chen <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Dapeng Mi <[email protected]> Tested-by: Steve Wahl <[email protected]> Link: https://patch.msgid.link/[email protected]
1 parent 7b568e9 commit a16d1ec

2 files changed

Lines changed: 7 additions & 7 deletions

File tree

arch/x86/events/intel/uncore.c

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,7 @@ int uncore_die_to_segment(int die)
6767
return bus ? pci_domain_nr(bus) : -EINVAL;
6868
}
6969

70+
/* Note: This API can only be used when NUMA information is available. */
7071
int uncore_device_to_die(struct pci_dev *dev)
7172
{
7273
int node = pcibus_to_node(dev->bus);

arch/x86/events/intel/uncore_snbep.c

Lines changed: 6 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1459,13 +1459,7 @@ static int snbep_pci2phy_map_init(int devid, int nodeid_loc, int idmap_loc, bool
14591459
}
14601460

14611461
map->pbus_to_dieid[bus] = die_id = uncore_device_to_die(ubox_dev);
1462-
14631462
raw_spin_unlock(&pci2phy_map_lock);
1464-
1465-
if (WARN_ON_ONCE(die_id == -1)) {
1466-
err = -EINVAL;
1467-
break;
1468-
}
14691463
}
14701464
}
14711465

@@ -6420,7 +6414,7 @@ static void spr_update_device_location(int type_id)
64206414

64216415
while ((dev = pci_get_device(PCI_VENDOR_ID_INTEL, device, dev)) != NULL) {
64226416

6423-
die = uncore_device_to_die(dev);
6417+
die = uncore_pcibus_to_dieid(dev->bus);
64246418
if (die < 0)
64256419
continue;
64266420

@@ -6444,6 +6438,11 @@ static void spr_update_device_location(int type_id)
64446438

64456439
int spr_uncore_pci_init(void)
64466440
{
6441+
int ret = snbep_pci2phy_map_init(0x3250, SKX_CPUNODEID, SKX_GIDNIDMAP, true);
6442+
6443+
if (ret)
6444+
return ret;
6445+
64476446
/*
64486447
* The discovery table of UPI on some SPR variant is broken,
64496448
* which impacts the detection of both UPI and M3UPI uncore PMON.

0 commit comments

Comments
 (0)