Skip to content

Commit 429b763

Browse files
bwicaksononvwilldeacon
authored andcommitted
perf: add NVIDIA Tegra410 CPU Memory Latency PMU
Adds CPU Memory (CMEM) Latency PMU support in Tegra410 SOC. The PMU is used to measure latency between the edge of the Unified Coherence Fabric to the local system DRAM. Reviewed-by: Ilkka Koskinen <[email protected]> Signed-off-by: Besar Wicaksono <[email protected]> Signed-off-by: Will Deacon <[email protected]>
1 parent 3dd7302 commit 429b763

4 files changed

Lines changed: 769 additions & 0 deletions

File tree

Documentation/admin-guide/perf/nvidia-tegra410-pmu.rst

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ metrics like memory bandwidth, latency, and utilization:
88
* Unified Coherence Fabric (UCF)
99
* PCIE
1010
* PCIE-TGT
11+
* CPU Memory (CMEM) Latency
1112

1213
PMU Driver
1314
----------
@@ -344,3 +345,27 @@ Example usage:
344345
0x10000 to 0x100FF on socket 0's PCIE RC-1::
345346

346347
perf stat -a -e nvidia_pcie_tgt_pmu_0_rc_1/event=0x1,dst_addr_base=0x10000,dst_addr_mask=0xFFF00,dst_addr_en=0x1/
348+
349+
CPU Memory (CMEM) Latency PMU
350+
-----------------------------
351+
352+
This PMU monitors latency events of memory read requests from the edge of the
353+
Unified Coherence Fabric (UCF) to local CPU DRAM:
354+
355+
* RD_REQ counters: count read requests (32B per request).
356+
* RD_CUM_OUTS counters: accumulated outstanding request counter, which track
357+
how many cycles the read requests are in flight.
358+
* CYCLES counter: counts the number of elapsed cycles.
359+
360+
The average latency is calculated as::
361+
362+
FREQ_IN_GHZ = CYCLES / ELAPSED_TIME_IN_NS
363+
AVG_LATENCY_IN_CYCLES = RD_CUM_OUTS / RD_REQ
364+
AVERAGE_LATENCY_IN_NS = AVG_LATENCY_IN_CYCLES / FREQ_IN_GHZ
365+
366+
The events and configuration options of this PMU device are described in sysfs,
367+
see /sys/bus/event_source/devices/nvidia_cmem_latency_pmu_<socket-id>.
368+
369+
Example usage::
370+
371+
perf stat -a -e '{nvidia_cmem_latency_pmu_0/rd_req/,nvidia_cmem_latency_pmu_0/rd_cum_outs/,nvidia_cmem_latency_pmu_0/cycles/}'

drivers/perf/Kconfig

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -311,4 +311,11 @@ config MARVELL_PEM_PMU
311311
Enable support for PCIe Interface performance monitoring
312312
on Marvell platform.
313313

314+
config NVIDIA_TEGRA410_CMEM_LATENCY_PMU
315+
tristate "NVIDIA Tegra410 CPU Memory Latency PMU"
316+
depends on ARM64 && ACPI
317+
help
318+
Enable perf support for CPU memory latency counters monitoring on
319+
NVIDIA Tegra410 SoC.
320+
314321
endmenu

drivers/perf/Makefile

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,3 +35,4 @@ obj-$(CONFIG_DWC_PCIE_PMU) += dwc_pcie_pmu.o
3535
obj-$(CONFIG_ARM_CORESIGHT_PMU_ARCH_SYSTEM_PMU) += arm_cspmu/
3636
obj-$(CONFIG_MESON_DDR_PMU) += amlogic/
3737
obj-$(CONFIG_CXL_PMU) += cxl_pmu.o
38+
obj-$(CONFIG_NVIDIA_TEGRA410_CMEM_LATENCY_PMU) += nvidia_t410_cmem_latency_pmu.o

0 commit comments

Comments
 (0)