Skip to content

ITS: staggering#15188

Merged
shahor02 merged 74 commits into
AliceO2Group:devfrom
f3sch:its/trk/stag
Apr 13, 2026
Merged

ITS: staggering#15188
shahor02 merged 74 commits into
AliceO2Group:devfrom
f3sch:its/trk/stag

Conversation

@f3sch
Copy link
Copy Markdown
Collaborator

@f3sch f3sch commented Mar 18, 2026

This is to run the CI and possibly to run tests at P2.

f3sch and others added 25 commits March 17, 2026 17:37
Signed-off-by: Felix Schlepper <[email protected]>
Signed-off-by: Felix Schlepper <[email protected]>
Signed-off-by: Felix Schlepper <[email protected]>
Signed-off-by: Felix Schlepper <[email protected]>
Signed-off-by: Felix Schlepper <[email protected]>
Signed-off-by: Felix Schlepper <[email protected]>
Signed-off-by: Felix Schlepper <[email protected]>
Adapt ITS/MFT CTF machinery to staggered data
Fix compilation of ALICE3 tracking with staggering
Signed-off-by: Felix Schlepper <[email protected]>
Signed-off-by: Felix Schlepper <[email protected]>
Signed-off-by: Felix Schlepper <[email protected]>
Signed-off-by: Felix Schlepper <[email protected]>
Signed-off-by: Felix Schlepper <[email protected]>
Signed-off-by: Felix Schlepper <[email protected]>
Signed-off-by: Felix Schlepper <[email protected]>
@github-actions
Copy link
Copy Markdown
Contributor

REQUEST FOR PRODUCTION RELEASES:
To request your PR to be included in production software, please add the corresponding labels called "async-" to your PR. Add the labels directly (if you have the permissions) or add a comment of the form (note that labels are separated by a ",")

+async-label <label1>, <label2>, !<label3> ...

This will add <label1> and <label2> and removes <label3>.

The following labels are available
async-2023-pbpb-apass4
async-2023-pp-apass4
async-2024-pp-apass1
async-2022-pp-apass7
async-2024-pp-cpass0
async-2024-PbPb-apass1
async-2024-ppRef-apass1
async-2024-PbPb-apass2
async-2023-PbPb-apass5

Signed-off-by: Felix Schlepper <[email protected]>
@f3sch f3sch marked this pull request as ready for review March 18, 2026 11:26
@f3sch f3sch requested review from sawenzel and shahor02 as code owners March 18, 2026 11:26
@alibuild
Copy link
Copy Markdown
Collaborator

alibuild commented Mar 27, 2026

Error while checking build/O2/fullCI_slc9 for 507223a at 2026-04-01 07:39:

## sw/BUILD/O2-full-system-test-latest/log
command /sw/slc9_x86-64/O2/slc9_x86-64-slc9_x86-64-local5/prodtests/full-system-test/dpl-workflow.sh had nonzero exit code 1
[8734:internal-dpl-ccdb-backend]: [07:23:37][ERROR] Failed to open file alien:///alice/data/CCDB/CTP/Calib/OrbitReset/01/35179/b07c53e0-b4c0-11ec-b66d-90ce809b250c?filetype=raw
[8734:internal-dpl-ccdb-backend]: [07:23:39][ERROR] Failed to open file alien:///alice/data/CCDB/EMC/Config/RecoParam/02/14620/d791f4c0-3ffb-11ed-a67e-2a010e0a0b16?filetype=raw
[8734:internal-dpl-ccdb-backend]: [07:23:41][ERROR] Failed to open file alien:///alice/data/CCDB/GLO/Config/GRPECS/04/02718/cf968f3f-0210-11ed-8000-200114580202?filetype=raw
[8734:internal-dpl-ccdb-backend]: [07:23:44][ERROR] Failed to open file alien:///alice/data/CCDB/ITS/Config/ClustererParam/06/44918/b0c41285-f1ae-11ec-b9c0-2a010e0a0b16?filetype=raw
[8734:internal-dpl-ccdb-backend]: [07:23:46][ERROR] Failed to open file alien:///alice/data/CCDB/MFT/Config/ClustererParam/06/44388/8c661b10-f480-11ec-841e-2a010e0a0b16?filetype=raw
[8734:internal-dpl-ccdb-backend]: [07:23:48][ERROR] Failed to open file alien:///alice/data/CCDB/MFT/Calib/ClusterDictionary/13/20237/c9a0bd70-9fe3-11ec-975c-200114580202?filetype=raw
[8734:internal-dpl-ccdb-backend]: [07:23:51][ERROR] Failed to open file alien:///alice/data/CCDB/CTP/Config/TriggerOffsets/05/51872/907486a2-ecaa-11ec-bea3-2a010e0a0b16?filetype=raw
[8734:internal-dpl-ccdb-backend]: [07:23:55][ERROR] Failed to open file alien:///alice/data/CCDB/CPV/Calib/Gains/13/50097/aa7cef8c-caea-11ec-8000-0aa202a21b9a?filetype=raw
[8734:internal-dpl-ccdb-backend]: [07:23:57][ERROR] Failed to open file alien:///alice/data/CCDB/CPV/Calib/Pedestals/06/01515/aa7d7bad-caea-11ec-8000-0aa202a21b9a?filetype=raw
[8734:internal-dpl-ccdb-backend]: [07:23:59][ERROR] Failed to open file alien:///alice/data/CCDB/CPV/Calib/BadChannelMap/12/29504/aa7c4363-caea-11ec-8000-0aa202a21b9a?filetype=raw
[8734:internal-dpl-ccdb-backend]: [07:24:01][ERROR] Failed to open file alien:///alice/data/CCDB/ZDC/Config/Module/12/06629/706c8778-60e8-11ed-9dc3-2a010e0a0b16?filetype=raw
[8734:internal-dpl-ccdb-backend]: [07:24:06][ERROR] Failed to open file alien:///alice/data/CCDB/TPC/Config/FEEPad/07/32310/3e279f1b-5e92-11ed-9dc3-200114580202?filetype=raw
[8734:internal-dpl-ccdb-backend]: [07:24:09][ERROR] Failed to open file alien:///alice/data/CCDB/TPC/Calib/PadGainFull/10/00989/b929213c-a46e-11ec-982e-0aa2043e1b9a?filetype=raw
[8734:internal-dpl-ccdb-backend]: [07:24:14][ERROR] Failed to open file alien:///alice/data/CCDB/ZDC/Calib/TDCCorr/01/30762/0d5c2790-cca6-11ec-b790-200114580d00?filetype=raw
[8734:internal-dpl-ccdb-backend]: [07:24:16][ERROR] Failed to open file alien:///alice/data/CCDB/ZDC/Calib/BaselineCalib/12/18437/28ea63c9-f95f-11ed-9692-200114580202?filetype=raw
[8734:internal-dpl-ccdb-backend]: [07:24:19][ERROR] Failed to open file alien:///alice/data/CCDB/CPV/Config/CPVSimParams/02/18202/5ce9988b-d60c-11ec-8000-0aa202a21b9a?filetype=raw
[8734:internal-dpl-ccdb-backend]: [07:24:21][ERROR] Failed to open file alien:///alice/data/CCDB/PHS/Calib/L1phase/13/59044/633640d8-4b07-11ed-832f-0aa202a21b9a?filetype=raw
[8734:internal-dpl-ccdb-backend]: [07:24:24][ERROR] Failed to open file alien:///alice/data/CCDB/PHS/Calib/BadMap/10/40794/45929b61-dd0a-11ec-8000-0aa202a21b9a?filetype=raw
[8734:internal-dpl-ccdb-backend]: [07:24:29][ERROR] Failed to open file alien:///alice/data/CCDB/TRD/Calib/LocalGainFactor/15/61343/100756ad-4ec7-11ed-bee3-200114580202?filetype=raw
[8734:internal-dpl-ccdb-backend]: [07:24:35][ERROR] Failed to open file alien:///alice/data/CCDB/GLO/Calib/MeanVertex/10/06884/9c31089f-2c77-11ed-ac0b-2a010e0a0b16?filetype=raw
[8734:internal-dpl-ccdb-backend]: [07:24:39][ERROR] Failed to open file alien:///alice/data/CCDB/GLO/Config/SVertexerParam/02/38730/8172363d-dfc5-11ee-82dc-200114580202?filetype=raw
[8734:internal-dpl-ccdb-backend]: [07:24:42][ERROR] Failed to open file alien:///alice/data/CCDB/EMC/Calib/BadChannelMap/13/27208/bd8faa80-3f56-11ed-ac0b-2a010e0a0b16?filetype=raw
[8734:internal-dpl-ccdb-backend]: [07:24:46][ERROR] Failed to open file alien:///alice/data/CCDB/EMC/Calib/TimeCalibParams/08/53931/ea66f883-3f56-11ed-ac0b-2a010e0a0b16?filetype=raw
[8757:BadMapCalibSpec]: [07:25:58][ERROR] Insufficient statistics: 0 entries in lowE histo, do nothing
[9156:qc-task-ITS-ITSTrackTask]: [07:39:39][ERROR] Exception while running: Inconsistent type and payload size at Vertices(DS/tracks1/0): type size 48  payload size 260. Rethrowing.
[9156:qc-task-ITS-ITSTrackTask]: [07:39:39][FATAL] Unhandled o2::framework::runtime_error reached the top of main of o2-qc, device shutting down. Reason: Inconsistent type and payload size at Vertices(DS/tracks1/0): type size 48  payload size 260
[ERROR] Workflow crashed - PID 9156 (qc-task-ITS-ITSTrackTask) did not exit correctly however it's not clear why. Exit code forced to 128.
[ERROR]  - Device qc-task-ITS-ITSTrackTask: pid 9156 (exit 128)
[INFO]    - First error: [07:39:39][FATAL] Unhandled o2::framework::runtime_error reached the top of main of o2-qc, device shutting down. Reason: Inconsistent type and payload size at Vertices(DS/tracks1/0): type size 48  payload size 260
[ERROR] SEVERE: Device qc-task-ITS-ITSTrackTask (9156) had at least one message above severity 7: Unhandled o2::framework::runtime_error reached the top of main of o2-qc, device shutting down. Reason: Inconsistent type and payload size at Vertices(DS/tracks1/0): type size 48  payload size 260


## sw/BUILD/o2checkcode-latest/log
--
========== List of errors found ==========
++ GRERR=0
++ grep -v clang-diagnostic-error error-log.txt
++ grep ' error:'
grep: error-log.txt: binary file matches
++ GRERR=1
++ [[ 1 == 0 ]]
++ mkdir -p /sw/INSTALLROOT/d0478a093af42beb2d89a6c601018984d03310e7/slc9_x86-64/o2checkcode/1.0-local37/etc/modulefiles
++ cat
--

Full log here.

@f3sch f3sch marked this pull request as draft April 2, 2026 07:07
@f3sch f3sch marked this pull request as ready for review April 2, 2026 19:58
@f3sch f3sch requested a review from fcolamar as a code owner April 2, 2026 19:58
@f3sch f3sch changed the title ITS: staggering [DO NOT MERGE] ITS: staggering Apr 2, 2026
@alibuild
Copy link
Copy Markdown
Collaborator

alibuild commented Apr 3, 2026

Error while checking build/O2/fullCI_slc9 for 92cd379 at 2026-04-06 00:20:

## sw/BUILD/o2checkcode-latest/log
--
========== List of errors found ==========
++ GRERR=0
++ grep -v clang-diagnostic-error error-log.txt
++ grep ' error:'
grep: error-log.txt: binary file matches
/sw/SOURCES/O2/slc9_x86-64-slc9_x86-64/0/Common/DCAFitter/GPU/cuda/GPUInterface.cu:55:15: error: use '= default' to define a trivial destructor [modernize-use-equals-default]
++ [[ 0 == 0 ]]
++ exit 1
--

Full log here.

@f3sch
Copy link
Copy Markdown
Collaborator Author

f3sch commented Apr 3, 2026

@shahor02 from my side this is ready to merge. @mpuccio checked that for online everything should work, we only have to arrange with Barth when to do it.

@shahor02
Copy link
Copy Markdown
Collaborator

shahor02 commented Apr 4, 2026

@f3sch fine for me.
Pinging to @Barthelemy with a question if it is ok for him to merge/tag AliceO2Group/QualityControl#2653 together with this PR (since our usual 2-step PR merging will not work in this case).
But I'll absent till April 12, so I will not be able to follow this in case of problems).

@alibuild
Copy link
Copy Markdown
Collaborator

alibuild commented Apr 6, 2026

Error while checking build/O2/fullCI_slc9 for 9995505 at 2026-04-13 00:18:

## sw/BUILD/O2-full-system-test-latest/log
command /sw/slc9_x86-64/O2/slc9_x86-64-slc9_x86-64-local11/prodtests/full-system-test/dpl-workflow.sh had nonzero exit code 1
[9116:BadMapCalibSpec]: [00:18:25][ERROR] Insufficient statistics: 0 entries in lowE histo, do nothing
[9563:qc-task-ITS-ITSTrackTask]: [00:18:40][ERROR] Exception while running: Inconsistent type and payload size at Vertices(DS/tracks1/0): type size 48  payload size 260. Rethrowing.
[9563:qc-task-ITS-ITSTrackTask]: [00:18:40][FATAL] Unhandled o2::framework::runtime_error reached the top of main of o2-qc, device shutting down. Reason: Inconsistent type and payload size at Vertices(DS/tracks1/0): type size 48  payload size 260
[ERROR] Workflow crashed - PID 9563 (qc-task-ITS-ITSTrackTask) did not exit correctly however it's not clear why. Exit code forced to 128.
[9389:o2-eve-export]: [00:18:43][ERROR] filesystem problem during DirectoryLoader::canCreateNextFile: filesystem error: directory iterator cannot open directory: No such file or directory [jsons]
[ERROR]  - Device qc-task-ITS-ITSTrackTask: pid 9563 (exit 128)
[INFO]    - First error: [00:18:40][FATAL] Unhandled o2::framework::runtime_error reached the top of main of o2-qc, device shutting down. Reason: Inconsistent type and payload size at Vertices(DS/tracks1/0): type size 48  payload size 260
[ERROR] SEVERE: Device qc-task-ITS-ITSTrackTask (9563) had at least one message above severity 7: Unhandled o2::framework::runtime_error reached the top of main of o2-qc, device shutting down. Reason: Inconsistent type and payload size at Vertices(DS/tracks1/0): type size 48  payload size 260


## sw/BUILD/o2checkcode-latest/log
--
========== List of errors found ==========
++ GRERR=0
++ grep -v clang-diagnostic-error error-log.txt
++ grep ' error:'
grep: error-log.txt: binary file matches
++ GRERR=1
++ [[ 1 == 0 ]]
++ mkdir -p /sw/INSTALLROOT/2b3180fa2846829affb8ee2f6a9aa1cc517029e0/slc9_x86-64/o2checkcode/1.0-local102/etc/modulefiles
++ cat
--

Full log here.

@shahor02
Copy link
Copy Markdown
Collaborator

Merging after fixing naming conflict. The CI's were passed except the conflict with the QC, whose relevant PR will be merged after this one (was tested locally by @mpuccio).

@shahor02 shahor02 merged commit c9acd57 into AliceO2Group:dev Apr 13, 2026
7 of 9 checks passed
@f3sch f3sch deleted the its/trk/stag branch April 13, 2026 13:15
knopers8 pushed a commit to AliceO2Group/QualityControl that referenced this pull request Apr 13, 2026
ehellbar pushed a commit that referenced this pull request Apr 15, 2026
* ITS: staggered tracking

Signed-off-by: Felix Schlepper <[email protected]>

* ITS: various fixes also for GPU

Signed-off-by: Felix Schlepper <[email protected]>

* ITS: fix vertexer and move new types

Signed-off-by: Felix Schlepper <[email protected]>

* ITS: format

Signed-off-by: Felix Schlepper <[email protected]>

* ITS: account for layer ROF bias in tracker

Signed-off-by: Felix Schlepper <[email protected]>

* ITS: sort tracks in time by lower edge

Signed-off-by: Felix Schlepper <[email protected]>

* ITS: ensure mc labels are nullptr

Signed-off-by: Felix Schlepper <[email protected]>

* ITSMFT: account for possible delay of received ROFs

Signed-off-by: Felix Schlepper <[email protected]>

* ITS: staggered STF decoder

Signed-off-by: Felix Schlepper <[email protected]>

* ITS: fix track time-assignment

Signed-off-by: Felix Schlepper <[email protected]>

* ITS: output vertices

Signed-off-by: Felix Schlepper <[email protected]>

* ITS: add macro to check staggering in data

Signed-off-by: Felix Schlepper <[email protected]>

* Adapt ITS/MFT CTF machinery to staggered data

* Fix compilation of ALICE3 tracking with staggering

* ITS: modify staggering macro

Signed-off-by: Felix Schlepper <[email protected]>

* ITSMFT: runtime staggering option

Signed-off-by: Felix Schlepper <[email protected]>

* ITSMFT: fix instantiation in namespace

Signed-off-by: Felix Schlepper <[email protected]>

* ITS3: fix compilation

Signed-off-by: Felix Schlepper <[email protected]>

* Raw,CTF: add option to specify base cache dir for remote files

Signed-off-by: Felix Schlepper <[email protected]>

* ITS: tracking same as dev

Signed-off-by: Felix Schlepper <[email protected]>

* ITS: add back datastreams

Signed-off-by: Felix Schlepper <[email protected]>

* ITSMFT: improve logging

Signed-off-by: Felix Schlepper <[email protected]>

* ITS: add rofs for vertices back

Signed-off-by: Felix Schlepper <[email protected]>

* add copyright to macro

Signed-off-by: Felix Schlepper <[email protected]>

* ITS: hide print functions for device code

Signed-off-by: Felix Schlepper <[email protected]>

* ITSMFT: add shim file for alpide param

Signed-off-by: Felix Schlepper <[email protected]>

* try to fix macro compilation

Signed-off-by: Felix Schlepper <[email protected]>

* Avoid wildcarded subspecs in Digit/ClusterWriter

* ITS: fix rof lut to work properly with added errors

Signed-off-by: Felix Schlepper <[email protected]>

* Fix/add some staggering options

* Add ITS/MFT staggering options to dpl-workflow.sh

To activate ITS or MFT staggering in the topology generation, export ITSSTAGGERED=1
or MFTSTAGGERED=1 respectively

* ITS: try fix for QC

Signed-off-by: Felix Schlepper <[email protected]>

* ITS: fix ROFLookpTables warning

Signed-off-by: Felix Schlepper <[email protected]>

* ITS: fix tracklet formatting

Signed-off-by: Felix Schlepper <[email protected]>

* ITS: set BCData properly for ROFs

Signed-off-by: Felix Schlepper <[email protected]>

* ITS: remove deprecated settings

Signed-off-by: Felix Schlepper <[email protected]>

* ITS: fix cluster label access for non-staggered

Signed-off-by: Felix Schlepper <[email protected]>

* ITSMFT: fix staggering wfx option for digit-writer-workflow

Signed-off-by: Felix Schlepper <[email protected]>

* Fix loop condition for ITS tracking layers

* Fix/add some staggering options

* Add ITS/MFT staggering options to dpl-workflow.sh

To activate ITS or MFT staggering in the topology generation, export ITSSTAGGERED=1
or MFTSTAGGERED=1 respectively

* ITSMFT: fix staggering wfx option for digit-writer-workflow

Signed-off-by: Felix Schlepper <[email protected]>

* Fix loop condition for ITS tracking layers

* Make ITS vertex messageable

* remove unused variable

* Add/fix staggering options to all workflows reading ITS,MFT clusters

To pass the sim-challenge test. W/o this option even <workflow> -h leads to a crash.
Strictly speaking, one could use in the DPLAlpideParamInitializer::isITSStaggeringEnabled
and DPLAlpideParamInitializer::isMFTStaggeringEnabled a test
ic.options().hasOption(stagITSOpt) and ic.options().hasOption(stagMFTOpt) before testing
the option itself. But better to have an explicit detection of missing staggering option.

* ITSMFT: fix digit reader

Signed-off-by: Felix Schlepper <[email protected]>

* Remove leftover NROFs configurable from dpl-workflow.sh

* ITS: fix time assignments

Signed-off-by: Felix Schlepper <[email protected]>

* ITS: fix degenerate LSE for matrix solving

Comparing the output of dev and this PR, I saw plently of cases where
the system of equation was fully degenerate and produced to different
floating instructions and compiler optimizations slightly different
results. The solution is to discard the vertex cand. if the LSE becomes
degenerate as not to produce non-sense solutions.

Signed-off-by: Felix Schlepper <[email protected]>

* ITS: fix macro

Signed-off-by: Felix Schlepper <[email protected]>

* MFT: fix track writer

Signed-off-by: Felix Schlepper <[email protected]>

* ITS: fix gpu compile due change in vertexer types

Signed-off-by: Felix Schlepper <[email protected]>

* ITS: move lookup table creation to proper place

Signed-off-by: Felix Schlepper <[email protected]>

* Move FastMultEstimation to ITS tracking library

* ITS: add containedIn to TS

Signed-off-by: Felix Schlepper <[email protected]>

* ITS: fix vertexer

Signed-off-by: Felix Schlepper <[email protected]>

* ITS: improve STFDecoder&Clusterer error messages and account for delay longer that ROF

Signed-off-by: Felix Schlepper <[email protected]>

* Implement new kind of multiplicity mask

* Adapt GPU code to the new mult mask

* ITS: finalize tracking code

Signed-off-by: Felix Schlepper <[email protected]>

* ITS: remove deltaRof for vertexer

Signed-off-by: Felix Schlepper <[email protected]>

* ITS: report current timeslice

Signed-off-by: Felix Schlepper <[email protected]>

* Vertex: also print time error

Signed-off-by: Felix Schlepper <[email protected]>

* ITS: speedup vertexer

Signed-off-by: Felix Schlepper <[email protected]>

---------

Signed-off-by: Felix Schlepper <[email protected]>
Co-authored-by: shahoian <[email protected]>
Co-authored-by: Maximiliano Puccio <[email protected]>
std::array<int, 2> mTracklets = constants::helpers::initArray<int, 2, constants::UnusedIndex>();
std::array<int, nLayers> mClusters = constants::helpers::initArray<int, nLayers, constants::UnusedIndex>();
std::array<int, NLayers> mClusters = constants::helpers::initArray<int, NLayers, constants::UnusedIndex>();
TimeEstBC mTime;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mTime is not initialised in the default ctor.

vert1.getTimeStamp().setTimeStampError(594);
vertices.push_back(vert1);
table.update(vertices.data(), vertices.size());
const auto view = table.getView();
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

view is unused.

static constexpr float DEFROFLengthTrig()
{
// length of RO frame in ns for triggered mode
return N == o2::detectors::DetID::ITS ? 6000. : 6000.;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are both values same?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because, although these are different detectors, they have the defaults.


std::vector<PairCandidate> k0Cands;
for (int iPos{0}; iPos < (int)posPool.size(); ++iPos) {
const auto pos = posPool[iPos];
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pos shadows an outer variable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

5 participants