OpenMP parallelization of cbm::algo::tof::Hitfind.
Applied OpenMP parallelization to the steering layer of the TOF hitfinder, i.e. the class cbm::algo::tof::Hitfind
. To achieve this, the same steps as in !1047 (closed) were retraced.
The observed speedup is similar to that observed in the other MR. We roughly gain a factor of 2 on 4 cores and saturate at about a factor of 2.5 on 8 cores. When running on a single core, the runtime is roughly 10% larger than for the serial version, due to overhead.
We lose a bit of time to std::vector::resize()
operations where the created objects are then overwritten (i.e. the usual problem in parallel code). Not sure what the status is regarding a vector class that avoids this.
We should probably go for this version in the next DC, as we will surely be running in multi-threaded mode. The question is whether we should keep the original version in some form. Not sure how important this is. Single-threaded performance might not be so relevant in practice.
Requesting a review from @fweig. Possibly also of interest to @j.decuveland.