algo: Optimize host code in GPU Hitfinder.
Introduces three optimizations to the host code of the STS Hitfinder:
- Allocate all buffers beforehand on GPU, do dynamic allocation per TS on CPU (allocating the pinned memory for GPU is very slow)
- Speedup sorting Digis by module by using a lookup table to map the sensor address to an index and parallelizing the copy
- Copy hits to pinned memory to speed up copy from the GPU
Edited by Felix Weiglhofer