training/inference time as a function of number of scans used #25

radu-diaconescu13 · 2025-04-17T09:59:25Z

Hello,

I have noticed that I get a substantial increase for both training time and inference time
when increasing the number of scans used from 1 at a time to 5 at a time.

For example, for training it takes around 10 hours for an epoch for finish when using 1 scan
vs 40 hours when using 5 scans. (i am trying on cpu for the moment, hence the absolute big training times)
Similarly, when using the test_inference script I get around 1.1 seconds versus 3.5 seconds on my mac M1 laptop.

I have looked into the code and a great chunk of the time increase comes from the forward method of the DRSPAAM object
in https://github.com/VisualComputingInstitute/2D_lidar_person_detection/blob/master/dr_spaam/dr_spaam/model/dr_spaam.py
at the for loop at line 102.

Is there a reason this is done sequentially and not paralled/vectorized using torch's capabilities in this respect?

Thank you

Pandoro · 2025-04-17T12:14:11Z

Hi there,

so it's been a while since I looked at this, but the obvious reason that comes to mind would be the auto-regressive nature of DR-SPAAM. Line 112 of the same file actually updates a state, which can't be parallelized in a trivial fashion. The feature extraction itself should be parallelizable though, given that these ops don't have a state. I guess you could easily flatten the batch and number of scan dimensions, extract the features and then reshape it back prior to running only line 112 in a loop. I guess you can even try that without re-training given that it should in essence be exactly the same during inference. Do pay attention though that during training this would behave slightly differently (no idea if it's good or bad). Right now batchnorm is performed on batches of a single scan, whereas doing what I proposed above would run batchnorm collectively on all B*N scans. This might even be better, but I guess it's hard to tell without trying.

radu-diaconescu13 · 2025-04-17T12:53:07Z

Thank you @Pandoro

By " guess you can even try that without re-training given that it should in essence be exactly the same during inference. "
do you mean manually changing the structure of the weights that are already obtained after training? do you have any suggestion on how to do this and/or a link to an example, please?

Pandoro · 2025-04-17T13:39:46Z

No, you don't need to change the weights. My pytorch syntax is a bit rusty and I can't test this right now, but I'm suggesting something along the following lines:

B, CT, N, L = x.shape

# extract feature from all scan
out = x.view(B * CT * N, 1, L) # Not sure if that works, but I wouldn't see why not. You could also use Einops to be more explicit.
out = self._conv_and_pool(out, self.conv_block_1)  # /2   <-- cut outs are processed as usual.
out = self._conv_and_pool(out, self.conv_block_2)  # /4
features_all = out.view(B, CT, N, out.shape[-2], out.shape[-1])  # Again this might need some testing

for i in range(n_scan):
    features_i = features_all[:, :, i, :, :]  # (B, CT, C, L)
    # combine current feature with memory
    out, sim = self.gate(features_i)  # (B, CT, C, L)

Each cutout from each scan, from each batch entry is processed independently, the only place they interact is in the batchnorm in the _conv_and_pool blocks. That's what I mentioned before. I don't think this is a huge issue though and post training it should give you the same results, up to GPU non-determinism and the likes.

Take of all this with a grain of salt though and test it for sure. I might be overlooking something stupid here and I'm only 95% sure this will work.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

training/inference time as a function of number of scans used #25

training/inference time as a function of number of scans used #25

radu-diaconescu13 commented Apr 17, 2025

Pandoro commented Apr 17, 2025

Uh oh!

radu-diaconescu13 commented Apr 17, 2025

Uh oh!

Pandoro commented Apr 17, 2025

Uh oh!

training/inference time as a function of number of scans used #25

training/inference time as a function of number of scans used #25

Comments

radu-diaconescu13 commented Apr 17, 2025

Pandoro commented Apr 17, 2025

Uh oh!

radu-diaconescu13 commented Apr 17, 2025

Uh oh!

Pandoro commented Apr 17, 2025

Uh oh!