In reply to @morganalyssa "so i profiled my": are you using the new `convolve` API (cc @ezquerra ) https://github.com/mratsim/Arraymancer/pull/617 using SIMD with convolution requires refactoring but it's possible that just asking `omp simd` in that inner loop helps:https:// github.com/AngelEzquerra/Arraymancer/blob/4a22c333e24063709c4e798831d46935dd98e1 df/src/arraymancer/tensor/math_functions.nim#L142 Otherwise for the very adventurous, I've detailed how to do fast convolution / cross-correlation here: https://github.com/mratsim/laser/wiki/Convolution-optimisation-resources And there is a high-level implementation that can be modified for SIMD here: https://github.com/SciNim/impulse/blob/master/benchmarks/image_filters/filter2d_ separable.nim#L124-L149 And nother approach is to have convolve use matrix multiplication like this dep learning convolution here: https://github.com/mratsim/Arraymancer/blob/master/src/arraymancer/nn_primitives /fallback/conv.nim#L81-L106