Gaussian blur is one of the basic building blocks of computer vision algorithms. This repo is a step-by-step tutorial that shows how it can be accelerated. The ideas used here can be also used for other convolution operations.
To better understand the tips, follow the codes in this order:
- memory_management
- simd
- parallel_processing
cd memory_management (or the other folders)
mkdir build
cd build
cmake ..
make
./fastGaussianBlur
See the runtimes for different image sizes on Raspberry Pi 3B+ (32bitOS, GCC 6.3)
For more details read my posts on memory management, simd and parallel processing.