Build a faster OpenCV for Raspberry Pi3
So you want to build a faster OpenCV for Raspberry Pi3, but want to be sure – Are you using the right build flags? How can you prove this is faster? Well, you can never be 100% sure in advance, but here is a methodic way to get there.
Build a faster OpenCV deb package for Raspberry Pi
Let’s cut to the chase and start with the conclusion. Later I’ll show you how to prove this is true and add a short discussion on how to make sure this is true for your use case.
From my personal experience and testing, this is how you would build the best performing OpenCV 3.1.0:
- The source is extracted in ~/Downloads/opencv-3.1.0 (comment – for big builds I usually prefer for the source and build directories to be mounted externally to the Pi).
- You already followed this post – Intel TBB on Raspberry Pi. (Optional for reasons I’ll detail later on, but I’m showing below how to build with TBB – if you don’t want it then remove the TBB related configuration parameters below).
cd ~/Downloads/opencv-3.1.0 mkdir build_rpi3_release_fp_tbb cd build_rpi3_release_fp_tbb cmake -DCMAKE_CXX_FLAGS="-DTBB_USE_GCC_BUILTINS=1 -D__TBB_64BIT_ATOMICS=0" -DENABLE_VFPV3=ON -DENABLE_NEON=ON -DBUILD_TESTS=OFF -DWITH_TBB=ON -DCMAKE_BUILD_TYPE=Release ..
make -j 4
Please note that building in parallel also requires a lot of RAM usage and the Pi doesn’t have so much RAM. If the make fails with “out of memory” or becomes slow because of swap usage then reduce the parallelism and/or close redundant processes (for example disable the GUI and instead use ssh to connect to the pi).
Prepare a package
This will create a package named opencv_3.1.0-1_armhf.deb in the current build directory.
sudo apt-get install checkinstall echo "opencv 3.1.0 build_rpi3_release_fp_tbb" > description-pak echo | sudo checkinstall -D --install=no --pkgname=opencv --pkgversion=3.1.0 --provides=opencv --nodoc --backup=no --exclude=$HOME
sudo dpkg -i opencv_3.1.0-1_armhf.deb
How much faster is it?
In short – about 30% faster.
How did I come up with this number?
Faster by definition is a relative term, so we need to determine what use case we’re comparing and with which build configuration.
I’m comparing this build with a simple ‘Release’ configuration, built on Raspberry Pi3. As for the use case, there are so many of them, so this is how I came up with an average:
- Build different configurations to compare in different build directories.
- Execute the performance tests which come with OpenCV 3.1.0 (run.py python scripts supplied with OpenCV).
- Compare the the different builds against the ‘Release’ build (summary.py python scripts supplied with OpenCV). Output gives a ‘x-factor’ per test which is how much slower or faster a test was relative to the base ‘Release’ build.
- Calculate an average for this ‘x-factor’ (small perl script I wrote to parse the html output of summary.py).
The resulting x-factor was 1.2975, which is ~ 30% faster.
Was this performance gain due to TBB?
No. 28% should be credited to building with Floating Point optimizations flags, as I verified in my tests. Nevertheless squeezing an extra 2% won’t do any harm. I’m using TBB anyway in many of my projects for it’s great parallelism tools (mainly their pipeline), but if you don’t want it, then leave it out.
Show me the numbers
See on this GitHub project’s comparisons sub directory (download the html files and open them locally in a browser).
But this is an average, how can I make sure for my use case?
The bottom line is this – if you don’t have performance issues then don’t optimize. If you do have performance issues and you’re sure there is nothing you can improve in your own code, then go ahead and investigate several build configurations. Try linking with each one and see your benefits.
The simple automated “build – test – compare” system I used to perform the above tests is available in this GitHub project, and you can use it as you wish. See the project’s page for simple usage instructions.
- Don’t optimize unless you need to.
- Optimize your own code first.
- You can build a 30% faster OpenCV package for Raspberry Pi3.
- Building packages is always better since they are easier to maintain and deliver for installations.
- Using TBB is optional since OpenCV has an alternative parallelism mechanism.
- Independently you can build a simple and fast pipeline with TBB for your own usage with OpenCV (I’ll share example code in a future post).
- UPDATE: See more on using OpenCV with TBB at https://www.theimpossiblecode.com/blog/faster-opencv-smiles-tbb.
- You can freely use the build/pkg/test/compare system I used for OpenCV 3.1.0 from here.
See you soon in an upcoming post.