Build a faster OpenCV for Raspberry Pi3

the impossible code - faster OpenCV for Raspberry Pi3

So you want to build a faster OpenCV for Raspberry Pi3, but want to be sure – Are you using the right build flags? How can you prove this is faster? Well, you can never be 100% sure in advance, but here is a methodic way to get there.

Build a faster OpenCV deb package for Raspberry Pi

Let’s cut to the chase and start with the conclusion. Later I’ll show you how to prove this is true and add a short discussion on how to make sure this is true for your use case.

From my personal experience and testing, this is how you would build the best performing OpenCV 3.1.0:

Assumptions

  1. The source is extracted in ~/Downloads/opencv-3.1.0 (comment – for big builds I usually prefer for the source and build directories to be mounted externally to the Pi).
  2. You already followed this post – Intel TBB on Raspberry Pi. (Optional for reasons I’ll detail later on, but I’m showing below how to build with TBB – if you don’t want it then remove the TBB related configuration parameters below).

Configure

cd ~/Downloads/opencv-3.1.0
mkdir build_rpi3_release_fp_tbb
cd build_rpi3_release_fp_tbb
cmake -DCMAKE_CXX_FLAGS="-DTBB_USE_GCC_BUILTINS=1 -D__TBB_64BIT_ATOMICS=0" -DENABLE_VFPV3=ON -DENABLE_NEON=ON -DBUILD_TESTS=OFF -DWITH_TBB=ON -DCMAKE_BUILD_TYPE=Release ..

Build

make -j 4

Please note that building in parallel also requires a lot of RAM usage and the Pi doesn’t have so much RAM. If the make fails with “out of memory” or becomes slow because of swap usage then reduce the parallelism and/or close redundant processes (for example disable the GUI and instead use ssh to connect to the pi).

Prepare a package

This will create a package named opencv_3.1.0-1_armhf.deb in the current build directory.

sudo apt-get install checkinstall
echo "opencv 3.1.0 build_rpi3_release_fp_tbb" > description-pak
echo | sudo checkinstall -D --install=no --pkgname=opencv --pkgversion=3.1.0 --provides=opencv --nodoc --backup=no --exclude=$HOME

Install

sudo dpkg -i opencv_3.1.0-1_armhf.deb

How much faster is it?

In short – about 30% faster.

How did I come up with this number?

Faster by definition is a relative term, so we need to determine what use case we’re comparing and with which build configuration.

I’m comparing this build with a simple ‘Release’ configuration, built on Raspberry Pi3. As for the use case, there are so many of them, so this is how I came up with an average:

  1. Build different configurations to compare in different build directories.
  2. Execute the performance tests which come with OpenCV 3.1.0 (run.py python scripts supplied with OpenCV).
  3. Compare the the different builds against the ‘Release’ build (summary.py python scripts supplied with OpenCV). Output gives a ‘x-factor’ per test which is how much slower or faster a test was relative to the base ‘Release’ build.
  4. Calculate an average for this ‘x-factor’ (small perl script I wrote to parse the html output of summary.py).

The resulting x-factor was 1.2975, which is ~ 30% faster.

Was this performance gain due to TBB?

No. 28% should be credited to building with Floating Point optimizations flags, as I verified in my tests. Nevertheless squeezing an extra 2% won’t do any harm. I’m using TBB anyway in many of my projects for it’s great parallelism tools (mainly their pipeline), but if you don’t want it, then leave it out.

Show me the numbers

See on this GitHub project’s comparisons sub directory (download the html files and open them locally in a browser).

But this is an average, how can I make sure for my use case?

The bottom line is this – if you don’t have performance issues then don’t optimize. If you do have performance issues and you’re sure there is nothing you can improve in your own code, then go ahead and investigate several build configurations. Try linking with each one and see your benefits.

The simple automated “build – test – compare” system I used to perform the above tests is available in this GitHub project, and you can use it as you wish. See the project’s page for simple usage instructions.

Summary

  • Don’t optimize unless you need to.
  • Optimize your own code first.
  • You can build a 30% faster OpenCV package for Raspberry Pi3.
    • Building packages is always better since they are easier to maintain and deliver for installations.
  • Using TBB is optional since OpenCV has an alternative parallelism mechanism.
  • You can freely use the build/pkg/test/compare system I used for OpenCV 3.1.0 from here.

See you soon in an upcoming post.

Sagi Zeevi

A software developer brain surgeon ... if software only had a brain. An electronics hobbyist heart surgeon ... if electronics only had a heart.

You may also like...

5 Responses

  1. tal says:

    hi,
    Thank you for your great and well explained posts!
    I have installed your tbb package and compiled openCV according to your instructions, the only difference being my openCV is in different directory and that I use openCV_contrib.
    I encountered a problem while trying to run the dmp example code in the openCV_contrib, it says that TBB is not defined.
    Maybe you have any advice? I think that maybe I’m using a wrong command line when I compile the code, can you write the compilation command?

    Thanks,
    Tal

  2. Sagi Zeevi says:

    Hi Tal,
    Can you copy paste your compilation command and error?

  3. James says:

    Thanks for this interesting article. I had not heard of TBB, but my initial look at it makes it appear very useful. For now I have just been using Boost for basic threading. Would be interesting to see an article with some pointers and tips of how you make use of TBB. Cheers!

Leave a Reply

Scroll Up