Here’s a simple example of using C++17 (parallel) execution policies for summation.
#include <vector>
#include <execution>We have to load Threading Building Blocks library that under the hood does the actual parallelization:
#pragma cling load("libtbb.so.2")const std::vector<double> v(10'000'007, 0.1);%%timeit
std::reduce(std::execution::seq, v.cbegin(), v.cend());Output
128 ms +- 23.8 ms per loop (mean +- std. dev. of 7 runs 10 loops each)
%%timeit
std::reduce(std::execution::par, v.cbegin(), v.cend());Output
32.6 ms +- 5.13 ms per loop (mean +- std. dev. of 7 runs 10 loops each)
auto s = std::reduce(std::execution::par, v.cbegin(), v.cend());
sOutput
1000000.7