充分利用多核CPU,计算很大的List中所有整数和:多线程不一定快

有个阿里面试题,是这样描述的:有一个很大的整数 List,需要求这个 List 中所有整数的和,写一个可以充分利用多核 CPU 的代码,来计算结果。

这道题考察的知识点很简单。将这个很大的 List 切割,然后利用多线程分别求解,等所有子线程都返回结果时再合并,得到最终的结果。

直接看代码:

#include <iostream>#include <vector>#include <algorithm>#include <numeric>#include <chrono>#include <thread>#include <mutex>class Timer {private: std::chrono::time_point<std::chrono::steady_clock> start, end; std::chrono::duration<float> duration;public: Timer() { start = std::chrono::high_resolution_clock::now(); } ~Timer() { end = std::chrono::high_resolution_clock::now(); duration = end - start; float s = duration.count() * 1000.0f; std::cout << 'Timer took ' << s << 'ms' << std::endl; }};static std::vector<unsigned int> v(1e4, 0);void CreateNums(){ for (unsigned int i = 0; i < v.size(); i++) { v[i] = i; }}static long long sum = 0;unsigned int step = 1e3;std::mutex mtx;void SubSum(int index){ long long tmp = std::accumulate(v.begin() + index * step, v.begin() + (index + 1) * step, 0); mtx.lock(); sum += tmp; mtx.unlock();}int main(){ CreateNums(); Timer multiThread_cost; std::cout << 'multi thread test: ' << std::endl; int threadNum = 10; for (int i = 0; i < threadNum; i++) { std::thread tmp(SubSum, i); tmp.join(); } std::cout << 'sum = ' << sum << std::endl; multiThread_cost.~Timer(); sum = 0; Timer singleThread_cost; std::cout << 'single thread test: ' << std::endl; sum = std::accumulate(v.begin(), v.end(), 0); std::cout << 'sum = ' << sum << std::endl; singleThread_cost.~Timer(); std::cin.get();}

上述代码中,对 10000 个整数求和,我们在 main 主线程中起了 10 个子线程,每个子线程分别计算 1000 个数的和,再将 10 个子线程的结果累加。

多线程、单线程求和时间消耗对比

从程序运行的结果来看,多线程的时间开销反而比单线程更多,这是为什么呢?主要原因是整数规模还不够大,整数求和的开销相比创建线程对象、加锁的开销太小了,从而导致在整个时间消耗中后者占主要部分。

我们再来看下面一段代码:

#include <iostream>#include <vector>#include <algorithm>#include <numeric>#include <chrono>#include <thread>#include <mutex>class Timer {private:    std::chrono::time_point<std::chrono::steady_clock> start, end;    std::chrono::duration<float> duration;public:    Timer()    {        start = std::chrono::high_resolution_clock::now();    }    ~Timer()    {        end = std::chrono::high_resolution_clock::now();        duration = end - start;        float s = duration.count() * 1000.0f;        std::cout << 'Timer took ' << s << 'ms' << std::endl;    }};void DoTask(){    }void SubSum(int index){    for (unsigned long long i = 0; i < 1e7; i++) {        DoTask();    }}int main(){    Timer multiThread_cost;    std::cout << 'multi thread test: ' << std::endl;    int threadNum = 10;    for (int i = 0; i < threadNum; i++) {        std::thread tmp(SubSum, i);        tmp.join();    }    multiThread_cost.~Timer();        Timer singleThread_cost;    std::cout << 'single thread test: ' << std::endl;    for (unsigned long long i = 0; i < 1e8; i++) {        DoTask();    }    singleThread_cost.~Timer();    std::cin.get();}

单线程和多线程任务时间消耗对比

从上述结果来看,多线程时间开销比单线程少了。这里的任务用空函数 DoTask 代替了,而且任务规模很大(1亿次),所以多线程的优势就体现出来了。

(0)

相关推荐