充分利用多核CPU,计算很大的List中所有整数和:多线程不一定快
有个阿里面试题,是这样描述的:有一个很大的整数 List,需要求这个 List 中所有整数的和,写一个可以充分利用多核 CPU 的代码,来计算结果。
这道题考察的知识点很简单。将这个很大的 List 切割,然后利用多线程分别求解,等所有子线程都返回结果时再合并,得到最终的结果。
直接看代码:
#include <iostream>#include <vector>#include <algorithm>#include <numeric>#include <chrono>#include <thread>#include <mutex>class Timer {private: std::chrono::time_point<std::chrono::steady_clock> start, end; std::chrono::duration<float> duration;public: Timer() { start = std::chrono::high_resolution_clock::now(); } ~Timer() { end = std::chrono::high_resolution_clock::now(); duration = end - start; float s = duration.count() * 1000.0f; std::cout << 'Timer took ' << s << 'ms' << std::endl; }};static std::vector<unsigned int> v(1e4, 0);void CreateNums(){ for (unsigned int i = 0; i < v.size(); i++) { v[i] = i; }}static long long sum = 0;unsigned int step = 1e3;std::mutex mtx;void SubSum(int index){ long long tmp = std::accumulate(v.begin() + index * step, v.begin() + (index + 1) * step, 0); mtx.lock(); sum += tmp; mtx.unlock();}int main(){ CreateNums(); Timer multiThread_cost; std::cout << 'multi thread test: ' << std::endl; int threadNum = 10; for (int i = 0; i < threadNum; i++) { std::thread tmp(SubSum, i); tmp.join(); } std::cout << 'sum = ' << sum << std::endl; multiThread_cost.~Timer(); sum = 0; Timer singleThread_cost; std::cout << 'single thread test: ' << std::endl; sum = std::accumulate(v.begin(), v.end(), 0); std::cout << 'sum = ' << sum << std::endl; singleThread_cost.~Timer(); std::cin.get();}
上述代码中,对 10000 个整数求和,我们在 main 主线程中起了 10 个子线程,每个子线程分别计算 1000 个数的和,再将 10 个子线程的结果累加。
多线程、单线程求和时间消耗对比
从程序运行的结果来看,多线程的时间开销反而比单线程更多,这是为什么呢?主要原因是整数规模还不够大,整数求和的开销相比创建线程对象、加锁的开销太小了,从而导致在整个时间消耗中后者占主要部分。
我们再来看下面一段代码:
#include <iostream>#include <vector>#include <algorithm>#include <numeric>#include <chrono>#include <thread>#include <mutex>class Timer {private: std::chrono::time_point<std::chrono::steady_clock> start, end; std::chrono::duration<float> duration;public: Timer() { start = std::chrono::high_resolution_clock::now(); } ~Timer() { end = std::chrono::high_resolution_clock::now(); duration = end - start; float s = duration.count() * 1000.0f; std::cout << 'Timer took ' << s << 'ms' << std::endl; }};void DoTask(){ }void SubSum(int index){ for (unsigned long long i = 0; i < 1e7; i++) { DoTask(); }}int main(){ Timer multiThread_cost; std::cout << 'multi thread test: ' << std::endl; int threadNum = 10; for (int i = 0; i < threadNum; i++) { std::thread tmp(SubSum, i); tmp.join(); } multiThread_cost.~Timer(); Timer singleThread_cost; std::cout << 'single thread test: ' << std::endl; for (unsigned long long i = 0; i < 1e8; i++) { DoTask(); } singleThread_cost.~Timer(); std::cin.get();}
单线程和多线程任务时间消耗对比
从上述结果来看,多线程时间开销比单线程少了。这里的任务用空函数 DoTask 代替了,而且任务规模很大(1亿次),所以多线程的优势就体现出来了。
赞 (0)