Lambda 与 Bind的性能比较
template <typename Function> void do_test_loop(Function func, const uint64_t upper_limit = 1000000000ULL) { for (uint64_t i = 0; i < upper_limit; ++i) func(i); }
1. 使用std::bind来生成一个多态的std::function<void (uint64_t)>函数。
void test_accumulate_bind_function(uint64_t& x, uint64_t i) { x += i; } uint64_t test_accumulate_bind() { namespace arg = std::placeholders; uint64_t x = 0; std::function<void (uint64_t)> accumulator = std::bind(&test_accumulate_bind_function, std::ref(x), arg::_1); do_test_loop(accumulator); return x; }
uint64_t test_accumulate_lambda() { uint64_t x = 0; auto accumulator = [&x] (uint64_t i) { x += i; }; do_test_loop(accumulator); return x; }
lambda表达式没有运行时的上下文切换。当然,我们也因此失去了std::function所具有的高级的多态特性。lambda是一种由编译器静态关联的无名类型,这也是为什么在定义该类型时必须使用auto关键字的原因。变量accumulator表示lambda表达式的结果(没有其他的lambda表达式能生成与此一样的结果)。即使是两个内容差不多的表达式也不会有相同的类型。如果do_test_loop是一个在cpp文件中实现的函数,那么我们将在其的作用域范围内获取不到传入进来的lambda表达式类型。 幸运的是,有些聪明的人已考虑到了这个潜在的问题,并且由一个lambda表达式赋值给一个std::function类型不仅仅是可能的,而且还是极其容易的:
uint64_t test_accumulate_bound_lambda() { uint64_t x = 0; std::function<void (uint64_t)> accumulator = [&x] (uint64_t i) { x += i; }; do_test_loop(accumulator); return x; }
通过使用 lambda语义来替代std::bind,我们获取到了std::function多态的所有威力和C++ lambda表达式所拥有的便利和高性能表现。这听起来像是一种双赢。
template <typename Function> void run_test(const std::string& name, Function func) { std::cout << name; timer t; volatile_write(func()); timer::duration duration = t.elapsed(); std::cout << ' ' << duration.count() << std::endl; } int main() { run_test("Accumulate (lambda) ", &test_accumulate_lambda); run_test("Accumulate (bind) ", &test_accumulate_bind); run_test("Accumulate (bound lambda)", &test_accumulate_bound_lambda); }
事不宜迟,我们先来看看使用gcc 4.4.2 -O3编译并且在Inter Core i7 Q740机器上运行的结果:
Accumulate (lambda) 7
Accumulate (bind) 4401849
Accumulate (bound lambda) 4379315
(gdb) disassemble test_accumulate_lambda
Dump of assembler code for function _Z22test_accumulate_lambdav:
0x0000000000400e70 <+0>: movabs $0x6f05b59b5e49b00,%rax
0x0000000000400e75 <+5>: retq
End of assembler dump.
uint64_t test_accumulate_lambda() { uint64_t x = 0; // do_test_loop: for (uint64_t i = 0; i < 1000000000; ++i) x += i; return x; }
任何优秀的编译器都将对其进行优化。我认为要从这个简单例子中获取的最重要的信息是:编译器知道lambda函数是具有静态性的,因此你可以放心的使用lambda函数而不必担心它性能。那么我们调用的std::function又是怎样的一个过程呢?在这里它的多态性让我们很难去剖析,当函数do_test_loop被函数std::function<void (uint64_t)>实例化时,编译器并不知道func的行为,因此它能做任何事情(它只是std::function的入口点)。std::bind和lambda表达式之间的不同之处是极其细微的。如果你多次的运行测试用例,在我的电脑里lambda表达式的总会比std::bind的快一点,但是这些数据并不具有统计学的意义。这种性能在以后很有可能在不同的机器上会发生改变,如果我要猜测我会说这有std::reference_wrapper的作用。下面让我们来看看两个函数的堆栈。
#0 test_accumulate_bind_function (x=@0x7fffffffe5d0, i=0) at lambda_vs_bind.cpp:106
#1 0x0000000000401111 in operator() (__args#0=0, this=<optimized out>) at /usr/local/include/gcc-4.6.2/functional:2161
#2 do_test_loop<std::function<void(long unsigned int)> > (func=<optimized out>, upper_limit=<optimized out>) at lambda_vs_bind.cpp:93
#3 test_accumulate_bind () at lambda_vs_bind.cpp:115
#4 0x0000000000401304 in run_test<unsigned long (*)()> (name=<optimized out>, func=0x401080 <test_accumulate_bind()>) at lambda_vs_bind.cpp:84
#5 0x0000000000401411 in main () at lambda_vs_bind.cpp:136
Lambda Expression
#0 std::_Function_handler<void(long unsigned int), test_accumulate_bound_lambda()::<lambda(uint64_t)> >::_M_invoke(const std::_Any_data &, unsigned long) (__functor=..., __args#0=0) at /usr/local/include/gcc-4.6.2/functional:1778
#1 0x0000000000400fa9 in operator() (__args#0=0, this=<optimized out> at /usr/local/include/gcc-4.6.2/functional:2161
#2 do_test_loop<std::function<void(long unsigned int)> > (func=<optimized out>, upper_limit=<optimized out>) at lambda_vs_bind.cpp:93
#3 test_accumulate_bound_lambda () at lambda_vs_bind.cpp:126
#4 0x0000000000401304 in run_test<unsigned long (*)()> (name=<optimized out>, func=0x400f20 <test_accumulate_bound_lambda()>) at lambda_vs_bind.cpp:84
#5 0x000000000040143e in main () at lambda_vs_bind.cpp:140
它们的不同之处仅仅是在std::function的operator()函数调用,为了正真发生了什么,我们来快速的看一下g++ 4.6.2的std::function是怎么实现的:
template<typename _Res, typename... _ArgTypes> class function<_Res(_ArgTypes...)> : public _Maybe_unary_or_binary_function<_Res, _ArgTypes...>, private _Function_base { // a whole bunch of implementation details private: typedef _Res (*_Invoker_type)(const _Any_data&, _ArgTypes...); _Invoker_type _M_invoker; };
Accumulate (boost bind) 3223174
Accumulate (boost bound lambda) 4255098
#0 test_accumulate_bind_function (x=@0x7fffffffe600, i=0) at lambda_vs_bind.cpp:114
#1 0x00000000004018a3 in operator() (a0=0, this=<optimized out>) at /usr/local/include/boost/function/function_template.hpp:1013
#2 do_test_loop<boost::function<void(long unsigned int)> > (upper_limit=<optimized out>, func=<optimized out>) at lambda_vs_bind.cpp:101
#3 test_accumulate_boost_bind () at lambda_vs_bind.cpp:144
#4 0x0000000000401f44 in run_test<unsigned long (*)()> (name=<optimized out>, func=0x401800 <test_accumulate_boost_bind()>) at lambda_vs_bind.cpp:92
#5 0x000000000040207e in main () at lambda_vs_bind.cpp:161
(我大概可以写一整篇的文章来描述问什么boost::bind要比std::bind快了... ...)
functional template<typename _Functor, typename... _ArgTypes> inline typename _Bind_helper<_Functor, _ArgTypes...>::type bind(_Functor&& __f, _ArgTypes&&... __args) { typedef _Bind_helper<_Functor, _ArgTypes...> __helper_type; typedef typename __helper_type::__maybe_type __maybe_type; typedef typename __helper_type::type __result_type; return __result_type(__maybe_type::__do_wrap(std::forward<_Functor>(__f)), std::forward<_ArgTypes>(__args)...); } boost/bind/bind.hpp (with the macros expanded) template<class F, class A1, class A2> _bi::bind_t<_bi::unspecified, F, typename _bi::list_av_2<A1, A2>::type> bind(F f, A1 a1, A2 a2) { typedef typename _bi::list_av_2<A1, A2>::type list_type; return _bi::bind_t<_bi::unspecified, F, list_type> (f, list_type(a1, a2)); }
1. 源代码
你可以从这里获取到该程序的源代码。它在g++ 4.6.2的编译器上通过了编译并且能够运行,在支持c++11更好的编译器上编译将会更好。我的Boost库的版本是1.47,较早的版本和更新的版本的库都将工作得很好,因为boost::bind语法在一段时间内不会有太大更新(将来不一定)。如果你希望编译和运行都不用boost,那么将USE_BOOST的值改为0即可。
2. volatile_write
template <typename T> void volatile_write(const T& x) { volatile T* p = new T; *p = x; delete p; }
/** * Copyright 2011 Travis Gockel * * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. * You may obtain a copy of the License at * * * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. **/ // Turn building and testing boost::bind on or off with this macro #define USE_BOOST 1 // workaround for varieties of g++-4.6 with --std=gnu++0x #ifndef _GLIBCXX_USE_NANOSLEEP # define _GLIBCXX_USE_NANOSLEEP #endif #include <cstdint> #include <chrono> #include <iostream> #include <string> #include <thread> #if USE_BOOST #include <boost/function.hpp> #include <boost/bind.hpp> #endif class timer { public: typedef std::chrono::high_resolution_clock clock; typedef clock::time_point time_point; typedef clock::duration duration; public: timer() { reset(); } void reset() { _starttime = clock::now(); } duration elapsed() const { return clock::now() - _starttime; } protected: time_point _starttime; }; bool test_timer() { using std::chrono::milliseconds; typedef timer::duration duration; const milliseconds sleep_time(500); timer t; std::this_thread::sleep_for(sleep_time); duration recorded = t.elapsed(); // make sure the clock and this_thread::sleep_for is precise within one millisecond (or at least in agreement as to // how inaccurate they are) return (recorded - milliseconds(1) < sleep_time) && (recorded + milliseconds(1) > sleep_time); } template <typename T> void volatile_write(const T& x) { volatile T* p = new T; *p = x; delete p; } template <typename Function> void run_test(const std::string& name, Function func) { std::cout << name; timer t; volatile_write(func()); timer::duration duration = t.elapsed(); std::cout << ' ' << duration.count() << std::endl; } template <typename Function> void do_test_loop(Function func, const uint64_t upper_limit = 1000000000ULL) { for (uint64_t i = 0; i < upper_limit; ++i) func(i); } uint64_t test_accumulate_lambda() { uint64_t x = 0; auto accumulator = [&x] (uint64_t i) { x += i; }; do_test_loop(accumulator); return x; } void test_accumulate_bind_function(uint64_t& x, uint64_t i) { x += i; } uint64_t test_accumulate_bind() { namespace arg = std::placeholders; uint64_t x = 0; std::function<void (uint64_t)> accumulator = std::bind(&test_accumulate_bind_function, std::ref(x), arg::_1); do_test_loop(accumulator); return x; } uint64_t test_accumulate_bound_lambda() { uint64_t x = 0; std::function<void (uint64_t)> accumulator = [&x] (uint64_t i) { x += i; }; do_test_loop(accumulator); return x; } #if USE_BOOST uint64_t test_accumulate_boost_bind() { uint64_t x = 0; boost::function<void (uint64_t)> accumulator = boost::bind(&test_accumulate_bind_function, boost::ref(x), _1); do_test_loop(accumulator); return x; } uint64_t test_accumulate_boost_bound_lambda() { uint64_t x = 0; boost::function<void (uint64_t)> accumulator = [&x] (uint64_t i) { x += i; }; do_test_loop(accumulator); return x; } #endif int main() { if (!test_timer()) { std::cout << "Failed timer test." << std::endl; return -1; } run_test("Accumulate (lambda) ", &test_accumulate_lambda); run_test("Accumulate (bind) ", &test_accumulate_bind); run_test("Accumulate (bound lambda) ", &test_accumulate_bound_lambda); #if USE_BOOST run_test("Accumulate (boost bind) ", &test_accumulate_boost_bind); run_test("Accumulate (boost bound lambda)", &test_accumulate_bound_lambda); #endif }