背景
互斥锁(mutex):
- 加锁失败的线程会被阻塞,阻塞的线程不耗费CPU资源
- 导致模式切换,使用互斥锁加锁会进入内核态,阻塞时还会引发调度,运行时重新进入用户态
自旋锁(spin lock):
- 使用了忙等待,加锁失败的线程会一直重复尝试加锁,耗费CPU资源
- 使用机器指令实现,不涉及模式切换,也不会引发调度
使用场景:
- 如果锁粒度足够小、持有锁时间足够短,建议使用自旋锁,反之,使用互斥锁
- 如果临界区内含有IO操作,建议使用互斥锁(临界区内不建议存在IO,一定要存在,使用互斥锁)
- 如果线程多、锁竞争激烈时,建议使用互斥锁
代码
此自旋锁过lock-free的std::atomic_flag实现
#ifndef _SPINLOCK_H_
#define _SPINLOCK_H_
#include <atomic>
class SpinLock final{
public:
void lock();
void unlock();
SpinLock() = default;
~SpinLock() = default;
SpinLock(const SpinLock& rhs) = delete;
SpinLock(SpinLock&& rhs) = delete;
SpinLock& operator=(const SpinLock& rhs) = delete;
SpinLock& operator=(SpinLock&& rhs) = delete;
private:
std::atomic_flag m_lock = ATOMIC_FLAG_INIT;
};
#endif // !_SPINLOCK_H_
#include "SpinLock.h"
void SpinLock::lock(){
while(m_lock.test_and_set(std::memory_order_acquire));
}
void SpinLock::unlock(){
m_lock.clear(std::memory_order_release);
}
- 为了获得更高的执行效率,编译器会对指令进行重排(不改变基本语义),CPU也会乱序执行,在多线程编程中会带来线程间同步问题,test_and_set方法内加入内存顺序参数来处理这个问题:
总之,上述内存顺序的组合限制了线程读写指令的重排的界限与执行顺序,读写指令的重排不能越界,读写操作执行也不能越界进行(此处的越界是单向的,仅仅是acquire与release范围内的读写不能往外)
- SpinLock满足基本可锁定要求(实现了方法lock(), unlock()),可通过std::lock_guard<>、std::unique_lock<>实现RAII风格锁定,达到自动释放锁及异常安全的目的
优化
- 增加了x86 pause指令来优化等待循环的性能(来自boost)
Improves the performance of spin-wait loops. When executing a "spin-wait loop," a Pentium 4 or Intel Xeon processor suffers a severe performance penalty when exiting the loop because it detects a possible memory order violation. The PAUSE instruction provides a hint to the processor that the code sequence is a spin-wait loop. The processor uses this hint to avoid the memory order violation in most situations, which greatly improves processor performance. For this reason, it is recommended that a PAUSE instruction be placed in all spin-wait loops.
An additional function of the PAUSE instruction is to reduce the power consumed by a Pentium 4 processor while executing a spin loop. The Pentium 4 processor can execute a spinwait loop extremely quickly, causing the processor to consume a lot of power while it waits for the resource it is spinning on to become available. Inserting a pause instruction in a spinwait loop greatly reduces the processor's power consumption.
This instruction was introduced in the Pentium 4 processors, but is backward compatible with all IA-32 processors. In earlier IA-32 processors, the PAUSE instruction operates like a NOP instruction. The Pentium 4 and Intel Xeon processors implement the PAUSE instruction as a pre-defined delay. The delay is finite and can be zero for some processors. This instruction does not change the architectural state of the processor (that is, it performs essentially a delaying noop operation).
来源: http://c9x.me/x86/html/file_module_x86_id_232.html - 增加try_lock()使SpinLock满足可锁定要求
#ifndef _SPINLOCK_H_
#define _SPINLOCK_H_
#include <atomic>
#include <emmintrin.h>
#if defined(_MSC_VER) && _MSC_VER >= 1310 && ( defined(_M_IX86) || defined(_M_X64) ) && !defined(__c2__)
#define BOOST_SMT_PAUSE _mm_pause();
#elif defined(__GNUC__) && ( defined(__i386__) || defined(__x86_64__) )
#define BOOST_SMT_PAUSE __asm__ __volatile__( "rep; nop" : : : "memory" );
#endif
class SpinLock final{
public:
void lock();
bool try_lock();
void unlock();
SpinLock() = default;
~SpinLock() = default;
SpinLock(const SpinLock& rhs) = delete;
SpinLock(SpinLock&& rhs) = delete;
SpinLock& operator=(const SpinLock& rhs) = delete;
SpinLock& operator=(SpinLock&& rhs) = delete;
private:
std::atomic_flag m_lock = ATOMIC_FLAG_INIT;
};
#endif // !_SPINLOCK_H_
#include <emmintrin.h>
#include "SpinLock.h"
void SpinLock::lock(){
while(m_lock.test_and_set(std::memory_order_acquire)){
BOOST_SMT_PAUSE
}
}
bool SpinLock::try_lock(){
return true != m_lock.test_and_set(std::memory_order_acquire);
}
void SpinLock::unlock(){
m_lock.clear(std::memory_order_release);
}
内存序扩展连接
聊聊原子变量、锁、内存屏障那点事
并发研究之CPU缓存一致性协议(MESI)