Java并发编程核心技术:synchronized、volatile与CAS
同步原语基础原理
对象内存结构与监视器机制
在Java虚拟机规范中,每个对象实例都包含对象头、实例数据和填充对齐三部分。对象头中的Mark Word(标记字段)是实现同步机制的关键存储区域,其结构随对象状态变化:
32位JVM Mark Word布局:
|-------------------------------------------------------|--------------------|
| Mark Word (32 bits) | State |
|-------------------------------------------------------|--------------------|
| identity_hashcode:25 | age:4 | biased_lock:0 | lock:01| Normal |
|-------------------------------------------------------|--------------------|
| thread:23 | epoch:2 | age:4 | biased_lock:1 | lock:01| Biased |
|-------------------------------------------------------|--------------------|
| ptr_to_lock_record:30 | lock:00| Lightweight Locked |
|-------------------------------------------------------|--------------------|
| ptr_to_heavyweight_monitor:30 | lock:10| Heavyweight Locked |
|-------------------------------------------------------|--------------------|
监视器锁(Monitor)的实现依赖这个结构:
- 当线程进入同步块时,JVM尝试通过CAS操作将Mark Word更新为指向当前线程的指针
- 更新失败时膨胀为重量级锁,涉及操作系统互斥量(mutex)操作
- 等待队列采用cxq(Contention queue)和EntryList双队列结构管理竞争线程
内存可见性本质
现代CPU架构的多级缓存结构:
+---------------+ +---------------+ +---------------+
| CPU Core 1 | | CPU Core 2 | | CPU Core N |
| L1 Cache | | L1 Cache | | L1 Cache |
| (32KB) | | (32KB) | | (32KB) |
+-------+-------+ +-------+-------+ +-------+-------+
| | |
+--------------------+--------------------+
L2 Cache (256KB)
+-------------------+
| |
+-------------------+
|
+-------------------+
| L3 Cache |
| (8-32MB) |
+-------------------+
|
+-------------------+
| Main Memory |
| (DRAM) |
+-------------------+
缓存一致性协议(MESI)的状态转换:
- Modified(M):缓存行已被修改,与主存不一致
- Exclusive(E):缓存行独占,与主存一致
- Shared(S):缓存行被多个核心共享
- Invalid(I):缓存行无效
示例代码演示可见性问题:
public class VisibilityDemo {
// 移除volatile观察不同表现
private static /*volatile*/ boolean ready = false;
private static int number = 0;
public static void main(String[] args) {
new Thread(() -> {
while (!ready) {
// 空循环或添加Thread.yield()观察效果差异
}
System.out.println("Number: " + number);
}).start();
number = 42;
ready = true;
}
}
synchronized深度解析
锁升级过程详解
- 偏向锁(Biased Locking)
- 对象首次被线程访问时,通过CAS将Mark Word中的线程ID设置为当前线程
- 批量重偏向(bulk rebias)机制:当一类对象的偏向锁撤销超过阈值(默认20次),JVM会重新分配偏向
- 轻量级锁(Lightweight Locking)
- 创建Lock Record于线程栈帧中,存储对象Mark Word拷贝
- 通过CAS将对象头指向Lock Record指针
- 自旋锁优化:自适应自旋(Adaptive Spinning)根据历史成功率调整自旋次数
- 重量级锁(Heavyweight Lockoring)
- 对象关联的monitor结构:
ObjectMonitor() {
_header = NULL;
_count = 0; // 重入次数
_waiters = 0,
_recursions = 0;
_object = NULL;
_owner = NULL; // 持有线程
_WaitSet = NULL; // 等待队列(调用wait())
_WaitSetLock = 0 ;
_Responsible = NULL ;
_succ = NULL ;
_cxq = NULL ; // 竞争队列
FreeNext = NULL ;
_EntryList = NULL ; // 处于等待锁的线程
_SpinFreq = 0 ;
_SpinClock = 0 ;
OwnerIsThread = 0 ;
}
锁优化技术
- 锁消除(Lock Elision) JIT编译器通过逃逸分析判断同步块是否线程安全
public String concat(String s1, String s2) {
StringBuffer sb = new StringBuffer();
sb.append(s1);
sb.append(s2);
return sb.toString(); // 自动消除锁操作
}
- 锁粗化(Lock Coarsening) 连续多个同步块合并为单个同步块:
// 优化前
for (int i = 0; i < 100; i++) {
synchronized(lock) {
// 操作
}
}
// 优化后
synchronized(lock) {
for (int i = 0; i < 100; i++) {
// 操作
}
}
- 适应性自旋(Adaptive Spinning)
- 成功率预测:根据最近自旋获取锁的成功率调整自旋时间
- 自旋上限:JDK 6之后由JVM自动控制,默认10次
volatile内存语义实现
内存屏障实现细节
x86架构下的内存屏障实现:
- StoreStore屏障:无实际指令(x86保证写顺序)
- LoadLoad屏障:无实际指令
- LoadStore屏障:无实际指令
- StoreLoad屏障:对应
lock; addl $0,0(%%rsp)
指令
JVM插入屏障策略:
// 写操作前
StoreStore屏障
// 写操作
// 写操作后
StoreLoad屏障
// 读操作前
LoadLoad屏障
LoadStore屏障
// 读操作
可见性保证示例
缓存失效协议实现:
class VolatileExample {
int x = 0;
volatile boolean v = false;
public void writer() {
x = 42;
v = true; // StoreStore屏障在此插入
}
public void reader() {
if (v) { // LoadLoad屏障在此插入
System.out.println(x);
}
}
}
CAS操作原理与实现
硬件层实现机制
x86架构的LOCK CMPXCHG指令时序:
- 锁定总线或使用缓存一致性协议
- 比较目标地址值与期望值
- 相等时执行交换操作
- 释放总线锁定
CPU缓存行对齐优化:
// 伪共享问题示例
class FalseSharing {
volatile long x; // 缓存行(64字节)
volatile long y; // 可能与x处于同一缓存行
}
// 填充优化方案
class PaddedAtomicLong extends AtomicLong {
public volatile long p1, p2, p3, p4, p5, p6; // 填充物
public PaddedAtomicLong() { super(); }
public volatile long value;
public volatile long q1, q2, q3, q4, q5, q6;
}
ABA问题解决方案
带标记的原子引用实现:
AtomicStampedReference<String> ref = new AtomicStampedReference<>("A", 0);
// 线程1
int[] stampHolder = new int[1];
String current = ref.get(stampHolder);
ref.compareAndSet(current, "B", stampHolder[0], stampHolder[0]+1);
// 线程2修改A->B->A
ref.compareAndSet("A", "B", 0, 1);
ref.compareAndSet("B", "A", 1, 2);
// 再次尝试修改将失败,因为版本号已变化
ref.compareAndSet("A", "C", 0, 1); // 预期stamp=0,实际stamp=2
并发控制策略对比
性能对比测试数据
使用JMH基准测试(纳秒/操作): | 实现方式 | 单线程 | 4线程竞争 | 16线程竞争 | |----------------|--------|-----------|------------| | synchronized | 15 | 1200 | 8500 | | ReentrantLock | 20 | 950 | 6200 | | AtomicInteger | 8 | 45 | 300 | | LongAdder | 12 | 25 | 50 |
(测试环境:Intel i9-9900K, 32GB DDR4, JDK 11)
适用场景决策树
graph TD
A[需要原子操作?] -->|是| B{操作复杂度}
B -->|简单| C[考虑CAS]
B -->|复杂| D[使用锁机制]
A -->|否| E{需要可见性保证?}
E -->|是| F[volatile修饰]
E -->|否| G[普通变量]
C --> H{竞争激烈程度?}
H -->|低| I[保持CAS]
H -->|高| J[转换为LongAdder]
D --> K{需要灵活控制?}
K -->|是| L[使用ReentrantLock]
K -->|否| M[synchronized]
高级优化技巧
锁分解(Lock Splitting)
class ImprovedServerStatus {
private final Object usersLock = new Object();
private final Object queriesLock = new Object();
private int activeUsers;
private int runningQueries;
public void updateUsers() {
synchronized(usersLock) {
activeUsers++;
}
}
public void updateQueries() {
synchronized(queriesLock) {
runningQueries++;
}
}
}
无锁数据结构示例
Michael-Scott非阻塞队列实现:
public class NonBlockingQueue<T> {
private static class Node<T> {
final T item;
volatile Node<T> next;
Node(T item) {
this.item = item;
}
}
private volatile Node<T> head;
private volatile Node<T> tail;
public NonBlockingQueue() {
Node<T> dummy = new Node<>(null);
head = dummy;
tail = dummy;
}
public void enq(T item) {
Node<T> node = new Node<>(item);
while (true) {
Node<T> last = tail;
Node<T> next = last.next;
if (last == tail) {
if (next == null) {
if (CAS(last.next, next, node)) {
CAS(tail, last, node);
return;
}
} else {
CAS(tail, last, next);
}
}
}
}
public T deq() {
while (true) {
Node<T> first = head;
Node<T> last = tail;
Node<T> next = first.next;
if (first == head) {
if (first == last) {
if (next == null) return null;
CAS(tail, last, next);
} else {
T item = next.item;
if (CAS(head, first, next)) return item;
}
}
}
}
// 基于Unsafe的CAS实现
private static final Unsafe UNSAFE = ...;
private static final long tailOffset = ...;
boolean CAS(Node<T> target, Node<T> expect, Node<T> update) {
return UNSAFE.compareAndSwapObject(target, tailOffset, expect, update);
}
}
常见问题排查
死锁检测
使用JStack检测死锁:
jstack <pid> | grep -A 10 deadlock
# 典型输出
Found one Java-level deadlock:
=============================
"Thread-1":
waiting to lock monitor 0x00007f88d8004818 (object 0x000000076eaf8a40, a java.lang.Object),
which is held by "Thread-0"
"Thread-0":
waiting to lock monitor 0x00007f88d8006208 (object 0x000000076eaf8a50, a java.lang.Object),
which is held by "Thread-1"
性能问题诊断
使用JFR(Java Flight Recorder)监控锁竞争:
JFR.configure()
.enable("jdk.JavaMonitorWait")
.withPeriod(Duration.ofSeconds(1))
.build()
.start();
正文到此结束
相关文章
热门推荐
评论插件初始化中...