Java并发编程核心技术:synchronized、volatile与CAS

同步原语基础原理

对象内存结构与监视器机制

在Java虚拟机规范中,每个对象实例都包含对象头、实例数据和填充对齐三部分。对象头中的Mark Word(标记字段)是实现同步机制的关键存储区域,其结构随对象状态变化:

32位JVM Mark Word布局:

|-------------------------------------------------------|--------------------|
|                  Mark Word (32 bits)                  |       State        |
|-------------------------------------------------------|--------------------|
| identity_hashcode:25 | age:4 | biased_lock:0 | lock:01|       Normal       |
|-------------------------------------------------------|--------------------|
|  thread:23 | epoch:2 | age:4 | biased_lock:1 | lock:01|       Biased       |
|-------------------------------------------------------|--------------------|
|               ptr_to_lock_record:30             | lock:00| Lightweight Locked |
|-------------------------------------------------------|--------------------|
|               ptr_to_heavyweight_monitor:30     | lock:10| Heavyweight Locked |
|-------------------------------------------------------|--------------------|

监视器锁(Monitor)的实现依赖这个结构:

  1. 当线程进入同步块时,JVM尝试通过CAS操作将Mark Word更新为指向当前线程的指针
  2. 更新失败时膨胀为重量级锁,涉及操作系统互斥量(mutex)操作
  3. 等待队列采用cxq(Contention queue)和EntryList双队列结构管理竞争线程

内存可见性本质

现代CPU架构的多级缓存结构:

+---------------+    +---------------+    +---------------+
|   CPU Core 1  |    |   CPU Core 2  |    |   CPU Core N  |
|  L1 Cache     |    |  L1 Cache     |    |  L1 Cache     |
|  (32KB)       |    |  (32KB)       |    |  (32KB)       |
+-------+-------+    +-------+-------+    +-------+-------+
        |                    |                    |
        +--------------------+--------------------+
                    L2 Cache (256KB)
                    +-------------------+
                    |                   |
                    +-------------------+
                            |
                    +-------------------+
                    |    L3 Cache       |
                    |    (8-32MB)       |
                    +-------------------+
                            |
                    +-------------------+
                    |   Main Memory     |
                    |   (DRAM)          |
                    +-------------------+

缓存一致性协议(MESI)的状态转换:

  • Modified(M):缓存行已被修改,与主存不一致
  • Exclusive(E):缓存行独占,与主存一致
  • Shared(S):缓存行被多个核心共享
  • Invalid(I):缓存行无效

示例代码演示可见性问题:

public class VisibilityDemo {
    // 移除volatile观察不同表现
    private static /*volatile*/ boolean ready = false;
    private static int number = 0;

    public static void main(String[] args) {
        new Thread(() -> {
            while (!ready) {
                // 空循环或添加Thread.yield()观察效果差异
            }
            System.out.println("Number: " + number);
        }).start();

        number = 42;
        ready = true;
    }
}

synchronized深度解析

锁升级过程详解

  1. 偏向锁(Biased Locking)
  • 对象首次被线程访问时,通过CAS将Mark Word中的线程ID设置为当前线程
  • 批量重偏向(bulk rebias)机制:当一类对象的偏向锁撤销超过阈值(默认20次),JVM会重新分配偏向
  1. 轻量级锁(Lightweight Locking)
  • 创建Lock Record于线程栈帧中,存储对象Mark Word拷贝
  • 通过CAS将对象头指向Lock Record指针
  • 自旋锁优化:自适应自旋(Adaptive Spinning)根据历史成功率调整自旋次数
  1. 重量级锁(Heavyweight Lockoring)
  • 对象关联的monitor结构:
ObjectMonitor() {
    _header       = NULL;
    _count        = 0;  // 重入次数
    _waiters      = 0,
    _recursions   = 0;
    _object       = NULL;
    _owner        = NULL;  // 持有线程
    _WaitSet      = NULL;  // 等待队列(调用wait())
    _WaitSetLock  = 0 ;
    _Responsible  = NULL ;
    _succ         = NULL ;
    _cxq          = NULL ;  // 竞争队列
    FreeNext      = NULL ;
    _EntryList    = NULL ;  // 处于等待锁的线程
    _SpinFreq     = 0 ;
    _SpinClock    = 0 ;
    OwnerIsThread = 0 ;
}

锁优化技术

  1. 锁消除(Lock Elision) JIT编译器通过逃逸分析判断同步块是否线程安全
public String concat(String s1, String s2) {
    StringBuffer sb = new StringBuffer();
    sb.append(s1);
    sb.append(s2);
    return sb.toString();  // 自动消除锁操作
}
  1. 锁粗化(Lock Coarsening) 连续多个同步块合并为单个同步块:
// 优化前
for (int i = 0; i < 100; i++) {
    synchronized(lock) {
        // 操作
    }
}

// 优化后
synchronized(lock) {
    for (int i = 0; i < 100; i++) {
        // 操作
    }
}
  1. 适应性自旋(Adaptive Spinning)
  • 成功率预测:根据最近自旋获取锁的成功率调整自旋时间
  • 自旋上限:JDK 6之后由JVM自动控制,默认10次

volatile内存语义实现

内存屏障实现细节

x86架构下的内存屏障实现:

  • StoreStore屏障:无实际指令(x86保证写顺序)
  • LoadLoad屏障:无实际指令
  • LoadStore屏障:无实际指令
  • StoreLoad屏障:对应lock; addl $0,0(%%rsp)指令

JVM插入屏障策略:

// 写操作前
StoreStore屏障
// 写操作
// 写操作后
StoreLoad屏障

// 读操作前
LoadLoad屏障
LoadStore屏障
// 读操作

可见性保证示例

缓存失效协议实现:

class VolatileExample {
    int x = 0;
    volatile boolean v = false;

    public void writer() {
        x = 42;
        v = true;  // StoreStore屏障在此插入
    }

    public void reader() {
        if (v) {   // LoadLoad屏障在此插入
            System.out.println(x);
        }
    }
}

CAS操作原理与实现

硬件层实现机制

x86架构的LOCK CMPXCHG指令时序:

  1. 锁定总线或使用缓存一致性协议
  2. 比较目标地址值与期望值
  3. 相等时执行交换操作
  4. 释放总线锁定

CPU缓存行对齐优化:

// 伪共享问题示例
class FalseSharing {
    volatile long x;  // 缓存行(64字节)
    volatile long y;  // 可能与x处于同一缓存行
}

// 填充优化方案
class PaddedAtomicLong extends AtomicLong {
    public volatile long p1, p2, p3, p4, p5, p6;  // 填充物
    public PaddedAtomicLong() { super(); }
    public volatile long value;
    public volatile long q1, q2, q3, q4, q5, q6;
}

ABA问题解决方案

带标记的原子引用实现:

AtomicStampedReference<String> ref = new AtomicStampedReference<>("A", 0);

// 线程1
int[] stampHolder = new int[1];
String current = ref.get(stampHolder);
ref.compareAndSet(current, "B", stampHolder[0], stampHolder[0]+1);

// 线程2修改A->B->A
ref.compareAndSet("A", "B", 0, 1);
ref.compareAndSet("B", "A", 1, 2);

// 再次尝试修改将失败,因为版本号已变化
ref.compareAndSet("A", "C", 0, 1);  // 预期stamp=0,实际stamp=2

并发控制策略对比

性能对比测试数据

使用JMH基准测试(纳秒/操作): | 实现方式 | 单线程 | 4线程竞争 | 16线程竞争 | |----------------|--------|-----------|------------| | synchronized | 15 | 1200 | 8500 | | ReentrantLock | 20 | 950 | 6200 | | AtomicInteger | 8 | 45 | 300 | | LongAdder | 12 | 25 | 50 |

(测试环境:Intel i9-9900K, 32GB DDR4, JDK 11)

适用场景决策树

graph TD
    A[需要原子操作?] -->|是| B{操作复杂度}
    B -->|简单| C[考虑CAS]
    B -->|复杂| D[使用锁机制]
    A -->|否| E{需要可见性保证?}
    E -->|是| F[volatile修饰]
    E -->|否| G[普通变量]
    C --> H{竞争激烈程度?}
    H -->|低| I[保持CAS]
    H -->|高| J[转换为LongAdder]
    D --> K{需要灵活控制?}
    K -->|是| L[使用ReentrantLock]
    K -->|否| M[synchronized]

高级优化技巧

锁分解(Lock Splitting)

class ImprovedServerStatus {
    private final Object usersLock = new Object();
    private final Object queriesLock = new Object();
    private int activeUsers;
    private int runningQueries;

    public void updateUsers() {
        synchronized(usersLock) {
            activeUsers++;
        }
    }

    public void updateQueries() {
        synchronized(queriesLock) {
            runningQueries++;
        }
    }
}

无锁数据结构示例

Michael-Scott非阻塞队列实现:

public class NonBlockingQueue<T> {
    private static class Node<T> {
        final T item;
        volatile Node<T> next;

        Node(T item) {
            this.item = item;
        }
    }

    private volatile Node<T> head;
    private volatile Node<T> tail;

    public NonBlockingQueue() {
        Node<T> dummy = new Node<>(null);
        head = dummy;
        tail = dummy;
    }

    public void enq(T item) {
        Node<T> node = new Node<>(item);
        while (true) {
            Node<T> last = tail;
            Node<T> next = last.next;
            if (last == tail) {
                if (next == null) {
                    if (CAS(last.next, next, node)) {
                        CAS(tail, last, node);
                        return;
                    }
                } else {
                    CAS(tail, last, next);
                }
            }
        }
    }

    public T deq() {
        while (true) {
            Node<T> first = head;
            Node<T> last = tail;
            Node<T> next = first.next;
            if (first == head) {
                if (first == last) {
                    if (next == null) return null;
                    CAS(tail, last, next);
                } else {
                    T item = next.item;
                    if (CAS(head, first, next)) return item;
                }
            }
        }
    }

    // 基于Unsafe的CAS实现
    private static final Unsafe UNSAFE = ...;
    private static final long tailOffset = ...;
    
    boolean CAS(Node<T> target, Node<T> expect, Node<T> update) {
        return UNSAFE.compareAndSwapObject(target, tailOffset, expect, update);
    }
}

常见问题排查

死锁检测

使用JStack检测死锁:

jstack <pid> | grep -A 10 deadlock

# 典型输出
Found one Java-level deadlock:
=============================
"Thread-1":
  waiting to lock monitor 0x00007f88d8004818 (object 0x000000076eaf8a40, a java.lang.Object),
  which is held by "Thread-0"
"Thread-0":
  waiting to lock monitor 0x00007f88d8006208 (object 0x000000076eaf8a50, a java.lang.Object),
  which is held by "Thread-1"

性能问题诊断

使用JFR(Java Flight Recorder)监控锁竞争:

JFR.configure()
    .enable("jdk.JavaMonitorWait")
    .withPeriod(Duration.ofSeconds(1))
    .build()
    .start();
正文到此结束
评论插件初始化中...
Loading...