Race condition description
During hot reload of one-nio server configuration with TLS enabled, the SSL context
is reconfigured unconditionally, which can cause a race condition and a JVM crash.
The TLS configuration may not have changed at all.
Test that reproduces the issue: #124
Test log with crash
Step 1: Server.reconfigure()
Step 2: acceptor.reconfigure() — calls configure on SSL_CTX
// Server.java:97
acceptor.reconfigure(config.acceptors)
→ DefaultAcceptorGroup.reconfigure(ac)
→ AcceptorThread.reconfigure(ac)
→ AcceptorSupport.reconfigureSocket(serverSocket, config)
→ sslContext.configure(config.ssl) // ← CALLED UNCONDITIONALLY, even if SSL config hasn't changed
Step 3: SslContext.configure() — unconditional modification of SSL_CTX that is currently used by active connections
// SslContext.java:85-86
setCiphers(config.ciphers != null ? config.ciphers : SslConfig.DEFAULT_CIPHERS);
setCiphersuites(config.ciphersuites != null ? config.ciphersuites : SslConfig.DEFAULT_CIPHERSUITES);
What setCiphers does in native code:
// ssl.c:707-718
SSL_CTX* ctx = (SSL_CTX*)(intptr_t)(*env)->GetLongField(env, self, f_ctx);
const char* value = (*env)->GetStringUTFChars(env, ciphers, NULL);
int result = SSL_CTX_set_cipher_list(ctx, value); // ← hot-modifies a live SSL_CTX
Race condition in OpenSSL cipher list
SSL_CTX_set_cipher_list() internally calls ssl_create_cipher_list():
// Simplified OpenSSL internals
int ssl_create_cipher_list(SSL_CTX *ctx, STACK_OF(SSL_CIPHER) **cipher_list, ...) {
// ...
sk_SSL_CIPHER_free(*cipher_list); // Frees the old stack
// *cipher_list now holds a dangling pointer
// ... builds new cipherstack ...
*cipher_list = cipherstack; // Assigns the new stack
}
Between freeing the old stack and assigning the new one, ctx->cipher_list contains a dangling pointer.
Meanwhile on selector threads:
For connections in the TLS handshake phase:
SelectorThread.run()
→ Session.read()
→ NativeSslSocket.read()
→ SSL_read(ssl, buf, count)
→ SSL_do_handshake()
→ ssl3_choose_cipher()
→ SSL_get_ciphers(ssl)
The crash
Reconfigure thread: NIO Selector thread:
─────────────────── ──────────────────────────
SSL_CTX_set_cipher_list(ctx, ciphers)
→ ssl_create_cipher_list()
→ sk_SSL_CIPHER_free(ctx->cipher_list)
SSL_read() → handshake
→ ssl3_choose_cipher()
→ SSL_get_ciphers(ssl)
→ return ctx->cipher_list // Dangling pointer
→ sk_SSL_CIPHER_num(list)
→ sk_SSL_CIPHER_value(list, 0)
→ OPENSSL_sk_value(st, 0)
→ st->data[0] // SIGSEGV — JVM crash
→ ctx->cipher_list = new_stack
Crash log from production:
Stack: [0x00007fea419c0000,0x00007fea41a00000], sp=0x00007fea419ee3a8, free space=184k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C [libcrypto.so.3+0x3745b4] OPENSSL_sk_value+0x14
Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
J 12099 one.nio.net.NativeSslSocket.read([BIII)I (0 bytes) @ 0x00007ff054976394 [0x00007ff054976340+0x0000000000000054]
J 14764 c2 one.nio.net.Session.read([BII)I (35 bytes) @ 0x00007ff054c8a944 [0x00007ff054c8a8e0+0x0000000000000064]
J 23530 c1 one.nio.http.HttpSession.processRead([B)V (139 bytes) @ 0x00007ff04d1d9094 [0x00007ff04d1d8f20+0x0000000000000174]
J 18365% c2 one.nio.server.SelectorThread.run()V (142 bytes) @ 0x00007ff055484a24 [0x00007ff055484740+0x00000000000002e4]
v ~StubRoutines::call_stub
siginfo: si_signo: 11 (SIGSEGV), si_code: 128 (SI_KERNEL), si_addr: 0x0000000000000000
Race condition description
During hot reload of one-nio server configuration with TLS enabled, the SSL context
is reconfigured unconditionally, which can cause a race condition and a JVM crash.
The TLS configuration may not have changed at all.
Test that reproduces the issue: #124
Test log with crash
Step 1: Server.reconfigure()
Step 2: acceptor.reconfigure() — calls configure on SSL_CTX
Step 3: SslContext.configure() — unconditional modification of SSL_CTX that is currently used by active connections
What
setCiphersdoes in native code:Race condition in OpenSSL cipher list
SSL_CTX_set_cipher_list()internally callsssl_create_cipher_list():Between freeing the old stack and assigning the new one,
ctx->cipher_listcontains a dangling pointer.Meanwhile on selector threads:
For connections in the TLS handshake phase:
The crash
Crash log from production: