Skip to content

Commit 4acd56d

Browse files
q2venkawasaki
authored andcommitted
net: Introduce lock_sock_try().
syzbot has reported 100+ possible deadlock splats involving NBD, typically following this pattern: lock_sock(sk) -> GFP_KERNEL memory allocation -> fs reclaim -> lock_sock(sk) at NBD Before calling sock_sendmsg() or sock_recvmsg(), NBD sets sk->sk_allocation to GFP_NOIO to prevent fs reclaim from being triggered during memory allocation for the backend socket. However, even after a socket is passed to NBD, it remains exposed to userspace and thus can exercise various slow paths under lock_sock(), where GFP_KERNEL is used directly instead of sk->sk_allocation, leading to the deadlock. Some of those paths do not currently have a reference to struct sock, and plumbing the sk pointer through the call chain just to fix the allocation flags would be extremely cumbersome. Even with that, lockdep would not be happy because such a path could be exercised before passing the socket to NBD, and then lockdep would learn that the path could trigger fs reclaim. Additionally, since the socket is exposed to userspace, we cannot change the lockdep key (even for sk->sk_lock.dep_map, due to lock_sock_fast()). We could spread memalloc_noio_{save,restore} over the networking code, but we want to avoid that and solve it in the NBD layer, which requires the trylock variant of lock_sock(). Let's introduce lock_sock_try() for that purpose. Signed-off-by: Kuniyuki Iwashima <[email protected]>
1 parent ebe44b3 commit 4acd56d

1 file changed

Lines changed: 31 additions & 0 deletions

File tree

include/net/sock.h

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1710,6 +1710,37 @@ static inline void lock_sock(struct sock *sk)
17101710
}
17111711

17121712
void __lock_sock(struct sock *sk);
1713+
1714+
/**
1715+
* lock_sock_try - trylock version of lock_sock
1716+
* @sk: socket
1717+
*
1718+
* Use of this function is strongly discouraged.
1719+
*
1720+
* It is primarily intended for NBD, where the driver must avoid
1721+
* deadlock during fs reclaim caused by the backend socket remaining
1722+
* exposed to userspace even after being handed over to NBD,
1723+
* which _is_ bad but too late to change.
1724+
*
1725+
* Return: true if the lock was acquired, false otherwise.
1726+
*/
1727+
static inline bool lock_sock_try(struct sock *sk)
1728+
{
1729+
if (!spin_trylock_bh(&sk->sk_lock.slock))
1730+
return false;
1731+
1732+
if (sk->sk_lock.owned) {
1733+
spin_unlock_bh(&sk->sk_lock.slock);
1734+
return false;
1735+
}
1736+
1737+
sk->sk_lock.owned = 1;
1738+
spin_unlock_bh(&sk->sk_lock.slock);
1739+
1740+
mutex_acquire(&sk->sk_lock.dep_map, 0, 1, _RET_IP_);
1741+
return true;
1742+
}
1743+
17131744
void __release_sock(struct sock *sk);
17141745
void release_sock(struct sock *sk);
17151746

0 commit comments

Comments
 (0)