Received: by mail.netbsd.org (Postfix, from userid 605) id 9DD6D84E29; Sun, 23 Feb 2020 08:40:38 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by mail.netbsd.org (Postfix) with ESMTP id 1C00484DF4 for ; Sun, 23 Feb 2020 08:40:38 +0000 (UTC) X-Virus-Scanned: amavisd-new at netbsd.org Received: from mail.netbsd.org ([127.0.0.1]) by localhost (mail.netbsd.org [127.0.0.1]) (amavisd-new, port 10025) with ESMTP id xm3gSKwvSdmE for ; Sun, 23 Feb 2020 08:40:37 +0000 (UTC) Received: from cvs.NetBSD.org (ivanova.NetBSD.org [IPv6:2001:470:a085:999:28c:faff:fe03:5984]) by mail.netbsd.org (Postfix) with ESMTP id 613D884D56 for ; Sun, 23 Feb 2020 08:40:37 +0000 (UTC) Received: by cvs.NetBSD.org (Postfix, from userid 500) id 5EF0BFBF4; Sun, 23 Feb 2020 08:40:37 +0000 (UTC) Content-Disposition: inline Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset="US-ASCII" MIME-Version: 1.0 Date: Sun, 23 Feb 2020 08:40:37 +0000 From: "Taylor R Campbell" Subject: CVS commit: src/sys/ufs/lfs To: source-changes@NetBSD.org X-Mailer: log_accum Message-Id: <20200223084037.5EF0BFBF4@cvs.NetBSD.org> Sender: source-changes-owner@NetBSD.org List-Id: source-changes.NetBSD.org Precedence: bulk Reply-To: source-changes-d@NetBSD.org Mail-Reply-To: "Taylor R Campbell" Mail-Followup-To: source-changes-d@NetBSD.org List-Unsubscribe: Module Name: src Committed By: riastradh Date: Sun Feb 23 08:40:37 UTC 2020 Modified Files: src/sys/ufs/lfs: lfs_extern.h lfs_segment.c lfs_subr.c Log Message: Break deadlock in PR kern/52301. The lock order is lfs_writer -> lfs_seglock. The problem in 52301 is that lfs_segwrite violates this lock order by sometimes doing lfs_seglock -> lfs_writer, either (a) when doing a checkpoint or (b), opportunistically, when there are no dirops pending. Both cases can deadlock, because dirops sometimes take the seglock (lfs_truncate, lfs_valloc, lfs_vfree): (a) There may be dirops pending, and they may be waiting for the seglock, so we can't wait for them to complete while holding the seglock. (b) The test for fs->lfs_dirops == 0 happens unlocked, and the state may change by the time lfs_writer_enter acquires lfs_lock. To resolve this in each case: (a) Do lfs_writer_enter before lfs_seglock, since we will need it unconditionally anyway. The worst performance impact of this should be that some dirops get delayed a little bit. (b) Create a new lfs_writer_tryenter to use at this point so that the test for fs->lfs_dirops == 0 and the acquisition of lfs_writer happen atomically under lfs_lock. To generate a diff of this commit: cvs rdiff -u -r1.115 -r1.116 src/sys/ufs/lfs/lfs_extern.h cvs rdiff -u -r1.284 -r1.285 src/sys/ufs/lfs/lfs_segment.c cvs rdiff -u -r1.98 -r1.99 src/sys/ufs/lfs/lfs_subr.c Please note that diffs are not public domain; they are subject to the copyright notices on the relevant files.