Link [ NetBSD | NetBSD OpenGrok source search | PR fulltext-search | Summary of daily snapshot builds | history of daily build result | pkgsrc commit viewer ]


   
        usage: [branch:branch] [user:user] [path@revision] keyword [... [-excludekeyword [...]]] (e.g. branch:MAIN sys/arch/arm, if_wm.c@1.234 )




switch to index mode

recent branches: MAIN (1h)  netbsd-10 (8d)  netbsd-9 (8d)  netbsd-8 (12d) 

2024-05-24 15:20:57 UTC Now

2022-05-21 15:03:08 UTC MAIN commitmail json YAML

doc/CHANGES: Expand notes on vether(4) and tap(4).

(riastradh)

2022-05-21 14:58:20 UTC MAIN commitmail json YAML

doc/CHANGES: Nix trailing whitespace.

(riastradh)

2022-05-21 13:31:29 UTC MAIN commitmail json YAML

opencrypto/h_ioctl: Print error messages with warn.

Might help track down what's going on when this fails.

(riastradh)

2022-05-19 20:51:59 UTC MAIN commitmail json YAML

opencrypto: Assert !cpu_intr_p() on dispatch and invoke.

These should only ever have been potentially called from hard
interrupt context by CRYPTO_F_CBIMM callbacks (CBIMM = call back
immediately).  CRYPTO_F_CBIMM is no more, so there is no more need to
allow this case of call from hard interrupt context.

(riastradh)

2022-05-19 20:51:46 UTC MAIN commitmail json YAML

opencrypto: Nix CRYPTO_F_USER, CRYPTO_F_CBIMM, CRYPTO_F_CBIFSYNC.

CRYPTO_F_USER is no longer needed.  It was introduced in 2008 by
darran@ in crypto.c 1.30, cryptodev.c 1.45 in an attempt to avoid
double-free between the issuing thread and asynchronous callback.
But the `fix' didn't work.  In 2017, knakahara@ fixed it properly in
cryptodev.c 1.87 by distinguishing `the crypto operation has
completed' (CRYPTO_F_DONE) from `the callback is done touching the
crp object' (CRYPTO_F_DQRETQ, now renamed to CRYPTODEV_F_RET).

CRYPTO_F_CBIMM formerly served to invoke the callback synchronously
from the driver's interrupt completion routine, to reduce contention
on what was once a single cryptoret thread.  Now, there is a per-CPU
queue and softint for much cheaper processing, so there is less
motivation for this in the first place.  So let's remove the
complicated logic.  This means the callbacks never run in hard
interrupt context, which means we don't need to worry about recursion
into crypto_dispatch in hard interrupt context.

(riastradh)

2022-05-18 20:03:58 UTC MAIN commitmail json YAML

crypto(4): Simplify error test in cryptodev_op.

No functional change intended.

(riastradh)

2022-05-18 20:03:45 UTC MAIN commitmail json YAML

crypto(4): Narrow scope of cryptodev_mtx to cover wait.

No functional change intended -- this only removes an unnecessary
lock/unlock cycle in the error case.

(riastradh)

2022-05-18 20:03:32 UTC MAIN commitmail json YAML

crypto(4): Nix long-dead code and comments.

(riastradh)

2022-05-18 20:02:49 UTC MAIN commitmail json YAML

crypto(4): Use IPL_NONE, not IPL_NET, for /dev/crypto pools.

These are used (pool_get/put) only from thread context, never from
interrupt or even soft interrupt context.

(riastradh)

2022-05-18 12:48:50 UTC MAIN commitmail json YAML

ubsec(4): Nix dead code.

No functional change intended.

(riastradh)

2022-05-17 15:00:05 UTC MAIN commitmail json YAML

cprng(9): Note ipl must be at most IPL_SOFTSERIAL now.

(riastradh)

2022-05-17 10:32:58 UTC MAIN commitmail json YAML

opencrypto: Factor setting CRYPTO_F_DONE out of branches.

This had been done in 1.30 when the locking was different.  No need
any more.  No functional change intended.

(riastradh)

2022-05-17 10:28:08 UTC MAIN commitmail json YAML

pfil(9): Assert pfil lists are not run in interrupt context.

All the paths leading to this should have been dispensed with by now.
The network stack runs in thread or softint context these days; hard
interrupt context is used only to put packets on queues deferred to
softint.

(riastradh)

2022-05-17 10:27:37 UTC MAIN commitmail json YAML

pfil(9): Assert sleepable when editing pfil lists.

These might sleep to wait for users to drain.

(riastradh)

2022-05-17 09:53:09 UTC MAIN commitmail json YAML

opencrypto(9): Omit needless casts around callbacks.

Just declare the right types to begin with.  No functional change
intended.

(riastradh)

2022-05-17 01:39:57 UTC MAIN commitmail json YAML

rnd(9): Note that rndsource callbacks are never run in hardint.

But they may be run in softint at IPL_SOFTSERIAL.

(riastradh)

2022-05-15 16:58:28 UTC MAIN commitmail json YAML

sun8icrypto(4): Switch off polling when ready for interrupts.

When I introduced logic to do polling and then interrupts, I
accidentally made it switch polling from on to...still on, which had
the effect of breaking the logic after sun8i_crypto_attach because
only sun8i_crypto_attach actually did polling.

(riastradh)

2022-05-15 16:20:10 UTC MAIN commitmail json YAML

adjtime(2): Handle negative tv_sec and tv_usec.

Previously I clamped these to avoid dangerous arithmetic overflow.
But I assumed sensible values should be nonnegative.

For tv_sec, this assumption was just wrong -- the adjustment may be
negative.

For tv_usec, this assumption is...not wrong, but also not right.
tv_usec is not _supposed_ to be negative (by POSIX, the type need
only represent values in [-1, 1000000]; semantically the member is
supposed to be a nonnegative number of microseconds below 1000000),
but ntp abuses it to hold negative values, for reasons unclear -- the
same effect could be had by subtracting one from tv_sec, and adding
1000000 to the negative tv_usec.  However, let's not break existing
ntp userlands...

(riastradh)

2022-05-15 12:45:33 UTC MAIN commitmail json YAML

x86: Use atomic_store_release/atomic_load_consume for nmi_handlers.

Simplifies things a bit.  No functional change intended.

(riastradh)

2022-05-14 19:44:37 UTC MAIN commitmail json YAML

xhci(4): Handle race between software abort and hardware stall.

(riastradh)

2022-05-14 19:44:26 UTC MAIN commitmail json YAML

xhci(4): Fix edge case in simultaneous xfer abort and failure.

On successful usbd_xfer_trycomplete, caller must set ux_status and
call usb_transfer_complete before releasing the pipe (bus) lock.
Failing to call usb_transfer_complete is a mistake.  Presumably this
was intended to claim the xfer to complete it only on the last
packet.

I previously introduced the violation of this rule when the code
looked like

xfer->ux_status = err;
if (trb stuff)
usb_transfer_complete(xfer);

I mostly mechanically changed all the assignments of xfer->ux_status
to do usbd_xfer_trycomplete first and then usb_transfer_complete.

In the original, the extra assignment of xfer->ux_status in the event
we _don't_ immediately call usb_transfer_complete was likely
redundant and (except insofar as the abort protocol was broken)
harmless.  But now it is a problem because of the contract between
usbd_xfer_trycomplete and usb_transfer_complete under the pipe (bus)
lock.  In retrospect, the original probably should have been

if (trb stuff) {
xfer->ux_status = err;
usb_transfer_complete(xfer);
}

and my mechanical transformation should have worked, but also in
retrospect I should have put more thought into the change and done it
a little less mechanically.

(riastradh)

2022-05-14 15:29:08 UTC MAIN commitmail json YAML

uvideo(4): Fix missing line breaks in debug messages.

(riastradh)

2022-05-14 15:28:59 UTC MAIN commitmail json YAML

uvideo(4): Fix lengths of various frame descriptors.

This driver doesn't use the frame interval members, which are either
fixed (if continuous) or flexible (if discrete) and so can't be
encoded in C types correctly.  If we did use them, it would be
necessary to use pointer arithmetic on char pointers in the enclosing
descriptor buffer.  But we don't, so this is simpler, and fixes the
sizeof checks to avoid running off the end of invalid descriptors.

Should fix failure to parse legitimate descriptors (without
regressing to choking on malicious ones):

-uvideo: found format (index 1) type 9 size 1280x720 size 1843200 stride 2560 interval 333333
- ^ picking this one
-uvideo: found format (index 2) type 9 size 640x480 size 614400 stride 1280 interval 333333
+uvideo: truncated CS subtype-0x7 descriptor, length 30 < 38uvideo: unimplemented VS CS descriptor len=30 type=0x24 subtype=0x07
+uvideo: unimplemented VS CS descriptor len=30 type=0x24 subtype=0x07

(riastradh)

2022-05-14 15:28:50 UTC MAIN commitmail json YAML

uvideo(4): Avoid exposing streams with invalid descriptors.

(riastradh)

2022-05-13 09:49:44 UTC MAIN commitmail json YAML

rkv1crypto(4): Fix units in RNG repeated-output health test.

This code was intended to check whether the two 4-word halves of an
8-word, 32-byte, 256-bit sample were repeated.

Instead, it accidentally checked whether the first 4 _bytes_ of the
two halves were repeated.

The effect was a false alarm rate of 1/2^32, instead of a false alarm
rate of 1/2^128, with no change on the true alarm rate in the event
of an RNG wedged producing all-zero or all-one bits.  1/2^128 is an
acceptable false alarm rate; 1/2^32, not so much.

(The false alarm right might be higher if the samples are not
perfectly uniformly distributed, which they most likey aren't,
although the documentation doesn't give any details other than
suggesting it's a ring oscillator under the hood, which provides
entropy from jitter induced by thermal noise.  This driver records
half a bit of entropy per bit of sample to be reasonably
conservative.)

(riastradh)

2022-05-13 09:40:25 UTC MAIN commitmail json YAML

cprng(9): Fix accidental 4x seed size.

With SHA-256, NIST Hash_DRBG takes an preferred 440-bit/55-byte seed.
It's a weird number, and I'm not sure where it comes from (a quick
skim of SP800-90A doesn't turn anything up), but it's certainly
sufficient (256-bit/32-byte seed is almost certainly enough) so it's
not a problem to use something larger; Hash_DRBG can absorb seeds of
arbitrary lengths and larger seeds can't really hurt security (with
minor caveats like HMAC RO quirks that don't apply here).

Except -- owing to a typo, we actually used a 1760-bit/220-byte seed,
because I wrote `uint32_t seed[...]' instead of `uint8_t seed[...]'.
Again: not a problem to use a seed larger than needed.  But let's
draw no more than we need out of the entropy pool!

Verified with CTASSERT(sizeof(seed) == 55).  (Assertion omitted from
this commit because we might swap out Hash_DRBG for something else
with a different seed size like 32 bytes.)

(riastradh)

2022-05-13 09:40:02 UTC MAIN commitmail json YAML

entropy(9): Update comment about where entropy_extract is allowed.

As of last month, it is forbidden in all hard interrupt context.

(riastradh)

2022-05-13 09:39:52 UTC MAIN commitmail json YAML

entropy(9): Note rules about how to use entropy_extract output.

(riastradh)

2022-05-13 09:39:40 UTC MAIN commitmail json YAML

x86/pmap: Feed entropy_extract output through nist_hash_drbg.

The entropy pool algorithm is NOT designed to provide backtracking
resistance on its own -- it MUST be combined with a PRNG/DRBG that
provides that.

The only reason we use entropy_extract here is that cprng(9) is not
available yet (which in turn is because kmem and other basic kernel
facilities aren't available yet), but nist_hash_drbg doesn't have any
initialization order requirements, so we'll just use it directly.

(riastradh)

2022-04-21 12:06:32 UTC MAIN commitmail json YAML

mips/cavium: Take advantage of Octeon's guaranteed r/rw ordering.

(riastradh)

2022-04-21 12:05:13 UTC MAIN commitmail json YAML

futex(9): Convert membar_enter/exit to membar_acquire/release.

No functional change -- this is just in an illustrative comment!

(riastradh)

2022-04-19 22:53:34 UTC MAIN commitmail json YAML

Revert "mmap(2): If we fail with a hint, try again without it."

This doesn't work, because uvm_mmap releases the uobj when it fails.
Should factor this more coherently, but let's just revert for now.

Reported-by: syzbot+d347c8951821b236117a@syzkaller.appspotmail.com
Reported-by: syzbot+7643d1b769fdfa18c3b2@syzkaller.appspotmail.com
Reported-by: syzbot+44f4b39671dd580cba5c@syzkaller.appspotmail.com
Reported-by: syzbot+b5a422299ca4ffe8570c@syzkaller.appspotmail.com
Reported-by: syzbot+22681822db67b6e90cfb@syzkaller.appspotmail.com
Reported-by: syzbot+e59f493ceef72b925a17@syzkaller.appspotmail.com
Reported-by: syzbot+666f3fe8364f47e8641b@syzkaller.appspotmail.com
Reported-by: syzbot+511d4572f52f1fd9b5cc@syzkaller.appspotmail.com

(riastradh)

2022-04-19 09:19:53 UTC MAIN commitmail json YAML

audio(4): Wait for opens to drain in detach.

Otherwise detach may barge ahead and start freeing things before open
has finished and is about to use them after free.

Reported-by: syzbot+31d2619e72c2c8436cc9@syzkaller.appspotmail.com

(riastradh)

2022-04-19 01:35:28 UTC MAIN commitmail json YAML

umcs(4): Avoid using uninitialized data if register read fails.

Reported-by: syzbot+511b32f415150b469250@syzkaller.appspotmail.com

(riastradh)

2022-04-19 01:34:52 UTC MAIN commitmail json YAML

mmap(2): If we fail with a hint, try again without it.

`Hint' here means nonzero addr, but no MAP_FIXED or MAP_TRYFIXED.

This is suboptimal -- we could teach uvm_mmap to do a fancier search
using the address as a hint.  But this should do for now.

Candidate fix for PR kern/55533.

ok chs@

(riastradh)

2022-04-17 13:17:56 UTC MAIN commitmail json YAML

umb(4): Use memcpy, not potentially unaligned/aliased casts.

(riastradh)

2022-04-17 13:17:40 UTC MAIN commitmail json YAML

umb(4): Validate descriptor lengths.

(riastradh)

2022-04-17 13:17:30 UTC MAIN commitmail json YAML

uvideo(4): Parse descriptors more robustly.

Validate lengths and types before barging ahead.

Not sure exactly which missing validation syzbot tripped on here, but
I'm pretty sure I caught all the cases.

Reported-by: syzbot+60f0a25c077b67547f57@syzkaller.appspotmail.com

(riastradh)

2022-04-17 13:17:19 UTC MAIN commitmail json YAML

uvideo(4): Convert conditional to KASSERT in uvideo_attach.

usb_desc_iter_next_interface no longer returns truncated interface
descriptors, so we no longer have to check for that here.

(riastradh)

2022-04-17 13:17:06 UTC MAIN commitmail json YAML

uvideo(4): Convert conditional to KASSERT in uvideo_unit_alloc.

The one caller guarantees the condition already.

(riastradh)

2022-04-17 13:16:52 UTC MAIN commitmail json YAML

usbdi(9): Make usb_desc_iter_next_interface reject truncated descs.

Currently callers have to check whether the descriptor is truncated,
which is willy.

(riastradh)

2022-04-17 13:16:43 UTC MAIN commitmail json YAML

usbdi(9): Minor KNF and tidying in descriptor iteration.

No functional change intended.

(riastradh)

2022-04-17 13:16:11 UTC MAIN commitmail json YAML

uvideo(4): Use sizeof(*p), not sizeof(T), for kmem_alloc/free.

No functional change intended.

(riastradh)

2022-04-17 13:15:48 UTC MAIN commitmail json YAML

uvideo(4): KNF comment style.

No functional change intended.

(riastradh)

2022-04-17 13:15:37 UTC MAIN commitmail json YAML

usbdi(9): Restore usb_descriptor_t to its correct definition.

Descriptors in the USB spec all start with bLength and
bDescriptorType.  bDescriptorSubtype is only for certain
class-specific descriptors.  Many descriptors, such as
usb_device_descriptor_t, _do not_ have bDescriptorSubtype, so using a
structure that has bDescriptorSubtype for such descrpitors is wrong.

There is some history here:

- Back in 1998, when augustss@ introduced the USB stack, the type
  usb_descriptor_t was erroneously defined with a bDescriptorSubtype
  member.

- In 2007, drochner@ removed this member to accurately reflect the
  USB spec.

- In 2018, khorben@ appeared to have accidentally reintroduced it
  while importing the umb(4) driver from OpenBSD, which still has the
  erroneous bDescriptorSubtype member in usb_descriptor_t.

The umb(4) driver has since been adjusted to correctly use
usb_cdc_descriptor_t instead of usb_descriptor_t (and I have now
restored umidi_cs_descriptor_t which I had removed last month before
I realized this history of usb_descriptor_t), so this member is no
longer necessary.

(riastradh)

2022-04-17 13:15:27 UTC MAIN commitmail json YAML

umb(4): Use usb_cdc_descriptor_t for bDescriptorSubtype.

Note: This needs more length validation!  TBD in a separate commit.

(riastradh)

2022-04-17 13:15:15 UTC MAIN commitmail json YAML

umidi(4): Restore umidi_cs_descriptor_t type.

Had been previously deleted under the misapprehension that
usb_descriptor_t is appropriate here, but it's not -- it should not
have the bDescriptorSubtype member.

(riastradh)

2022-04-17 13:15:05 UTC MAIN commitmail json YAML

uvideo(4): Avoid printing off the end of truncated descriptors.

(riastradh)

2022-04-17 09:25:24 UTC MAIN commitmail json YAML

ucom(4): Make sure rndsource is attached before use and detach.

Reported-by: syzbot+04fb6786e0cf873905e8@syzkaller.appspotmail.com

(riastradh)

2022-04-17 09:09:13 UTC MAIN commitmail json YAML

panic(9): Serialize panicstr access and printing `panic:' message.

This isn't riskier than before -- previously we took kprintf_lock
inside each separate printf/vprintf call here.  Now we just take it
once around access to panicstr and printing the message.

With any luck, this should help avoid interleaving panic messages
with each other and with other output -- and maybe cut down on the
number of syzkaller duplicates.

(riastradh)

2022-04-16 11:13:10 UTC MAIN commitmail json YAML

sequencer(4): VOP_CLOSE requires vnode lock.

Reported-by: syzbot+877c50d819fea9403247@syzkaller.appspotmail.com

(riastradh)

2022-04-16 11:13:01 UTC MAIN commitmail json YAML

sequencer(4): Convert some #if DIAGNOSTIC to assertions.

(riastradh)

2022-04-16 11:12:21 UTC MAIN commitmail json YAML

sequencer(4): Sprinkle KNF.

- Sort includes.
- Nix trailing whitespace.
- No parens for return.
- Blank line between declarations and statements.
- Note MP bug with unit allocation.

No functional change intended.

(riastradh)

2022-04-14 19:47:14 UTC MAIN commitmail json YAML

Revert "viornd(4): Process host entropy in softint context."

Apparently this has the effect of sometimes making the network hang
on Google Compute Engine as used by syzbot, which has held up all the
syzkaller testing for weeks now.  Let's revert this for now, and
separately try to figure out what's wrong with it.

(riastradh)

2022-04-10 11:36:32 UTC MAIN commitmail json YAML

2022-04-10 10:38:33 UTC MAIN commitmail json YAML

2022-04-10 09:22:35 UTC MAIN commitmail json YAML

makefs(8): Fix tool build -- no libutil DPADD if HOSTPROG.

(riastradh)

2022-04-09 23:52:23 UTC MAIN commitmail json YAML

unix(4): Convert membar_exit to membar_release.

Use atomic_load_consume or atomic_load_relaxed where necessary.

Comment on why unlocked nonatomic access is valid where it is done.

(riastradh)

2022-04-09 23:52:05 UTC MAIN commitmail json YAML

select(9): Use membar_acquire/release and atomic_store_release.

No store-before-load ordering here -- this was obviously always
intended to be load-before-load/store all along.

(riastradh)

2022-04-09 23:51:57 UTC MAIN commitmail json YAML

thmap(9): Convert membar_exit to membar_release.

(riastradh)

2022-04-09 23:51:48 UTC MAIN commitmail json YAML

pool(9): Convert membar_exit to membar_release.

(riastradh)

2022-04-09 23:51:22 UTC MAIN commitmail json YAML

ipi(9): Convert membar_exit/enter to membar_release/acquire.

No store-before-load ordering needed here, just need to ensure that
in

A, ipi_send, ipi receive, B,

memory operations in A happen-before memory operations in B.

(riastradh)

2022-04-09 23:51:09 UTC MAIN commitmail json YAML

ucas(9): Convert membar_exit to membar_release.

(riastradh)

2022-04-09 23:46:19 UTC MAIN commitmail json YAML

rwlock(9): Convert to membar_acquire/release.

Leave an XXX comment where I suspect there might be a missing membar
-- to be audited when I have more time to think!

(riastradh)

2022-04-09 23:46:10 UTC MAIN commitmail json YAML

mutex(9): Convert to membar_acquire/release.

Except for setting the waiters bit -- not sure if this is actually
required to be store-before-load/store.  Seems unlikely -- surely
we'd have seen some serious bug by now if not, because membar_enter
has failed to guarantee that on x86! -- but I'm leaving it for now
until I have time to think enough about it to be sure one way or
another.

(riastradh)

2022-04-09 23:45:45 UTC MAIN commitmail json YAML

vfs(9): Add XXX comment about unclear membar_enter.

(riastradh)

2022-04-09 23:45:37 UTC MAIN commitmail json YAML

kern: Handle l_mutex with atomic_store_release, atomic_load_consume.

- Where the lock is held and known to be correct, no atomic.
- In loops to acquire the lock, use atomic_load_relaxed before we
  restart with atomic_load_consume.

Nix membar_exit.

(Who knows, using atomic_load_consume here might fix bugs on Alpha!)

(riastradh)

2022-04-09 23:45:23 UTC MAIN commitmail json YAML

rumpkern/sleepq: Convert membar_exit/store to atomic_store_release.

(riastradh)

2022-04-09 23:45:14 UTC MAIN commitmail json YAML

rumpkern/scheduler: Use membar_release.

...but add an XXX comment asking for clarity on what it pairs with.

(riastradh)

2022-04-09 23:45:02 UTC MAIN commitmail json YAML

if_shmem(4): Use membar_acquire/release for lock acquire/release.

(riastradh)

2022-04-09 23:44:54 UTC MAIN commitmail json YAML

ena: Convert not-right membar_enter/exit to membar_acquire/release.

Only used on non-x86 and non-aarch64, which probably means this
branch is never used.  (This should really use bus_space_barrier or
bus_dmamap_sync.)

(riastradh)

2022-04-09 23:44:45 UTC MAIN commitmail json YAML

2022-04-09 23:44:25 UTC MAIN commitmail json YAML

linux/ratelimit: Convert to membar_acquire and atomic_store_release.

Simplify while here: atomic_swap is enough, no need for atomic_cas.
(Maybe drm'll run faster on sparcv8 this way...!)

(riastradh)

2022-04-09 23:43:55 UTC MAIN commitmail json YAML

linux/llist: Use membar_release and membar_datadep_consumer.

No need for membar_acquire here!  Loads are all data-dependent.

(riastradh)

2022-04-09 23:43:39 UTC MAIN commitmail json YAML

linux/kref: Fix memory barriers and use membar_release/acquire.

(riastradh)

2022-04-09 23:43:31 UTC MAIN commitmail json YAML

2022-04-09 23:43:20 UTC MAIN commitmail json YAML

mips: Convert lock.h to membar_release/acquire.

(riastradh)

2022-04-09 23:43:12 UTC MAIN commitmail json YAML

hppa: Convert ipiuncs.c to membar_release/acquire.

(riastradh)

2022-04-09 23:42:56 UTC MAIN commitmail json YAML

alpha: Convert ipifuncs.c to membar_release/acquire.

No semantic change is possible because all of these membars are just
mb on alpha -- change just makes the intent clearer.  (Only
membar_producer is weaker, wmb.)

(riastradh)

2022-04-09 23:39:18 UTC MAIN commitmail json YAML

alpha: Convert cpu_iccb_send from membar_exit to membar_release.

XXX Maybe this should really use alpha_mb, since it's not writing to
normal MI-type memory so technically the membr_* semantics doesn't
apply?

(riastradh)

2022-04-09 23:39:07 UTC MAIN commitmail json YAML

rtld: Convert membar_exit/enter to membar_release/acquire.

These are basic CAS-based locking primitives needing release and
acquire semantics, nothing fancy here -- except the membar_sync parts
which are questionable but not relevant to the present audit.

(riastradh)

2022-04-09 23:38:57 UTC MAIN commitmail json YAML

libc/atomic: Fix membars in __atomic_load/store_* stubs.

- membar_enter/exit ordering was backwards.
- membar_enter doesn't make any sense for load anyway.
- Switch to membar_release for store and membar_acquire for load.

The only sensible orderings for a simple load or store are acquire or
release, respectively, or sequential consistency.  This never
provided correct sequential consistency before -- we should really
make it conditional on memmodel but I don't know offhand what the
values of memmodel might be and this is at least better than before.

(riastradh)

2022-04-09 23:38:33 UTC MAIN commitmail json YAML

2022-04-09 23:36:22 UTC MAIN commitmail json YAML

alpha: Omit needless membar in pmap_reference.

If the pmap is published enough for us to obtain a reference to it
then there's no membar needed.  If it's not then something else is
wrong and we can't use pmap_reference here anyway.  Membars are
needed only on the destruction side to make sure all use, by any
thread, happens-before all freeing in the last user thread.

(riastradh)

2022-04-09 23:35:58 UTC MAIN commitmail json YAML

audio(4): Use membar_acquire, not membar_enter.

Cheaper and adequate to make an atomic_swap into a load-acquire.

(riastradh)

2022-04-09 23:34:50 UTC MAIN commitmail json YAML

mips/rmixl: Insert appropriate membars around IPIs.

(riastradh)

2022-04-09 23:34:40 UTC MAIN commitmail json YAML

mips/cavium: Insert appropriate membars around IPIs.

(riastradh)

2022-04-09 23:34:30 UTC MAIN commitmail json YAML

atomic_loadstore(9): Use membar_acquire/release.

(riastradh)

2022-04-09 23:32:53 UTC MAIN commitmail json YAML

Introduce membar_acquire/release.  Deprecate membar_enter/exit.

The names membar_enter/exit were unclear, and the documentation of
membar_enter has disagreed with the implementations on sparc,
powerpc, and even x86(!) for the entire time it has been in NetBSD.

The terms `acquire' and `release' are ubiquitous in the literature
today, and have been adopted in the C and C++ standards to mean
load-before-load/store and load/store-before-store, respectively,
which are exactly the orderings required by acquiring and releasing a
mutex, as well as other useful applications like decrementing a
reference count and then freeing the underlying object if it went to
zero.

Originally I proposed changing one word in the documentation for
membar_enter to make it load-before-load/store instead of
store-before-load/store, i.e., to make it an acquire barrier.  I
proposed this on the grounds that

(a) all implementations guarantee load-before-load/store,
(b) some implementations fail to guarantee store-before-load/store,
and
(c) all uses in-tree assume load-before-load/store.

I verified parts (a) and (b) (except, for (a), powerpc didn't even
guarantee load-before-load/store -- isync isn't necessarily enough;
need lwsync in general -- but it _almost_ did, and it certainly didn't
guarantee store-before-load/store).

Part (c) might not be correct, however: under the mistaken assumption
that atomic-r/m/w then membar-w/rw is equivalent to atomic-r/m/w then
membar-r/rw, I only audited the cases of membar_enter that _aren't_
immediately after an atomic-r/m/w.  All of those cases assume
load-before-load/store.  But my assumption was wrong -- there are
cases of atomic-r/m/w then membar-w/rw that would be broken by
changing to atomic-r/m/w then membar-r/rw:

https://mail-index.netbsd.org/tech-kern/2022/03/29/msg028044.html

Furthermore, the name membar_enter has been adopted in other places
like OpenBSD where it actually does follow the documentation and
guarantee store-before-load/store, even if that order is not useful.
So the name membar_enter currently lives in a bad place where it
means either of two things -- r/rw or w/rw.

With this change, we deprecate membar_enter/exit, introduce
membar_acquire/release as better names for the useful pair (r/rw and
rw/w), and make sure the implementation of membar_enter guarantees
both what was documented _and_ what was implemented, making it an
alias for membar_sync.

While here, rework all of the membar_* definitions and aliases.  The
new logic follows a rule to make it easier to audit:

membar_X is defined as an alias for membar_Y iff membar_X is
guaranteed by membar_Y.

The `no stronger than' relation is (the transitive closure of):

- membar_consumer (r/r) is guaranteed by membar_acquire (r/rw)
- membar_producer (w/w) is guaranteed by membar_release (rw/w)
- membar_acquire (r/rw) is guaranteed by membar_sync (rw/rw)
- membar_release (rw/w) is guaranteed by membar_sync (rw/rw)

And, for the deprecated membars:

- membar_enter (whether r/rw, w/rw, or rw/rw) is guaranteed by
  membar_sync (rw/rw)
- membar_exit (rw/w) is guaranteed by membar_release (rw/w)

(membar_exit is identical to membar_release, but the name is
deprecated.)

Finally, while here, annotate some of the instructions with their
semantics.  For powerpc, leave an essay with citations on the
unfortunate but -- as far as I can tell -- necessary decision to use
lwsync, not isync, for membar_acquire and membar_consumer.

Also add membar(3) and atomic(3) man page links.

(riastradh)

2022-04-09 22:53:53 UTC MAIN commitmail json YAML

riscv/membar_ops: Upgrade membar_enter from W/RW to RW/RW.

This will be deprecated soon but let's avoid leaving rakes to trip on
with it arising from disagreement over the documentation (W/RW) and
implementation and usage (R/RW).

(riastradh)

2022-04-09 22:53:45 UTC MAIN commitmail json YAML

x86_64/membar_ops: Upgrade membar_enter from R/RW to RW/RW.

This will be deprecated soon but let's avoid leaving rakes to trip on
with it arising from disagreement over the documentation (W/RW) and
implementation and usage (R/RW).

(riastradh)

2022-04-09 22:53:36 UTC MAIN commitmail json YAML

i386/membar_ops: Upgrade membar_enter from R/RW to RW/RW.

This will be deprecated soon but let's avoid leaving rakes to trip on
with it arising from disagreement over the documentation (W/RW) and
implementation and usage (R/RW).

(riastradh)

2022-04-09 22:53:25 UTC MAIN commitmail json YAML

sparc64/membar_ops: Upgrade membar_enter from R/RW to RW/RW.

This will be deprecated soon but let's avoid leaving rakes to trip on
with it arising from disagreement over the documentation (W/RW) and
implementation and usage (R/RW).

(riastradh)

2022-04-09 22:53:17 UTC MAIN commitmail json YAML

sparc/membar_ops: Upgrade membar_enter from R/RW to RW/RW.

This will be deprecated soon but let's avoid leaving rakes to trip on
with it arising from disagreement over the documentation (W/RW) and
implementation and usage (R/RW).

(riastradh)

2022-04-09 19:59:08 UTC MAIN commitmail json YAML

nouveau: Omit needless local patch.

This code probably once called ioread32/iowrite32 or something, but
no longer.

(riastradh)

2022-04-09 14:38:47 UTC MAIN commitmail json YAML

makefs(8): Needs -lm for ceil in udf as a host tool too.

(riastradh)

2022-04-09 14:09:32 UTC MAIN commitmail json YAML

mips/rmi: Hack to get XLSATX64.MP kernel building again.

Using <mips/asm.h> in a .c file is kinda grody but CALLFRAME_SIZ
doesn't seem to be defined anywhere else.  Not sure how this was ever
supposed to work...

(riastradh)

2022-04-09 13:38:15 UTC MAIN commitmail json YAML

sys/lwp.h: Nix trailing whitespace.

(riastradh)

2022-04-09 12:49:37 UTC MAIN commitmail json YAML

ena: Fix rmb/wmb/mb to match Linux on x86 and aarch64.

This is not right but it's not worse than it was before.

(riastradh)

2022-04-09 12:07:37 UTC MAIN commitmail json YAML

aarch64/membar_ops: Fix wrong symbol end.

(riastradh)

2022-04-09 12:07:29 UTC MAIN commitmail json YAML

2022-04-09 12:07:18 UTC MAIN commitmail json YAML

x86: Omit needless store in membar_producer/exit.

On x86, every store is a store-release, so there is no need for any
barrier.  But this wasn't a barrier anyway; it was just a store,
which was redundant with the store of the return address to the stack
implied by CALL even if issuing a store made a difference.

(riastradh)

2022-04-09 12:07:01 UTC MAIN commitmail json YAML

x86: Every load is a load-acquire, so membar_consumer is a noop.

lfence is only needed for MD logic, such as operations on I/O memory
rather than normal cacheable memory, or special instructions like
RDTSC -- never for MI synchronization between threads/CPUs.  No need
for hot-patching to do lfence here.

(The x86_lfence function might reasonably be patched on i386 to do
lfence for MD logic, but it isn't now and this doesn't change that.)

(riastradh)

2022-04-09 12:06:48 UTC MAIN commitmail json YAML

sparc64: Fix membar_sync by issuing membar #StoreLoad.

In TSO this is the only memory barrier ever needed, and somehow we
got this wrong and instead issued an unnecessary membar #LoadLoad --
not needed even in PSO let alone in TSO.

XXX Apparently we may run userland programs with PSO or RMO, in which
case all of these membars need fixing:

                        PSO                    RMO
membar_consumer        nop                    membar #LoadLoad
membar_producer        membar #StoreStore      membar #StoreStore
membar_enter            nop                    membar #LoadLoad|LoadStore
membar_exit            membar #StoreStore      membar #LoadStore|StoreStore
membar_sync            membar #StoreLoad|StoreStore
                                                membar #...everything...

But at least this fixes the TSO case in which we run the kernel.
Also I'm not sure there's any non-TSO hardware out there in practice.

(riastradh)

2022-04-09 12:06:39 UTC MAIN commitmail json YAML

sparc: Fix membar_sync with LDSTUB.

membar_sync is required to be a full sequential consistency barrier,
equivalent to MEMBAR #StoreStore|LoadStore|StoreLoad|LoadLoad on
sparcv9.  LDSTUB and SWAP are the only pre-v9 instructions that do
this and SWAP doesn't exist on all v7 hardware, so use LDSTUB.

Note: I'm having a hard time nailing down a reference for the
ordering implied by LDSTUB and SWAP.  I'm _pretty sure_ SWAP has to
imply store-load ordering since the SPARCv8 manual recommends it for
Dekker's algorithm (which notoriously requires store-load ordering),
and the formal memory model treats LDSTUB and SWAP the same for
ordering.  But the v8 and v9 manuals aren't clear.

GCC issues STBAR and LDSTUB, but (a) I don't see why STBAR is
necessary here, (b) STBAR doesn't exist on v7 so it'd be a pain to
use, and (c) from what I've heard (although again it's hard to nail
down authoritative references here) all actual SPARC hardware is TSO
or SC anyway so STBAR is a noop in all the silicon anyway.

Either way, certainly this is better than what we had before, which
was nothing implying ordering at all, just a store!

(riastradh)

2022-04-09 10:05:35 UTC MAIN commitmail json YAML

2022-04-09 09:59:16 UTC MAIN commitmail json YAML

fsck_udf(8): Nix trailing whitespace.

No functional change intended.

(riastradh)

2022-04-09 09:58:12 UTC MAIN commitmail json YAML

2022-04-08 23:48:05 UTC MAIN commitmail json YAML

fsck_udf(8): Mark vat_length as ignored.

(riastradh)

2022-04-08 23:47:19 UTC MAIN commitmail json YAML

fsck_udf(8): Sprinkle __printflike and omit self-assignment.

(riastradh)

2022-04-08 23:35:52 UTC MAIN commitmail json YAML

membar_ops(3): Add some automatic tests.

These tests run two threads for five seconds each to try to trigger
races in the event of broken memory barriers.  They run only on
machines with at least two CPUs; on uniprocessor systems there's no
point -- the membars can correctly just be (instruction barrier)
no-ops.

(riastradh)

2022-04-08 23:14:21 UTC MAIN commitmail json YAML

rk_v1crypto(4): Fix missing `error =' assignment.

This is not likely to fail, but let's avoid suppressing an unlikely
error.

(riastradh)

2022-04-08 23:14:11 UTC MAIN commitmail json YAML

etc: Sort NetBSD.dist.tests.

(riastradh)

2022-04-07 21:47:02 UTC MAIN commitmail json YAML

ucom(4): Use tty_unit -- save a couple lines of code.

(riastradh)

2022-04-07 21:46:52 UTC MAIN commitmail json YAML

tty(9): New function tty_unit for struct cdevsw::d_devtounit.

(riastradh)

2022-04-07 17:35:31 UTC MAIN commitmail json YAML

ucom(4): Fix unit numbering for devsw/autoconf cross-wiring.

Should introduce a tty_unit function to use here but this'll do for
now to fix the bug I introduced in ucom(4).

(riastradh)

2022-04-06 22:48:22 UTC MAIN commitmail json YAML

sparc64/pmap: Nix trailing whitespace.

(riastradh)

2022-04-06 22:48:09 UTC MAIN commitmail json YAML

mvxpsec(4): Nix trailing whitespace.

(riastradh)

2022-04-06 22:47:58 UTC MAIN commitmail json YAML

Nix trailing whitespace in files of membars, atomics, and lock stubs.

Will be touching many of these files soon for functional changes.

No functional change intended.

(riastradh)

2022-04-02 09:53:20 UTC MAIN commitmail json YAML

cgd(4): Omit technically-correct-but-broken adiantum dependency again.

It is true that cgd_crypto.c depends on sys/crypto/adiantum now, and
transitively on sys/crypto/aes.

However, there's a problem with the cgd module having a formal
(transitive) module dependency on the aes module.

Yesterday I thought the problem with this was that fpu_kern_enter was
artificially restricted while cold -- to detect, and noisily crash
on, reentrance, it raises the IPL to IPL_VM, asserts that the IPL is
not _higher_ (so it can't be re-entered by an IPL_SCHED or IPL_HIGH
interrupt), and asserts that it's not currently in use on the current
CPU.

Early at boot, the IPL is at IPL_HIGH, and no interrupts are possible
anyway, so the assertions tripped for artificial reasons, which I
fixed in:

https://mail-index.netbsd.org/source-changes/2022/04/01/msg137840.html

However, I had forgotten that there's a deeper problem for the cgd
module dependency on aes.  The ordering of events is:

1. Initialize builtin MODULE_CLASS_DRIVER modules -- including cgd.

2. Run configure -- including detecting CPUs, which on aarch64 is
  where the decision of which AES (and ChaCha) implementation to use
  based on supported CPU features.

3. Initialize builtin MODULE_CLASS_MISC modules -- including aes,
  _if_ there are no driver-class modules that depend on it.

There's a tangle of ordering dependencies here:

- MODULE_CLASS_DRIVER modules providing _autoconf_ drivers generally
  have to be initialized _before_ configure, because you need the
  driver to be initialized before configure can attach its devices.

- configure must run _before_ aes is initialized because the decision
  of which AES implementation to choose depends on CPU features
  detected in configure, and the prospect of dynamically changing the
  AES implementation is too painful to contemplate (it may change the
  key schedule, so it would invalidate any existing key schedules
  precomputed by callers like uvm_swap or configured cgd devices,
  which raises a host of painful concurrency issues to invalidate
  these cached key schedules on all CPUs in all subsystems using
  them).

- cgd doesn't figure into the configure stage of autoconf, but it
  nevertheless has to be MODULE_CLASS_DRIVER because specfs autoloads
  MODULE_CLASS_DRIVER modules in case they provide _devsw_ drivers
  (i.e., /dev nodes), as cgd does.  And we don't have a mechanism for
  identifying `autoconf driver modules' separately from `devsw driver
  modules' because some modules provide both and each module can have
  only one class.

For now, this is breaking boot on several tier I architectures so
let's nix the cgd->adiantum->aes module dependency as a stop-gap
measure.

(riastradh)

2022-04-01 19:57:22 UTC MAIN commitmail json YAML

x86, arm: Allow fpu_kern_enter/leave while cold.

Normally these are forbidden above IPL_VM, so that FPU usage doesn't
block IPL_SCHED or IPL_HIGH interrupts.  But while cold, e.g. during
builtin module initialization at boot, all interrupts are blocked
anyway so it's a moot point.

Also initialize x86 cpu_info_primary.ci_kfpu_spl to -1 so we don't
trip over an assertion about it while cold -- the assertion is meant
to detect reentrance into fpu_kern_enter/leave, which is prohibited.

Also initialize cpu0's ci_kfpu_spl.

(riastradh)

2022-04-01 00:21:19 UTC MAIN commitmail json YAML

cgd(4): Remove recently added dependency on adiantum.

While this dependency is technically correct, it triggers a problem
with module initialization for builtin modules that manifests as an
instant crash on boot for x86 (and likely arm) kernels, and that
problem is not trivial to solve immediately.  See the top of
sys/crypto/aes/aes_impl.c for a summary of the problem and why it's
tricky.

So as a stop-gap measure, we'll remove this dependency for now; we
can reinstate it later once the underlying problem with module
initialization order is resolved.

Reported-by: syzbot+e9b3550af985b6557414@syzkaller.appspotmail.com
(actually first reported, to my knowledge, by pgoyette@, but this
line tells syzkaller that we fixed the problem)

(riastradh)

2022-04-01 00:16:40 UTC MAIN commitmail json YAML

thmap(9): Handle memory allocation failure in root_try_put.

Reported-by: syzbot+8ded6e17a394e39d6291@syzkaller.appspotmail.com

(riastradh)

2022-03-30 17:02:02 UTC MAIN commitmail json YAML

kern: Assert softint does not net acquire kernel locks.

This redoes previous change where I mistakenly used the CPU's biglock
count, which is not necessarily stable -- the softint lwp may sleep
on a mutex, and the lwp it interrupted may start up again and release
the kernel lock, so by the time the softint lwp wakes up again and
the softint function returns, the CPU may not be holding any kernel
locks.  But the softint lwp should never hold kernel locks except
when it's in a (usually, non-MPSAFE) softint function.

Same with callout.

(riastradh)

2022-03-30 14:54:29 UTC MAIN commitmail json YAML

Revert "kern: Sprinkle biglock-slippage assertions."

Got the diagnostic information I needed from this, and it's holding
up releng tests of everything else, so let's back this out until I
need more diagnostics or track down the original source of the
problem.

(riastradh)

2022-03-30 10:34:14 UTC MAIN commitmail json YAML

kern: Sprinkle biglock-slippage assertions.

We seem to have a poltergeist that occasionally messes with the
biglock depth, but it's very hard to reproduce and only manifests as
some other CPU spinning out on the kernel lock which is no good for
diagnostics.

(riastradh)

2022-03-29 09:19:56 UTC MAIN commitmail json YAML

sequencer(4): Don't use mutex_spin_exit on an IPL_NONE lock.

(riastradh)

2022-03-29 09:16:24 UTC MAIN commitmail json YAML

cs4281(4): Fix lock ordering in suspend.

No idea if this code works -- obviously this path has never been
tested in the >decade it's been here!

(riastradh)

2022-03-29 09:08:44 UTC MAIN commitmail json YAML

emdtv(4): Fix issues in detach.

- Use config_detach_children, and do it up front, and handle failure
  (not relevant for yanking usb but relevant for drvctl which doesn't
  pass DETACH_FORCE).

- Fix teardown order: stop interrupts so we stop issuing new work,
  and _then_ wait for pending work to drain and destroy the
  workqueue.

- Omit needless empty  mutex_enter(lock); mutex_exit(lock)  dance
  which probably only appeared necessary because of the wrong
  teardown order.

(riastradh)

2022-03-29 06:59:19 UTC MAIN commitmail json YAML

uhid(4): Make sure error is initialized in uhidkqfilter.

(riastradh)

2022-03-29 06:56:51 UTC MAIN commitmail json YAML

x68k/ite(4): Include <sys/device_impl.h> to abuse autoconf internals.

(riastradh)

2022-03-28 19:59:36 UTC MAIN commitmail json YAML

arm/cortex: Use container_of, not bespoke offsetof arithmetic.

(riastradh)

2022-03-28 19:59:26 UTC MAIN commitmail json YAML

arm/apple: Use container_of, not bespoke offsetof arithmetic.

Better type-safety this way.

(riastradh)

2022-03-28 12:48:35 UTC MAIN commitmail json YAML

sys/dev/ccd.c: Restore historic RCS id.

This got munged accidentally by `git cvsexportcommit -k' -- taking
that option out of my commitbomb script!

(riastradh)

2022-03-28 12:45:04 UTC MAIN commitmail json YAML

uatp(4): Use usbd_get/set_report for Geyser 3/4 reset.

(riastradh)

2022-03-28 12:44:55 UTC MAIN commitmail json YAML

uatp(4): Fix detach logic.

Let wsmouse child decide whether it's in use or close if mandatory.
If config_detach_children succeeds, this must no longer be open and
we can commit to freeing everything.

(riastradh)

2022-03-28 12:44:45 UTC MAIN commitmail json YAML

uhidev(9): Assert uhidev is open when writing.

(Maybe we could have uhidevs that are output-only, in which case a
driver could, in principle, want to issue writes without getting any
input report interrupts.  But we can cross that bridge when we come
to it.)

(riastradh)

2022-03-28 12:44:37 UTC MAIN commitmail json YAML

uhidev(9): Define UHIDEV_MAXREPID = 255.

Report ids are limited by the HID spec to a single byte.

- Clamp max report id in report descriptor to this.

- Prune dead refcnt-overflow error branches.  Assert instead.

(riastradh)

2022-03-28 12:44:28 UTC MAIN commitmail json YAML

uhidev(9): Omit needless sc_dying.

(riastradh)

2022-03-28 12:44:17 UTC MAIN commitmail json YAML

uhidev(9): Make uhidev state opaque.

This makes the API simpler and clearer and gives us more latitude to
fix bugs in the state management without breaking the ABI.

XXX kernel ABI change to signature of uhidev_get_report_desc and
uhidev_open, and to struct uhidev_attach_arg, requires bump for
uhidev driver modules

(riastradh)

2022-03-28 12:44:06 UTC MAIN commitmail json YAML

uhidev(9): Fix race between uhidev_close and uhidev_intr.

uhidev_intr currently relies on the kernel lock to serialize access
to struct uhidev::sc_state, but if the driver's sc_intr function
sleeps (which it may do, even though this runs is softint context --
it may sleep on an adaptive lock), uhidev_close might return while
the driver's sc_intr function is still in flight.

This avoids that race, and makes progress toward MP-safe uhidev.

(riastradh)

2022-03-28 12:43:58 UTC MAIN commitmail json YAML

uhidev(9): Refactor error branch to use one label.

No functional change intended.

(riastradh)

2022-03-28 12:43:48 UTC MAIN commitmail json YAML

uhidev(9): Make some private functions static and fix comment.

No functional change.

(riastradh)

2022-03-28 12:43:39 UTC MAIN commitmail json YAML

uhidev(9): Make uhidev_stop work reliably.

(riastradh)

2022-03-28 12:43:30 UTC MAIN commitmail json YAML

ucycom(4): Defer uhidev_write_async to taskq.

Can't submit USB transfers while holding tty_lock, a spin lock.

(riastradh)

2022-03-28 12:43:22 UTC MAIN commitmail json YAML

uhidev(9): Move struct uhidev_softc into uhidev.c.

No longer part of any ABI for uhidev modules.

(riastradh)

2022-03-28 12:43:12 UTC MAIN commitmail json YAML

uhidev(9): Get the device and interface through attach args.

This way uhidev drivers don't need access to uhidev_softc itself for
it.

(riastradh)

2022-03-28 12:43:03 UTC MAIN commitmail json YAML

uhidev(9): New uhidev_write_async.

Like uhidev_write but issues the transfer asynchronously with a
callback.

Use it in ucycom(4).

Also, clear endpoint stalls asynchronously -- can't do them
synchronously in xfer callbacks which run at softint and therefore
can't wait in cv_wait as usbd_do_request does.

(riastradh)

2022-03-28 12:42:54 UTC MAIN commitmail json YAML

uhidev(9): Partially fix uhidev_write aborting.

In my previous change, I intended to make uhidev_stop abort any
pending write -- but I forgot to initialize sc->sc_writereportid, so
it never did anything.

This changes the API and ABI of uhidev_write so it takes the struct
uhidev pointer, rather than the struct uhidev_softc pointer; this way
uhidev_write knows what the report id of the client is, so it can
arrange to have uhidev_stop abort only this one.

XXX Except it still doesn't actually work because we do this
unlocked, ugh, so the write might complete before we abort anything.
To be fixed some more in a later change.

XXX kernel ABI change to uhidev_write signature, used by uhidev
driver modules, requires bump

(riastradh)

2022-03-28 12:42:45 UTC MAIN commitmail json YAML

uhid(4): Use d_cfdriver/devtounit/cancel to avoid open/detach races.

- Split uhidclose into separate uhidcancel and uhidclose parts.
  uhidcancel interrupts pending I/O operations (open, read, write,
  ioctl, &c.); uhidclose doesn't run until all I/O operations are
  done.

- Handle case where, owing to revoke(2), uhidcancel/uhidclose run
  concurrently with a uhidopen that hasn't yet noticed that there
  isn't actually a device.

- Handle case where, owing to revoke(2), uhidread might be cancelled
  by mere revoke, not by detach, so it has to wake up when the device
  is closing, not (just) when dying (but dying will lead to closing
  so no need to check for dying).

- Omit needless reference-counting goo.  vdevgone takes care of this
  for us by cancelling all I/O operations with uhidcancel, waiting
  for I/O operations to drain, closing the device, and waiting until
  it is closed if that is already happening concurrently.

- Name the closed/changing/open states rather than using 0/1/2.

- Omit needless sc_dying.

(riastradh)

2022-03-28 12:42:37 UTC MAIN commitmail json YAML

ucom(4): Rework open/close/attach/detach logic.

- Defer sleep after hangup until open.

  No need to make close hang; we just need to make sure some time has
  passed before we next try to open.

  This changes the wchan for the sleep.  Oh well.

- Use .d_cfdriver/devtounit/cancel to resolve races between attach,
  detach, open, close, and revoke.

- Use a separate .sc_closing flag instead of a UCOM_CLOSING state.
  ucomcancel/ucomclose owns this flag, and it may be set in any state
  (except UCOM_DEAD).  UCOM_OPENING remains owned by ucomopen, which
  might be interrupted by cancel/close.

- Rework error branches in ucomopen.  Much simpler this way.

- Nix unnecessary reference counting.

(riastradh)

2022-03-28 12:41:17 UTC MAIN commitmail json YAML

subr_devsw.c: KNF and style nits.

No functional change intended.

(riastradh)

2022-03-28 12:39:57 UTC MAIN commitmail json YAML

audio(4): Use d_cfdriver/devtounit to avoid open/detach races.

(riastradh)

2022-03-28 12:39:47 UTC MAIN commitmail json YAML

sd(4): Use d_cfdriver/devtounit to avoid open/detach races.

(riastradh)

2022-03-28 12:39:37 UTC MAIN commitmail json YAML

wd(4): Use d_cfdriver/devtounit to avoid open/detach races.

(riastradh)

2022-03-28 12:39:29 UTC MAIN commitmail json YAML

tty(9): New ttycancel function.

This causes any current and future ttyopens to fail until ttyclose.

This is necessary for revoke to work reliably for device detach like
ucom(4) removable USB devices.  A tty driver for a removable device
needs some way to interrupt a pending .d_open so it returns promptly.
But ttyclose only interrupts ttyopen if it's already sleeping; it
won't cause a concurrent .d_open call which _will call_ but _hasn't
yet called_ ttyopen to avoid sleeping.  Using ttycancel in the tty
driver's .d_cancel makes this work.

(riastradh)

2022-03-28 12:39:18 UTC MAIN commitmail json YAML

driver(9): New types dev_*_t for device driver devsw operations.

These will serve to replace the archaic and kludgey dev_type_* macros
which should've been typedefs all along.

(riastradh)

2022-03-28 12:39:10 UTC MAIN commitmail json YAML

driver(9): New devsw d_cancel op to interrupt I/O before close.

If specified, when revoking a device node or closing its last open
node, specfs will:

1. Call d_cancel, which should return promptly without blocking.
2. Wait for all concurrent d_read/write/ioctl/&c. to drain.
3. Call d_close.

Otherwise, specfs will:

1. Call d_close.
2. Wait for all concurrent d_read/write/ioctl/&c. to drain.

This fallback is problematic because often parts of d_close rely on
concurrent devsw operations to have completed already, so it is up to
each driver to have its own mechanism for waiting, and the extra step
in (2) is almost redundant.  But it is still important to ensure that
devsw operations are not active by the time a module tries to invoke
devsw_detach, because only d_open is protected against that.

The signature of d_cancel matches d_close, mostly so we don't raise
questions about `why is this different?'; the lwp argument is not
useful but we should remove it from open/cancel/close all at the same
time.

The only way d_cancel should fail, if it does at all, is with ENODEV,
meaning the driver doesn't support cancelling outstanding I/O, and
will take responsibility for that in d_close.  I would make it return
void and only have bdev_cancel and cdev_cancel possibly return ENODEV
so specfs can detect whether a driver supports it, but this would
break the pattern around devsw operation types.

Drivers are allowed to omit it from struct bdevsw, struct cdevsw --
if so, it is as if they used a function that just returns ENODEV.

XXX kernel ABI change to struct bdevsw/cdevsw requires bump

(riastradh)

2022-03-28 12:38:59 UTC MAIN commitmail json YAML

2022-03-28 12:38:34 UTC MAIN commitmail json YAML

driver(9): Make vdevgone call config_detach_commit if appropriate.

Make sure to do this before spec_node_lookup_by_dev -- that might wait
for a concurrent revoke to complete, which in turn might wait for a
concurrent open to complete, which in turn might be waiting for the
device to commit to detaching.

(riastradh)

2022-03-28 12:38:25 UTC MAIN commitmail json YAML

autoconf(9): Disentangle slightly circuitous config_detach logic.

No functional change intended.

(riastradh)

2022-03-28 12:38:15 UTC MAIN commitmail json YAML

autoconf(9): New function config_detach_commit.

When a driver's .ca_detach function has committed to detaching -- it
definitely won't back out with EBUSY, for instance -- it can call
this to wake all pending calls to device_lookup_acquire and make them
fail immediately.

This is necessary to break a deadlock if the device_lookup_acquire
calls happen inside I/O operations which the driver's .ca_detach
function waits for the completion of -- without config_detach_commit,
I/O operations would be stuck in device_lookup_acquire waiting for
.ca_detach and .ca_detach would be stuck waiting for I/O operations
to return.

Most drivers won't need to call this: for autoconf drivers used the
traditional way by devsw for userland device nodes, the .ca_detach
routine uses vdevgone, and we will arrange to make vdevgone call
config_detach_commit automagically in such drivers anyway.

XXX kernel ABI change to struct device requires bump -- later change
will make struct device opaque to ABI, but we're not there yet

(riastradh)

2022-03-28 12:38:04 UTC MAIN commitmail json YAML

specfs: Reorder struct specnode members to save padding.

Shrinks from 40 bytes to 32 bytes on LP64 systems this way.

(riastradh)

2022-03-28 12:37:56 UTC MAIN commitmail json YAML

specfs: Remove specnode from hash table in spec_node_revoke.

Previously, it was possible for spec_node_lookup_by_dev to handle a
speconde that a concurrent spec_node_destroy is about to remove from
the hash table and then free, as soon as spec_node_lookup_by_dev
releases device_lock.

Now, the ordering is:

1. Remove specnode from hash table in spec_node_revoke.  At this
  point, no _new_ vnode references are possible (other than possibly
  one acquired by vcache_vget under v_interlock), but there may be
  existing ones.

2. Mark vnode reclaimed so vcache_vget will fail.

3. The last vrele (or equivalent logic in vcache_vget) will then free
  the specnode in spec_node_destroy.

This way, _if_ a thread in spec_node_lookup_by_dev finds a specnode
in the hash table under device_lock/v_interlock, _then_ it will not
be freed until the thread completes vcache_vget.

This change requires calling spec_node_revoke unconditionally for
device special nodes, not just for active ones.  Might introduce
slightly more contention on device_lock but not much because we
already have to take it in this path anyway a little later in
spec_node_destroy.

(riastradh)

2022-03-28 12:37:46 UTC MAIN commitmail json YAML

specfs: Let spec_node_lookup_by_dev wait for reclaim to finish.

vdevgone relies on this to ensure that if there is a concurrent
revoke in progress, it will wait for that revoke to finish -- that
way, it can guarantee all I/O operations have completed and the
device is closed.

(riastradh)

2022-03-28 12:37:35 UTC MAIN commitmail json YAML

specfs: Assert opencnt is nonzero before decrementing.

(riastradh)

2022-03-28 12:37:27 UTC MAIN commitmail json YAML

specfs: Take an I/O reference across bdev/cdev_open.

- Revoke is used to invalidate all prior access control checks when
  device permissions are changing, so it must wait for .d_open to exit
  so any new access must go through new access control checks.

- Revoke is used by vdevgone in xyz_detach to wait until all use of
  the driver's data structures have completed before xyz_detach frees
  them.

So we need to make sure spec_close waits for .d_open too.

(riastradh)

2022-03-28 12:37:18 UTC MAIN commitmail json YAML

specfs: Wait for last close in spec_node_revoke.

Otherwise, revoke -- and vdevgone, in the detach path of removable
devices -- may complete while I/O operations are still running
concurrently.

(riastradh)

2022-03-28 12:37:09 UTC MAIN commitmail json YAML

specfs: Prevent new opens while close is waiting to drain.

Otherwise, bdev/cdev_close could have cancelled all _existing_ opens,
and waited for them to complete (and freed resources used by them) --
but a new one could start, and hang (e.g., a tty), at the same time
spec_close tries to drain all pending I/O operations, one of which
(the new open) is now hanging indefinitely.

Preventing the new open from even starting until bdev/cdev_close is
finished and all I/O operations have drained avoids this deadlock.

(riastradh)

2022-03-28 12:37:01 UTC MAIN commitmail json YAML

specfs: Take an I/O reference in spec_node_setmountedfs.

This is not quite correct.  We _should_ require the caller to hold a
vnode lock around spec_node_getmountedfs, and an exclusive vnode lock
around spec_node_setmountedfs, so that it is only necessary to check
whether revoke has already happened, not hold an I/O reference.

Unfortunately, various callers in various file systems don't follow
this sensible rule.  So let's at least make sure the vnode can't be
revoked in spec_node_setmountedfs, while we're in bdev_ioctl, and
leave a comment explaining what the sorry state of affairs is and how
to fix it later.

(riastradh)

2022-03-28 12:36:51 UTC MAIN commitmail json YAML

specfs: Drain all I/O operations after last .d_close call.

New kind of I/O reference on specdevs, sd_iocnt.  This could be done
with psref instead; I chose a reference count instead for now because
we already have to take a per-object lock anyway, v_interlock, for
vdead_check, so another atomic is not likely to hurt much more.  We
can always change the mechanism inside spec_io_enter/exit/drain later
on.

Make sure every access to vp->v_rdev or vp->v_specnode and every call
to a devsw operation is protected either:

- by the vnode lock (with vdead_check if we unlocked/relocked),
- by positive sd_opencnt,
- by spec_io_enter/exit, or
- by sd_opencnt management in open/close.

(riastradh)

2022-03-28 12:36:42 UTC MAIN commitmail json YAML

specfs: Resolve a race between close and a failing reopen.

(riastradh)

2022-03-28 12:36:34 UTC MAIN commitmail json YAML

specfs: Document sn_opencnt, sd_opencnt, sd_refcnt.

(riastradh)

2022-03-28 12:36:27 UTC MAIN commitmail json YAML

specfs: Paranoia: Assert opencnt is zero on reclaim.

(riastradh)

2022-03-28 12:36:18 UTC MAIN commitmail json YAML

specfs: Omit needless vdead_check in spec_fdiscard.

The vnode lock is held, so the vnode cannot be revoked without also
changing v_op so subsequent uses under the vnode lock will go to
deadfs's VOP_FDISCARD instead (which is genfs_eopnotsupp).

(riastradh)

2022-03-28 12:36:09 UTC MAIN commitmail json YAML

specfs: Add a comment and assertion to spec_close about refcnts.

(riastradh)

2022-03-28 12:36:01 UTC MAIN commitmail json YAML

specfs: If sd_opencnt is zero, sn_opencnt had better be zero.

(riastradh)

2022-03-28 12:35:52 UTC MAIN commitmail json YAML

specfs: Factor KASSERT out of switch in spec_open.

No functional change.

(riastradh)

2022-03-28 12:35:44 UTC MAIN commitmail json YAML

specfs: sn_gone cannot be set while we hold the vnode lock.

Revoke runs with the vnode lock too, which is exclusive.  Add an
assertion to this effect in spec_node_revoke to make it clear.

(riastradh)

2022-03-28 12:35:35 UTC MAIN commitmail json YAML

specfs: Reorganize D_DISK tail of spec_open and explain what's up.

No functional change intended.

(riastradh)

2022-03-28 12:35:26 UTC MAIN commitmail json YAML

specfs: Factor VOP_UNLOCK/vn_lock out of switch for clarity.

No functional change.

(riastradh)

2022-03-28 12:35:17 UTC MAIN commitmail json YAML

specfs: Factor common device_lock out of switch for clarity.

No functional change.

(riastradh)

2022-03-28 12:35:08 UTC MAIN commitmail json YAML

specfs: Delete bogus comment about .d_open/.d_close at same time.

Annoying as it is that .d_open and .d_close can run at the same time,
it is also necessary for tty semantics, where open can block
indefinitely, and it is the responsibility of close (called via
revoke) necessary to interrupt it.

(riastradh)

2022-03-28 12:34:59 UTC MAIN commitmail json YAML

specfs: Split spec_open switch into three sections.

The sections are now:

1. Acquire open reference.

1a (intermezzo). Set VV_ISTTY.

2. Drop the vnode lock to call .d_open and autoload modules if
  necessary.

3. Handle concurrent revoke if it happenend, or release open reference
  if .d_open failed.

No functional change.  Sprinkle comments about problems.

(riastradh)

2022-03-28 12:34:51 UTC MAIN commitmail json YAML

specfs: Factor common kauth check out of switch in spec_open.

No functional change.

(riastradh)

2022-03-28 12:34:42 UTC MAIN commitmail json YAML

specfs: Assert v_type is VBLK or VCHR in spec_open.

Nothing else makes sense.  Prune dead branches (and replace default
case by panic).

(riastradh)

2022-03-28 12:34:34 UTC MAIN commitmail json YAML

specfs: Call bdev_open without the vnode lock.

There is no need for it to serialize opens, because they are already
serialized by sd_opencnt which for block devices is always either 0
or 1.

There's not obviously any other reason why the vnode lock should be
held across bdev_open, other than that it might be nice to avoid
dropping it if not necessary.  For character devices we always have
to drop the vnode lock because open might hang indefinitely, when
opening a tty, which is not allowed while holding the vnode lock.

(riastradh)

2022-03-28 12:34:26 UTC MAIN commitmail json YAML

specfs: Note lock order for vnode lock, device_lock, v_interlock.

(riastradh)

2022-03-28 12:34:17 UTC MAIN commitmail json YAML

driver(9): Eliminate D_MCLOSE.

D_MCLOSE was introduced a few years ago by mistake for audio(4),
which should have used -- and now does use -- fd_clone to create
per-open state.  The semantics was originally to call close once
every time the device node is closed, not only for the last close.
Nothing uses it any more, and it complicates reasoning about the
system, so let's simplify it away.

(riastradh)

2022-03-28 12:34:08 UTC MAIN commitmail json YAML

driver(9): New function dev_minor_unit.

(riastradh)

2022-03-28 12:33:59 UTC MAIN commitmail json YAML

disk(9): New function disklabel_dev_unit.

Maps a dev_t like wd3e to an autoconf instance number like 3, with no
partition.  Same as DISKUNIT macro, but is a symbol whose pointer can
be taken.  Meant for use with struct bdevsw, cdevsw::d_devtounit.

(riastradh)

2022-03-28 12:33:50 UTC MAIN commitmail json YAML

driver(9): New devsw members d_cfdriver, d_devtounit.

If set, then bdev_open/cdev_open will use d_devtounit to map the
dev_t to an autoconf instance (e.g., /dev/wd0a -> wd0) and hold a
reference with device_lookup_acquire across the call to d_open.

This guarantees that the autoconf instance cannot be detached while
the devsw's d_open function is trying to open it (and also that the
autoconf instance has finished *_attach before anyone can open it).

Of course, if the underlying hardware has gone away, there will be
I/O errors, but this avoids software synchronization bugs between
open and detach for drivers that opt into it.  It's up to the driver
and bus to figure out how to deal with I/O errors from operations on
hardware that has gone away while the software hasn't finished
notifying everything that it's gone yet.

XXX kernel ABI change to struct bdevsw/cdevsw requires bump

(riastradh)

2022-03-28 12:33:41 UTC MAIN commitmail json YAML

autoconf(9): New localcount-based device instance references.

device_lookup_acquire looks up an autoconf device instance, if found,
and acquires a reference the caller must release with device_release.
If attach or detach is still in progress, device_lookup_acquire waits
until it completes.  While references are held, the device's softc
will not be freed or reused until the last reference is released.

The reference is meant to be held while opening a device in the short
term, and then to be passed off to a longer-term reference that can
be broken explicitly by detach -- usually a device special vnode,
which is broken by vdevgone in the driver's *_detach function.

Sleeping while holding a reference is allowed, e.g. waiting to open a
tty.  A driver must arrange that its *_detach function will interrupt
any threads sleeping while holding references and cause them to back
out so that detach can complete promptly.

Subsequent changes to subr_devsw.c will make bdev_open and cdev_open
automatically take a reference to an autoconf instance for drivers
that opt into this, so there will be no logic changes needed in most
drivers other than to connect the autoconf cfdriver to the
bdevsw/cdevsw I/O operation tables.  The effect will be that *_detach
may run while d_open is in progress, but no new d_open can begin
until *_detach has backed out from or committed to detaching.

XXX kernel ABI change to struct device requires bump -- later change
will make struct device opaque to ABI, but we're not there yet

(riastradh)

2022-03-28 12:33:32 UTC MAIN commitmail json YAML

driver(9): Fix synchronization of devsw_attach/lookup/detach.

(`dev' means either `bdev' or `cdev' for brevity here, e.g. in
`devsw_lookup' (bdevsw_lookup, cdevsw_lookup), `dev_open' (bdev_open,
cdev_open), `maxdevsws', &c., except for `devsw_attach' and
`devsw_detach' which are taken literally.)

- Use atomic_store_release and atomic_load_consume for devsw and
  tables and their entries, which are read unlocked and thus require
  memory barriers to ensure ordering between initialization in
  devsw_attach and use in dev_lookup.

- Use pserialize(9) and localcount(9) to synchronize dev_open and
  devsw_detach.

  => Driver must ensure d_open fails and all open instances have been
    closed by the time it calls devsw_detach.

  => Bonus: dev_open is no longer globally serialized through
    device_lock.

- Use atomic_store_release and atomic_load_acquire for max_devsws,
  which is used in conditionals in the new devsw_lookup_acquire.

  => It is safe to use atomic_load_relaxed in devsw_lookup because
    the caller must guarantee the entry is stable, so any increase
    of max_devsws must have already happened.

  => devsw_lookup and devsw_lookup_acquire assume that max_devsws
    never goes down.  If you change this you must find some way to
    adapt the users, preferably without adding much overhead so that
    devsw operations are cheap.

This change introduces an auxiliary table devswref mapping device
majors to localcounts of opens in progress.  The auxiliary table only
occupies one pointer's worth of memory in a monolithic kernel, and is
allocated on the fly for dynamically loaded modules.  We could ask
the module itself to reserve storage for it, but I don't see much
value in that, and it would require some changes to the ABI and to
config(8).

- Omit needless boolean indirection.

(riastradh)

2022-03-28 12:33:22 UTC MAIN commitmail json YAML

driver(9): devsw_detach never fails.  Make it return void.

Prune a whole lotta dead branches as a result of this.  (Some logic
calling this is also wrong for other reasons; devsw_detach is final
-- you should never have any reason to decide to roll it back.  To be
cleaned up in subsequent commits...)

XXX kernel ABI change to devsw_detach signature requires bump

(riastradh)

2022-03-26 19:38:00 UTC MAIN commitmail json YAML

mips/cavium: Simplify membars around interrupt establishment.

Previously I used xc_barrier to ensure the initialization of the
struct octeon_intrhand was witnessed on all CPUs before publishing
it, in order to avoid needing any barrier on the usage side to be
issued by the interrupt handler.

But there's no need to avoid atomic_load_consume at time of
interrupt: on MIPS it's the same as atomic_load_relaxed anyway, so
there's no additional memory barrier cost here.

(riastradh)

2022-03-26 19:35:57 UTC MAIN commitmail json YAML

igpio(4): Use device_xname, not struct device members.

(riastradh)

2022-03-26 19:35:35 UTC MAIN commitmail json YAML

igpio(4): Nix trailing whitespace.

(setq show-trailing-whitespace t), M-x delete-trailing-whitespace

(riastradh)

2022-03-24 12:59:56 UTC MAIN commitmail json YAML

vfs(9): Add missing vnode lock around VOP_CLOSE in vfs_mountroot.

Maybe vnode_if.c should be taught to KASSERT the vnode lock now that
locks always work.

(riastradh)

2022-03-24 12:58:56 UTC MAIN commitmail json YAML

entropy(9): Call entropy_softintr while bound to CPU.

It looks like We tripped on the new assertion in entropy_account_cpu
when there was pending entropy on cpu0 running lwp0 when xc_broadcast
ran -- since xc_broadcast calls the function directly rather than
calling it through softint_schedule, it's not called via the softint
lwp which would satisfy the assertion.

(riastradh)