Thu Sep 3 00:00:07 2020 UTC ()
Update membar_ops(3) man page with examples and relation to C11.

Add exhortation to always always always document how membars come in
pairs for synchronization between two CPUs when you use them.


(riastradh)
diff -r1.5 -r1.6 src/lib/libc/atomic/membar_ops.3

cvs diff -r1.5 -r1.6 src/lib/libc/atomic/membar_ops.3 (expand / switch to unified diff)

--- src/lib/libc/atomic/membar_ops.3 2017/10/24 18:19:17 1.5
+++ src/lib/libc/atomic/membar_ops.3 2020/09/03 00:00:06 1.6
@@ -1,14 +1,14 @@ @@ -1,14 +1,14 @@
1.\" $NetBSD: membar_ops.3,v 1.5 2017/10/24 18:19:17 abhinav Exp $ 1.\" $NetBSD: membar_ops.3,v 1.6 2020/09/03 00:00:06 riastradh Exp $
2.\" 2.\"
3.\" Copyright (c) 2007, 2008 The NetBSD Foundation, Inc. 3.\" Copyright (c) 2007, 2008 The NetBSD Foundation, Inc.
4.\" All rights reserved. 4.\" All rights reserved.
5.\" 5.\"
6.\" This code is derived from software contributed to The NetBSD Foundation 6.\" This code is derived from software contributed to The NetBSD Foundation
7.\" by Jason R. Thorpe. 7.\" by Jason R. Thorpe.
8.\" 8.\"
9.\" Redistribution and use in source and binary forms, with or without 9.\" Redistribution and use in source and binary forms, with or without
10.\" modification, are permitted provided that the following conditions 10.\" modification, are permitted provided that the following conditions
11.\" are met: 11.\" are met:
12.\" 1. Redistributions of source code must retain the above copyright 12.\" 1. Redistributions of source code must retain the above copyright
13.\" notice, this list of conditions and the following disclaimer. 13.\" notice, this list of conditions and the following disclaimer.
14.\" 2. Redistributions in binary form must reproduce the above copyright 14.\" 2. Redistributions in binary form must reproduce the above copyright
@@ -17,121 +17,342 @@ @@ -17,121 +17,342 @@
17.\" 17.\"
18.\" THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS 18.\" THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS
19.\" ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED 19.\" ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
20.\" TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR 20.\" TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
21.\" PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS 21.\" PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS
22.\" BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR 22.\" BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
23.\" CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF 23.\" CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
24.\" SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS 24.\" SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
25.\" INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN 25.\" INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
26.\" CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) 26.\" CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
27.\" ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE 27.\" ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
28.\" POSSIBILITY OF SUCH DAMAGE. 28.\" POSSIBILITY OF SUCH DAMAGE.
29.\" 29.\"
30.Dd November 20, 2014 30.Dd September 2, 2020
31.Dt MEMBAR_OPS 3 31.Dt MEMBAR_OPS 3
32.Os 32.Os
33.Sh NAME 33.Sh NAME
34.Nm membar_ops , 34.Nm membar_ops ,
35.Nm membar_enter , 35.Nm membar_enter ,
36.Nm membar_exit , 36.Nm membar_exit ,
37.Nm membar_producer , 37.Nm membar_producer ,
38.Nm membar_consumer , 38.Nm membar_consumer ,
39.Nm membar_datadep_consumer , 39.Nm membar_datadep_consumer ,
40.Nm membar_sync 40.Nm membar_sync
41.Nd memory access barrier operations 41.Nd memory ordering barriers
42.\" .Sh LIBRARY 42.\" .Sh LIBRARY
43.\" .Lb libc 43.\" .Lb libc
44.Sh SYNOPSIS 44.Sh SYNOPSIS
45.In sys/atomic.h 45.In sys/atomic.h
46.\" 46.\"
47.Ft void 47.Ft void
48.Fn membar_enter "void" 48.Fn membar_enter "void"
49.Ft void 49.Ft void
50.Fn membar_exit "void" 50.Fn membar_exit "void"
51.Ft void 51.Ft void
52.Fn membar_producer "void" 52.Fn membar_producer "void"
53.Ft void 53.Ft void
54.Fn membar_consumer "void" 54.Fn membar_consumer "void"
55.Ft void 55.Ft void
56.Fn membar_datadep_consumer "void" 56.Fn membar_datadep_consumer "void"
57.Ft void 57.Ft void
58.Fn membar_sync "void" 58.Fn membar_sync "void"
59.Sh DESCRIPTION 59.Sh DESCRIPTION
60The 60The
61.Nm membar_ops 61.Nm
62family of functions provide memory access barrier operations necessary 62family of functions prevent reordering of memory operations, as needed
63for synchronization in multiprocessor execution environments that have 63for synchronization in multiprocessor execution environments that have
64relaxed load and store order. 64relaxed load and store order.
65.Bl -tag -width "mem" 65.Pp
 66In general, memory barriers must come in pairs \(em a barrier on one
 67CPU, such as
 68.Fn membar_exit ,
 69must pair with a barrier on another CPU, such as
 70.Fn membar_enter ,
 71in order to synchronize anything between the two CPUs.
 72Code using
 73.Nm
 74should generally be annotated with comments identifying how they are
 75paired.
 76.Pp
 77.Nm
 78affect only operations on regular memory, not on device
 79memory; see
 80.Xr bus_space 9
 81and
 82.Xr bus_dma 9
 83for machine-independent interfaces to handling device memory and DMA
 84operations for device drivers.
 85.Pp
 86Unlike C11,
 87.Em all
 88memory operations \(em that is, all loads and stores on regular
 89memory \(em are affected by
 90.Nm ,
 91not just C11 atomic operations on
 92.Vt _Atomic Ns -qualified
 93objects.
 94.Bl -tag -width abcd
66.It Fn membar_enter 95.It Fn membar_enter
67Any store preceding 96Any store preceding
68.Fn membar_enter 97.Fn membar_enter
69will reach global visibility before all loads and stores following it. 98will happen before all memory operations following it.
 99.Pp
 100An atomic read/modify/write operation
 101.Pq Xr atomic_ops 3
 102followed by a
 103.Fn membar_enter
 104implies a
 105.Em load-acquire
 106operation in the language of C11.
 107.Pp
 108.Sy WARNING :
 109A load followed by
 110.Fn membar_enter
 111.Em does not
 112imply a
 113.Em load-acquire
 114operation, even though
 115.Fn membar_exit
 116followed by a store implies a
 117.Em store-release
 118operation; the symmetry of these names and asymmetry of the semantics
 119is a historical mistake.
 120In the
 121.Nx
 122kernel, you can use
 123.Xr atomic_load_acquire 9
 124for a
 125.Em load-acquire
 126operation without any atomic read/modify/write.
70.Pp 127.Pp
71.Fn membar_enter 128.Fn membar_enter
72is typically used in code that implements locking primitives to ensure 129is typically used in code that implements locking primitives to ensure
73that a lock protects its data. 130that a lock protects its data, and is typically paired with
 131.Fn membar_exit ;
 132see below for an example.
74.It Fn membar_exit 133.It Fn membar_exit
75All loads and stores preceding 134All memory operations preceding
76.Fn membar_exit 135.Fn membar_exit
77will reach global visibility before any store that follows it. 136will happen before any store that follows it.
 137.Pp
 138A
 139.Fn membar_exit
 140followed by a store implies a
 141.Em store-release
 142operation in the language of C11.
 143For a regular store, rather than an atomic read/modify/write store, you
 144should use
 145.Xr atomic_store_release 9
 146instead of
 147.Fn membar_exit
 148followed by the store.
78.Pp 149.Pp
79.Fn membar_exit 150.Fn membar_exit
80is typically used in code that implements locking primitives to ensure 151is typically used in code that implements locking primitives to ensure
81that a lock protects its data. 152that a lock protects its data, and is typically paired with
 153.Fn membar_enter .
 154For example:
 155.Bd -literal -offset abcdefgh
 156/* thread A */
 157obj->state.mumblefrotz = 42;
 158KASSERT(valid(&obj->state));
 159membar_exit();
 160obj->lock = 0;
 161
 162/* thread B */
 163if (atomic_cas_uint(&obj->lock, 0, 1) != 0)
 164 return;
 165membar_enter();
 166KASSERT(valid(&obj->state));
 167obj->state.mumblefrotz--;
 168.Ed
 169.Pp
 170In this example,
 171.Em if
 172the
 173.Fn atomic_cas_uint
 174operation in thread B witnesses the store
 175.Li "obj->lock = 0"
 176from thread A,
 177.Em then
 178everything in thread A before the
 179.Fn membar_exit
 180is guaranteed to happen before everything in thread B after the
 181.Fn membar_enter ,
 182as if the machine had sequentially executed:
 183.Bd -literal -offset abcdefgh
 184obj->state.mumblefrotz = 42; /* from thread A */
 185KASSERT(valid(&obj->state));
 186\&...
 187KASSERT(valid(&obj->state)); /* from thread B */
 188obj->state.mumblefrotz--;
 189.Ed
 190.Pp
 191.Fn membar_exit
 192followed by a store, serving as a
 193.Em store-release
 194operation, may also be paired with a subsequent load followed by
 195.Fn membar_sync ,
 196serving as the corresponding
 197.Em load-acquire
 198operation.
 199However, you should use
 200.Xr atomic_store_release 9
 201and
 202.Xr atomic_load_acquire 9
 203instead in that situation, unless the store is an atomic
 204read/modify/write which requires a separate
 205.Fn membar_exit .
82.It Fn membar_producer 206.It Fn membar_producer
83All stores preceding the memory barrier will reach global visibility 207All stores preceding
84before any stores after the memory barrier reach global visibility. 208.Fn membar_producer
 209will happen before any stores following it.
 210.Pp
 211.Fn membar_producer
 212has no analogue in C11.
 213.Pp
 214.Fn membar_producer
 215is typically used in code that produces data for read-only consumers
 216which use
 217.Fn membar_consumer ,
 218such as
 219.Sq seqlocked
 220snapshots of statistics; see below for an example.
85.It Fn membar_consumer 221.It Fn membar_consumer
86All loads preceding the memory barrier will complete before any loads 222All loads preceding
87after the memory barrier complete. 223.Fn membar_consumer
 224will complete before any loads after it.
 225.Pp
 226.Fn membar_consumer
 227has no analogue in C11.
 228.Pp
 229.Fn membar_consumer
 230is typically used in code that reads data from producers which use
 231.Fn membar_producer ,
 232such as
 233.Sq seqlocked
 234snapshots of statistics.
 235For example:
 236.Bd -literal
 237struct {
 238 /* version number and in-progress bit */
 239 unsigned seq;
 240
 241 /* read-only statistics, too large for atomic load */
 242 unsigned foo;
 243 int bar;
 244 uint64_t baz;
 245} stats;
 246
 247 /* producer (must be serialized, e.g. with mutex(9)) */
 248 stats->seq |= 1; /* mark update in progress */
 249 membar_producer();
 250 stats->foo = count_foo();
 251 stats->bar = measure_bar();
 252 stats->baz = enumerate_baz();
 253 membar_producer();
 254 stats->seq++; /* bump version number */
 255
 256 /* consumer (in parallel w/ producer, other consumers) */
 257restart:
 258 while ((seq = stats->seq) & 1) /* wait for update */
 259 SPINLOCK_BACKOFF_HOOK;
 260 membar_consumer();
 261 foo = stats->foo; /* read out a candidate snapshot */
 262 bar = stats->bar;
 263 baz = stats->baz;
 264 membar_consumer();
 265 if (seq != stats->seq) /* try again if version changed */
 266 goto restart;
 267.Ed
88.It Fn membar_datadep_consumer 268.It Fn membar_datadep_consumer
89Same as 269Same as
90.Fn membar_consumer , 270.Fn membar_consumer ,
91but limited to loads of addresses dependent on prior loads, or 271but limited to loads of addresses dependent on prior loads, or
92.Sq data-dependent 272.Sq data-dependent
93loads: 273loads:
94.Bd -literal -offset indent 274.Bd -literal -offset indent
95int **pp, *p, v; 275int **pp, *p, v;
96 276
97p = *pp; 277p = *pp;
98membar_datadep_consumer(); 278membar_datadep_consumer();
99v = *p; 279v = *p;
100consume(v); 280consume(v);
101.Ed 281.Ed
102.Pp 282.Pp
103Does not guarantee ordering of loads in branches, or 283.Fn membar_datadep_consumer
 284is typically paired with
 285.Fn membar_exit
 286by code that initializes an object before publishing it.
 287However, you should use
 288.Xr atomic_store_release 9
 289and
 290.Xr atomic_load_consume 9
 291instead, to avoid obscure edge cases in case the consumer is not
 292read-only.
 293.Pp
 294.Fn membar_datadep_consumer
 295does not guarantee ordering of loads in branches, or
104.Sq control-dependent 296.Sq control-dependent
105loads -- you must use 297loads \(em you must use
106.Fn membar_consumer 298.Fn membar_consumer
107instead: 299instead:
108.Bd -literal -offset indent 300.Bd -literal -offset indent
109int *ok, *p, v; 301int *ok, *p, v;
110 302
111if (*ok) { 303if (*ok) {
112 membar_consumer(); 304 membar_consumer();
113 v = *p; 305 v = *p;
114 consume(v); 306 consume(v);
115} 307}
116.Ed 308.Ed
117.Pp 309.Pp
118Most CPUs do not reorder data-dependent loads (i.e., most CPUs 310Most CPUs do not reorder data-dependent loads (i.e., most CPUs
119guarantee that cached values are not stale in that case), so 311guarantee that cached values are not stale in that case), so
120.Fn membar_datadep_consumer 312.Fn membar_datadep_consumer
121is a no-op on those CPUs. 313is a no-op on those CPUs.
122.It Fn membar_sync 314.It Fn membar_sync
123All loads and stores preceding the memory barrier will complete and 315All memory operations preceding
124reach global visibility before any loads and stores after the memory 316.Fn membar_sync
125barrier complete and reach global visibility. 317will happen before any memory operations following it.
 318.Pp
 319.Fn membar_sync
 320is a sequential consistency acquire/release barrier, analogous to
 321.Li "atomic_thread_fence(memory_order_seq_cst)"
 322in C11.
 323.Pp
 324.Fn membar_sync
 325is typically paired with
 326.Fn membar_sync .
 327.Pp
 328A load followed by
 329.Fn membar_sync ,
 330serving as a
 331.Em load-acquire
 332operation, may also be paired with a prior
 333.Fn membar_exit
 334followed by a store, serving as the corresponding
 335.Em store-release
 336operation.
 337However, you should use
 338.Xr atomic_load_acquire 9
 339instead of
 340.No load-then- Ns Fn membar_sync
 341if it is a regular load, or
 342.Fn membar_enter
 343instead of
 344.Fn membar_sync
 345if the load is in an atomic read/modify/write operation.
126.El 346.El
127.Sh SEE ALSO 347.Sh SEE ALSO
128.Xr atomic_ops 3 348.Xr atomic_ops 3 ,
 349.Xr atomic_loadstore 9
129.Sh HISTORY 350.Sh HISTORY
130The 351The
131.Nm membar_ops 352.Nm membar_ops
132functions first appeared in 353functions first appeared in
133.Nx 5.0 . 354.Nx 5.0 .
134The data-dependent load barrier, 355The data-dependent load barrier,
135.Fn membar_datadep_consumer , 356.Fn membar_datadep_consumer ,
136first appeared in 357first appeared in
137.Nx 7.0 . 358.Nx 7.0 .