Thu Mar 11 17:13:29 2021 UTC ()
Remove trailing whitespace.


(wiz)
diff -r1.30 -r1.31 src/lib/libc/regex/regex.3

cvs diff -r1.30 -r1.31 src/lib/libc/regex/regex.3 (switch to unified diff)

--- src/lib/libc/regex/regex.3 2021/03/11 16:36:41 1.30
+++ src/lib/libc/regex/regex.3 2021/03/11 17:13:29 1.31
@@ -1,861 +1,861 @@ @@ -1,861 +1,861 @@
1.\" $NetBSD: regex.3,v 1.30 2021/03/11 16:36:41 christos Exp $ 1.\" $NetBSD: regex.3,v 1.31 2021/03/11 17:13:29 wiz Exp $
2.\" 2.\"
3.\" Copyright (c) 1992, 1993, 1994 Henry Spencer. 3.\" Copyright (c) 1992, 1993, 1994 Henry Spencer.
4.\" Copyright (c) 1992, 1993, 1994 4.\" Copyright (c) 1992, 1993, 1994
5.\" The Regents of the University of California. All rights reserved. 5.\" The Regents of the University of California. All rights reserved.
6.\" 6.\"
7.\" This code is derived from software contributed to Berkeley by 7.\" This code is derived from software contributed to Berkeley by
8.\" Henry Spencer. 8.\" Henry Spencer.
9.\" 9.\"
10.\" Redistribution and use in source and binary forms, with or without 10.\" Redistribution and use in source and binary forms, with or without
11.\" modification, are permitted provided that the following conditions 11.\" modification, are permitted provided that the following conditions
12.\" are met: 12.\" are met:
13.\" 1. Redistributions of source code must retain the above copyright 13.\" 1. Redistributions of source code must retain the above copyright
14.\" notice, this list of conditions and the following disclaimer. 14.\" notice, this list of conditions and the following disclaimer.
15.\" 2. Redistributions in binary form must reproduce the above copyright 15.\" 2. Redistributions in binary form must reproduce the above copyright
16.\" notice, this list of conditions and the following disclaimer in the 16.\" notice, this list of conditions and the following disclaimer in the
17.\" documentation and/or other materials provided with the distribution. 17.\" documentation and/or other materials provided with the distribution.
18.\" 3. Neither the name of the University nor the names of its contributors 18.\" 3. Neither the name of the University nor the names of its contributors
19.\" may be used to endorse or promote products derived from this software 19.\" may be used to endorse or promote products derived from this software
20.\" without specific prior written permission. 20.\" without specific prior written permission.
21.\" 21.\"
22.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND 22.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
23.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 23.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
24.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 24.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
25.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE 25.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
26.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 26.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
27.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 27.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
28.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 28.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
29.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 29.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
30.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 30.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
31.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 31.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
32.\" SUCH DAMAGE. 32.\" SUCH DAMAGE.
33.\" 33.\"
34.\" @(#)regex.3 8.4 (Berkeley) 3/20/94 34.\" @(#)regex.3 8.4 (Berkeley) 3/20/94
35.\" $FreeBSD: head/lib/libc/regex/regex.3 363817 2020-08-04 02:06:49Z kevans $ 35.\" $FreeBSD: head/lib/libc/regex/regex.3 363817 2020-08-04 02:06:49Z kevans $
36.\" 36.\"
37.Dd March 11, 2021 37.Dd March 11, 2021
38.Dt REGEX 3 38.Dt REGEX 3
39.Os 39.Os
40.Sh NAME 40.Sh NAME
41.Nm regcomp , 41.Nm regcomp ,
42.Nm regexec , 42.Nm regexec ,
43.Nm regerror , 43.Nm regerror ,
44.Nm regfree , 44.Nm regfree ,
45.Nm regasub , 45.Nm regasub ,
46.Nm regnsub 46.Nm regnsub
47.Nd regular-expression library 47.Nd regular-expression library
48.Sh LIBRARY 48.Sh LIBRARY
49.Lb libc 49.Lb libc
50.Sh SYNOPSIS 50.Sh SYNOPSIS
51.In regex.h 51.In regex.h
52.Ft int 52.Ft int
53.Fo regcomp 53.Fo regcomp
54.Fa "regex_t * restrict preg" "const char * restrict pattern" "int cflags" 54.Fa "regex_t * restrict preg" "const char * restrict pattern" "int cflags"
55.Fc 55.Fc
56.Ft int 56.Ft int
57.Fo regexec 57.Fo regexec
58.Fa "const regex_t * restrict preg" "const char * restrict string" 58.Fa "const regex_t * restrict preg" "const char * restrict string"
59.Fa "size_t nmatch" "regmatch_t pmatch[restrict]" "int eflags" 59.Fa "size_t nmatch" "regmatch_t pmatch[restrict]" "int eflags"
60.Fc 60.Fc
61.Ft size_t 61.Ft size_t
62.Fo regerror 62.Fo regerror
63.Fa "int errcode" "const regex_t * restrict preg" 63.Fa "int errcode" "const regex_t * restrict preg"
64.Fa "char * restrict errbuf" "size_t errbuf_size" 64.Fa "char * restrict errbuf" "size_t errbuf_size"
65.Fc 65.Fc
66.Ft void 66.Ft void
67.Fn regfree "regex_t *preg" 67.Fn regfree "regex_t *preg"
68.Ft ssize_t 68.Ft ssize_t
69.Fn regnsub "char *buf" "size_t bufsiz" "const char *sub" "const regmatch_t *rm" "const char *str" 69.Fn regnsub "char *buf" "size_t bufsiz" "const char *sub" "const regmatch_t *rm" "const char *str"
70.Ft ssize_t 70.Ft ssize_t
71.Fn regasub "char **buf" "const char *sub" "const regmatch_t *rm" "const char *sstr" 71.Fn regasub "char **buf" "const char *sub" "const regmatch_t *rm" "const char *sstr"
72.Sh DESCRIPTION 72.Sh DESCRIPTION
73These routines implement 73These routines implement
74.St -p1003.2 74.St -p1003.2
75regular expressions 75regular expressions
76.Pq Do RE Dc Ns s ; 76.Pq Do RE Dc Ns s ;
77see 77see
78.Xr re_format 7 . 78.Xr re_format 7 .
79The 79The
80.Fn regcomp 80.Fn regcomp
81function 81function
82compiles an RE written as a string into an internal form, 82compiles an RE written as a string into an internal form,
83.Fn regexec 83.Fn regexec
84matches that internal form against a string and reports results, 84matches that internal form against a string and reports results,
85.Fn regerror 85.Fn regerror
86transforms error codes from either into human-readable messages, 86transforms error codes from either into human-readable messages,
87and 87and
88.Fn regfree 88.Fn regfree
89frees any dynamically-allocated storage used by the internal form 89frees any dynamically-allocated storage used by the internal form
90of an RE. 90of an RE.
91.Pp 91.Pp
92The header 92The header
93.In regex.h 93.In regex.h
94declares two structure types, 94declares two structure types,
95.Ft regex_t 95.Ft regex_t
96and 96and
97.Ft regmatch_t , 97.Ft regmatch_t ,
98the former for compiled internal forms and the latter for match reporting. 98the former for compiled internal forms and the latter for match reporting.
99It also declares the four functions, 99It also declares the four functions,
100a type 100a type
101.Ft regoff_t , 101.Ft regoff_t ,
102and a number of constants with names starting with 102and a number of constants with names starting with
103.Dq Dv REG_ . 103.Dq Dv REG_ .
104.Pp 104.Pp
105The 105The
106.Fn regcomp 106.Fn regcomp
107function 107function
108compiles the regular expression contained in the 108compiles the regular expression contained in the
109.Fa pattern 109.Fa pattern
110string, 110string,
111subject to the flags in 111subject to the flags in
112.Fa cflags , 112.Fa cflags ,
113and places the results in the 113and places the results in the
114.Ft regex_t 114.Ft regex_t
115structure pointed to by 115structure pointed to by
116.Fa preg . 116.Fa preg .
117The 117The
118.Fa cflags 118.Fa cflags
119argument 119argument
120is the bitwise OR of zero or more of the following flags: 120is the bitwise OR of zero or more of the following flags:
121.Bl -tag -width REG_EXTENDED 121.Bl -tag -width REG_EXTENDED
122.It Dv REG_EXTENDED 122.It Dv REG_EXTENDED
123Compile modern 123Compile modern
124.Pq Dq extended 124.Pq Dq extended
125REs, 125REs,
126rather than the obsolete 126rather than the obsolete
127.Pq Dq basic 127.Pq Dq basic
128REs that 128REs that
129are the default. 129are the default.
130.It Dv REG_BASIC 130.It Dv REG_BASIC
131This is a synonym for 0, 131This is a synonym for 0,
132provided as a counterpart to 132provided as a counterpart to
133.Dv REG_EXTENDED 133.Dv REG_EXTENDED
134to improve readability. 134to improve readability.
135.It Dv REG_NOSPEC 135.It Dv REG_NOSPEC
136Compile with recognition of all special characters turned off. 136Compile with recognition of all special characters turned off.
137All characters are thus considered ordinary, 137All characters are thus considered ordinary,
138so the 138so the
139.Dq RE 139.Dq RE
140is a literal string. 140is a literal string.
141This is an extension, 141This is an extension,
142compatible with but not specified by 142compatible with but not specified by
143.St -p1003.2 , 143.St -p1003.2 ,
144and should be used with 144and should be used with
145caution in software intended to be portable to other systems. 145caution in software intended to be portable to other systems.
146.Dv REG_EXTENDED 146.Dv REG_EXTENDED
147and 147and
148.Dv REG_NOSPEC 148.Dv REG_NOSPEC
149may not be used 149may not be used
150in the same call to 150in the same call to
151.Fn regcomp . 151.Fn regcomp .
152.It Dv REG_ICASE 152.It Dv REG_ICASE
153Compile for matching that ignores upper/lower case distinctions. 153Compile for matching that ignores upper/lower case distinctions.
154See 154See
155.Xr re_format 7 . 155.Xr re_format 7 .
156.It Dv REG_NOSUB 156.It Dv REG_NOSUB
157Compile for matching that need only report success or failure, 157Compile for matching that need only report success or failure,
158not what was matched. 158not what was matched.
159.It Dv REG_NEWLINE 159.It Dv REG_NEWLINE
160Compile for newline-sensitive matching. 160Compile for newline-sensitive matching.
161By default, newline is a completely ordinary character with no special 161By default, newline is a completely ordinary character with no special
162meaning in either REs or strings. 162meaning in either REs or strings.
163With this flag, 163With this flag,
164.Ql [^ 164.Ql [^
165bracket expressions and 165bracket expressions and
166.Ql .\& 166.Ql .\&
167never match newline, 167never match newline,
168a 168a
169.Ql ^\& 169.Ql ^\&
170anchor matches the null string after any newline in the string 170anchor matches the null string after any newline in the string
171in addition to its normal function, 171in addition to its normal function,
172and the 172and the
173.Ql $\& 173.Ql $\&
174anchor matches the null string before any newline in the 174anchor matches the null string before any newline in the
175string in addition to its normal function. 175string in addition to its normal function.
176.It Dv REG_PEND 176.It Dv REG_PEND
177The regular expression ends, 177The regular expression ends,
178not at the first NUL, 178not at the first NUL,
179but just before the character pointed to by the 179but just before the character pointed to by the
180.Va re_endp 180.Va re_endp
181member of the structure pointed to by 181member of the structure pointed to by
182.Fa preg . 182.Fa preg .
183The 183The
184.Va re_endp 184.Va re_endp
185member is of type 185member is of type
186.Ft "const char *" . 186.Ft "const char *" .
187This flag permits inclusion of NULs in the RE; 187This flag permits inclusion of NULs in the RE;
188they are considered ordinary characters. 188they are considered ordinary characters.
189This is an extension, 189This is an extension,
190compatible with but not specified by 190compatible with but not specified by
191.St -p1003.2 , 191.St -p1003.2 ,
192and should be used with 192and should be used with
193caution in software intended to be portable to other systems. 193caution in software intended to be portable to other systems.
194.It Dv REG_GNU 194.It Dv REG_GNU
195Include GNU-inspired extensions: 195Include GNU-inspired extensions:
196.Pp 196.Pp
197.Bl -tag -offset indent -width XX -compact  197.Bl -tag -offset indent -width XX -compact
198.It \eN 198.It \eN
199Use backreference 199Use backreference
200.Dv N 200.Dv N
201where 201where
202.Dv N 202.Dv N
203is a single digit number between  203is a single digit number between
204.Dv 1  204.Dv 1
205and  205and
206.Dv 9 . 206.Dv 9 .
207.It \ea 207.It \ea
208Visual Bell 208Visual Bell
209.It \eb 209.It \eb
210Match a position that is a word boundary. 210Match a position that is a word boundary.
211.It \eB 211.It \eB
212Match a position that is a not word boundary. 212Match a position that is a not word boundary.
213.It \ef 213.It \ef
214Form Feed 214Form Feed
215.It \en 215.It \en
216Line Feed 216Line Feed
217.It \er 217.It \er
218Carriage return 218Carriage return
219.It \es 219.It \es
220Alias for [[:space:]] 220Alias for [[:space:]]
221.It \eS 221.It \eS
222Alias for [^[:space:]] 222Alias for [^[:space:]]
223.It \et 223.It \et
224Horizontal Tab 224Horizontal Tab
225.It \ev 225.It \ev
226Vertical Tab 226Vertical Tab
227.It \ew 227.It \ew
228Alias for [[:alnum:]] 228Alias for [[:alnum:]]
229.It \eW 229.It \eW
230Alias for [^[:alnum:]] 230Alias for [^[:alnum:]]
231.It \e' 231.It \e'
232Matches the end of the subject string (the string to be matched). 232Matches the end of the subject string (the string to be matched).
233.It \e` 233.It \e`
234Matches the beginning of the subject string. 234Matches the beginning of the subject string.
235.El 235.El
236.Pp 236.Pp
237This is an extension, 237This is an extension,
238compatible with but not specified by 238compatible with but not specified by
239.St -p1003.2 , 239.St -p1003.2 ,
240and should be used with 240and should be used with
241caution in software intended to be portable to other systems. 241caution in software intended to be portable to other systems.
242.El 242.El
243.Pp 243.Pp
244When successful, 244When successful,
245.Fn regcomp 245.Fn regcomp
246returns 0 and fills in the structure pointed to by 246returns 0 and fills in the structure pointed to by
247.Fa preg . 247.Fa preg .
248One member of that structure 248One member of that structure
249(other than 249(other than
250.Va re_endp ) 250.Va re_endp )
251is publicized: 251is publicized:
252.Va re_nsub , 252.Va re_nsub ,
253of type 253of type
254.Ft size_t , 254.Ft size_t ,
255contains the number of parenthesized subexpressions within the RE 255contains the number of parenthesized subexpressions within the RE
256(except that the value of this member is undefined if the 256(except that the value of this member is undefined if the
257.Dv REG_NOSUB 257.Dv REG_NOSUB
258flag was used). 258flag was used).
259If 259If
260.Fn regcomp 260.Fn regcomp
261fails, it returns a non-zero error code; 261fails, it returns a non-zero error code;
262see 262see
263.Sx DIAGNOSTICS . 263.Sx DIAGNOSTICS .
264.Pp 264.Pp
265The 265The
266.Fn regexec 266.Fn regexec
267function 267function
268matches the compiled RE pointed to by 268matches the compiled RE pointed to by
269.Fa preg 269.Fa preg
270against the 270against the
271.Fa string , 271.Fa string ,
272subject to the flags in 272subject to the flags in
273.Fa eflags , 273.Fa eflags ,
274and reports results using 274and reports results using
275.Fa nmatch , 275.Fa nmatch ,
276.Fa pmatch , 276.Fa pmatch ,
277and the returned value. 277and the returned value.
278The RE must have been compiled by a previous invocation of 278The RE must have been compiled by a previous invocation of
279.Fn regcomp . 279.Fn regcomp .
280The compiled form is not altered during execution of 280The compiled form is not altered during execution of
281.Fn regexec , 281.Fn regexec ,
282so a single compiled RE can be used simultaneously by multiple threads. 282so a single compiled RE can be used simultaneously by multiple threads.
283.Pp 283.Pp
284By default, 284By default,
285the NUL-terminated string pointed to by 285the NUL-terminated string pointed to by
286.Fa string 286.Fa string
287is considered to be the text of an entire line, minus any terminating 287is considered to be the text of an entire line, minus any terminating
288newline. 288newline.
289The 289The
290.Fa eflags 290.Fa eflags
291argument is the bitwise OR of zero or more of the following flags: 291argument is the bitwise OR of zero or more of the following flags:
292.Bl -tag -width REG_STARTEND 292.Bl -tag -width REG_STARTEND
293.It Dv REG_NOTBOL 293.It Dv REG_NOTBOL
294The first character of the string is treated as the continuation 294The first character of the string is treated as the continuation
295of a line. 295of a line.
296This means that the anchors 296This means that the anchors
297.Ql ^\& , 297.Ql ^\& ,
298.Ql [[:<:]] , 298.Ql [[:<:]] ,
299and 299and
300.Ql \e< 300.Ql \e<
301do not match before it; but see 301do not match before it; but see
302.Dv REG_STARTEND 302.Dv REG_STARTEND
303below. 303below.
304This does not affect the behavior of newlines under 304This does not affect the behavior of newlines under
305.Dv REG_NEWLINE . 305.Dv REG_NEWLINE .
306.It Dv REG_NOTEOL 306.It Dv REG_NOTEOL
307The NUL terminating 307The NUL terminating
308the string 308the string
309does not end a line, so the 309does not end a line, so the
310.Ql $\& 310.Ql $\&
311anchor does not match before it. 311anchor does not match before it.
312This does not affect the behavior of newlines under 312This does not affect the behavior of newlines under
313.Dv REG_NEWLINE . 313.Dv REG_NEWLINE .
314.It Dv REG_STARTEND 314.It Dv REG_STARTEND
315The string is considered to start at 315The string is considered to start at
316.Fa string No + 316.Fa string No +
317.Fa pmatch Ns [0]. Ns Fa rm_so 317.Fa pmatch Ns [0]. Ns Fa rm_so
318and to end before the byte located at 318and to end before the byte located at
319.Fa string No + 319.Fa string No +
320.Fa pmatch Ns [0]. Ns Fa rm_eo , 320.Fa pmatch Ns [0]. Ns Fa rm_eo ,
321regardless of the value of 321regardless of the value of
322.Fa nmatch . 322.Fa nmatch .
323See below for the definition of 323See below for the definition of
324.Fa pmatch 324.Fa pmatch
325and 325and
326.Fa nmatch . 326.Fa nmatch .
327This is an extension, 327This is an extension,
328compatible with but not specified by 328compatible with but not specified by
329.St -p1003.2 , 329.St -p1003.2 ,
330and should be used with 330and should be used with
331caution in software intended to be portable to other systems. 331caution in software intended to be portable to other systems.
332.Pp 332.Pp
333Without 333Without
334.Dv REG_NOTBOL , 334.Dv REG_NOTBOL ,
335the position 335the position
336.Fa rm_so 336.Fa rm_so
337is considered the beginning of a line, such that 337is considered the beginning of a line, such that
338.Ql ^ 338.Ql ^
339matches before it, and the beginning of a word if there is a word 339matches before it, and the beginning of a word if there is a word
340character at this position, such that 340character at this position, such that
341.Ql [[:<:]] 341.Ql [[:<:]]
342and 342and
343.Ql \e< 343.Ql \e<
344match before it. 344match before it.
345.Pp 345.Pp
346With 346With
347.Dv REG_NOTBOL , 347.Dv REG_NOTBOL ,
348the character at position 348the character at position
349.Fa rm_so 349.Fa rm_so
350is treated as the continuation of a line, and if 350is treated as the continuation of a line, and if
351.Fa rm_so 351.Fa rm_so
352is greater than 0, the preceding character is taken into consideration. 352is greater than 0, the preceding character is taken into consideration.
353If the preceding character is a newline and the regular expression was compiled 353If the preceding character is a newline and the regular expression was compiled
354with 354with
355.Dv REG_NEWLINE , 355.Dv REG_NEWLINE ,
356.Ql ^ 356.Ql ^
357matches before the string; if the preceding character is not a word character 357matches before the string; if the preceding character is not a word character
358but the string starts with a word character, 358but the string starts with a word character,
359.Ql [[:<:]] 359.Ql [[:<:]]
360and 360and
361.Ql \e< 361.Ql \e<
362match before the string. 362match before the string.
363.El 363.El
364.Pp 364.Pp
365See 365See
366.Xr re_format 7 366.Xr re_format 7
367for a discussion of what is matched in situations where an RE or a 367for a discussion of what is matched in situations where an RE or a
368portion thereof could match any of several substrings of 368portion thereof could match any of several substrings of
369.Fa string . 369.Fa string .
370.Pp 370.Pp
371Normally, 371Normally,
372.Fn regexec 372.Fn regexec
373returns 0 for success and the non-zero code 373returns 0 for success and the non-zero code
374.Dv REG_NOMATCH 374.Dv REG_NOMATCH
375for failure. 375for failure.
376Other non-zero error codes may be returned in exceptional situations; 376Other non-zero error codes may be returned in exceptional situations;
377see 377see
378.Sx DIAGNOSTICS . 378.Sx DIAGNOSTICS .
379.Pp 379.Pp
380If 380If
381.Dv REG_NOSUB 381.Dv REG_NOSUB
382was specified in the compilation of the RE, 382was specified in the compilation of the RE,
383or if 383or if
384.Fa nmatch 384.Fa nmatch
385is 0, 385is 0,
386.Fn regexec 386.Fn regexec
387ignores the 387ignores the
388.Fa pmatch 388.Fa pmatch
389argument (but see below for the case where 389argument (but see below for the case where
390.Dv REG_STARTEND 390.Dv REG_STARTEND
391is specified). 391is specified).
392Otherwise, 392Otherwise,
393.Fa pmatch 393.Fa pmatch
394points to an array of 394points to an array of
395.Fa nmatch 395.Fa nmatch
396structures of type 396structures of type
397.Ft regmatch_t . 397.Ft regmatch_t .
398Such a structure has at least the members 398Such a structure has at least the members
399.Va rm_so 399.Va rm_so
400and 400and
401.Va rm_eo , 401.Va rm_eo ,
402both of type 402both of type
403.Ft regoff_t 403.Ft regoff_t
404(a signed arithmetic type at least as large as an 404(a signed arithmetic type at least as large as an
405.Ft off_t 405.Ft off_t
406and a 406and a
407.Ft ssize_t ) , 407.Ft ssize_t ) ,
408containing respectively the offset of the first character of a substring 408containing respectively the offset of the first character of a substring
409and the offset of the first character after the end of the substring. 409and the offset of the first character after the end of the substring.
410Offsets are measured from the beginning of the 410Offsets are measured from the beginning of the
411.Fa string 411.Fa string
412argument given to 412argument given to
413.Fn regexec . 413.Fn regexec .
414An empty substring is denoted by equal offsets, 414An empty substring is denoted by equal offsets,
415both indicating the character following the empty substring. 415both indicating the character following the empty substring.
416.Pp 416.Pp
417The 0th member of the 417The 0th member of the
418.Fa pmatch 418.Fa pmatch
419array is filled in to indicate what substring of 419array is filled in to indicate what substring of
420.Fa string 420.Fa string
421was matched by the entire RE. 421was matched by the entire RE.
422Remaining members report what substring was matched by parenthesized 422Remaining members report what substring was matched by parenthesized
423subexpressions within the RE; 423subexpressions within the RE;
424member 424member
425.Va i 425.Va i
426reports subexpression 426reports subexpression
427.Va i , 427.Va i ,
428with subexpressions counted (starting at 1) by the order of their opening 428with subexpressions counted (starting at 1) by the order of their opening
429parentheses in the RE, left to right. 429parentheses in the RE, left to right.
430Unused entries in the array (corresponding either to subexpressions that 430Unused entries in the array (corresponding either to subexpressions that
431did not participate in the match at all, or to subexpressions that do not 431did not participate in the match at all, or to subexpressions that do not
432exist in the RE (that is, 432exist in the RE (that is,
433.Va i 433.Va i
434> 434>
435.Fa preg Ns -> Ns Va re_nsub ) ) 435.Fa preg Ns -> Ns Va re_nsub ) )
436have both 436have both
437.Va rm_so 437.Va rm_so
438and 438and
439.Va rm_eo 439.Va rm_eo
440set to -1. 440set to -1.
441If a subexpression participated in the match several times, 441If a subexpression participated in the match several times,
442the reported substring is the last one it matched. 442the reported substring is the last one it matched.
443(Note, as an example in particular, that when the RE 443(Note, as an example in particular, that when the RE
444.Ql "(b*)+" 444.Ql "(b*)+"
445matches 445matches
446.Ql bbb , 446.Ql bbb ,
447the parenthesized subexpression matches each of the three 447the parenthesized subexpression matches each of the three
448.So Li b Sc Ns s 448.So Li b Sc Ns s
449and then 449and then
450an infinite number of empty strings following the last 450an infinite number of empty strings following the last
451.Ql b , 451.Ql b ,
452so the reported substring is one of the empties.) 452so the reported substring is one of the empties.)
453.Pp 453.Pp
454If 454If
455.Dv REG_STARTEND 455.Dv REG_STARTEND
456is specified, 456is specified,
457.Fa pmatch 457.Fa pmatch
458must point to at least one 458must point to at least one
459.Ft regmatch_t 459.Ft regmatch_t
460(even if 460(even if
461.Fa nmatch 461.Fa nmatch
462is 0 or 462is 0 or
463.Dv REG_NOSUB 463.Dv REG_NOSUB
464was specified), 464was specified),
465to hold the input offsets for 465to hold the input offsets for
466.Dv REG_STARTEND . 466.Dv REG_STARTEND .
467Use for output is still entirely controlled by 467Use for output is still entirely controlled by
468.Fa nmatch ; 468.Fa nmatch ;
469if 469if
470.Fa nmatch 470.Fa nmatch
471is 0 or 471is 0 or
472.Dv REG_NOSUB 472.Dv REG_NOSUB
473was specified, 473was specified,
474the value of 474the value of
475.Fa pmatch Ns [0] 475.Fa pmatch Ns [0]
476will not be changed by a successful 476will not be changed by a successful
477.Fn regexec . 477.Fn regexec .
478.Pp 478.Pp
479The 479The
480.Fn regerror 480.Fn regerror
481function 481function
482maps a non-zero 482maps a non-zero
483.Fa errcode 483.Fa errcode
484from either 484from either
485.Fn regcomp 485.Fn regcomp
486or 486or
487.Fn regexec 487.Fn regexec
488to a human-readable, printable message. 488to a human-readable, printable message.
489If 489If
490.Fa preg 490.Fa preg
491is 491is
492.No non\- Ns Dv NULL , 492.No non\- Ns Dv NULL ,
493the error code should have arisen from use of 493the error code should have arisen from use of
494the 494the
495.Ft regex_t 495.Ft regex_t
496pointed to by 496pointed to by
497.Fa preg , 497.Fa preg ,
498and if the error code came from 498and if the error code came from
499.Fn regcomp , 499.Fn regcomp ,
500it should have been the result from the most recent 500it should have been the result from the most recent
501.Fn regcomp 501.Fn regcomp
502using that 502using that
503.Ft regex_t . 503.Ft regex_t .
504The 504The
505.Po 505.Po
506.Fn regerror 506.Fn regerror
507may be able to supply a more detailed message using information 507may be able to supply a more detailed message using information
508from the 508from the
509.Ft regex_t . 509.Ft regex_t .
510.Pc 510.Pc
511The 511The
512.Fn regerror 512.Fn regerror
513function 513function
514places the NUL-terminated message into the buffer pointed to by 514places the NUL-terminated message into the buffer pointed to by
515.Fa errbuf , 515.Fa errbuf ,
516limiting the length (including the NUL) to at most 516limiting the length (including the NUL) to at most
517.Fa errbuf_size 517.Fa errbuf_size
518bytes. 518bytes.
519If the whole message will not fit, 519If the whole message will not fit,
520as much of it as will fit before the terminating NUL is supplied. 520as much of it as will fit before the terminating NUL is supplied.
521In any case, 521In any case,
522the returned value is the size of buffer needed to hold the whole 522the returned value is the size of buffer needed to hold the whole
523message (including terminating NUL). 523message (including terminating NUL).
524If 524If
525.Fa errbuf_size 525.Fa errbuf_size
526is 0, 526is 0,
527.Fa errbuf 527.Fa errbuf
528is ignored but the return value is still correct. 528is ignored but the return value is still correct.
529.Pp 529.Pp
530If the 530If the
531.Fa errcode 531.Fa errcode
532given to 532given to
533.Fn regerror 533.Fn regerror
534is first ORed with 534is first ORed with
535.Dv REG_ITOA , 535.Dv REG_ITOA ,
536the 536the
537.Dq message 537.Dq message
538that results is the printable name of the error code, 538that results is the printable name of the error code,
539e.g.\& 539e.g.\&
540.Dq Dv REG_NOMATCH , 540.Dq Dv REG_NOMATCH ,
541rather than an explanation thereof. 541rather than an explanation thereof.
542If 542If
543.Fa errcode 543.Fa errcode
544is 544is
545.Dv REG_ATOI , 545.Dv REG_ATOI ,
546then 546then
547.Fa preg 547.Fa preg
548shall be 548shall be
549.No non\- Ns Dv NULL 549.No non\- Ns Dv NULL
550and the 550and the
551.Va re_endp 551.Va re_endp
552member of the structure it points to 552member of the structure it points to
553must point to the printable name of an error code; 553must point to the printable name of an error code;
554in this case, the result in 554in this case, the result in
555.Fa errbuf 555.Fa errbuf
556is the decimal digits of 556is the decimal digits of
557the numeric value of the error code 557the numeric value of the error code
558(0 if the name is not recognized). 558(0 if the name is not recognized).
559.Dv REG_ITOA 559.Dv REG_ITOA
560and 560and
561.Dv REG_ATOI 561.Dv REG_ATOI
562are intended primarily as debugging facilities; 562are intended primarily as debugging facilities;
563they are extensions, 563they are extensions,
564compatible with but not specified by 564compatible with but not specified by
565.St -p1003.2 , 565.St -p1003.2 ,
566and should be used with 566and should be used with
567caution in software intended to be portable to other systems. 567caution in software intended to be portable to other systems.
568Be warned also that they are considered experimental and changes are possible. 568Be warned also that they are considered experimental and changes are possible.
569.Pp 569.Pp
570The 570The
571.Fn regfree 571.Fn regfree
572function 572function
573frees any dynamically-allocated storage associated with the compiled RE 573frees any dynamically-allocated storage associated with the compiled RE
574pointed to by 574pointed to by
575.Fa preg . 575.Fa preg .
576The remaining 576The remaining
577.Ft regex_t 577.Ft regex_t
578is no longer a valid compiled RE 578is no longer a valid compiled RE
579and the effect of supplying it to 579and the effect of supplying it to
580.Fn regexec 580.Fn regexec
581or 581or
582.Fn regerror 582.Fn regerror
583is undefined. 583is undefined.
584.Pp 584.Pp
585None of these functions references global variables except for tables 585None of these functions references global variables except for tables
586of constants; 586of constants;
587all are safe for use from multiple threads if the arguments are safe. 587all are safe for use from multiple threads if the arguments are safe.
588.Pp 588.Pp
589The 589The
590.Fn regnsub 590.Fn regnsub
591and 591and
592.Fn regasub 592.Fn regasub
593functions perform substitutions using 593functions perform substitutions using
594.Xr sed 1 594.Xr sed 1
595like syntax. 595like syntax.
596They return the length of the string that would have been created 596They return the length of the string that would have been created
597if there was enough space or 597if there was enough space or
598.Dv \-1 598.Dv \-1
599on error, setting 599on error, setting
600.Dv errno . 600.Dv errno .
601The result 601The result
602is being placed in 602is being placed in
603.Fa buf 603.Fa buf
604which is user-supplied in 604which is user-supplied in
605.Fn regnsub 605.Fn regnsub
606and dynamically allocated in 606and dynamically allocated in
607.Fn regasub . 607.Fn regasub .
608The 608The
609.Fa sub 609.Fa sub
610argument contains a substitution string which might refer to the first 610argument contains a substitution string which might refer to the first
6119 regular expression strings using 6119 regular expression strings using
612.Dq \e<n> 612.Dq \e<n>
613to refer to the nth matched 613to refer to the nth matched
614item, or 614item, or
615.Dq & 615.Dq &
616(which is equivalent to 616(which is equivalent to
617.Dq \e0 ) 617.Dq \e0 )
618to refer to the full match. 618to refer to the full match.
619The 619The
620.Fa rm 620.Fa rm
621array must be at least 10 elements long, and should contain the result 621array must be at least 10 elements long, and should contain the result
622of the matches from a previous 622of the matches from a previous
623.Fn regexec 623.Fn regexec
624call. 624call.
625Only 10 elements of the 625Only 10 elements of the
626.Fa rm 626.Fa rm
627array can be used. 627array can be used.
628The 628The
629.Fa str 629.Fa str
630argument contains the source string to apply the transformation to. 630argument contains the source string to apply the transformation to.
631.Sh IMPLEMENTATION CHOICES 631.Sh IMPLEMENTATION CHOICES
632There are a number of decisions that 632There are a number of decisions that
633.St -p1003.2 633.St -p1003.2
634leaves up to the implementor, 634leaves up to the implementor,
635either by explicitly saying 635either by explicitly saying
636.Dq undefined 636.Dq undefined
637or by virtue of them being 637or by virtue of them being
638forbidden by the RE grammar. 638forbidden by the RE grammar.
639This implementation treats them as follows. 639This implementation treats them as follows.
640.Pp 640.Pp
641See 641See
642.Xr re_format 7 642.Xr re_format 7
643for a discussion of the definition of case-independent matching. 643for a discussion of the definition of case-independent matching.
644.Pp 644.Pp
645There is no particular limit on the length of REs, 645There is no particular limit on the length of REs,
646except insofar as memory is limited. 646except insofar as memory is limited.
647Memory usage is approximately linear in RE size, and largely insensitive 647Memory usage is approximately linear in RE size, and largely insensitive
648to RE complexity, except for bounded repetitions. 648to RE complexity, except for bounded repetitions.
649See 649See
650.Sx BUGS 650.Sx BUGS
651for one short RE using them 651for one short RE using them
652that will run almost any system out of memory. 652that will run almost any system out of memory.
653.Pp 653.Pp
654A backslashed character other than one specifically given a magic meaning 654A backslashed character other than one specifically given a magic meaning
655by 655by
656.St -p1003.2 656.St -p1003.2
657(such magic meanings occur only in obsolete 657(such magic meanings occur only in obsolete
658.Bq Dq basic 658.Bq Dq basic
659REs) 659REs)
660is taken as an ordinary character. 660is taken as an ordinary character.
661.Pp 661.Pp
662Any unmatched 662Any unmatched
663.Ql [\& 663.Ql [\&
664is a 664is a
665.Dv REG_EBRACK 665.Dv REG_EBRACK
666error. 666error.
667.Pp 667.Pp
668Equivalence classes cannot begin or end bracket-expression ranges. 668Equivalence classes cannot begin or end bracket-expression ranges.
669The endpoint of one range cannot begin another. 669The endpoint of one range cannot begin another.
670.Pp 670.Pp
671.Dv RE_DUP_MAX , 671.Dv RE_DUP_MAX ,
672the limit on repetition counts in bounded repetitions, is 255. 672the limit on repetition counts in bounded repetitions, is 255.
673.Pp 673.Pp
674A repetition operator 674A repetition operator
675.Ql ( ?\& , 675.Ql ( ?\& ,
676.Ql *\& , 676.Ql *\& ,
677.Ql +\& , 677.Ql +\& ,
678or bounds) 678or bounds)
679cannot follow another 679cannot follow another
680repetition operator. 680repetition operator.
681A repetition operator cannot begin an expression or subexpression 681A repetition operator cannot begin an expression or subexpression
682or follow 682or follow
683.Ql ^\& 683.Ql ^\&
684or 684or
685.Ql |\& . 685.Ql |\& .
686.Pp 686.Pp
687.Ql |\& 687.Ql |\&
688cannot appear first or last in a (sub)expression or after another 688cannot appear first or last in a (sub)expression or after another
689.Ql |\& , 689.Ql |\& ,
690i.e., an operand of 690i.e., an operand of
691.Ql |\& 691.Ql |\&
692cannot be an empty subexpression. 692cannot be an empty subexpression.
693An empty parenthesized subexpression, 693An empty parenthesized subexpression,
694.Ql "()" , 694.Ql "()" ,
695is legal and matches an 695is legal and matches an
696empty (sub)string. 696empty (sub)string.
697An empty string is not a legal RE. 697An empty string is not a legal RE.
698.Pp 698.Pp
699A 699A
700.Ql {\& 700.Ql {\&
701followed by a digit is considered the beginning of bounds for a 701followed by a digit is considered the beginning of bounds for a
702bounded repetition, which must then follow the syntax for bounds. 702bounded repetition, which must then follow the syntax for bounds.
703A 703A
704.Ql {\& 704.Ql {\&
705.Em not 705.Em not
706followed by a digit is considered an ordinary character. 706followed by a digit is considered an ordinary character.
707.Pp 707.Pp
708.Ql ^\& 708.Ql ^\&
709and 709and
710.Ql $\& 710.Ql $\&
711beginning and ending subexpressions in obsolete 711beginning and ending subexpressions in obsolete
712.Pq Dq basic 712.Pq Dq basic
713REs are anchors, not ordinary characters. 713REs are anchors, not ordinary characters.
714.Sh DIAGNOSTICS 714.Sh DIAGNOSTICS
715Non-zero error codes from 715Non-zero error codes from
716.Fn regcomp 716.Fn regcomp
717and 717and
718.Fn regexec 718.Fn regexec
719include the following: 719include the following:
720.Pp 720.Pp
721.Bl -tag -width REG_ECOLLATE -compact 721.Bl -tag -width REG_ECOLLATE -compact
722.It Dv REG_NOMATCH 722.It Dv REG_NOMATCH
723The 723The
724.Fn regexec 724.Fn regexec
725function 725function
726failed to match 726failed to match
727.It Dv REG_BADPAT 727.It Dv REG_BADPAT
728invalid regular expression 728invalid regular expression
729.It Dv REG_ECOLLATE 729.It Dv REG_ECOLLATE
730invalid collating element 730invalid collating element
731.It Dv REG_ECTYPE 731.It Dv REG_ECTYPE
732invalid character class 732invalid character class
733.It Dv REG_EESCAPE 733.It Dv REG_EESCAPE
734.Ql \e 734.Ql \e
735applied to unescapable character 735applied to unescapable character
736.It Dv REG_ESUBREG 736.It Dv REG_ESUBREG
737invalid backreference number 737invalid backreference number
738.It Dv REG_EBRACK 738.It Dv REG_EBRACK
739brackets 739brackets
740.Ql "[ ]" 740.Ql "[ ]"
741not balanced 741not balanced
742.It Dv REG_EPAREN 742.It Dv REG_EPAREN
743parentheses 743parentheses
744.Ql "( )" 744.Ql "( )"
745not balanced 745not balanced
746.It Dv REG_EBRACE 746.It Dv REG_EBRACE
747braces 747braces
748.Ql "{ }" 748.Ql "{ }"
749not balanced 749not balanced
750.It Dv REG_BADBR 750.It Dv REG_BADBR
751invalid repetition count(s) in 751invalid repetition count(s) in
752.Ql "{ }" 752.Ql "{ }"
753.It Dv REG_ERANGE 753.It Dv REG_ERANGE
754invalid character range in 754invalid character range in
755.Ql "[ ]" 755.Ql "[ ]"
756.It Dv REG_ESPACE 756.It Dv REG_ESPACE
757ran out of memory 757ran out of memory
758.It Dv REG_BADRPT 758.It Dv REG_BADRPT
759.Ql ?\& , 759.Ql ?\& ,
760.Ql *\& , 760.Ql *\& ,
761or 761or
762.Ql +\& 762.Ql +\&
763operand invalid 763operand invalid
764.It Dv REG_EMPTY 764.It Dv REG_EMPTY
765empty (sub)expression 765empty (sub)expression
766.It Dv REG_ASSERT 766.It Dv REG_ASSERT
767cannot happen - you found a bug 767cannot happen - you found a bug
768.It Dv REG_INVARG 768.It Dv REG_INVARG
769invalid argument, e.g.\& negative-length string 769invalid argument, e.g.\& negative-length string
770.It Dv REG_ILLSEQ 770.It Dv REG_ILLSEQ
771illegal byte sequence (bad multibyte character) 771illegal byte sequence (bad multibyte character)
772.El 772.El
773.Sh SEE ALSO 773.Sh SEE ALSO
774.Xr grep 1 , 774.Xr grep 1 ,
775.Xr re_format 7 775.Xr re_format 7
776.Pp 776.Pp
777.St -p1003.2 , 777.St -p1003.2 ,
778sections 2.8 (Regular Expression Notation) 778sections 2.8 (Regular Expression Notation)
779and 779and
780B.5 (C Binding for Regular Expression Matching). 780B.5 (C Binding for Regular Expression Matching).
781.Sh HISTORY 781.Sh HISTORY
782Originally written by 782Originally written by
783.An Henry Spencer . 783.An Henry Spencer .
784Altered for inclusion in the 784Altered for inclusion in the
785.Bx 4.4 785.Bx 4.4
786distribution. 786distribution.
787.Pp 787.Pp
788The 788The
789.Fn regnsub 789.Fn regnsub
790and 790and
791.Fn regasub 791.Fn regasub
792functions appeared in 792functions appeared in
793.Nx 8 . 793.Nx 8 .
794.Sh BUGS 794.Sh BUGS
795This is an alpha release with known defects. 795This is an alpha release with known defects.
796Please report problems. 796Please report problems.
797.Pp 797.Pp
798The back-reference code is subtle and doubts linger about its correctness 798The back-reference code is subtle and doubts linger about its correctness
799in complex cases. 799in complex cases.
800.Pp 800.Pp
801The 801The
802.Fn regexec 802.Fn regexec
803function 803function
804performance is poor. 804performance is poor.
805This will improve with later releases. 805This will improve with later releases.
806The 806The
807.Fa nmatch 807.Fa nmatch
808argument 808argument
809exceeding 0 is expensive; 809exceeding 0 is expensive;
810.Fa nmatch 810.Fa nmatch
811exceeding 1 is worse. 811exceeding 1 is worse.
812The 812The
813.Fn regexec 813.Fn regexec
814function 814function
815is largely insensitive to RE complexity 815is largely insensitive to RE complexity
816.Em except 816.Em except
817that back 817that back
818references are massively expensive. 818references are massively expensive.
819RE length does matter; in particular, there is a strong speed bonus 819RE length does matter; in particular, there is a strong speed bonus
820for keeping RE length under about 30 characters, 820for keeping RE length under about 30 characters,
821with most special characters counting roughly double. 821with most special characters counting roughly double.
822.Pp 822.Pp
823The 823The
824.Fn regcomp 824.Fn regcomp
825function 825function
826implements bounded repetitions by macro expansion, 826implements bounded repetitions by macro expansion,
827which is costly in time and space if counts are large 827which is costly in time and space if counts are large
828or bounded repetitions are nested. 828or bounded repetitions are nested.
829An RE like, say, 829An RE like, say,
830.Ql "((((a{1,100}){1,100}){1,100}){1,100}){1,100}" 830.Ql "((((a{1,100}){1,100}){1,100}){1,100}){1,100}"
831will (eventually) run almost any existing machine out of swap space. 831will (eventually) run almost any existing machine out of swap space.
832.Pp 832.Pp
833There are suspected problems with response to obscure error conditions. 833There are suspected problems with response to obscure error conditions.
834Notably, 834Notably,
835certain kinds of internal overflow, 835certain kinds of internal overflow,
836produced only by truly enormous REs or by multiply nested bounded repetitions, 836produced only by truly enormous REs or by multiply nested bounded repetitions,
837are probably not handled well. 837are probably not handled well.
838.Pp 838.Pp
839Due to a mistake in 839Due to a mistake in
840.St -p1003.2 , 840.St -p1003.2 ,
841things like 841things like
842.Ql "a)b" 842.Ql "a)b"
843are legal REs because 843are legal REs because
844.Ql )\& 844.Ql )\&
845is 845is
846a special character only in the presence of a previous unmatched 846a special character only in the presence of a previous unmatched
847.Ql (\& . 847.Ql (\& .
848This cannot be fixed until the spec is fixed. 848This cannot be fixed until the spec is fixed.
849.Pp 849.Pp
850The standard's definition of back references is vague. 850The standard's definition of back references is vague.
851For example, does 851For example, does
852.Ql "a\e(\e(b\e)*\e2\e)*d" 852.Ql "a\e(\e(b\e)*\e2\e)*d"
853match 853match
854.Ql "abbbd" ? 854.Ql "abbbd" ?
855Until the standard is clarified, 855Until the standard is clarified,
856behavior in such cases should not be relied on. 856behavior in such cases should not be relied on.
857.Pp 857.Pp
858The implementation of word-boundary matching is a bit of a kludge, 858The implementation of word-boundary matching is a bit of a kludge,
859and bugs may lurk in combinations of word-boundary matching and anchoring. 859and bugs may lurk in combinations of word-boundary matching and anchoring.
860.Pp 860.Pp
861Word-boundary matching does not work properly in multibyte locales. 861Word-boundary matching does not work properly in multibyte locales.