---
- branch: MAIN
  date: Wed Aug 16 23:38:35 UTC 2017
  files:
  - new: '1.1'
    old: '0'
    path: othersrc/external/bsd/agcre/dist/internal.h
    pathrev: othersrc/external/bsd/agcre/dist/internal.h@1.1
    type: added
  id: 20170816T233835Z.44b1b12119c9f2960f4cb4da5f18d0f35eba78cd
  log: "Just what this world needs - another regexp library. However, for\nsomething
    I was doing, I needed a regexp library in C, BSD-licensed,\nand able to be exposed
    to a wide range of expressions, some better\ncontrolled than others.\n\nThe resulting
    library is libagcre, which implements regular expression\ncompilation and execution.
    It uses the Pike Virtual Machine approach,\nand features:\n\n+ standard POSIX
    features where sane\n+ some/most Perl escapes\n+ lazy matching via '?'\n+ non-capture
    parenthese (?:...)\n+ in-expression case-insensitive directives are supported
    (?i)...(?-i)\n+ all case-insensitivity is actioned at expression exec time.\nCase-insensitivity
    can be specified at expression compile-time,\nand, if so, it will be remembered.
    \ But the expression itself, once\ncompiled, can be used to match in both a case-sensitive
    and insensitive\nmanner\n+ utf8 is supported both for expressions and for input
    text when\nmatching\n+ unicode escapes (in the Java format of \\uABCD) are supported\n+
    exact multiple repetition specifiers {N}, and {N,M} are supported\n+ backreferences
    are supported\n+ utf16 (LE and BE) and utf32 (LE and BE) are supported, both for
    the\nexpression and for the input being searched\n+ at the most basic level, individual
    32bit unicode characters are\nmatched\n+ an egrep/grep implementation for matching
    unicode regexps\nis included\n\nA simple implementation of sets is used to provide
    inclusion and\nexclusion information for unicode characters, which is taken directly\nfrom
    unicode.org. No bitmasks are used - ranges are specified by\nusing an upper and
    a lower bound for the codepoints. Callbacks can\nalso be added to these sets,
    to provide functionality similar to\nthe ctype macros across the whole unicode
    character set.\n\nThe standard regular expression basic3 torture test is passed
    with\n4 known (and, I'd argue, incorrect) results flagged.  As expected,\nthe
    expression '(a?){9999}aaaaaaaaaaaaaaaaaaaaaaaaaaaaa' matches\nin linear time,
    as does the expression\n'((((((((((((((((((((((((((((((x))))))))))))))))))))))))))))))'\n\n\t%
    time agcre '(a?){9999}aaaaaaaaaaaaaaaaaaaaaaaaaaaaa' dist/tests/2.in\n\taaaaaaaaaaaaaaaaaaaaaaaaaaaaa\n\t0.063u
    0.000s 0:00.06 100.0%    0+0k 0+0io 0pf+0w\n\t% time egrep '(a?){9999}aaaaaaaaaaaaaaaaaaaaaaaaaaaaa'
    dist/tests/2.in\n\t^C88.462u 0.730s 1:29.21 99.9%  0+0k 0+0io 0pf+0w\n\t%\n\nThe
    library and agcre utility have been run through valgrind to\nconfirm no memory
    leaks.\n\nIn general, the emphasis is on a modern, predictable, VM-style,\nwell-featured
    regexp library, in C, with a BSD license. In\nparticular, sljit has not been used
    to speed up on certain platforms,\nmost Perl regexp features are supported, as
    are back references,\nand UTF-8, UTF-16 and UTF32.\n\nOnce again, I wouldn't expect
    anyone to use this as the main engine\nin egrep. But I am always amazed at the
    uses for some of the things\nthat I write.\n\nFor more information about the Pike
    VM, and comparison to other\nregexp implementations, please see:\n\n\thttps://swtch.com/~rsc/regexp/regexp2.html\n\nAlistair
    Crooks\nTue Aug 15 07:43:34 PDT 2017\n"
  module: othersrc
  subject: 'CVS commit: othersrc/external/bsd/agcre/dist'
  unixtime: '1502926715'
  user: agc