---
- branch: MAIN
  date: Fri Mar  6 08:18:31 UTC 2020
  files:
  - new: '1.16'
    old: '1.15'
    path: pkgsrc/textproc/miller/Makefile
    pathrev: pkgsrc/textproc/miller/Makefile@1.16
    type: modified
  - new: '1.15'
    old: '1.14'
    path: pkgsrc/textproc/miller/distinfo
    pathrev: pkgsrc/textproc/miller/distinfo@1.15
    type: modified
  id: 20200306T081831Z.58b02f8a71d21704491a2e4106b236cb07efa16e
  log: "miller: update to 5.6.2.\n\nChangeLog:\n\nv5.6.2\n\nBug fixes:\n\n    #271
    fixes a corner-case bug with more than 100 CSV/TSV files with\n    headers of
    varying lengths.\n\nDocumentation:\n\n    The new http://johnkerl.org/miller/doc/whyc-details.html
    is an\n    elaboration on http://johnkerl.org/miller/doc/whyc.html which answers\n
    \   a question posed by @BurntSushi on Reddit a couple years ago which\n    I
    did not address in detail at the time.\n\nv5.6.1\n\n    The only change is that
    http://johnkerl.org/miller/doc is now\n    more mobile-friendly.  All build artifacts
    are the same as at\n    https://github.com/johnkerl/miller/releases/tag/v5.6.0\n\nv5.6.0\n\n
    \   The new system DSL function allows you to run arbitrary shell commands\n    and
    store them in field values. Some example usages are documented\n    here. This
    is in response to issues #246 and #209.\n\n    There is now support for ASV and
    USV file formats. This is in response\n    to issue #245.\n\n    The new format-values
    verb allows you to apply numerical formatting\n    across all record values. This
    is in response to issue #252.\n\nDocumentation:\n\n    The new DKVP I/O in Python
    sample code now works for Python 2 as\n    well as Python 3.\n\n    There is a
    new cookbook entry on doing multiple joins. This is in\n    response to issue
    #235.\n\nBugfixes:\n\n    The toupper, tolower, and capitalize DSL functions\n
    \   are now UTF-8 aware, thanks to @sheredom's marvelous\n    https://github.com/sheredom/utf8.h.
    The internationalization page\n    has also been expanded. This is in response
    to issue #254.\n\n    #250 fixes a bug using in-place mode in conjunction with
    verbs\n    (such as rename or sort) which take field-name lists as arguments.\n\n
    \   #253 fixes a bug in the label when one or more names are common\n    between
    old and new.\n\n    #251 fixes a corner-case bug when (a) input is CSV; (b) the
    last\n    field ends with a comma and no newline; (c) input is from standard\n
    \   input and/or --no-mmap is supplied.\n\nv5.5.0\n\n    The new positional-indexing
    feature resolves #236 from @aborruso. You\n    can now get the name of the 3rd
    field of each record via $[[3]], and\n    its value by $[[[3]]]. These are both
    usable on either the left-hand\n    or right-hand side of assignment statements,
    so you can more easily\n    do things like renaming fields progrmatically within
    the DSL.\n\n    There is a new capitalize DSL function, complementing the\n    already-existing
    toupper. This stems from #236.\n\n    There is a new skip-trivial-records verb,
    resolving #197. Similarly,\n    there is a new remove-empty-columns verb, resolving
    #206. Both are\n    useful for data-cleaning use-cases.\n\n    Another pair is
    #181 and #256. While Miller uses mmap internally\n    (and invisibily) to get
    approximately a 20% performance boost over\n    not using it, this can cause out-of-memory
    issues with reading either\n    large files, or too many small ones. Now, Miller
    automatically avoids\n    mmap in these cases. You can still use --mmap or --no-mmap
    if you\n    want manual control of this.\n\n    There is a new --ivar option for
    the nest verb which complements\n    the already-existing --evar. This is from
    #260 thanks to @jgreely.\n\n    There is a new keystroke-saving urandrange DSL
    function:\n    urandrange(low, high) is the same as low + (high - low) *\n    urand().
    This arose from #243.\n\n    There is a new -v option for the cat verb which writes
    a low-level\n    record-structure dump to standard error.\n\n    There is a new
    -N option for mlr which is a keystroke-saver for\n    --implicit-csv-header --headerless-csv-output.\n\nDocumentation:\n\n
    \   The new FAQ entry\n    http://johnkerl.org/miller/doc/faq.html#How_to_escape_'%3F'_in_regexes%3F\n
    \   resolves #203.\n\n    The new FAQ entry\n    http://johnkerl.org/miller/doc/faq.html#How_can_I_filter_by_date%3F\n
    \   resolves #208.\n\n    #244 fixes a documentation issue while highlighting
    the need for #241.\n\nBugfixes:\n\n    There was a SEGV using nest within then-chains,
    fixed in response\n    to #220.\n\n    Quotes and backslashes weren't being escaped
    in JSON output with\n    --jvquoteall; reported on #222.\n\nv5.4.0\n\n    The
    new clean-whitespace verb resolves #190 from @aborruso. Along with\n    the new
    functions strip, lstrip, rstrip, collapse_whitespace, and\n    clean_whitespace,
    there is now both coarse-grained and fine-grained\n    control over whitespace
    within field names and/or values. See the\n    linked-to documentation for examples.\n\n
    \   The new altkv verb resolves #184 which was originally opened via an\n    email
    request. This supports mapping value-lists such as a,b,c,d to\n    alternating
    key-value pairs such as a=b,c=d.\n\n    The new fill-down verb resolves #189 by
    @aborruso. See the linked-to\n    documentation for examples.\n\n    The uniq
    verb now has a uniq -a which resolves #168 from @sjackman.\n\n    The new regextract
    and regextract_or_else functions resolve #183\n    by @aborruso.\n\n    The new
    ssub function arises from #171 by @dohse, as a simplified way\n    to avoid escaping
    characters which are special to regular-expression\n    parsers.\n\n    There
    are new localtime functions in response to #170 by\n    @sitaramc. However note
    that as discussed on #170 these do\n    not undo one another in all circumstances.
    This is a non-issue\n    for timezones which do not do DST. Otherwise, please
    use with\n    disclaimers: localdate, localtime2sec, sec2localdate, sec2localtime,\n
    \   strftime_local, and strptime_local.\n\nBuilds:\n\n    Windows build-artifacts
    are now available in Appveyor at\n    https://ci.appveyor.com/project/johnkerl/miller/build/artifacts,\n
    \   and will be attached to this and future releases. This resolves #167,\n    #148,
    and #109.\n\n    Travis builds at https://travis-ci.org/johnkerl/miller/builds
    now\n    run on OSX as well as Linux.\n\n    An Ubuntu 17 build issue was fixed
    by @singalen on #164.\n\nDocumentation:\n\n    put/filter documentation was confusing
    as reported by @NikosAlexandris\n    on #169.\n\n    The new FAQ entry\n    http://johnkerl.org/miller-releases/miller-head/doc/faq.html#How_to_rectangularize_after_joins_with_unpaired?\n
    \   resolves #193 by @aborruso.\n\n    The new cookbook entry\n    http://johnkerl.org/miller/doc/cookbook.html#Options_for_dealing_with_duplicate_rows\n
    \   arises from #168 from @sjackman.\n\n    The unsparsify documentation had some
    words missing as reported by\n    @tst2005 on #194.\n\n    There was a typo in
    the cookpage page\n    http://johnkerl.org/miller/doc/cookbook.html#Full_field_renames_and_reassigns\n
    \   as fixed by @tst2005 in #192.\n\nBugfixes:\n\n    There was a memory leak
    for TSV-format files only as reported by\n    @treynr on #181.\n\n    Dollar sign
    in regular expressions were not being escaped properly\n    as reported by @dohse
    on #171.\n\nv5.3.0\n\n    Comment strings in data files: mlr --skip-comments allows\n
    \   you to filter out input lines starting with #, for all file\n    formats.
    Likewise, mlr --skip-comments-with X lets you specify\n    the comment-string
    X. Comments are only supported at start of data\n    line. mlr --pass-comments
    and mlr --pass-comments-with X allow you\n    to forward comments to program output
    as they are read.\n\n    The count-similar verb lets you compute cluster sizes
    by cluster\n    labels.\n\n    While Miller DSL arithmetic gracefully overflows
    from 64-integer\n    to double-precision float (see also here), there are now
    the\n    integer-preserving arithmetic operators .+ .- .* ./ .// for those\n    times
    when you want integer overflow.\n\n    There is a new bitcount function: for example,
    echo x=0xf0000206 |\n    mlr put '$y=bitcount($x)' produces x=0xf0000206,y=7.\n\n
    \   Issue 158: mlr -T is an alias for --nidx --fs tab, and mlr -t is an\n    alias
    for mlr --tsvlite.\n\n    The mathematical constants Ï\x80 and e have been renamed
    from PI and\n    E to M_PI and M_E, respectively. (It's annoying to get a syntax\n
    \   error when you try to define a variable named E in the DSL, when\n    A through
    D work just fine.) This is a backward incompatibility,\n    but not enough of
    us to justify calling this release Miller 6.0.0.\n\nDocumentation:\n\n    As noted
    here, while Miller has its own DSL there will always be\n    things better expressible
    in a general-purpose language. The new page\n    Sharing data with other languages
    shows how to seamlessly share data\n    back and forth between Miller, Ruby, and
    Python. SQL-input examples\n    and SQL-output examples contain detailed information
    the interplay\n    between Miller and SQL.\n\n    Issue 150 raised a question
    about suppressing numeric conversion. This\n    resulted in a new FAQ entry How
    do I suppress numeric conversion?,\n    as well as the longer-term follow-on issue
    151 which will make\n    numeric conversion happen on a just-in-time basis.\n\n
    \   To my surprise, csvlite format options werenâ\x80\x99t listed in mlr --help\n
    \   or the manpage. This has been fixed.\n\n    Documentation for auxiliary commands
    has been expanded, including\n    within the manpage.\n\nBugfixes:\n\n    Issue
    159 fixes regex-match of literal dot.\n\n    Issue 160 fixes out-of-memory cases
    for huge files. This is an old\n    bug, as old as Miller, and is due to inadequate
    testing of huge-file\n    cases. The problem is simple: Miller prefers memory-mapped
    I/O\n    (using mmap) over stdio since mmap is fractionally faster. Yet as\n    any
    processing (even mlr cat) steps through an input file, more and\n    more pages
    are faulted in -- and, unfortunately, previous pages are\n    not paged out once
    memory pressure increases. (This despite gallant\n    attempts with madvise.)
    Once all processing is done, the memory is\n    released; there is no leak per
    se. But the Miller process can crash\n    before the entire file is read. The
    solution is equally simple: to\n    prefer stdio over mmap for files over 4GB
    in size. (This 4GB threshold\n    is tunable via the --mmap-below flag as described
    in the manpage.)\n\n    Issue 161 fixes a CSV-parse error (with error message
    \"unwrapped\n    double quote at line 0\") when a CSV file starts with the UTF-8\n
    \   byte-order-mark (\"BOM\") sequence 0xef 0xbb 0xbf and the header line\n    has
    double-quoted fields. (Release 5.2.0 introduced handling for\n    UTF-8 BOMs,
    but missed the case of double-quoted header line.)\n\n    Issue 162 fixes a corner
    case doing multi-emit of aggregate variables\n    when the first variable name
    is a typo.\n\n    The Miller JSON parser used to error with Unable to parse JSON
    data:\n    Line 1 column 0: Unexpected 0x00 when seeking value on empty input,\n
    \   or input with trailing whitespace; this has been fixed.\n"
  module: pkgsrc
  subject: 'CVS commit: pkgsrc/textproc/miller'
  unixtime: '1583482711'
  user: fcambus