Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "mail.NetBSD.org", Issuer "Postmaster NetBSD.org" (verified OK)) by mollari.NetBSD.org (Postfix) with ESMTPS id E9F867105D for ; Thu, 15 Aug 2013 08:02:36 +0000 (UTC) Received: by mail.netbsd.org (Postfix, from userid 605) id 4FF7114A180; Thu, 15 Aug 2013 08:02:36 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by mail.netbsd.org (Postfix) with ESMTP id 0E3E214A161 for ; Thu, 15 Aug 2013 08:02:24 +0000 (UTC) X-Virus-Scanned: amavisd-new at NetBSD.org Received: from mail.netbsd.org ([127.0.0.1]) by localhost (mail.NetBSD.org [127.0.0.1]) (amavisd-new, port 10025) with ESMTP id jvBdx4o9RPSu for ; Thu, 15 Aug 2013 08:02:23 +0000 (UTC) Received: from cvs.netbsd.org (cvs.NetBSD.org [IPv6:2001:4f8:3:7:2e0:81ff:fe30:95bd]) by mail.netbsd.org (Postfix) with ESMTP id 22A9D14A15B for ; Thu, 15 Aug 2013 08:02:23 +0000 (UTC) Received: by cvs.netbsd.org (Postfix, from userid 500) id 1FA2A96; Thu, 15 Aug 2013 08:02:23 +0000 (UTC) Content-Disposition: inline Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset="US-ASCII" MIME-Version: 1.0 Date: Thu, 15 Aug 2013 08:02:23 +0000 From: "Wen Heping" Subject: CVS commit: pkgsrc/converters/p5-MARC-Charset To: pkgsrc-changes@NetBSD.org Reply-To: wen@netbsd.org X-Mailer: log_accum Message-Id: <20130815080223.1FA2A96@cvs.netbsd.org> Sender: pkgsrc-changes-owner@NetBSD.org List-Id: pkgsrc-changes.NetBSD.org Precedence: bulk Module Name: pkgsrc Committed By: wen Date: Thu Aug 15 08:02:23 UTC 2013 Modified Files: pkgsrc/converters/p5-MARC-Charset: Makefile distinfo Log Message: Update to 1.35 Upstream changes: 1.35 Tue Aug 13 19:50:55 PDT 2013 - improve conversion of certain composed characters to MARC8 Some characters should not be fully decomposed before converting them to MARC8. This patch adds a table of such characters, based on Annex A of http://www.loc.gov/marc/marbi/2006/2006-04.html and on some sample records provided by Jason Stephenson of MVLC. - recognize G0 and G1 characters properly When converting from MARC8 to UTF8, MARC::Charset now properly recognizes if a (single-byte) MARC8 character falls in G0 or G1. This is part of the fix for RT#63271 (converting characters in the Extended Cyrillic character set), but should also fix similar issues with converting characters in the extended Arabic set. This commit also means that all MARC8 character sets that support both G0 and G1 wll be properly converted, regardless of whether they're currently set as the G0 or G1 character set. For example, it is now possible to convert Extended Latin as G0 or Basic Latin as G1. This fixes RT#63271 - have MARC::Charset::Code->marc_value() handle G0/G1 conversion Since there's at present no need to do things like have ANSEL be the G0 character set when converting from UTF8 to MARC8, this commit centralizes the logic for deciding whether to return the G0 or G1 MARC8 representation of a character. Also add MARC::Charset::Code->g0_marc_value(), which returns the G0 representation of the character for use by the character DB. - New test cases for converting Vietnamese and Extended Cyrillic text. 1.34 Mon Feb 11 09:10:35 PST 2013 - RT#83257: use AnyDBM_File rather than hardcode GDBM_File To improve portability, use AnyDBM_File to select a DBM rather than rely on GDBM_File. GDBM_File apparently used to be a core module, but not all distributions included it, particularly OS X. In any event, GDBM_File is no longer core. This patch also includes a tweak to allow MARC::Charset to work with NDBM_File and ODBM_File, neither of which support 'exists'. I've tested MARC::Charset successfully on the following DBMs: - GDBM_File - DB_File - NDBM_File - ODBM_File - SDBM_File This is also my preferred order; SDBM_File is selected last because it produces the biggest data file on disk. - RT#38912: fix mapping of double diacritics (ligature and double tilde). Thanks to Thomas P. Ventimiglia for the bug report and test case. To generate a diff of this commit: cvs rdiff -u -r1.8 -r1.9 pkgsrc/converters/p5-MARC-Charset/Makefile cvs rdiff -u -r1.2 -r1.3 pkgsrc/converters/p5-MARC-Charset/distinfo Please note that diffs are not public domain; they are subject to the copyright notices on the relevant files.