Received: from localhost (localhost [127.0.0.1]) by mail.netbsd.org (Postfix) with ESMTP id DA00184FBE for ; Thu, 28 Sep 2023 22:19:37 +0000 (UTC) X-Virus-Scanned: amavisd-new at netbsd.org Received: from mail.netbsd.org ([IPv6:::1]) by localhost (mail.netbsd.org [IPv6:::1]) (amavisd-new, port 10025) with ESMTP id lyrm2Nt9h-hl for ; Thu, 28 Sep 2023 22:19:37 +0000 (UTC) Received: from cvs.NetBSD.org (ivanova.NetBSD.org [IPv6:2001:470:a085:999:28c:faff:fe03:5984]) by mail.netbsd.org (Postfix) with ESMTP id 48BBD84D24 for ; Thu, 28 Sep 2023 22:19:37 +0000 (UTC) Received: by cvs.NetBSD.org (Postfix, from userid 500) id 3744FFBDB; Thu, 28 Sep 2023 22:19:37 +0000 (UTC) Content-Transfer-Encoding: 7bit Content-Type: multipart/mixed; boundary="_----------=_1695939577233550" MIME-Version: 1.0 Date: Thu, 28 Sep 2023 22:19:37 +0000 From: "Joerg Sonnenberger" Subject: CVS commit: pkgsrc/textproc/py-pdfrw To: pkgsrc-changes@NetBSD.org Approved: commit_and_comment Reply-To: joerg@netbsd.org X-Mailer: log_accum Message-Id: <20230928221937.3744FFBDB@cvs.NetBSD.org> This is a multi-part message in MIME format. --_----------=_1695939577233550 Content-Disposition: inline Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset="US-ASCII" Module Name: pkgsrc Committed By: joerg Date: Thu Sep 28 22:19:37 UTC 2023 Modified Files: pkgsrc/textproc/py-pdfrw: Makefile Added Files: pkgsrc/textproc/py-pdfrw/patches: patch-pdfrw_pdfreader.py Log Message: Include a fix for certain common broken PDFs. To generate a diff of this commit: cvs rdiff -u -r1.4 -r1.5 pkgsrc/textproc/py-pdfrw/Makefile cvs rdiff -u -r0 -r1.1 \ pkgsrc/textproc/py-pdfrw/patches/patch-pdfrw_pdfreader.py Please note that diffs are not public domain; they are subject to the copyright notices on the relevant files. --_----------=_1695939577233550 Content-Disposition: inline Content-Length: 2219 Content-Transfer-Encoding: binary Content-Type: text/x-diff; charset=us-ascii Modified files: Index: pkgsrc/textproc/py-pdfrw/Makefile diff -u pkgsrc/textproc/py-pdfrw/Makefile:1.4 pkgsrc/textproc/py-pdfrw/Makefile:1.5 --- pkgsrc/textproc/py-pdfrw/Makefile:1.4 Tue Jan 4 20:55:02 2022 +++ pkgsrc/textproc/py-pdfrw/Makefile Thu Sep 28 22:19:36 2023 @@ -1,8 +1,8 @@ -# $NetBSD: Makefile,v 1.4 2022/01/04 20:55:02 wiz Exp $ +# $NetBSD: Makefile,v 1.5 2023/09/28 22:19:36 joerg Exp $ DISTNAME= pdfrw-0.4 PKGNAME= ${PYPKGPREFIX}-${DISTNAME} -PKGREVISION= 1 +PKGREVISION= 2 CATEGORIES= textproc python MASTER_SITES= ${MASTER_SITE_PYPI:=p/pdfrw/} Added files: Index: pkgsrc/textproc/py-pdfrw/patches/patch-pdfrw_pdfreader.py diff -u /dev/null pkgsrc/textproc/py-pdfrw/patches/patch-pdfrw_pdfreader.py:1.1 --- /dev/null Thu Sep 28 22:19:37 2023 +++ pkgsrc/textproc/py-pdfrw/patches/patch-pdfrw_pdfreader.py Thu Sep 28 22:19:37 2023 @@ -0,0 +1,30 @@ +$NetBSD: patch-pdfrw_pdfreader.py,v 1.1 2023/09/28 22:19:37 joerg Exp $ + +Handle the case where the xref index starts with the free list, even if +it is supposed to be at a non-zero offset. + +--- pdfrw/pdfreader.py.orig 2023-08-31 20:47:41.383788598 +0000 ++++ pdfrw/pdfreader.py +@@ -408,7 +408,9 @@ class PdfReader(PdfDict): + if tok == 'trailer': + return + startobj = int(tok) +- for objnum in range(startobj, startobj + int(next())): ++ objnum = startobj ++ lastobj = int(next()) ++ while objnum < startobj + lastobj: + offset = int(next()) + generation = int(next()) + inuse = next() +@@ -417,6 +419,11 @@ class PdfReader(PdfDict): + setdefault((objnum, generation), offset) + elif inuse != 'f': + raise ValueError ++ elif startobj and objnum==startobj and offset == 0 and generation == 65535: ++ startobj = 0 ++ objnum = startobj ++ log.warning('Invalid first object in non-zero-offset xref table, offset ignored') ++ objnum += 1 + except: + pass + try: --_----------=_1695939577233550--