[Therion] Creating PDF/A map files
Benedikt Hallinger
beni at hallinger.org
Thu Oct 22 15:43:02 CEST 2020
Hi,
instalation of veraPDF was straightforward.
For a test run i get this:
<details passedRules="96" failedRules="7" passedChecks="25551"
failedChecks="1846">
This will be a long road.
But your idea is quite good, as it will preserve valuable work.
OTOH the PDFs are just results, what is really valuable would be the
source files. And they are pretty good to archive already, as they are
just text files!
Therion should be fine to run a long time into the future, hopefully.
And as its OpenSource, there are no constraints whatsoever to fix bugs
in the future to make it work again...
Am 2020-10-22 15:17, schrieb Bill Gee:
> Hello everyone -
>
> I propose a new feature for Therion. This will probably take some
> work, and I am sure there will be discussion about how to implement
> it.
>
> It seems to me that the maps we produce with Therion are likely going
> to be stored for a very long time, perhaps running into multiple tens
> of years. As we all know, computer technology over that amount of time
> will change drastically. Just think about the contrast in both
> hardware and software in the last 25 years - from Windows 95 running
> on 486dx processors to Linux and Windows 10 running on i7 and i9
> processors.
>
> I think we have some obligation to make sure the cave maps we generate
> are still usable many years from now. Saving them in PDF format is a
> large - but incomplete - step in that direction.
>
> The new feature I propose is to modify the PDF creation code so that
> it produces files that are PDF/A version 1b (or possibly version 2)
> compliant.
>
> https://en.wikipedia.org/wiki/PDF/A [1]
>
> I have checked all of the PDF files I created in Therion, and none of
> them are flagged as PDF/A compliant. It is possible that they are, in
> fact, compliant and simply do not have the necessary flag. The experts
> can check that against the PDF/A specifications.
>
> Existing PDF documents can be checked for PDF/A compliance with a
> command-line tool called "verapdf". The web site for that tool is
>
> https://openpreservation.org/products/verapdf/ [2]
>
> It is possible to use GhostScript to transform an existing PDF into a
> PDF/A file. The command line is daunting.
>
> https://www.mcbsys.com/blog/2018/10/batch-convert-pdf-to-pdf-a-2018-edition/
> [3]
>
> I tried the GhostScript conversion on one of my Therion maps.
> Immediately at startup it produced this message three times:
>
> "GPL Ghostscript 9.53.3: UTF16BE text string detected in DOCINFO
> cannot be represented in XMP for PDF/A1, reverting to normal PDF
> output"
>
> The process continued running and took about 10 minutes. The resulting
> file failed verapdf analysis. It also increased the file size from 4.3
> megabytes to over 52 megabytes! The output file displayed correctly in
> Okular.
>
> I do not have any idea how Therion produces PDF files. It probably
> uses some combination of TeX and GhostScript to get it done. The new
> feature may be as simple as adding some additional parameters to the
> command lines that call the external programs.
>
> Let the discussion begin! :-)
>
> --
>
> Bill Gee
>
>
>
> Links:
> ------
> [1] https://en.wikipedia.org/wiki/PDF/A
> [2] https://openpreservation.org/products/verapdf/
> [3]
> https://www.mcbsys.com/blog/2018/10/batch-convert-pdf-to-pdf-a-2018-edition/
> _______________________________________________
> Therion mailing list
> Therion at speleo.sk
> https://mailman.speleo.sk/listinfo/therion
More information about the Therion
mailing list