text to pdf with utf8 encoding (alternative to a2ps)

13,178

Solution 1

If what you actually want is to use a2ps or enscript (which is a similar tool), and if your single need is to use them with some UTF-8 document, you only have to convert your document to ISO-8859-1 or some supported encoding. Various tools allow this. For instance, here is a workflow for enscript (but you can surely do the same with a2ps):

cat document.txt | iconv -c -f utf-8 -t ISO-8859-1 | enscript -o document.ps

But you may lose some characters during the conversion because such encodings have a smaller range than UTF-8.

On the other hand, if UTF-8 is a requirement, you may rather have to look for some recent tool allowing to convert UTF-8 to PDF. I wrote myself a Python program called txt2pdf; you may find it here. Have also a look at tools like pandoc, gimli, rst2pdf or wkhtmltopdf.

Solution 2

I've gotten acceptable results (for printing code listings) from https://github.com/arsv/u2ps

Solution 3

Use paps! For instance I use it as follow:

paps --font="Monospace 10" input.txt > output.ps  

and I have no problem with utf encoding. If you need a pdf file then

pdf2ps output.ps 

Solution 4

https://gitlab.com/gnomify/u2ps is the replacement of gnome-u2ps.

Solution 5

If the text file is small, paps converts to text to ps, which then can be fed to ps2pdf. The problem is ps file from paps causes ps2pdf to create a very big pdf file. If that is ok, this is possible. Currently, I am having a large file size pdf from paps.

Share:
13,178
guettli
Author by

guettli

http://thomas-guettler.de/ Working out loud: https://github.com/guettli/wol

Updated on June 05, 2022

Comments

  • guettli
    guettli almost 2 years

    The programm a2ps does not support utf-8. At least my version does only support the latin-X encodings:

    a2ps --list=encoding
    

    Version:

    GNU a2ps 4.14
    

    How can I convert a simple utf-8 text to postscript or pdf?

  • Thomas Baruchel
    Thomas Baruchel almost 8 years
    I tried this tool years ago; I don't know if its current conversion engine is better, but at that time I found the result awful. If I remember well, outlines from a TrueType font are converted but the hinting is missing. While it may be suitable for printing the output, it can certainly not be an acceptable solution for creating standalone PDF documents.
  • bortzmeyer
    bortzmeyer over 7 years
    It seems no longer maintained (it depends on obsolete libraries like gnomeprint)
  • Skippy le Grand Gourou
    Skippy le Grand Gourou almost 6 years
    Using iso-8859-1//TRANSLIT instead of plain iso-8859-1 may save you some iconv: illegal input sequence at position XX errors.
  • Roger House
    Roger House over 5 years
    txt2pdf seems to convert all UTF-8 characters (which are not ASCII) into small black squares.
  • Thomas Baruchel
    Thomas Baruchel over 5 years
    @RogerHouse Hi, I use txt2pdf on a daily basis for my own document with no trouble. Are you sure you specified a correct font?
  • Roger House
    Roger House over 5 years
    @ThomasBaruchel I am using /usr/share/fonts/truetype/ubuntu-font-family/UbuntuMono-R.tt‌​f and the result is mostly empty square boxes. Probably I should find another font, but I don't know which one. When I display my text file with cat or look at it with vim or emacs (terminal version), everything is great, but I have not found out how to successfully convert it to PDF.
  • Roger House
    Roger House over 5 years
    @ThomasBaruchel I should mention that my text document is mostly Unicode characters, not straight ASCII.
  • Thomas Baruchel
    Thomas Baruchel over 5 years
    @RogerHouse I checked with this font and didn't notice any issue. Of course the tool is intended to be used with UTF8 characters and if the glyph is in the font there should be no trouble. We have to understand what is going wrong, but certainly nothing that couldn't be fixed very quickly. Could you open an issue on Github in order to discuss more conveniently than here?
  • Roger House
    Roger House over 5 years
    @ThomasBaruchel See link for text which results in question marks.
  • guettli
    guettli over 4 years
    Yes, utf-8 is a requirement. Converting to latin1 is not a solution.
  • Abhishek Gurjar
    Abhishek Gurjar over 4 years
    Please provide essential details from link because link may get expired in future.
  • Dov Grobgeld
    Dov Grobgeld over 3 years
    I'm the author of paps. The problem of large files from ps2pdf is no longer true in version 0.7.* . Further version 0.7.* can directly write pdf files, so you don't need ps2pdf anymore. Get the latest version from the git repo.
  • xebeche
    xebeche about 3 years
    Note that v0.7+ is probably preferred.
  • Leo B.
    Leo B. about 2 years
    @DovGrobgeld Just what the doctor ordered for line printer output from mainframe emulators to be sent to a contemporary printer! Thank you!