Replace all font glyphs in a PDF by converting them to outline shapes

11,242

Solution 1

Yes, you can use Ghostscript to achieve what you want.

I. For Ghostscript versions up to 9.14

You need to go through 2 steps:

  1. Convert the PDF to a PostScript file, but use the side effect of a relatively unknown parameter: it is called -dNOCACHE. This will convert all used fonts to outline shapes:

    gs -o somepdf.ps -dNOCACHE -sDEVICE=pswrite somepdf.pdf
    
  2. Convert the PS back to PDF (and, maybe delete the intermediate PS again):

    gs -o somepdf-with-outlines.pdf -sDEVICE=pdfwrite somepdf.ps
    
    rm somepdf.ps
    

This method is not reliable long-term, because the Ghostscript developers have stated that -dNOCACHE may not be present in future versions.

Note: the resulting PDF will very likely be larger than the original one. Plus, without additional command line parameters, all images in the original PDF will likely also be processed according to Ghostscript builtin defaults. This can lead to unwanted side-effects. Those side-effects can be avoided by adding more command line parameters to do otherwise.


II. Ghostscript versions 9.15 or newer

Ghostscript version 9.15 (released in September 2014) supports a new command line parameter:

 -dNoOutputFonts

This will cause the output devices pdfwrite, ps2write and eps2write "to 'flatten' glyphs into 'basic' marking operations (rather than writing fonts to the output)".

This means: the two steps described for pre-9.15 GS versions can be avoided. The desired result can be achieved with a single command:

 gs -o file-with-outlines.pdf -dNoOutputFonts -sDEVICE=pdfwrite file.pdf

Note: the same caveat is true as already noted in part I. If your PDF includes images, there may be unwanted side effects introduced by the simple command line above. To avoid these, you need to add more specific parameters.

Solution 2

This commit adds a new switch -dNoOutputFonts to the Ghostscript pdfwrite and ps2write devices which will produce a PDF file (or PostScript, depending on the selected device) where all the glyphs have been created as vectors, not as text.

You will need at least version 9.15 of Ghostscript to get this feature. Be aware that the PDF file will almost certainly be larger and copy/paste/search will (obviously) not work.

Share:
11,242

Related videos on Youtube

Szabolcs
Author by

Szabolcs

Updated on June 24, 2022

Comments

  • Szabolcs
    Szabolcs almost 2 years

    I am looking for a way to 'outline' all text/fonts in a PDF file, i.e. convert them to curves.

    I would prefer to do this without having to convert the PDF to PostScript and back. Also, I would like to use free lightweight cross-platform tools that can be automated from the command line, such as Ghostscript or MuPDF.

    • Szabolcs
      Szabolcs about 9 years
      LaTeXiT can do this and I believe it uses GhostScript (not sure). I tried to dig through the source and find how it does it but didn't succeed.
    • KenS
      KenS about 9 years
      Ghostscript can do this, now, but it couldn't readily do so previously (you would have had to go via PostScript). I've added the information as an answer below.
    • tejasvi88
      tejasvi88 about 2 years
      PDF-TEXT-To-Outlines with adblocker seems to work well for one off privacy insensitive documents.
    • Szabolcs
      Szabolcs about 2 years
      @tejasvi88 However, it is not a command line tool that can easily be automated, which is what I was looking for.
  • Kaviraj Kanagaraj
    Kaviraj Kanagaraj over 8 years
    hey Kurt, Actually I have created a photobook pdf with images, captions and emojis.. And I need to print the pdf. What is the ideal way to covert any photobook pdf to the "print-ready" pdf format.. What are the options to use in ghostscript? Can you guide me or point to some resources? Thanks a lot in advance. Actually I tried outlines the fonts in my photobook pdf via command you mentioned in this answer.. it works fine. But since this pdf contains images, emojis, text.. Am not sure is the exact command? or I need to use some extra options on the longer run... ?
  • Libin Wen
    Libin Wen over 4 years
    @Kurt, nice answer, you really should add the link to another answer by you, about how to keep the raster image resolution: superuser.com/a/373740/207447
  • samm
    samm almost 4 years
    Add a related document reference for -dNoOutputFonts. But note the new output PDF created by Ghostscript is not necessarily much more "intelligent" (overall smaller, better optimized files from bloated input PDF) with default settings. See also How to remove duplicate objects in PDF using ghostscript?
  • samm
    samm almost 4 years
    Yes, I tested, I found that the cause for larger size was not just in convert fonts to outline shapes/vectors/curves. For example, I had a PDF with one watermask image embedded and referenced/indirectly used on each page. After ghostscript, I found the output PDF contained duplicated images on each page using itext-rups-7.1.11.jar. ``` Pages: ... Page 3 124 0 R => Image Stream Page 4 171 0 R => Image Stream ... XRef: ... 124 => Image Stream 171 => Image Stream ... ```
  • KenS
    KenS almost 4 years
    The comment above doesn't seem to be anything to do with the original question or answer. samm, if you have a problem, please start a new question. For other readers, Ghostscript's pdfwrite device (by default) will hash all images, and only use one if they are identical. Of course samm has not provided an input file, a command line, an output file or even informaiton on which OS or version of Ghostscript, which makes it impossible to investigate or comment.
  • samm
    samm almost 4 years
    Well, it seems to have little to do with converting texts to curves without fonts embedded. I just wanted to add a note about larger size of the output PDF file if someone is concerned with the size. I used gs v9.52 on windows 10 by ` gs -o book.vectored.pdf -dNoOutputFonts -sDEVICE=pdfwrite book.optimized.pdf` and the pdf had 300+ of pages. I used the same optimization algorithm to book.vectored.pdf as was used to book.optimized.pdf, I could reduce the size by 10 MB.
  • Szabolcs
    Szabolcs almost 3 years
    Solution "II" form the accepted answer does work in Ghostscript 9.54 just as before (I use it regularly). The other answers did not rely on GSView. I am not sure what issue your answer is trying to address.
  • Supernuija
    Supernuija almost 3 years
    I did try that solution, but for some reason some specific fonts still had some errors (some disformed characters, as if some vertices or control vectors were missing) in them, which were fixed only when printing first PS with Windows 10 own driver, and then converting that to EPS. I have used Ghostscript for decades to fix all kind odd visual errors in vector file conversions, it's a great tool! Gsview just made it super easy to use, since it had a graphical UI, and that's no longer available.
  • Szabolcs
    Szabolcs almost 3 years
    It will be helpful to readers if you explain (within the answer itself) what problem your solution is meant to address.