Editing PDF via text editor
13,014
Using Notepad++ instead of Sublime Text solved the issue.
Apparently, my Sublime Text made some changes to the file even when it wasn't asked to do so.
Related videos on Youtube
Author by
Draex_
Updated on September 18, 2022Comments
-
Draex_ over 1 year
I'm trying to add page labels to a PDF file by modifying the file directly with a text editor.
When I open the PDF in a text editor and save it, without making any changes, the file becomes corrupted and can't be opened by Adobe Reader.
Why does this happen?
The solution that came to my mind is using HEX editor, but that doesn't seem to be a comfortable way of working with files. Is there any other way?
As a text editor, I use Sublime Text.
-
Admin over 7 yearsThe problem is probably relating to the text encoding. You should check which encoding the text editor is defaulting to and change it if necessary.
-
Admin over 7 yearsI've tried using several encodings with no success. Which encoding should I use? The file is mostly binary. However, since I'm not changing the file, I don't understand why encoding matters.
-
Admin over 7 yearsWell, PDF's aren't designed to be edited this way anyway, but if your text editor attempts to change the encoding then it just makes matters worse. Have you tried using Notepad++ instead? If I open a PDF and save it then it still seems to work.
-
Admin over 7 yearsThe question is not "which encoding should I use", the point is that your text editor probably assumes the PDF binary data is text in some particular encoding, and makes some changes that are valid for that particular encoding (like adding BOM marks), but that are invalid for the PDF binary data. So your text editor does make changes just by opening the file. Fix the problem by using a text edit which doesn't do that. The next problem is that by editing the file, you'll make the xref table invalid, so you need to recompute it.
-
Admin over 7 yearsThanks guys, using Notepad++ solves the issue. @dirkt Even though I didn't touch the xref table, the document opens okay. Any idea why? xref table should contain byte offsets of several objects in the file, right? Positions of objects are now changed.
-
Admin over 7 yearsSome viewers repair the xref table automatically if they detect that it's corrupt, some don't. I'm on Linux and use mainly
xpdf
andmupdf
, so I can't tell you what Windows viewers do. But if the position of the objects changed, the xref table is corrupt and should be regenerated if you want to have a standard-conforming file. -
Admin over 6 yearsOne thing that can happen: your editor may strip trailing spaces when you save the file, for instance, which can make the PDF no longer valid. (Happened to me just now.)
-