How to convert a .pdf file into a folder of images?
Solution 1
OK well, I did some more research and although tohuwawohu's method does work, I found it easier to use a program called pdftoppm to achieve what I wanted done. Since I am pretty much a layperson when it comes to using command line apps, I will do my best to explain how I got this to work for me.
-
Navigate to the folder containing the .pdf you wish to edit and open a terminal there. I did this by using the sample command:
cd ~/Documents/PDF
-
Let's say the file I want to edit is called Sample.pdf What I want to do is use pdftoppm to create image files of each page of the .pdf. Several formats can be chosen (see the man pages link above) but I prefer to use .png. The basic command looks like this:
pdftoppm -FORMAT FILENAME.pdf PREFIX
or in the example above:
pdftoppm -png Sample.pdf Sample
This command creates an image file of each page in the same folder as the original .pdf file with names like Sample-01.png, Sample-02.png and so on. I have tried it with the .png and .jpeg extensions successfully. .jpg is apparently not supported.
Then I just use Archive Manager by selecting all the newly-created image files, right-clicking, and choosing "Compress" from the context menu. I then choose the archive format I prefer (in this case .cbz or Comic Book Zip) and create the new archive.
Now I have a shiny new .cbz file called Sample.cbz which I can then view with my Comix reader!
Hopefully what I have posted above makes enough sense that someone else can learn from it. If I need to change it in any way please let me know.
Solution 2
I'm not very familiar with *.cbr / *.cbz, but it seems you'll have to combine two steps:
- Convert PDF to Images
- Compress them into a ZIP / RAR archive.
Regarding step 1, you could use ImageMagick's convert
command. You can feed convert
with a PDf comprising multiple pages, and convert
will return each page as single graphics file. I've tested it with a text scanned at 400 dpi, and the following command resulted in nice single JPGEs:
$ convert -verbose -colorspace RGB -interlace none -density 400 -quality 100 yourPdfFile.pdf 00%d.jpeg
(credits regarding the -quality
option: this forum entry)
As a result, you get 000.jpeg
, 001.jpeg
and so on. Just zip them into a .cbz
file, and you're done.
You could even combine both steps by "concatenating" them:
$ convert -verbose -colorspace RGB -interlace none -density 400 -quality 100 yourPdfFile.pdf 00%d.jpg && zip -vm comic.cbz *.jpg
(make sure that there aren't any other JPEGs in your current working directory, since using the code above, zip will move all JPEGs into the cbz file)
Solution 3
I have written a simple bash script for exactly this purpose, you will need poppler installed, so:
sudo apt-get install poppler-utils #ubuntu
brew install poppler # mac
Here is the bash script (save it as convert_to_cbz.sh):
filename="${1%.*}"
echo "Converting $filename to cbz"
mkdir "./$filename"
pdftoppm -jpeg "$1" "./$filename/000"
zip -r9 "${filename}.cbz" "./$filename"
rm -rf "./$filename"
To use the bash script:
chmod +x convert_to_cbz.sh
./convert_to_cbz.sh "Nintendo Official Magazine 066 (OldGameMags).pdf"
Hopefully this will be useful for someone!
Solution 4
Try using calibre to directly convert the .pdf to .cbr or .cbz.
Related videos on Youtube
Shawn
Updated on September 18, 2022Comments
-
Shawn over 1 year
I have some .pdf files that I would like to convert to my preferred reading format of .cbr or .cbz or, if this isn't directly possible, I need to extract all pages from the .pdf as images and then compress them into my format of choice. I have only been able to save pages one at a time with Document Viewer. Obviously, I'd like to do it a little quicker. I have tried pdfsam, pdf shuffler, and pdfmod all with no luck. I am using Ubuntu 11.10.
-
tohuwawohu over 12 yearsVery nice! It seems that
pdftoppm
is in fact easier to use than ImageMagick'sconvert
. -
Shawn over 12 yearsThanks for the suggestion, but for me using Calibre as a solution won't work. I installed the program and I am sorry to say that it sticks out like a sore thumb on my desktop! Also, I discovered using the pdftoppm command below is WAY faster than installing and configuring Calibre before converting.
-
Anny Igi over 12 years@Shawn Yes, I would say that Calibre is ugly and slow, but it does do the job. I'm glad you found a better solution, though :)
-
Pankaj Badukale over 8 yearsyes it very helpful and easy. I just want to know. can add alpha prefix. Like prefix-a, prefix-b, prefix-c, in this way
-
Eric Duminil over 6 yearsWonderful, thanks. I had to change the script a bit though.
pdftoppm
used all my RAM and crashed my computer. Replacing the 4th line withpdfimages -j "$1" "./$filename/000"
did the trick. It works fine, it's fast, there doesn't seem to be any quality loss and the cbz is slightly smaller than the original pdf.pdfimages
is also included inpoppler-utils
. -
Eric Duminil over 6 years
pdftoppm
is extremely slow and uses all the RAM on my computer.pdfimages -j
worked much better. -
mchid about 3 yearsThis doesn't work. All you get is a help message when using this syntax for
pdftoppm