Split each PDF page in two?

29,844

Solution 1

PDF Scissors allowed me to bulk split (crop) all pages in a PDF.

Solution 2

mutool works brillantly for this. The example below will chop each page of input.pdf into 3 horizontal and 8 vertical parts (thus creating 24 pages of output for each 1 of input):

mutool poster -x 3 -y 8 input.pdf output.pdf

To install mutool, just install mupdf, which is probably packaged with most GNU/Linux distributions.

(Credits to marttt.)

On debian based linux systems like ubuntu, you can install it using

sudo apt install mupdf
sudo apt install mupdf-tools

Solution 3

Briss is "a simple cross-platform (Linux, Windows, Mac OSX) application for cropping PDF files. A simple user interface lets you define exactly the crop-region by fitting a rectangle on the visually overlaid pages." It's open source (GPL).

Works well for me. The GUI is minimal, but functional. It can also be used from the command line.

Solution 4

You can use a Python library called PyPDF. This function will split double pages no matter what the page orientation is:

import copy
import math
import pyPdf

def split_pages(src, dst):
    src_f = file(src, 'r+b')
    dst_f = file(dst, 'w+b')

    input = pyPdf.PdfFileReader(src_f)
    output = pyPdf.PdfFileWriter()

    for i in range(input.getNumPages()):
        p = input.getPage(i)
        q = copy.copy(p)
        q.mediaBox = copy.copy(p.mediaBox)

        x1, x2 = p.mediaBox.lowerLeft
        x3, x4 = p.mediaBox.upperRight

        x1, x2 = math.floor(x1), math.floor(x2)
        x3, x4 = math.floor(x3), math.floor(x4)
        x5, x6 = math.floor(x3/2), math.floor(x4/2)

        if x3 > x4:
            # horizontal
            p.mediaBox.upperRight = (x5, x4)
            p.mediaBox.lowerLeft = (x1, x2)

            q.mediaBox.upperRight = (x3, x4)
            q.mediaBox.lowerLeft = (x5, x2)
        else:
            # vertical
            p.mediaBox.upperRight = (x3, x4)
            p.mediaBox.lowerLeft = (x1, x6)

            q.mediaBox.upperRight = (x3, x6)
            q.mediaBox.lowerLeft = (x1, x2)

        output.addPage(p)
        output.addPage(q)

    output.write(dst_f)
    src_f.close()
    dst_f.close()

Solution 5

Thanks to Matt Gumbley for his Python Script. I have modified that Python script such that it now also works with PDFs that contain portrait and landscape pages and cropped pages:

# -*- coding: utf-8 -*-
"""
Created on Thu Feb 26 08:49:39 2015

@author: Matt Gumbley  (stackoverflow)
changed by Hanspeter Schmid to deal with already cropped pages
"""

import copy
import math
from PyPDF2 import PdfFileReader, PdfFileWriter

def split_pages2(src, dst):
    src_f = file(src, 'r+b')
    dst_f = file(dst, 'w+b')

    input = PdfFileReader(src_f)
    output = PdfFileWriter()

    for i in range(input.getNumPages()):
        # make two copies of the input page
        pp = input.getPage(i)
        p = copy.copy(pp)
        q = copy.copy(pp)

        # the new media boxes are the previous crop boxes
        p.mediaBox = copy.copy(p.cropBox)
        q.mediaBox = copy.copy(p.cropBox)

        x1, x2 = p.mediaBox.lowerLeft
        x3, x4 = p.mediaBox.upperRight

        x1, x2 = math.floor(x1), math.floor(x2)
        x3, x4 = math.floor(x3), math.floor(x4)
        x5, x6 = x1+math.floor((x3-x1)/2), x2+math.floor((x4-x2)/2)

        if (x3-x1) > (x4-x2):
            # horizontal
            q.mediaBox.upperRight = (x5, x4)
            q.mediaBox.lowerLeft = (x1, x2)

            p.mediaBox.upperRight = (x3, x4)
            p.mediaBox.lowerLeft = (x5, x2)
        else:
            # vertical
            p.mediaBox.upperRight = (x3, x4)
            p.mediaBox.lowerLeft = (x1, x6)

            q.mediaBox.upperRight = (x3, x6)
            q.mediaBox.lowerLeft = (x1, x2)


        p.artBox = p.mediaBox
        p.bleedBox = p.mediaBox
        p.cropBox = p.mediaBox

        q.artBox = q.mediaBox
        q.bleedBox = q.mediaBox
        q.cropBox = q.mediaBox

        output.addPage(q)
        output.addPage(p)


    output.write(dst_f)
    src_f.close()
    dst_f.close()
Share:
29,844
stackoverflowuser95
Author by

stackoverflowuser95

Updated on June 02, 2020

Comments

  • stackoverflowuser95
    stackoverflowuser95 about 4 years

    I have a large number of PDF files which have two slides to a page (for printing).

    The format is A4 pages each with two slides setup like so:

    -----------
    | slide 1 |
    -----------
    | slide 2 |
    -----------
    

    How can I generate a new PDF file with one slide per page?

    Happy to use GUI, CLI, scripts or even interface with a language's PDF library; but I do need the text on the slides to still be selectable.

  • stackoverflowuser95
    stackoverflowuser95 almost 11 years
    They had to go ahead and call it Bris didn't they >.<! - Cheers, I'll check it out :)
  • Magnetic_dud
    Magnetic_dud over 9 years
    Promising, but needs Java :(
  • martz
    martz over 9 years
    any idea why this is producing an empty file with python 2.7 and pypdf 1.13? thanks!
  • martz
    martz over 9 years
    any idea why this is producing an empty file with python 2.7 and pypdf 1.13? thanks!
  • Nick
    Nick over 9 years
    @TobiasKienzler I fixed the dead link :)
  • Waldir Leoncio
    Waldir Leoncio about 8 years
    mutool is a very powerful little program. I recommend taking a look at it's manual page (manpages.ubuntu.com/manpages/trusty/man1/mutool.1.html).
  • Jose Serodio
    Jose Serodio over 7 years
    Love you man, it worked!. I had to split in 4 my pages, and this just did the trick!
  • David Magalhães
    David Magalhães over 6 years
    Use this! Works like a charm!
  • kaslusimoes
    kaslusimoes over 6 years
    When I tried using this command (in my case mutool poster -y 2 input.pdf output.pdf) the resulting output.pdf had its pages sorted incorrectly. Is there any way to fix this? I couldn't find it in the man page
  • Skippy le Grand Gourou
    Skippy le Grand Gourou over 6 years
    @kaslusimoes I guess your input file orientation is not consistent with mutool expectations, in which case you could just use e.g. pdf180 (from pdfjam) on your input file before feeding mutool with it. If on the other hand your input file ordering is more complicated, you could just resort pages manually with mutool clean input.pdf output.pdf 2,1,3,4,6,5 where 2,1,3,4,6,5 is the new ordering you'd like.
  • xilopaint
    xilopaint over 4 years
    Why does math.floor() is needed here?
  • xilopaint
    xilopaint over 4 years
    Why does math.floor() is needed here?
  • Andrej Shulaev
    Andrej Shulaev about 3 years
    there is mismatch in insertion order of created pages. It should be output.addPage(p) and then output.addPage(q) `