Generating pdf-latex with python script

62,028

Solution 1

You can start by defining the template tex file as a string:

content = r'''\documentclass{article}
\begin{document}
...
\textbf{\huge %(school)s \\}
\vspace{1cm}
\textbf{\Large %(title)s \\}
...
\end{document}
'''

Next, use argparse to accept values for the course, title, name and school:

parser = argparse.ArgumentParser()
parser.add_argument('-c', '--course')
parser.add_argument('-t', '--title')
parser.add_argument('-n', '--name',) 
parser.add_argument('-s', '--school', default='My U')

A bit of string formatting is all it takes to stick the args into content:

args = parser.parse_args()
content%args.__dict__

After writing the content out to a file, cover.tex,

with open('cover.tex','w') as f:
    f.write(content%args.__dict__)

you could use subprocess to call pdflatex cover.tex.

proc = subprocess.Popen(['pdflatex', 'cover.tex'])
proc.communicate()

You could add an lpr command here too to add printing to the workflow.

Remove unneeded files:

os.unlink('cover.tex')
os.unlink('cover.log')

The script could then be called like this:

make_cover.py -c "Hardest Class Ever" -t "Theoretical Theory" -n Me

Putting it all together,

import argparse
import os
import subprocess

content = r'''\documentclass{article}
\begin{document}
... P \& B 
\textbf{\huge %(school)s \\}
\vspace{1cm}
\textbf{\Large %(title)s \\}
...
\end{document}
'''

parser = argparse.ArgumentParser()
parser.add_argument('-c', '--course')
parser.add_argument('-t', '--title')
parser.add_argument('-n', '--name',) 
parser.add_argument('-s', '--school', default='My U')

args = parser.parse_args()

with open('cover.tex','w') as f:
    f.write(content%args.__dict__)

cmd = ['pdflatex', '-interaction', 'nonstopmode', 'cover.tex']
proc = subprocess.Popen(cmd)
proc.communicate()

retcode = proc.returncode
if not retcode == 0:
    os.unlink('cover.pdf')
    raise ValueError('Error {} executing command: {}'.format(retcode, ' '.join(cmd))) 

os.unlink('cover.tex')
os.unlink('cover.log')

Solution 2

There are of course templating systems like Jinja, but they're probably overkill for what you're asking. You can also format the page using RST and use that to generate LaTeX, but again that's probably overkill. Heck, auto-generating the page is probably overkill for the number of fields you've got to define, but since when did overkill stop us! :)

I've done something similar with Python's string formatting. Take your LaTeX document above and "tokenize" the file by placing %(placeholder_name1)s tokens into the document. For example, where you want your class name to go, use %(course_name)s

\textbf{\Large "%(homework_title)s" \\}
\vspace{1cm}
\textbf{\Large "%(course_name)s" \\}

Then, from Python, you can load in that template and format it as:

template = file('template.tex', 'r').read()
page = template % {'course_name' : 'Computer Science 500', 
                   'homework_title' : 'NP-Complete'}
file('result.tex', 'w').write(page)

If you want to find those tokens automatically, the following should do pretty well:

import sys
import re
import subprocess

template = file('template.tex', 'r').read()
pattern = re.compile('%\(([^}]+)\)[bcdeEfFgGnosxX%]')
tokens = pattern.findall(template)

token_values = dict()
for token in tokens:
    sys.stdout.write('Enter value for ' + token + ': ')
    token_values[token] = sys.stdin.readline().strip()

page = template % token_values
file('result.tex', 'w').write(page)

subprocess.call('pdflatex result.tex')

The code will iterate across the tokens and print a prompt to the console asking you for an input for each token. In the above example, you'll get two prompts (with example answers):

Enter value for homework_title: NP-Complete
Enter value for course_name: Computer Science 500

The final line calls pdflatex on the resulting file and generates a PDF from it. If you want to go further, you could also ask the user for an output file name or take it as an command line option.

Solution 3

There's a Python library exactly for that: PyLaTeX. The following code was taken directly from the documentation:

from pylatex import Document, Section, Subsection, Command
from pylatex.utils import italic, NoEscape


def fill_document(doc):
    """Add a section, a subsection and some text to the document.

    :param doc: the document
    :type doc: :class:`pylatex.document.Document` instance
    """
    with doc.create(Section('A section')):
        doc.append('Some regular text and some ')
        doc.append(italic('italic text. '))

        with doc.create(Subsection('A subsection')):
            doc.append('Also some crazy characters: $&#{}')


if __name__ == '__main__':
    # Basic document
    doc = Document('basic')
    fill_document(doc)

    doc.generate_pdf(clean_tex=False)
    doc.generate_tex()

    # Document with `\maketitle` command activated
    doc = Document()

    doc.preamble.append(Command('title', 'Awesome Title'))
    doc.preamble.append(Command('author', 'Anonymous author'))
    doc.preamble.append(Command('date', NoEscape(r'\today')))
    doc.append(NoEscape(r'\maketitle'))

    fill_document(doc)

    doc.generate_pdf('basic_maketitle', clean_tex=False)

    # Add stuff to the document
    with doc.create(Section('A second section')):
        doc.append('Some text.')

    doc.generate_pdf('basic_maketitle2', clean_tex=False)
    tex = doc.dumps()  # The document as string in LaTeX syntax

It's particularly useful for generating automatic reports or slides.

Solution 4

There is also a Template class (since 2.4) allowing to use $that token instead of %(thi)s one.

Share:
62,028
juliomalegria
Author by

juliomalegria

Me: juliomalegria.com Currently based in Antwerp, BE. Worked as an SRE at Google (San Francisco office), and as a Software Engineer at YouTube (Paris office). Studied Computer Science at UCSP.

Updated on January 22, 2020

Comments

  • juliomalegria
    juliomalegria over 4 years

    I'm a college guy, and in my college, to present any kind of homework, it has to have a standard coverpage (with the college logo, course name, professor's name, my name and bla bla bla).

    So, I have a .tex document, which generate my standard coverpages pdfs. It goes something like:

    ...
    \begin{document}
    %% College logo
    \vspace{5cm}
    \begin{center}
    \textbf{\huge "School and Program Name" \\}
    \vspace{1cm}
    \textbf{\Large "Homework Title" \\}
    \vspace{1cm}
    \textbf{\Large "Course Name" \\}
    \end{center}
    \vspace{2.5cm}
    \begin{flushright}
    {\large "My name" }
    \end{flushright}
    ...
    

    So, I was wondering if there's a way to make a Python script that asks me for the title of my homework, the course name and the rest of the strings and use them to generate the coverpage. After that, it should compile the .tex and generate the pdf with the information given.

    Any opinions, advice, snippet, library, is accepted.

  • juliomalegria
    juliomalegria over 12 years
    pretty good solution, and simple! now I have a automatic cover generator :)
  • TimP
    TimP about 11 years
    I needed to add shell=True to subprocess call.
  • sleblanc
    sleblanc over 10 years
    Since we like overkill, I want to see the Jinja answer!
  • Kritz
    Kritz about 8 years
    Is there a way to check whether the pdf was generated successfully? I've found if I have an '&' or '%' in the text it breaks the pdf.
  • unutbu
    unutbu about 8 years
    @Johan: The script shows the output of the call to pdlatex. If there is an error processing the LaTeX, those error messages will show you that the pdf was not successfully generated. The & is not a special character in Python, but it is in TeX, so you need to backslash it if you want a literal ampersand: \&. The % is a special character in both Python and TeX. Depending on where the % is located, it may need to be changed to either \% or %%.
  • Kritz
    Kritz about 8 years
    Thanks Unutbu. I actually want to run this on an remote server, so I won't be able to view the output. What I'm doing now is just checking if the pdf was generated. If it was generated, I assume everything is ok and send the pdf, if not, the server responds with an error message. Or do you have a better suggestion?
  • unutbu
    unutbu about 8 years
    @Johan: I modified the script above to raise a ValueError if the pdflatex command fails. Without -interaction nonstopmode, the Python process could hang waiting for user input. Note that pdflatex will still generate a pdf file even if errors occur. You must check for the ValueError to know if the pdf file generated was due to a bad run.
  • unutbu
    unutbu about 8 years
    Alternatively, you could call os.unlink(cover.pdf) whenever retcode != 0.
  • 3kstc
    3kstc over 5 years
    @unutbu how would one include a logo (/directory/somefolder/logo.jpg)?
  • unutbu
    unutbu over 5 years
    @3kstc: Do you have the TeX code to include the logo? If you do, it should be trivial to add it to content. If you don't have the TeX code, start by searching here for the answer.
  • Rusca8
    Rusca8 about 4 years
    Be aware that things like generate_pdf need to have pdflatex or similar installed (which in turn deppend on Perl I think?). I ended up using PyLaTeX at PythonAnywhere, since it's free and it has all this working almost by default.
  • GCMeccariello
    GCMeccariello almost 2 years
    @unutbu : thanks a lot for the answer! However, whenever I use content%args.__dict__ I get an error message saying: ValueError: unsupported format character '\' (0x5c) at index 94. Do you know what it means?