How to display a pdf that has been downloaded in python

python pdf poppler pdf-rendering

13,098

Solution 1

It all depends on the OS your using. These might usually help:

import os
os.system('my_pdf.pdf')

os.startfile('path_to_pdf.pdf')

import webbrowser
webbrowser.open(r'file:///my_pdf.pdf')

Solution 2

How about using a temporary file?

import tempfile
import urllib
import urlparse

import requests

from gi.repository import Poppler, Gtk

pdf = requests.get("http://www.scala-lang.org/docu/files/ScalaByExample.pdf")

with tempfile.NamedTemporaryFile() as pdf_contents:
    pdf_contents.file.write(pdf)
    file_url = urlparse.urljoin(
        'file:', urllib.pathname2url(pdf_contents.name))
    document = Poppler.Document.new_from_file(file_url, None)

Solution 3

Try this and tell me if it works:

document = Poppler.Document.new_from_data(str(pdf.content),len(repr(pdf.content)),None)

Solution 4

If you want to open pdf using acrobat reader then below code should work

import subprocess
process = subprocess.Popen(['<here path to acrobat.exe>', '/A', 'page=1', '<here path to pdf>'], shell=False, stdout=subprocess.PIPE)
process.wait()

Solution 5

Since there is a library named pyPdf, you should be able to load PDF file using that. If you have any further questions, send me messege.

View more solutions

13,098

Author by

marshall

Updated on June 15, 2022

Comments

marshall about 2 years

I have grabbed a pdf from the web using for example

import requests
pdf = requests.get("http://www.scala-lang.org/docu/files/ScalaByExample.pdf")

I would like to modify this code to display it

from gi.repository import Poppler, Gtk

def draw(widget, surface):
    page.render(surface)

document = Poppler.Document.new_from_file("file:///home/me/some.pdf", None)
page = document.get_page(0)

window = Gtk.Window(title="Hello World")
window.connect("delete-event", Gtk.main_quit)
window.connect("draw", draw)
window.set_app_paintable(True)

window.show_all()
Gtk.main()

How do I modify the document = line to use the variable pdf that contains the pdf?

(I don't mind using popplerqt4 or anything else if that makes it easier.)

marshall over 10 years

This is my current workaround. It would be great if it could be avoided however.
logc over 10 years

Are you using python-poppler-qt4, pypoppler, or which library is the one that defines Document.Poppler ?
marshall over 10 years

My import line is from gi.repository import Poppler, Gtk which defines Poppler.Document . I needed to install libpoppler-dev to get it to work I think. I am happy to move to python-poppler-qt if that is a good idea however.
logc over 10 years

And which library is, in turn, the one that allows you to import gi.repository ? :) BTW, I am not suggesting you move to another library, I do not have very much experience with the others I mentioned ...
Cilyan over 10 years

I still get PDF document is damaged with this solution with python3.3, and a segmentation fault on python2.7. But maybe it will work for OP...
Raghav RV over 10 years

I tried it in ipython notebook. It did. but since @Cilyan says it did not work for him. You should try it yourself and tell me if it does work for you.
Rolf of Saxony over 7 years

import webbrowser +1