How to display a pdf that has been downloaded in python

13,098

Solution 1

It all depends on the OS your using. These might usually help:

import os
os.system('my_pdf.pdf')

or

os.startfile('path_to_pdf.pdf')

or

import webbrowser
webbrowser.open(r'file:///my_pdf.pdf')

Solution 2

How about using a temporary file?

import tempfile
import urllib
import urlparse

import requests

from gi.repository import Poppler, Gtk

pdf = requests.get("http://www.scala-lang.org/docu/files/ScalaByExample.pdf")

with tempfile.NamedTemporaryFile() as pdf_contents:
    pdf_contents.file.write(pdf)
    file_url = urlparse.urljoin(
        'file:', urllib.pathname2url(pdf_contents.name))
    document = Poppler.Document.new_from_file(file_url, None)

Solution 3

Try this and tell me if it works:

document = Poppler.Document.new_from_data(str(pdf.content),len(repr(pdf.content)),None)

Solution 4

If you want to open pdf using acrobat reader then below code should work

import subprocess
process = subprocess.Popen(['<here path to acrobat.exe>', '/A', 'page=1', '<here path to pdf>'], shell=False, stdout=subprocess.PIPE)
process.wait()

Solution 5

Since there is a library named pyPdf, you should be able to load PDF file using that. If you have any further questions, send me messege.

Share:
13,098
marshall
Author by

marshall

Updated on June 15, 2022

Comments

  • marshall
    marshall about 2 years

    I have grabbed a pdf from the web using for example

    import requests
    pdf = requests.get("http://www.scala-lang.org/docu/files/ScalaByExample.pdf")
    

    I would like to modify this code to display it

    from gi.repository import Poppler, Gtk
    
    def draw(widget, surface):
        page.render(surface)
    
    document = Poppler.Document.new_from_file("file:///home/me/some.pdf", None)
    page = document.get_page(0)
    
    window = Gtk.Window(title="Hello World")
    window.connect("delete-event", Gtk.main_quit)
    window.connect("draw", draw)
    window.set_app_paintable(True)
    
    window.show_all()
    Gtk.main()
    

    How do I modify the document = line to use the variable pdf that contains the pdf?

    (I don't mind using popplerqt4 or anything else if that makes it easier.)

  • marshall
    marshall over 10 years
    This is my current workaround. It would be great if it could be avoided however.
  • logc
    logc over 10 years
    Are you using python-poppler-qt4, pypoppler, or which library is the one that defines Document.Poppler ?
  • marshall
    marshall over 10 years
    My import line is from gi.repository import Poppler, Gtk which defines Poppler.Document . I needed to install libpoppler-dev to get it to work I think. I am happy to move to python-poppler-qt if that is a good idea however.
  • logc
    logc over 10 years
    And which library is, in turn, the one that allows you to import gi.repository ? :) BTW, I am not suggesting you move to another library, I do not have very much experience with the others I mentioned ...
  • Cilyan
    Cilyan over 10 years
    I still get PDF document is damaged with this solution with python3.3, and a segmentation fault on python2.7. But maybe it will work for OP...
  • Raghav RV
    Raghav RV over 10 years
    I tried it in ipython notebook. It did. but since @Cilyan says it did not work for him. You should try it yourself and tell me if it does work for you.
  • Rolf of Saxony
    Rolf of Saxony over 7 years
    import webbrowser +1