Python: download files from google drive using url

140,506

Solution 1

If by "drive's url" you mean the shareable link of a file on Google Drive, then the following might help:

import requests

def download_file_from_google_drive(id, destination):
    URL = "https://docs.google.com/uc?export=download"

    session = requests.Session()

    response = session.get(URL, params = { 'id' : id }, stream = True)
    token = get_confirm_token(response)

    if token:
        params = { 'id' : id, 'confirm' : token }
        response = session.get(URL, params = params, stream = True)

    save_response_content(response, destination)    

def get_confirm_token(response):
    for key, value in response.cookies.items():
        if key.startswith('download_warning'):
            return value

    return None

def save_response_content(response, destination):
    CHUNK_SIZE = 32768

    with open(destination, "wb") as f:
        for chunk in response.iter_content(CHUNK_SIZE):
            if chunk: # filter out keep-alive new chunks
                f.write(chunk)

if __name__ == "__main__":
    file_id = 'TAKE ID FROM SHAREABLE LINK'
    destination = 'DESTINATION FILE ON YOUR DISK'
    download_file_from_google_drive(file_id, destination)

The snipped does not use pydrive, nor the Google Drive SDK, though. It uses the requests module (which is, somehow, an alternative to urllib2).

When downloading large files from Google Drive, a single GET request is not sufficient. A second one is needed - see wget/curl large file from google drive.

Solution 2

Having had similar needs many times, I made an extra simple class GoogleDriveDownloader starting on the snippet from @user115202 above. You can find the source code here.

You can also install it through pip:

pip install googledrivedownloader

Then usage is as simple as:

from google_drive_downloader import GoogleDriveDownloader as gdd

gdd.download_file_from_google_drive(file_id='1iytA1n2z4go3uVCwE__vIKouTKyIDjEq',
                                    dest_path='./data/mnist.zip',
                                    unzip=True)

This snippet will download an archive shared in Google Drive. In this case 1iytA1n2z4go3uVCwE__vIKouTKyIDjEq is the id of the sharable link got from Google Drive.

Solution 3

I recommend gdown package.

pip install gdown

Take your share link

https://drive.google.com/file/d/0B9P1L--7Wd2vNm9zMTJWOGxobkU/view?usp=sharing

and grab the id - eg. 1TLNdIufzwesDbyr_nVTR7Zrx9oRHLM_N by pressing the download button (look for at the link), and swap it in after the id below.

import gdown

url = 'https://drive.google.com/uc?id=0B9P1L--7Wd2vNm9zMTJWOGxobkU'
output = '20150428_collected_images.tgz'
gdown.download(url, output, quiet=False)

Solution 4

Here's an easy way to do it with no third-party libraries and a service account.

pip install google-api-core and google-api-python-client

from googleapiclient.discovery import build
from googleapiclient.http import MediaIoBaseDownload
from google.oauth2 import service_account
import io

credz = {} #put json credentials her from service account or the like
# More info: https://cloud.google.com/docs/authentication

credentials = service_account.Credentials.from_service_account_info(credz)
drive_service = build('drive', 'v3', credentials=credentials)

file_id = '0BwwA4oUTeiV1UVNwOHItT0xfa2M'
request = drive_service.files().get_media(fileId=file_id)
#fh = io.BytesIO() # this can be used to keep in memory
fh = io.FileIO('file.tar.gz', 'wb') # this can be used to write to disk
downloader = MediaIoBaseDownload(fh, request)
done = False
while done is False:
    status, done = downloader.next_chunk()
    print("Download %d%%." % int(status.progress() * 100))


Solution 5

PyDrive allows you to download a file with the function GetContentFile(). You can find the function's documentation here.

See example below:

# Initialize GoogleDriveFile instance with file id.
file_obj = drive.CreateFile({'id': '<your file ID here>'})
file_obj.GetContentFile('cats.png') # Download file as 'cats.png'.

This code assumes that you have an authenticated drive object, the docs on this can be found here and here.

In the general case this is done like so:

from pydrive.auth import GoogleAuth

gauth = GoogleAuth()
# Create local webserver which automatically handles authentication.
gauth.LocalWebserverAuth()

# Create GoogleDrive instance with authenticated GoogleAuth instance.
drive = GoogleDrive(gauth)

Info on silent authentication on a server can be found here and involves writing a settings.yaml (example: here) in which you save the authentication details.

Share:
140,506

Related videos on Youtube

rkatkam
Author by

rkatkam

A firm believer of "Learn by Doing"! love playing with python, mongodb and definitely only on linux ;)

Updated on May 05, 2022

Comments

  • rkatkam
    rkatkam almost 2 years

    I am trying to download files from google drive and all I have is the drive's URL.

    I have read about google API that talks about some drive_service and MedioIO, which also requires some credentials( mainly JSON file/OAuth). But I am unable to get any idea about how it is working.

    Also, tried urllib2.urlretrieve, but my case is to get files from the drive. Tried wget too but no use.

    Tried PyDrive library. It has good upload functions to drive but no download options.

    Any help will be appreciated. Thanks.

  • Billal Begueradj
    Billal Begueradj almost 7 years
    Your answer is more interesting
  • Joe
    Joe over 6 years
    first link broken :(
  • Robin Nabel
    Robin Nabel over 6 years
    @Joe fixed the link!
  • user3722096
    user3722096 about 6 years
    This gives me a 404-not found, using the ID of a public shared file. Any suggestions what could be wrong?
  • simpleuser
    simpleuser over 5 years
    @RobinNabel every link in the answer is now dead
  • Raksha
    Raksha about 5 years
    can't retrieve file ... 'open(/content/data.json').read() returns '<HTML>\n<HEAD>\n<TITLE>Not Found</TITLE>\n</HEAD>\n<BODY BGCOLOR="#FFFFFF" TEXT="#000000">\n<H1>Not Found</H1>\n<H2>Error 404</H2>\n</BODY>\n</HTML>\n'
  • ndrplz
    ndrplz about 5 years
    @Raksha It's difficult to understand the issue from your comment. If you still encounter this problem, please open a proper issue on GitHub
  • GoingMyWay
    GoingMyWay almost 5 years
    How can I download files from Google Drive given links, for example drive.google.com/file/d/1I05c4-d9OsNwGZnLx85fR8dnX-yVoTWe/vi‌​ew
  • yashas123
    yashas123 over 4 years
    @turdus-merula Anyway to get the downloading file name as it is stored in drive?
  • yashas123
    yashas123 over 4 years
    NVM, I got it by doing this: re.search(r'filename\=\"(.*)\"', response.headers['Content-Disposition']).group(1)
  • mrgloom
    mrgloom about 4 years
    Don't work and just silently downloads 4,0K file without warning or error Example link: drive.google.com/open?id=0B4qLcYyJmiz0TXdaTExNcW03ejA
  • mrgloom
    mrgloom about 4 years
    What modification should be done to download this zip file: drive.google.com/open?id=0B4qLcYyJmiz0TXdaTExNcW03ejA Just using 0B4qLcYyJmiz0TXdaTExNcW03ejA not work.
  • Black Thunder
    Black Thunder about 4 years
    bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: html5lib. Do you need to install a parser library?
  • Agile Bean
    Agile Bean almost 4 years
    Importantly, if you create the link by "Share" or "Get shareable link", the URL doesn't work - you must replace in the URL "open" to "uc". In other words, drive.google.com/open?id= ... to drive.google.com/uc?id= ...
  • Wok
    Wok over 3 years
    You need to add requests to the requirements.
  • Dhiraj Gandhi
    Dhiraj Gandhi over 3 years
    What if I want to access a restricted file using a Gmail id and password?
  • Mudit Bhatia
    Mudit Bhatia about 3 years
    Bro, I don't have enough words to thank you.
  • Aref
    Aref about 3 years
    The best and simplest answer. Thanks!
  • A.Sherif
    A.Sherif almost 3 years
    I tried to do as @AgileBean stated, but my link looks like this https://drive.google.com/file/d/3Xxk5lJSr...UV5eX9M/view?usp‌​=sharing so it did not work. So instead, I used the ID parameter gdown --id 3Xxk5lJSr...UV5eX9M where 3Xxk5lJSr...UV5eX9M is the file id that you can easily extract from the file's link.
  • Jingnan Jia
    Jingnan Jia almost 3 years
    My files are in a folder and the shared link of the folder is https://drive.google.com/drive/folders/14gKg6QW3TnwnaHoYTxxT‌​r6NzgQWqJufa?usp=sha‌​ring, but I can not download this folder using this method.
  • Subangkar KrS
    Subangkar KrS over 2 years
    The best one. Thanks a lot!!
  • Om Rastogi
    Om Rastogi about 2 years
    How can we get the name of the document from the link
  • JasonGenX
    JasonGenX about 2 years
    it doesn't work.... even for public files. I find it ridiculous that the output from this, running on python is "you may be able to use the browser". Now I only need to download the library that converts Python to a human who knows how to operate a browser and has hands for keyboard and mouse....
  • Mr Tsjolder
    Mr Tsjolder about 2 years
    seems like something has changed behind the scenes and the token stuff does not quite work anymore. However, simply always including confirm=1 as parameter seems to be a workaround.
  • zetyquickly
    zetyquickly almost 2 years
    Worked for me using when pasted the link that appears after pressing the "Download" button on google drive web page