Read xlsx stored on sharepoint location with openpyxl in python?

20,607

Instead of trying to load directly from a web-address, try using urllib.

import urllib
file = "https://content.potatocompany.com/workspaces/PotatoTeam/Shared Documents/XYZ errors/XYZ Errors_Confirm.xlsx"
urllib.urlretrieve(file,"test.xlsx")

From further research, requests may be preferred over urllib. Try this:

import requests
from requests.auth import HTTPBasicAuth
file = "https://content.potatocompany.com/workspaces/PotatoTeam/Shared Documents/XYZ errors/XYZ Errors_Confirm.xlsx"
        
username = 'myUsername'
password = 'myPassword'
        
resp=requests.get(file, auth=HTTPBasicAuth(username, password))
output = open('test.xlsx', 'wb')
output.write(resp.content)
output.close()

To get requests installed:

pip install requests
Share:
20,607
Gunay Anach
Author by

Gunay Anach

Gave up been a unicorn, a majestic hard-working donkey is fine.

Updated on July 15, 2022

Comments

  • Gunay Anach
    Gunay Anach almost 2 years

    quick one.

    I have XLSX file located on sharepoint drive and cannot open it using openpyxl in python, it works well if it is stored on my local drive.

    I tried this.

    from openpyxl import load_workbook
    wb = load_workbook('https://content.potatocompany.com/workspaces/PotatoTeam/Shared Documents/XYZ errors/XYZ Errors_Confirm.xlsx')
    

    Throws up this exception:

    C:\Anaconda\lib\site-packages\openpyxl\reader\excel.py in load_workbook(filename, use_iterators, keep_vba, guess_types, data_only)
        123     except (BadZipfile, RuntimeError, IOError, ValueError):
        124         e = exc_info()[1]
    --> 125         raise InvalidFileException(unicode(e))
        126     wb = Workbook(guess_types=guess_types, data_only=data_only)
        127 
    
    InvalidFileException: [Errno 22] invalid mode ('rb') or filename: 'https://...
    

    Am I missing something? I need to read the content of one of the sheets in python.


    EDIT:

    Using crussell's advice, I receive 401 UNAUTHORIZED:

    import requests
    import urllib
    from openpyxl import load_workbook
    from requests.auth import HTTPBasicAuth
    
    file = "https://content.potatocompany.com/workspaces/PotatoTeam/Shared Documents/XYZ errors/XYZ Errors_Confirm.xlsx"
    
    username = 'PotatoUser'
    password = 'PotatoPassword'
    
    resp=requests.get(file, auth=HTTPBasicAuth(username, password))
    print(resp.content)
    

    Seems like sharepoint and requests are not compatible, with both Digest Authentication and Basic Authentication http://docs.python-requests.org/en/latest/user/authentication/

  • Gunay Anach
    Gunay Anach over 8 years
    Thanks Martin for your suggestion, however I get : 560 class HTTPRedirectHandler(BaseHandler): HTTPError: HTTP Error 400: Bad Request
  • Gunay Anach
    Gunay Anach over 8 years
    One forward slash throws this exception: URLError: <urlopen error no host given>
  • Gunay Anach
    Gunay Anach over 8 years
    I can open it in a browser, otherwise I get this exception: HTTPError: HTTP Error 401: Unauthorized
  • Martin Evans
    Martin Evans over 8 years
    Are you going through a proxy?
  • Gunay Anach
    Gunay Anach over 8 years
    No, it supposed to be internal sharepoint location with web interface.
  • Gunay Anach
    Gunay Anach over 8 years
    Thanks crussell, this seems to be right direction for me, now need to fight with authentication request while accessing the file. IOError: ('http error', 401, 'Unauthorized' ...
  • crussell
    crussell over 8 years
    Is the file encrpyted?
  • crussell
    crussell over 8 years
    The web server seems to be expecting some authentication credentials, do you have a username and password for the website?
  • Gunay Anach
    Gunay Anach over 8 years
    Yes i do have credentials and the file is not encrypted
  • Charlie Clark
    Charlie Clark over 8 years
    Apart from the headaches of logging into Sharepoint I don't think openpyxl supports the buffer interface. It's restricted to locally accessible files for reasons of simplicity.
  • Gunay Anach
    Gunay Anach over 8 years
    Thank you for your suggestions crussell, It seems like there is authentication issues which cannot be handled by requests. Will keep looking for another method.
  • Calab
    Calab over 4 years
    I would be interested in seeing an example with request_ntml, please.
  • Calab
    Calab over 4 years
    I have no issues opening the spreadsheet from SharePoint in a web browser, but trying to do so from a python script is giving me a 403 forbidden error. What am I missing? import requests_ntlm import requests file = "https://ourcompany.sharepoint.com/abcd/5UJJPA3D/FILE" u=r"domain\username" p="password" resp = requests.get(file, auth=requests_ntlm.HttpNtlmAuth(u,p)) print(resp.content) b'403 FORBIDDEN'
  • Ben.T
    Ben.T over 4 years
    @Calab Hi, unfortunately, I'm not at the same work anymore and I had to leave my scripts so I can't answer your question. sorry
  • ThinkCode
    ThinkCode almost 4 years
    Instead of basic auth since this is internal sharepoint, I used this instead - from requests_negotiate_sspi import HttpNegotiateAuth resp=requests.get(file, auth=HttpNegotiateAuth())