Read xlsx stored on sharepoint location with openpyxl in python?
20,607
Instead of trying to load directly from a web-address, try using urllib.
import urllib
file = "https://content.potatocompany.com/workspaces/PotatoTeam/Shared Documents/XYZ errors/XYZ Errors_Confirm.xlsx"
urllib.urlretrieve(file,"test.xlsx")
From further research, requests may be preferred over urllib. Try this:
import requests
from requests.auth import HTTPBasicAuth
file = "https://content.potatocompany.com/workspaces/PotatoTeam/Shared Documents/XYZ errors/XYZ Errors_Confirm.xlsx"
username = 'myUsername'
password = 'myPassword'
resp=requests.get(file, auth=HTTPBasicAuth(username, password))
output = open('test.xlsx', 'wb')
output.write(resp.content)
output.close()
To get requests installed:
pip install requests
Author by
Gunay Anach
Gave up been a unicorn, a majestic hard-working donkey is fine.
Updated on July 15, 2022Comments
-
Gunay Anach almost 2 years
quick one.
I have XLSX file located on sharepoint drive and cannot open it using openpyxl in python, it works well if it is stored on my local drive.
I tried this.
from openpyxl import load_workbook wb = load_workbook('https://content.potatocompany.com/workspaces/PotatoTeam/Shared Documents/XYZ errors/XYZ Errors_Confirm.xlsx')
Throws up this exception:
C:\Anaconda\lib\site-packages\openpyxl\reader\excel.py in load_workbook(filename, use_iterators, keep_vba, guess_types, data_only) 123 except (BadZipfile, RuntimeError, IOError, ValueError): 124 e = exc_info()[1] --> 125 raise InvalidFileException(unicode(e)) 126 wb = Workbook(guess_types=guess_types, data_only=data_only) 127 InvalidFileException: [Errno 22] invalid mode ('rb') or filename: 'https://...
Am I missing something? I need to read the content of one of the sheets in python.
EDIT:
Using crussell's advice, I receive 401 UNAUTHORIZED:
import requests import urllib from openpyxl import load_workbook from requests.auth import HTTPBasicAuth file = "https://content.potatocompany.com/workspaces/PotatoTeam/Shared Documents/XYZ errors/XYZ Errors_Confirm.xlsx" username = 'PotatoUser' password = 'PotatoPassword' resp=requests.get(file, auth=HTTPBasicAuth(username, password)) print(resp.content)
Seems like sharepoint and requests are not compatible, with both Digest Authentication and Basic Authentication http://docs.python-requests.org/en/latest/user/authentication/
-
Gunay Anach over 8 yearsThanks Martin for your suggestion, however I get : 560 class HTTPRedirectHandler(BaseHandler): HTTPError: HTTP Error 400: Bad Request
-
Gunay Anach over 8 yearsOne forward slash throws this exception: URLError: <urlopen error no host given>
-
Gunay Anach over 8 yearsI can open it in a browser, otherwise I get this exception: HTTPError: HTTP Error 401: Unauthorized
-
Martin Evans over 8 yearsAre you going through a proxy?
-
Gunay Anach over 8 yearsNo, it supposed to be internal sharepoint location with web interface.
-
Gunay Anach over 8 yearsThanks crussell, this seems to be right direction for me, now need to fight with authentication request while accessing the file. IOError: ('http error', 401, 'Unauthorized' ...
-
crussell over 8 yearsIs the file encrpyted?
-
crussell over 8 yearsThe web server seems to be expecting some authentication credentials, do you have a username and password for the website?
-
Gunay Anach over 8 yearsYes i do have credentials and the file is not encrypted
-
Charlie Clark over 8 yearsApart from the headaches of logging into Sharepoint I don't think openpyxl supports the buffer interface. It's restricted to locally accessible files for reasons of simplicity.
-
Gunay Anach over 8 yearsThank you for your suggestions crussell, It seems like there is authentication issues which cannot be handled by requests. Will keep looking for another method.
-
Calab over 4 yearsI would be interested in seeing an example with request_ntml, please.
-
Calab over 4 yearsI have no issues opening the spreadsheet from SharePoint in a web browser, but trying to do so from a python script is giving me a 403 forbidden error. What am I missing?
import requests_ntlm import requests file = "https://ourcompany.sharepoint.com/abcd/5UJJPA3D/FILE" u=r"domain\username" p="password" resp = requests.get(file, auth=requests_ntlm.HttpNtlmAuth(u,p)) print(resp.content) b'403 FORBIDDEN'
-
Ben.T over 4 years@Calab Hi, unfortunately, I'm not at the same work anymore and I had to leave my scripts so I can't answer your question. sorry
-
ThinkCode almost 4 yearsInstead of basic auth since this is internal sharepoint, I used this instead - from requests_negotiate_sspi import HttpNegotiateAuth resp=requests.get(file, auth=HttpNegotiateAuth())