Azure Blob - Read using Python
Solution 1
Yes, it is certainly possible to do so. Check out Azure Storage SDK for Python
from azure.storage.blob import BlockBlobService
block_blob_service = BlockBlobService(account_name='myaccount', account_key='mykey')
block_blob_service.get_blob_to_path('mycontainer', 'myblockblob', 'out-sunset.png')
You can read the complete SDK documentation here: http://azure-storage.readthedocs.io.
Solution 2
Here's a way to do it with the new version of the SDK (12.0.0):
from azure.storage.blob import BlobClient
blob = BlobClient(account_url="https://<account_name>.blob.core.windows.net"
container_name="<container_name>",
blob_name="<blob_name>",
credential="<account_key>")
with open("example.csv", "wb") as f:
data = blob.download_blob()
data.readinto(f)
See here for details.
Solution 3
Provide Your Azure subscription Azure storage name and Secret Key as Account Key here
block_blob_service = BlockBlobService(account_name='$$$$$$', account_key='$$$$$$')
This still get the blob and save in current location as 'output.jpg'
block_blob_service.get_blob_to_path('you-container_name', 'your-blob', 'output.jpg')
This will get text/item from blob
blob_item= block_blob_service.get_blob_to_bytes('your-container-name','blob-name')
blob_item.content
Solution 4
One can stream from blob with python like this:
from tempfile import NamedTemporaryFile
from azure.storage.blob.blockblobservice import BlockBlobService
entry_path = conf['entry_path']
container_name = conf['container_name']
blob_service = BlockBlobService(
account_name=conf['account_name'],
account_key=conf['account_key'])
def get_file(filename):
local_file = NamedTemporaryFile()
blob_service.get_blob_to_stream(container_name, filename, stream=local_file,
max_connections=2)
local_file.seek(0)
return local_file
Solution 5
I recommend using smart_open.
from smart_open import open
# stream from Azure Blob Storage
with open('azure://my_container/my_file.txt') as fin:
for line in fin:
print(line)
# stream content *into* Azure Blob Storage (write mode):
with open('azure://my_container/my_file.txt', 'wb') as fout:
fout.write(b'hello world')
Related videos on Youtube
AngiSen
Updated on December 16, 2021Comments
-
AngiSen over 2 years
Can someone tell me if it is possible to read a csv file directly from Azure blob storage as a stream and process it using Python? I know it can be done using C#.Net (shown below) but wanted to know the equivalent library in Python to do this.
CloudBlobClient client = storageAccount.CreateCloudBlobClient(); CloudBlobContainer container = client.GetContainerReference("outfiles"); CloudBlob blob = container.GetBlobReference("Test.csv");*
-
AngiSen about 6 years@Jay..Do you have any inputs on this?
-
-
AngiSen about 6 yearsthanks Gaurav. I checked the page but not able to see GetBlobReference class equivalent for Python.
-
Gaurav Mantri about 6 yearsAs such you don't get reference to BlockBlob as you can get in .Net SDK. I have edited my code to show how you can download a blob to local file system and added a link to SDK documentation. HTH.
-
AngiSen about 6 yearsI know this functionality exist for Python SDK but i am looking for a function similar to .Net
-
Gaurav Mantri about 6 yearsSo if I understand correctly, you wish to create an instance of BlockBlob (like CloudBlockBlob) in Python. Correct? Would you mind explaining the reason behind it.
-
AngiSen about 6 yearsIt's in alignment with some of our existing works... I need to read a file from blob as a stream, do some processing and write it back to the blob. The whole Python app will run as a webjob. I know i can download the file from blob to Webjob console (D:) but wanted to know if there is a similar functionality of .Net in Python without having to download the file in drive.
-
Gaurav Mantri about 6 yearsOh...Yes, you would use
get_blob_to_stream
in that case. Please check out this page: azure-storage.readthedocs.io/ref/…. -
hui chen almost 5 yearsI would suggest putting the official Microsoft Docs link for the SDK documentation.
-
Hayat over 4 yearsIt says
AzureMissingResourceHttpError
. When Iprint(list(blob_service.list_blob_names("azureml")))
it shows'data/leming/00U/dataset.csv'
I don't know what to put infilepath
-
Rajat Arora over 4 yearsHI, this still downloads the file. Is it possible to get the contents of blob without downloading the file?
-
Sebastian Dziadzio over 4 yearsWhen you do
data = blob.download_blob()
, the contents of the blob will be indata
, you don't need to write to a file. -
RB17 about 4 years@SebastianDziadzio Is there a way to read this data into python data frame? I am somehow unable to work using blockblovservice
-
Sebastian Dziadzio about 4 yearsIf you're downloading a CSV file, you should be able to convert the contents of
data
to a data frame withpd.read_csv(data)
. -
Steven Van Dorpe over 3 yearsThanks for this, very useful. Does the TemporaryFile need clean-up afterwards?
-
Daniel R over 3 yearshappy to help:) according to docs (docs.python.org/3/library/tempfile.html) it will be closed and destroyed, no need to worry about that
-
Niels Hoogeveen about 2 yearsHow can I read all csv files in a folder and append them to my dataframe?