Azure Blob - Read using Python

51,642

Solution 1

Yes, it is certainly possible to do so. Check out Azure Storage SDK for Python

from azure.storage.blob import BlockBlobService

block_blob_service = BlockBlobService(account_name='myaccount', account_key='mykey')

block_blob_service.get_blob_to_path('mycontainer', 'myblockblob', 'out-sunset.png')

You can read the complete SDK documentation here: http://azure-storage.readthedocs.io.

Solution 2

Here's a way to do it with the new version of the SDK (12.0.0):

from azure.storage.blob import BlobClient

blob = BlobClient(account_url="https://<account_name>.blob.core.windows.net"
                  container_name="<container_name>",
                  blob_name="<blob_name>",
                  credential="<account_key>")

with open("example.csv", "wb") as f:
    data = blob.download_blob()
    data.readinto(f)

See here for details.

Solution 3

Provide Your Azure subscription Azure storage name and Secret Key as Account Key here

block_blob_service = BlockBlobService(account_name='$$$$$$', account_key='$$$$$$')

This still get the blob and save in current location as 'output.jpg'

block_blob_service.get_blob_to_path('you-container_name', 'your-blob', 'output.jpg')

This will get text/item from blob

blob_item= block_blob_service.get_blob_to_bytes('your-container-name','blob-name')

    blob_item.content

Solution 4

One can stream from blob with python like this:

from tempfile import NamedTemporaryFile
from azure.storage.blob.blockblobservice import BlockBlobService

entry_path = conf['entry_path']
container_name = conf['container_name']
blob_service = BlockBlobService(
            account_name=conf['account_name'],
            account_key=conf['account_key'])

def get_file(filename):
    local_file = NamedTemporaryFile()
    blob_service.get_blob_to_stream(container_name, filename, stream=local_file, 
    max_connections=2)

    local_file.seek(0)
    return local_file

Solution 5

I recommend using smart_open.

from smart_open import open

# stream from Azure Blob Storage
with open('azure://my_container/my_file.txt') as fin:
    for line in fin:
        print(line)

# stream content *into* Azure Blob Storage (write mode):
with open('azure://my_container/my_file.txt', 'wb') as fout:
    fout.write(b'hello world')
Share:
51,642

Related videos on Youtube

AngiSen
Author by

AngiSen

Updated on December 16, 2021

Comments

  • AngiSen
    AngiSen over 2 years

    Can someone tell me if it is possible to read a csv file directly from Azure blob storage as a stream and process it using Python? I know it can be done using C#.Net (shown below) but wanted to know the equivalent library in Python to do this.

    CloudBlobClient client = storageAccount.CreateCloudBlobClient();
    CloudBlobContainer container = client.GetContainerReference("outfiles");
    CloudBlob blob = container.GetBlobReference("Test.csv");*
    
    • AngiSen
      AngiSen about 6 years
      @Jay..Do you have any inputs on this?
  • AngiSen
    AngiSen about 6 years
    thanks Gaurav. I checked the page but not able to see GetBlobReference class equivalent for Python.
  • Gaurav Mantri
    Gaurav Mantri about 6 years
    As such you don't get reference to BlockBlob as you can get in .Net SDK. I have edited my code to show how you can download a blob to local file system and added a link to SDK documentation. HTH.
  • AngiSen
    AngiSen about 6 years
    I know this functionality exist for Python SDK but i am looking for a function similar to .Net
  • Gaurav Mantri
    Gaurav Mantri about 6 years
    So if I understand correctly, you wish to create an instance of BlockBlob (like CloudBlockBlob) in Python. Correct? Would you mind explaining the reason behind it.
  • AngiSen
    AngiSen about 6 years
    It's in alignment with some of our existing works... I need to read a file from blob as a stream, do some processing and write it back to the blob. The whole Python app will run as a webjob. I know i can download the file from blob to Webjob console (D:) but wanted to know if there is a similar functionality of .Net in Python without having to download the file in drive.
  • Gaurav Mantri
    Gaurav Mantri about 6 years
    Oh...Yes, you would use get_blob_to_stream in that case. Please check out this page: azure-storage.readthedocs.io/ref/….
  • hui chen
    hui chen almost 5 years
    I would suggest putting the official Microsoft Docs link for the SDK documentation.
  • Hayat
    Hayat over 4 years
    It says AzureMissingResourceHttpError. When I print(list(blob_service.list_blob_names("azureml"))) it shows 'data/leming/00U/dataset.csv' I don't know what to put in filepath
  • Rajat Arora
    Rajat Arora over 4 years
    HI, this still downloads the file. Is it possible to get the contents of blob without downloading the file?
  • Sebastian Dziadzio
    Sebastian Dziadzio over 4 years
    When you do data = blob.download_blob(), the contents of the blob will be in data, you don't need to write to a file.
  • RB17
    RB17 about 4 years
    @SebastianDziadzio Is there a way to read this data into python data frame? I am somehow unable to work using blockblovservice
  • Sebastian Dziadzio
    Sebastian Dziadzio about 4 years
    If you're downloading a CSV file, you should be able to convert the contents of data to a data frame with pd.read_csv(data).
  • Steven Van Dorpe
    Steven Van Dorpe over 3 years
    Thanks for this, very useful. Does the TemporaryFile need clean-up afterwards?
  • Daniel R
    Daniel R over 3 years
    happy to help:) according to docs (docs.python.org/3/library/tempfile.html) it will be closed and destroyed, no need to worry about that
  • Niels Hoogeveen
    Niels Hoogeveen about 2 years
    How can I read all csv files in a folder and append them to my dataframe?