Downloading contents of a Azure blob as a text string taking too long time
To speed up the process, one thing you could do is instead of reading the entire file in one go you read them in chunks. Take a look at DownloadRangeToStream
method.
Essentially the idea is that you first create an empty file of 30 MB (size of your blob). Then in parallel you download 1MB (or whatever size you see fit) chunks using DownloadRangeToStream
method. As and when these chunks are downloaded, you put the stream contents in appropriate places in the file.
I answered a similar question on SO a few days ago: StorageException when downloading a large file over a slow network. Take a look at my answer there. There the chunks are downloaded in sequence but it should give you some idea about how to implement chunked download.
Comments
-
Rohit almost 2 years
I am developing an application that
Upload a .CSV file on Azure blob storage from my local machine using simple HTTP web page (REST methods)
Once, the .CSV file is uploaded, I fetch the stream in order to update my database
The .CSV file is around 30 MB, it takes 2 minutes to upload to blob, but takes 30 minutes to read the stream. can you please provide inputs to improve the speed? Here is the code snippet being used to read stream from the file: https://azure.microsoft.com/en-in/documentation/articles/storage-dotnet-how-to-use-blobs/
public string GetReadData(string filename) { // Retrieve storage account from connection string. CloudStorageAccount storageAccount = CloudStorageAccount.Parse(System.Web.Configuration.WebConfigurationManager.AppSettings["StorageConnectionString"]); // Create the blob client. CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient(); // Retrieve reference to a previously created container. CloudBlobContainer container = blobClient.GetContainerReference(System.Web.Configuration.WebConfigurationManager.AppSettings["BlobStorageContainerName"]); // Retrieve reference to a blob named "filename" CloudBlockBlob blockBlob2 = container.GetBlockBlobReference(filename); string text; using (var memoryStream = new MemoryStream()) { blockBlob2.DownloadToStream(memoryStream); text = System.Text.Encoding.UTF8.GetString(memoryStream.ToArray()); } return text; }
-
Zhaoxing Lu over 8 yearsGaurav has already answered the question perfectly, but personally I'd still suggest you not to place your application in your local machine. :) Honestly, taking 30min to download 30MB file in single thread is super terrible. Please consider moving your application into Azure (Web Role/Worker Role/Virtual Machine) or somewhere with better network environment.
-
GFoley83 about 8 years@GauravMantri can you provide an example of how you would download in parallel please? I see you can set
ParallelOperationThreadCount
inBlobRequestOptions
but this only works for uploads. Can't find any code on the topic. Thanks! -
Gaurav Mantri about 8 years@GFoley ... Did you take a look at the sample code I posted here: stackoverflow.com/questions/31128977/…?
-
GFoley83 about 8 yearsI did. And up-voted it too. The problem is (as you mentioned above) it only demos how to upload the chunks in sequence, not in parallel.
-
Gaurav Mantri about 8 yearsAah .... I see (& thanks for upvoting :)). Can you please post a new question and I will try to hack some code for you and provide that as an answer?