What does await do in this function?

10,612

Solution 1

It awaits the completion of the HTTP request. The code resumes (for iteration...) only after every single request is complete.

Your 2nd version works precisely because it doesn't await for each task to complete before initiating the following tasks, and only waits for all the tasks to complete after all have been started.

What async-await is useful for is allowing the calling function to continue doing other things while the asynchronous function is awaiting, as opposed to synchronous ("normal") functions that block the calling function until completion.

Solution 2

An await is an asynchronous wait. It is not a blocking call and allows the caller of your method to continue. The remainder of the code inside the method after an await will be executed when the Task returned has completed.

In the first version of your code, you allow callers to continue. However, each iteration of the loop will wait until the Task returned by GetStringAsync has completed. This has the effect of sequentially downloading each URL, rather than concurrently.

Note that the second version of your code is not asynchronous insofar as it uses threads to perform the work in parallel.

If it were asynchronous, it would retrieve the webpage content using only one thread but still concurrently.

Something like this (untested):

public static async Task<int> Test()
{
    int ret = 0;
    HttpClient client = new HttpClient();
    List<Task> taskList = new List<Task>();
    for (int i = 1000; i <= 1100; i++)
    {
        var i1 = i;
        taskList.Add(client.GetStringAsync($"https://en.wikipedia.org/wiki/{i1}"));
    }
    await Task.WhenAll(taskList.ToArray());
    return ret;
}

Here, we start the tasks asynchronously and add them to the taskList. These tasks are non-blocking and will complete when the download has finished and the string retrieved. Pay attention to the call to Task.WhenAll rather than Task.WaitAll: the former is asynchronous and non-blocking, the latter is synchronous and blocking. This means that, at the await, the caller of this Test() method will receive the Task<int> returned: but the task will be incomplete until all of the strings are downloaded.

This is what forces async/await to proliferate throughout the stack. Once the very bottom call is asynchronous, it only makes sense if the rest of the callers all the way up are also asynchronous. Otherwise, you are forced to create a thread via Task.Run() calls or somesuch.

Solution 3

Per the msdn documentation

The await operator is applied to a task in an asynchronous method to suspend the execution of the method until the awaited task completes. The task represents ongoing work.

That means the await operator blocks the execution of the for loop until it get a responds from the server, making it sequential.

What you can do is create all the task (so that it begins execution) and then await all of them.

Here's an example from another StackOverflow question

public IEnumerable<TContent> DownloadContentFromUrls<TContent>(IEnumerable<string> urls)
{
    var queue = new ConcurrentQueue<TContent>();

    using (var client = new HttpClient())
    {
        Task.WaitAll(urls.Select(url =>
        {
            return client.GetAsync(url).ContinueWith(response =>
            {
                var content = JsonConvert.
                    DeserializeObject<IEnumerable<TContent>>(
                        response.Result.Content.ReadAsStringAsync().Result);

                foreach (var c in content)
                    queue.Enqueue(c);
            });
        }).ToArray());
    }

    return queue;
}

There's also good article in msdn that explains how to make parallel request with await.

Edit:

As @GaryMcLeanHall pointed out in a comment, you can change Task.WaitAll to await Task.WhenAll and add the async modifier to make the method return asynchronously

Here's another msdn article that picks the example in the first one and adds the use of WhenAll.

Share:
10,612
derekhh
Author by

derekhh

Currently I am a software engineer at Snap Inc. Previously I've been working at Microsoft Bing for almost four years. During my years at Microsoft I was the primary back-end algorithm developer or tech lead for many features related to natural language processing and machine learning. I've designed and implemented many core algorithms that powered Bing's conversational experience, question answering and entity carousel. Prior to joining Microsoft, I've obtained my Ph.D. from the Hong Kong University of Science and Technology. My research was primarily on the theme of sensor-based human activity recognition. Throughout my PhD years, I've published around 20 papers in top conferences and journals. I was also a winner of the Microsoft Research Fellowship in the year 2009. I also enjoy competitive programming a lot. I was a regular contestant in programming contests like acm/icpc, Google Code Jam and TopCoder Open. I've also won awards and top prizes from these competitions. Google Scholar Page: https://scholar.google.com/citations?user=Ks81aO0AAAAJ&amp;hl=en Specialties: machine learning, data mining, algorithms, programming

Updated on June 28, 2022

Comments

  • derekhh
    derekhh almost 2 years

    I thought I understand the async-await pattern in C# but today I've found out I really do not.

    In a simple code snippet like this. I have System.Net.ServicePointManager.DefaultConnectionLimit = 1000; defined already.

    public static async Task Test()
    {
        HttpClient client = new HttpClient();
        string str;
        for (int i = 1000; i <= 1100; i++)
            str = await client.GetStringAsync($"https://en.wikipedia.org/wiki/{i1}");
    }
    

    What does await do here? Initially I thought since this is in async-await pattern, it means basically HttpClient will initiate all HTTP GET calls in a multi-threaded fashion, i.e. basically all the Urls should be fetched at once.

    But when I'm using Fiddler to analyze the behavior I've found it really fetches the URLs sequentially.

    I need to change it to this to make it work:

    public static async Task<int> Test()
    {
        int ret = 0;
        HttpClient client = new HttpClient();
        List<Task> taskList = new List<Task>();
        for (int i = 1000; i <= 1100; i++)
        {
            var i1 = i;
            var task = Task.Run(() => client.GetStringAsync($"https://en.wikipedia.org/wiki/{i1}"));
            taskList.Add(task);
        }
        Task.WaitAll(taskList.ToArray());
        return ret;
    }
    

    This time the URLs are fetched in parallel. So what does the await keyword really do in the first code snippet?