How to get the file size from http headers

58,652

Solution 1

Yes, assuming the HTTP server you're talking to supports/allows this:

public long GetFileSize(string url)
{
    long result = -1;

    System.Net.WebRequest req = System.Net.WebRequest.Create(url);
    req.Method = "HEAD";
    using (System.Net.WebResponse resp = req.GetResponse())
    {
        if (long.TryParse(resp.Headers.Get("Content-Length"), out long ContentLength))
        {
            result = ContentLength;
        }
    }

    return result;
}

If using the HEAD method is not allowed, or the Content-Length header is not present in the server reply, the only way to determine the size of the content on the server is to download it. Since this is not particularly reliable, most servers will include this information.

Solution 2

Can this be done with HTTP headers?

Yes, this is the way to go. If the information is provided, it's in the header as the Content-Length. Note, however, that this is not necessarily the case.

Downloading only the header can be done using a HEAD request instead of GET. Maybe the following code helps:

HttpWebRequest req = (HttpWebRequest)WebRequest.Create("http://example.com/");
req.Method = "HEAD";
long len;
using(HttpWebResponse resp = (HttpWebResponse)(req.GetResponse()))
{
    len = resp.ContentLength;
}

Notice the property for the content length on the HttpWebResponse object – no need to parse the Content-Length header manually.

Solution 3

Note that not every server accepts HTTP HEAD requests. One alternative approach to get the file size is to make an HTTP GET call to the server requesting only a portion of the file to keep the response small and retrieve the file size from the metadata that is returned as part of the response content header.

The standard System.Net.Http.HttpClient can be used to accomplish this. The partial content is requested by setting a byte range on the request message header as:

    request.Headers.Range = new RangeHeaderValue(startByte, endByte)

The server responds with a message containing the requested range as well as the entire file size. This information is returned in the response content header (response.Content.Header) with the key "Content-Range".

Here's an example of the content range in the response message content header:

    {
       "Key": "Content-Range",
       "Value": [
         "bytes 0-15/2328372"
       ]
    }

In this example the header value implies the response contains bytes 0 to 15 (i.e., 16 bytes total) and the file is 2,328,372 bytes in its entirety.

Here's a sample implementation of this method:

public static class HttpClientExtensions
{
    public static async Task<long> GetContentSizeAsync(this System.Net.Http.HttpClient client, string url)
    {
        using (var request = new System.Net.Http.HttpRequestMessage(System.Net.Http.HttpMethod.Get, url))
        {
            // In order to keep the response as small as possible, set the requested byte range to [0,0] (i.e., only the first byte)
            request.Headers.Range = new System.Net.Http.Headers.RangeHeaderValue(from: 0, to: 0);

            using (var response = await client.SendAsync(request))
            {
                response.EnsureSuccessStatusCode();

                if (response.StatusCode != System.Net.HttpStatusCode.PartialContent) 
                    throw new System.Net.WebException($"expected partial content response ({System.Net.HttpStatusCode.PartialContent}), instead received: {response.StatusCode}");

                var contentRange = response.Content.Headers.GetValues(@"Content-Range").Single();
                var lengthString = System.Text.RegularExpressions.Regex.Match(contentRange, @"(?<=^bytes\s[0-9]+\-[0-9]+/)[0-9]+$").Value;
                return long.Parse(lengthString);
            }
        }
    }
}

Solution 4

WebClient webClient = new WebClient();
webClient.OpenRead("http://stackoverflow.com/robots.txt");
long totalSizeBytes= Convert.ToInt64(webClient.ResponseHeaders["Content-Length"]);
Console.WriteLine((totalSizeBytes));
Share:
58,652
Admin
Author by

Admin

Updated on July 05, 2022

Comments

  • Admin
    Admin almost 2 years

    I want to get the size of an http:/.../file before I download it. The file can be a webpage, image, or a media file. Can this be done with HTTP headers? How do I download just the file HTTP header?

  • ChocoMartin
    ChocoMartin about 13 years
    Won't resp.ContentLength above give you the length of the HEAD response, and not the length of the file you were interested in getting the sizeof ?
  • Konrad Rudolph
    Konrad Rudolph about 13 years
    @Adam No. The documentation says: “The ContentLength property contains the value of the Content-Length header returned with the response.”
  • Eric Smith
    Eric Smith about 11 years
    Make sure you call resp.Close() or else you can encounter timeout errors when making multiple requests at a time (my third request was timing out in a foreach loop which was solved by closing each response)
  • Konrad Rudolph
    Konrad Rudolph about 11 years
    @Eric In fact you should use a Using block here, or implement the disposable pattern to manage the lifetime of the resource explicitly. Manually calling Close is not enough unless you insure that it always happens, even in the case of error.
  • Eric Smith
    Eric Smith about 11 years
    @KonradRudolph You're absolutely right. Calling Close() fixed my bug while I was testing this, but a using block is the correct way to do it. Derp.
  • justderb
    justderb about 11 years
    If you use using it automatically disposes it. msdn.microsoft.com/en-us/library/yh598w02(v=vs.110).aspx
  • gunr2171
    gunr2171 about 11 years
    @KonradRudolph, FYI, ContentLength returns a long. Not a big deal but just in case you want to fix it.
  • Preston
    Preston over 8 years
    Another note, if you are using this for extremely large files int is not enough, you'll need to use long ContentLength; and long.TryParse(xxx) to support more than a 2.14GB size return value.
  • Justin
    Justin almost 8 years
    Won't http compression being enabled throw off the actual file size?
  • ScottFoster1000
    ScottFoster1000 over 5 years
    This is a great solution, especially if you're already using WebClient to download the file and just want to add checking the file length first.
  • Phani Rithvij
    Phani Rithvij over 4 years
    Nice solution but not every server allows content range requests.
  • Behzad
    Behzad about 4 years
    I use this method to knowing the size of this link: http://ipv4.download.thinkbroadband.com/200MB.zip but get an error 403! why?