Is there any way to get the file extension from a URL

31,162

Solution 1

It is weird, but it works:

string url = @"http://example.com/file.jpg";
string ext = System.IO.Path.GetExtension(url);
MessageBox.Show(this, ext);

but as crono remarked below, it will not work with parameters:

string url = @"http://example.com/file.jpg?par=x";
string ext = System.IO.Path.GetExtension(url);
MessageBox.Show(this, ext);

result: ".jpg?par=x"

Solution 2

here's a simple one I use. Works with parameters, with absolute and relative URLs, etc. etc.

public static string GetFileExtensionFromUrl(string url)
{
    url = url.Split('?')[0];
    url = url.Split('/').Last();
    return url.Contains('.') ? url.Substring(url.LastIndexOf('.')) : "";
}

Unit test if you will

[TestMethod]
public void TestGetExt()
{
    Assert.IsTrue(Helpers.GetFileExtensionFromUrl("../wtf.js?x=wtf")==".js");
    Assert.IsTrue(Helpers.GetFileExtensionFromUrl("wtf.js")==".js");
    Assert.IsTrue(Helpers.GetFileExtensionFromUrl("http://www.com/wtf.js?wtf")==".js");
    Assert.IsTrue(Helpers.GetFileExtensionFromUrl("wtf") == "");
    Assert.IsTrue(Helpers.GetFileExtensionFromUrl("") == "");
}

Tune for your own needs.

P.S. Do not use Path.GetExtension cause it does not work with query-string params

Solution 3

I know that this is an old question, but can be helpful to people that see this question.

The best approach for getting an extension from filename inside an URL, also with parameters are with regex.

You can use this pattern (not urls only):

.+(\.\w{3})\?*.*

Explanation:

.+     Match any character between one and infinite
(...)  With this, you create a group, after you can use for getting string inside the brackets
\.     Match the character '.'
\w     Matches any word character equal to [a-zA-Z0-9_]
\?*    Match the character '?' between zero and infinite
.*     Match any character between zero and infinite

Example:

http://example.com/file.png
http://example.com/file.png?foo=10

But if you have an URL like this:

http://example.com/asd
This take '.com' as extension.

So you can use a strong pattern for urls like this:

.+\/{2}.+\/{1}.+(\.\w+)\?*.*

Explanation:

.+        Match any character between one and infinite
\/{2}     Match two '/' characters
.+        Match any character between one and infinite
\/{1}     Match one '/' character
.+        Match any character between one and infinite
(\.\w+)  Group and match '.' character and any word character equal to [a-zA-Z0-9_] from one to infinite
\?*       Match the character '?' between zero and infinite
.*        Match any character between zero and infinite

Example:

http://example.com/file.png          (Match .png)
https://example.com/file.png?foo=10  (Match .png)
http://example.com/asd               (No match)
C:\Foo\file.png                      (No match, only urls!)

http://example.com/file.png

    http:        .+
    //           \/{2}
    example.com  .+
    /            \/{1}
    file         .+
    .png         (\.\w+)

Solution 4

If you just want to get the .jpg part of http://example.com/file.jpg then just use Path.GetExtension as heringer suggests.

// The following evaluates to ".jpg"
Path.GetExtension("http://example.com/file.jpg")

If the download link is something like http://example.com/this_url_will_download_a_file then the filename will be contained as part of the Content-Disposition, a HTTP header that is used to suggest a filename for browsers that display a "save file" dialog. If you want to get this filename then you can use the technique suggested by Get filename without Content-Disposition to initiate the download and get the HTTP headers, but cancel the download without actually downloading any of the file

HttpWebResponse res = (HttpWebResponse)request.GetResponse();
using (Stream rstream = res.GetResponseStream())
{
    string fileName = res.Headers["Content-Disposition"] != null ?
        res.Headers["Content-Disposition"].Replace("attachment; filename=", "").Replace("\"", "") :
        res.Headers["Location"] != null ? Path.GetFileName(res.Headers["Location"]) : 
        Path.GetFileName(url).Contains('?') || Path.GetFileName(url).Contains('=') ?
        Path.GetFileName(res.ResponseUri.ToString()) : defaultFileName;
}
res.Close();

Solution 5

Here is my solution:

if (Uri.TryCreate(url, UriKind.Absolute, out var uri)){
    Console.WriteLine(Path.GetExtension(uri.LocalPath));
}

First, I verify that my url is a valid url, then I get the file extension from the local path.

Share:
31,162
z3nth10n
Author by

z3nth10n

Hi, I'm Zenthion. I'm actually developing a port from GTA SA to Unity3D (https://gitlab.com/uta-gi/unity-theft-auto-intro).

Updated on July 28, 2022

Comments

  • z3nth10n
    z3nth10n almost 2 years

    I want to know that for make sure that the file that will be download from my script will have the extension I want.

    The file will not be at URLs like:

    http://example.com/this_url_will_download_a_file
    

    Or maybe yes, but, I think that I will only use that kind of URL:

    http://example.com/file.jpg
    

    I will not check it with: Url.Substring(Url.LastIndexOf(".") - 3, 3) because this is a very poor way.

    So, what do you recommend me to do?