Android: how to parse URL String with spaces to URI object?

114,607

Solution 1

You should in fact URI-encode the "invalid" characters. Since the string actually contains the complete URL, it's hard to properly URI-encode it. You don't know which slashes / should be taken into account and which not. You cannot predict that on a raw String beforehand. The problem really needs to be solved at a higher level. Where does that String come from? Is it hardcoded? Then just change it yourself accordingly. Does it come in as user input? Validate it and show error, let the user solve itself.

At any way, if you can ensure that it are only the spaces in URLs which makes it invalid, then you can also just do a string-by-string replace with %20:

URI uri = new URI(string.replace(" ", "%20"));

Or if you can ensure that it's only the part after the last slash which needs to be URI-encoded, then you can also just do so with help of android.net.Uri utility class:

int pos = string.lastIndexOf('/') + 1;
URI uri = new URI(string.substring(0, pos) + Uri.encode(string.substring(pos)));

Do note that URLEncoder is insuitable for the task as it's designed to encode query string parameter names/values as per application/x-www-form-urlencoded rules (as used in HTML forms). See also Java URL encoding of query string parameters.

Solution 2

java.net.URLEncoder.encode(finalPartOfString, "utf-8");

This will URL-encode the string.

finalPartOfString is the part after the last slash - in your case, the name of the song, as it seems.

Solution 3

URL url = Test.class.getResource(args[0]);  // reading demo file path from                                                   
                                            // same location where class                                    
File input=null;
try {
    input = new File(url.toURI());
} catch (URISyntaxException e1) {
    // TODO Auto-generated catch block
    e1.printStackTrace();
}

Solution 4

To handle spaces, @, and other unsafe characters in arbitrary locations in the url path, Use Uri.Builder in combination with a local instance of URL as I have described here:

private Uri.Builder builder;
public Uri getUriFromUrl(String thisUrl) {
    URL url = new URL(thisUrl);
    builder =  new Uri.Builder()
                            .scheme(url.getProtocol())
                            .authority(url.getAuthority())
                            .appendPath(url.getPath());
    return builder.build();
}
Share:
114,607
whlk
Author by

whlk

Apparently, this user prefers to keep an air of mystery about them.

Updated on July 08, 2022

Comments

  • whlk
    whlk almost 2 years

    I have a string representing an URL containing spaces and want to convert it to an URI object. If I simply try to create it via

    String myString = "http://myhost.com/media/File Name that has spaces inside.mp3";
    URI myUri = new URI(myString);
    

    it gives me

    java.net.URISyntaxException: Illegal character in path at index X
    

    where index X is the position of the first space in the URL string.

    How can i parse myString into a URI object?

  • BalusC
    BalusC about 14 years
    It will also urlencode the colon and the slashes which would make the url still invalid. He basically only need to urlencode the spaces to get it valid.
  • whlk
    whlk about 14 years
    Ok, this gets me by the URISyntaxException but now i get a 404 from the server. The url I get is http://myhost.com/media/mp3s/9/Agenda+of+swine+-+13.+Persecu‌​tion+Ascension_+leav‌​e+nothing+standing.m‌​p3. I use the URI in an org.apache.http.client.methods.HttpGet.HttpGet Request. Any ideas?
  • Bozho
    Bozho about 14 years
    @Mannaz now that's another thing - you have to show the servlet code - or better, ask another question. The problem is no longer on the client.
  • whlk
    whlk about 14 years
    @Bozho shure it is a client/encoding problem, because requesting the original URL (myString) in a normal Browser does not result in a 404 error.
  • Bozho
    Bozho about 14 years
    @Mannaz and does the resultant (encoded) string result in 404 in a browser?
  • Bozho
    Bozho about 14 years
    @Mannaz - just be careful when another "invalid" symbol appears in a song name.
  • praveenb
    praveenb about 12 years
    @BalusC i tried URLEncoder.encode("query string","UTF-8"); its returning with + symbol like this "query+string" where im expecting "%20". So i used string.replace with the hardcoded the values. Solved the issue. Thanks for the info. Is there any otherway to encode instead of manual replace..?
  • Sniper
    Sniper over 10 years
    I am using java.net.URLEncoder.encode("aa bb cc", "utf-8"); but instead of adding %20 instead of space it replacing +. "aa+bb+cc". Why this is happening.
  • yuralife
    yuralife over 10 years
    @Sniper, I`ve got the same problem ('+' instead of '%20')
  • MetaFight
    MetaFight over 9 years
    because this isn't answering the question.
  • siddmuk2005
    siddmuk2005 over 9 years
    I have given this for removing the space from URL so it solve my problem because while reading the file location FileInputStream points to null and while reading with null it will throw Exception bu using URI i didn't get the problem.
  • Hanry
    Hanry about 8 years
    Found any solution for plus sign ?