Do I really need to remove special characters in a URL?
Solution 1
Modern browsers will automatically encode the special characters in the URL before requesting it. So you are already using encoded characters, you just don't know it.
I used http://www.url-encode-decode.com/ to encode the file portion of your URL (using UTF-8):
http://www.mydomain.com/downloads/Some+Band+-+En+fran%C3%A7ais+avec+des+caract%C3%A8res+sp%C3%A9ciaux+%282013%29+%5B7%27%27+EP%5D.zip
That should be what browsers are sending when you link without the encoding. For compatibility with older browsers you should URL encode all your links.
Solution 2
Yes, for uniform compatibility with different browsers and internet accessible applications, you would need to encode all of the following in a URL:
- Spaces
- ASCII Control characters
- Non-ASCII characters
- Reserved characters
- Unsafe characters
For more information as to what these are, see this: What characters need to be encoded and why?
Since it seems that you know what the URL's are, you can try to use online URL encoders like the one in the link above, or in the following link, which also provides information about URL encoding: Url Encode/Decode online
Then test the URL's in as many browsers as possible to confirm they are working before shaing them. You can download several different browsers (e.g., Chrome, Firefox, and Opera) and install them on the same computer for testing.
As you become more familiar with which characters need to be encoded, you can replace or remove them in the names of your files prior to uploading.
Solution 3
...share the downloads in a music forum
This is really just adding to the existing answers... the URL needs to be encoded at some point, either implicitly by the browser (or forum software) or explicitly by you.
You specifically mention you are sharing these links in a forum. Many forums automatically encode links in forum posts, so you might not have to explicitly encode this yourself - but this will depend on the forum.
Stack exchange (markdown) encodes links to a certain extent, but will fail on the unencoded spaces (as will a lot of forum software) if you simply type the unencoded URL into the post and allow the forum to auto-detect the URL. However, if the forum has a specific prompt for embedding links then it might cope with this OK, as it does when using the toolbar option on Stack Exchange:
NOTE TO EDITORS: Please don't "correct" the (broken) links below, or surround in
<pre>
tags
- the links are meant to be broken or viewed as-is; it is serving as an example!
Link typed manually
(As you can see, it is broken at the first space)
[link typed manually unencoded](http://www.example.com/downloads/Some Band - En français avec des caractères spéciaux (2013) [7'' EP].zip)
Link entered using the hyperlink option on the toolbar
link is correctly encoded by the forum software
The above link is encoded as:
<a href="http://www.example.com/downloads/Some%20Band%20-%20En%20fran%C3%A7ais%20avec%20des%20caract%C3%A8res%20sp%C3%A9ciaux%20%282013%29%20%5B7%27%27%20EP%5D.zip" rel="nofollow">link is correctly encoded by the <em>forum</em> software</a>
Related videos on Youtube
djointster
Updated on September 18, 2022Comments
-
djointster over 1 year
I have an FTP account shared with friends where we upload underground music albums and then we use the links to share the downloads in a music forum. The problem is that the album names are in french so there is a lot of special characters in the name.
So the URL looks like
http://www.mydomain.com/downloads/Some Band - En français avec des caractères spéciaux (2013) [7'' EP].zip
For me it works perfectly and I can download the file by using this URL, but I have read everywhere that special chars are bad in URL.
Is there any reason why I must remove the special characters or encode the URL? Is everyone able to access a URL with special characters or will some older browsers not be able to download the files?
I really don't care about SEO or anything else. I just want the download links to work for everyone.
Since the files are uploaded through FTP, I can't use PHP to remove the special characters with a regex, so I really don't know what to do.
-
unor almost 11 yearsAsked the same question there too: stackoverflow.com/q/17119689/1591669
-
dan almost 11 yearsIt's considered a duplicate question. I'd suggest deleting it there because you already have answers here, and this site is more appropriate for non-programming questions like this.
-
cl-r almost 11 yearsJava have a URLEncoder URLDecoder classes to do this.
-
-
djointster almost 11 yearsWhy does wikipedia uses accents like éèà and parentheses in the URL if it must be encoded ?
-
djointster almost 11 yearsAlso, is there a place where i can find which browsers will not support non-encoded URL ?
-
dan almost 11 yearsAccents and parentheses are included under HTML ASCII Characters: nationalfinder.com/html/char-asc.htm I'm not sure if there's a list like that or not, however, depending on your environment, you might want to consider that devices or apps with limited browsers (e.g., a stripped down WebKit or HTML viewer) might also not support characters that need to be encoded.
-
MrWhite over 8 yearsThat tool does not seem to encode the URL properly? (The output from that tool is consistent with just passing the source string through PHP's
urlencode()
function - which is not correct.) Spaces in the path part of the URL should be percent-encoded as%20
, not a+
(plus), as shown above. The+
should only be used to encode spaces in the query string part. A+
in the path part of the URL is a literal+
, so is likely to result in a 404. -
MrWhite over 8 yearsThe second tool linked to above (
url-encode-decode.com
) is not intended to be used to encode an entire URL, it simply URL encodes the submitted text (it does not parse the URL in any way). That tool is only suitable for encoding submitted form data (the query string).