Should URL be case sensitive?

177,715

Solution 1

According to W3's "HTML and URLs" they should:

There may be URLs, or parts of URLs, where case doesn't matter, but identifying these may not be easy. Users should always consider that URLs are case-sensitive.

Solution 2

All “insensitive”s are boldened for readability.

Domain names are case insensitive according to RFC 4343. The rest of URL is sent to the server via the GET method. This may be case sensitive or not.

Take this page for example, stackoverflow.com receives GET string /questions/7996919/should-url-be-case-sensitive, sending a HTML document to your browser. Stackoverflow.com is case insensitive because it produces the same result for /QUEStions/7996919/Should-url-be-case-sensitive.

On the other hand, Wikipedia is case sensitive except the first character of the title. The URLs https://en.wikipedia.org/wiki/Case_sensitivity and https://en.wikipedia.org/wiki/case_sensitivity leads to the same article, but https://en.wikipedia.org/wiki/CASE_SENSITIVITY returns 404.

Solution 3

Depends on the hosting os. Sites that are hosted on Windows tend to be case insensitive as the underlying file system is case insensitive. Sites hosted on Unix type systems tend to be case sensitive as their underlying file systems are typically case sensitive. The host name part of the URL is always case insensitive, it's the rest of the path that varies.

Solution 4

The domain name portion of a URL is not case sensitive since DNS ignores case: http://en.example.org/ and HTTP://EN.EXAMPLE.ORG/ both open the same page.

The path is used to specify and perhaps find the resource requested. It is case-sensitive, though it may be treated as case-insensitive by some servers, especially those based on Microsoft Windows.

If the server is case sensitive and http://en.example.org/wiki/URL is correct, then http://en.example.org/WIKI/URL or http://en.example.org/wiki/url will display an HTTP 404 error page, unless these URLs point to valid resources themselves.

Solution 5

I am not a fan of bumping old articles but because this was one of the first responses for this particular issue I felt a need to clarify something.

As @Bhavin Shah answer states the domain part of the url is case insensitive, so

http://google.com 

and

http://GOOGLE.COM 

and

http://GoOgLe.CoM 

are all the same but everything after the domain name part is considered case sensitive.

so...

http://GOOGLE.COM/ABOUT

and

http://GOOGLE.COM/about

are different.

Note: I am talking "technically" and not "literally" in a lot of cases, most actually, servers are setup to handle these items the same, but it is possible to set them up so they are NOT handled the same.

Different servers handle this differently and in some cases they Have to be case sensitive. In many cases query string values are encoded (such as Session Ids or Base64 encoded data thats passed as a query string value) These items are case sensitive by their nature so the server has to be case sensitive in handling them.

So to answer the question, "should" servers be case sensitive in grabbing this data, the answer is "yes, most definitely."

Of course not everything needs to be case sensitive but the server should be aware of what that is and how to handle those cases.


@Hart Simha's comment basically says the same thing. I missed it before I posted so I want to give credit where credit is due.

Share:
177,715
Imageree
Author by

Imageree

My websites: Programming Quotes Photography Tips and Quotes Stock Photography Free images I think outside the box Icelandic Quotes Free Online OCR Quotes from Twitter Google Alerts for Twitter

Updated on October 17, 2021

Comments

  • Imageree
    Imageree over 2 years

    I noticed that

    HTTP://STACKOVERFLOW.COM/QUESTIONS/ASK
    

    and

    http://stackoverflow.com/questions/ask
    

    both works fine - actually the previous one is converted to lowercase.

    I think that this makes sense for the user.

    If I look at Google then this URL works fine:

    http://www.google.com/intl/en/about/corporate/index.html  
    

    but this one with "ABOUT" is not working:

    http://www.google.com/intl/en/ABOUT/corporate/index.html   
    

    Should the URL be case sensitive?

  • jldupont
    jldupont over 12 years
    I guess "be liberal in what you accept and conservative in what you send" (IETF speak) would be my guideline.
  • oᴉɹǝɥɔ
    oᴉɹǝɥɔ about 11 years
    W3 guideline is reasonable. It simply states that one shouldn't make an assumption on how the server handles the URL you are submitting. It is up to the server how to handle the request URL. Most of web servers are unix/linux and that means most of web servers are case sensitive.
  • Hart Simha
    Hart Simha almost 11 years
    There is one advantage to URLs being case sensitive. In some websites, where objects are encoded with unique IDs that can be referred to through the URL, the encoding can be something like base64 instead of base36. This allows you to encode exponentially more unique objects in the same number of URL characters. For example, foo.com/000 - foo.com/zzz (case insensitive) could refer to 36^3 unique objects, where as foo.com/000 - foo.com/ZZZ (case sensitive, meaning foo.com/zzz and foo.com/ZZZ are different paths), would refer to 62^3 objects.
  • trysis
    trysis over 10 years
    W3 says USERS should assume that servers are case-sensitive, but does not give a recommendation for SERVERS.
  • trysis
    trysis over 10 years
    Wikipedia is actually very forgiving for case-sensitivity in cases where users may think a word should be one case or another, but this is more because of the OCD... sorry, considerate nature of its editors. Its URL's are technically case-sensitive, though.
  • monokrome
    monokrome almost 9 years
    This doesn't answer the question
  • user3367701
    user3367701 over 8 years
    That's because the semantic, readable part of a question's URL in stackoverflow does not identify it, it's identified by 7996919. The semantic part of the URL is just there for SEO purposes.
  • Daniel W.
    Daniel W. about 8 years
    This answer has the only correct wording "it is case-sensitive, though it may be treated as case-insensitive". Only valid answer.
  • realPK
    realPK almost 8 years
    The question is: "Should URL be case sensitive?" Your answer is: "How to make case insensitive URLs"
  • realPK
    realPK almost 8 years
    For resiliency, programs interpreting URLs should treat upper case letters as equivalent to lower case in scheme names (e.g., allow "HTTP" as well as "http"). Source
  • PJP
    PJP almost 8 years
    This isn't an answer, it's an opinionated comment.
  • HenriKoppen
    HenriKoppen almost 8 years
    I back it up with an example. URL's are used by people -see original question-, not computers. It's very hard so see WHY a link isn't working and since almost ALL domains are case insensitive, so should the rest of the URL. The downvotes are for my tone of voice (which is bad), or because technical people tend to chose technical beauty over user experience.
  • Laurie Stearn
    Laurie Stearn over 7 years
    Yes, as this one painfully found out on http requests to files on a Unix ftp server.
  • dthrasher
    dthrasher over 7 years
    @PK_ Note that this only holds for the scheme portion of the URL. RFC1738 does not discuss whether other parts of the URL should be interpreted as case sensitive or not.
  • TygerKrash
    TygerKrash over 7 years
    this is an interesting one. regular e ASCII characters (which have an upper and lower case) are not actually converted though right? it's only spaces and extended characters that are escaped in the url. Do any extended chars have an upper/lower case modifier?
  • Johnny
    Johnny over 7 years
    @PK_J This part is relevant only for the scheme part of the url (HTTP->http, FTP->ftp)
  • rspring1975
    rspring1975 almost 7 years
    I think this and many of the answers around what the spec does or does not say is missing the point of the question.**Should** they be case sensitive? That's a loaded question really. From a user's point of view, case sensitivity is a pain point, not all know makes a difference. The question of whether URIs should or shouldn't be, depends on the context of the question. For technical flexibility, yes, they should be. For usability, no, they should not be.
  • Bozzy
    Bozzy almost 7 years
    Actually also https://stackoverflow.com/questions/7996919/should-BLABLA-be‌​-or-NOT-to-be works. This is because stackoverflow.com's server only uses the question's ID to identify it and return the correct URL and HTML page.
  • Toby Speight
    Toby Speight almost 7 years
    It may be worth noting that schemes such as https, ftp, irc and mailto all contain DNS names, which we know are case-insensitive
  • chharvey
    chharvey over 6 years
    To be fair, any question asking "SHOULD" is inherently opinion-based and could be removed from StackOverflow. (More: stackoverflow.blog/2010/09/29/good-subjective-bad-subjective‌​)
  • chharvey
    chharvey over 6 years
    @theTinMan It's an answer to the opinion-evoking question.
  • Valentin Waeselynck
    Valentin Waeselynck over 6 years
    It would be more accurate to say 'depends on the server' in the general sense - because serving files is not the only way to answer HTTP requests.
  • jpmc26
    jpmc26 over 5 years
    Are query strings treated as part of the location? I believe they are treated as separate entities and not used for location resolution.
  • Bob
    Bob over 5 years
    Query strings are separate from location, yes. But the same principles that I've shown there with query parameters can also apply to other parts of the URL. Some CMSes, for example, might purposefully rewrite "/user.php?id=3756" to "/users/PaulMcCartney" for better SEO-friendly human-readable URLs (Wordpress does this, for example). The point is that the standards deliberately back off from prescription over that which is context-dependent. It's left to the server to decide, as the server understands the context, where a universal standard can't.
  • garnet
    garnet over 5 years
    @DanFromGermany, path is case-sensitive can be deduced vaguely from here "URLs in general are case-sensitive (with the exception of machine names).There may be URLs, or parts of URLs, where case doesn't matter, but identifying these may not be easy." But, it is ambiguous to deduce that. As mentioned in one above comment, RFC1738 does not discuss if parts of the URL other than scheme should be interpreted as case sensitive or not. Do you have any link which clarifies which parts of url are case-sensitive?
  • Daniel W.
    Daniel W. over 5 years
    @garnet From RFC3986 6.2.2.1. Case Normalization: When a URI uses components of the generic syntax, the component syntax equivalence rules always apply; namely, that the scheme and host are case-insensitive and therefore should be normalized to lowercase. For example, the URI HTTP://www.EXAMPLE.com/ is equivalent to http://www.example.com/. The other generic syntax components are assumed to be case-sensitive unless specifically defined otherwise by the scheme."
  • Daniel W.
    Daniel W. over 5 years
    @garnet And from the HTTP RFC: "When comparing two URIs to decide if they match or not, a client SHOULD use a case-sensitive octet-by-octet comparison of the entire URIs [...]" (with exception of scheme and host itself).
  • jaybro
    jaybro over 5 years
    I agree with @HartSimha and since the question asks for opinion: Unless part of the URL route is being used to identify a unique object, please for the love of all that is good on the internet, DO NOT make it case sensitive.
  • Jeremy Caney
    Jeremy Caney about 3 years
    @chharvey is correct; I'd recommend flagging this question as opinion based.
  • risingballs
    risingballs almost 3 years
    This is correct, but since it's not possible to distinguish these two the path section that is sent to the server, including parameters, up to an #anchor, which is not sent to the server, should always be considered case-sensitive.
  • Aaron Franke
    Aaron Franke almost 2 years
    What characters are valid in schemes? Only letters? Only alphanumeric characters?
  • Nikolai Novik
    Nikolai Novik almost 2 years
    Section 2.2 of the RFC 3986 describes reserved characters. Section 2.3 lists the ranges of characters which can be passed without "percent-encoding".