Parameter separator in URLs, the case of misused question mark

40,744

Solution 1

From the URI spec's (RFC 3986) point of view, the only separator here is "?". the format of the query is opaque; the ampersands just are something that HTML happens to use for form submissions.

Solution 2

The answer's pretty much in this article - http://www.skorks.com/2010/05/what-every-developer-should-know-about-urls/ . To highlight it, here goes :

Query is the preferred way to send some parameters to a resource on the server. These are key=value pairs and are separated from the rest of the URL by a ? (question mark) character and are normally separated from each other by & (ampersand) characters. What you may not know is the fact that it is legal to separate them from each other by the ; (semi-colon) character as well. The following URLs are equivalent:

http://www.blah.com/some/crazy/path.html?param1=foo&param2=bar

http://www.blah.com/some/crazy/path.html?param1=foo;param2

Solution 3

The RFC 3896 (https://www.ietf.org/rfc/rfc3986.txt) defines general and sub delimiters ... '?' is a general, '&' and ';' are sub. The spec is pretty clear about that.

In this case the latter '?' chars would be treated as part of the query. If the query parser follows the spec strictly, it would then pass the whole query on to the app-destination. If the app-destination could choose to further process the query string in a manner which treats the ? as a param name-value pairs delimiter, that is up to the app's designers.

My guess is that this often 'just works' because code that splits query strings and the original uri uses all delimiters for matching: 1) first query is split on '?' then 2) query string is parsed using char match list that includes '?' (convenience only).... This could be occurring in ubiquitous parsing libraries already.

Share:
40,744
el_shayan
Author by

el_shayan

Updated on August 07, 2020

Comments

  • el_shayan
    el_shayan almost 4 years

    What I don't really understand is the benefit of using '?' instead of '&' in urls:

    question mark vs ampersand

    It makes nobody's life easier if we use a different character as the first separator character. Can you come up with a reasonable explanation?

    EDIT: after more research I found that "&" can be a part of file name (terms&conditions.html) so "?" is a good separator. But still I think using "?" for separators makes lives easier (from url generators and parsers point of view):

    question mark as separator

    Is there any advantage in using "&" which is not clear at the first glance?

  • el_shayan
    el_shayan about 12 years
    So they have developed in different times? HTML was there before URI?
  • Julian Reschke
    Julian Reschke about 12 years
    Yes, they developed separately, but no, HTML wasn't there before URIs (previously called URLs).
  • el_shayan
    el_shayan about 12 years
    So the answer can be something like "HTML and URIs developed separately and HTML developers haven't noticed the benefit of using ? as parameter separator instead of &" except I would say "haven't noticed" is not accurate. They might have a good reason.
  • David Tonhofer
    David Tonhofer almost 5 years
    "just are something that HTML happens to use for form submissions" rather, it's HTTP (the data request/data transfer protocol) that uses them for that: tools.ietf.org/html/rfc2616#section-3.2.2
  • Julian Reschke
    Julian Reschke almost 5 years
    @DavidTonhofer - no; the HTTP spec doesn't define the structure of the query part at all.
  • David Tonhofer
    David Tonhofer almost 5 years
    @JulianReschke Just follow the link Julian. "RFC 2616 Hypertext Transfer Protocol -- HTTP/1.1": http_URL = "http:" "//" host [ ":" port ] [ abs_path [ "?" query ]].
  • David Tonhofer
    David Tonhofer almost 5 years
    More precisely: You have HTML (the markup language, defined by whatwg.org nowadays), the HTTP protocol and message format (which defines URL layout) and the definition of the URI in RFC3986 of which the URL is a special case. All on top of TCP or whatever. The HTTP spec says nothing about syntax or semantics of what's in the "query" part (or in a form submission for that matter). Use XML, JSON, EDN, encoded ASN.1 ... you are free!
  • Julian Reschke
    Julian Reschke almost 5 years
    @DavidTonhofer - your last comment is correct, the one before is not. HTTP does not define the structure of the query part (and furthermore, RFC 2616 has been obsoleted long ago by RFCs 723*).
  • hultqvist
    hultqvist over 3 years
    That article only explains WHAT not WHY.