URL route matching with Regex

11,803

Solution

First concentrate on how the developer could create the routes. What must she type for entering dynamic parameters ? Then writing the dynamic parameters matching will be easier.

Example

In Java, I recently worked with Jersey. Here is how one can define an url route:

/api/{id:[\dA-F]+}.{type:(?:xml|json|csv)}

Some expected urls:

/api/EF123.csv
/api/ABC.json
/api/1234567890.xml

The matcher would parse the route provided by the developer for finding dynamic parameter using a regex like this:

{([^:]+)\s*:\s*(.+?)(?<!\\)}

Check the demo: http://regex101.com/r/iH1gY3

Regular expression visualization

Once done, the matcher can build the regex below on the fly for matching the route:

/api/[\dA-F]+\.(?:xml|json|csv)
Share:
11,803
Mark
Author by

Mark

Updated on June 04, 2022

Comments

  • Mark
    Mark almost 2 years

    I'm trying to build my own URL route matching engine, trying to match routes using regular expressions.

    For example, let's consider the scenario where a server application allows to set custom parameterized routes and then execute a function when the route it's being invoked by an HTTP request. The developer could create the following routes:

    • /users/:id/doSomething
    • /hello/world
    • /:format/convert

    And each one of them would be associated with a different request handler/function.

    Now, on an incoming request, the server should be able to match the requested path to the proper handler. So for example if a client application requests http://myservice.com/users/john/doSomething, the server should be able to tell that the requested URL belongs to the /users/{id}/doSomething route definition, and then execute the associated handler.

    Personally they way I would build the route matcher would be to take the requested URL, loop over the route definitions and, if a definition matches the requested URL, execute the handler. The tricky part is the dynamic parameters matching.

    How would you build a regular expression that matches the URL segments?

    EDIT:

    I'm currently using the following regular expression to match segments: ([^/\?])+.

    For example to check if a request path belongs to the first route I would match it against:

    /users/([^/])+/doSomething

    Which is a very permissive regex.