Apache LocationMatch Regex

44,618

It seems the regex is a little too permissive - the scope of .+ within the cat/subcat/subsubcat needs to be constrained slightly. Also there is a slight error in the final expression ("./*"), this should be ("/.*"):

Working LocationMatch:

<LocationMatch ^(?!.*\.html$)/products/(?<cat>([A-Za-z0-9\-\_])+)/(?<subcat>([A-Za-z0-9\-\_])+)/(?<subsubcat>([A-Za-z0-9\-\_])+)((/?)|(/.*))$>
ProxyPass balancer://mycluster/search/
ProxyPassReverse balancer://mycluster/search/
</LocationMatch>
Share:
44,618
Crollster
Author by

Crollster

Extensive IT industry experience, covering many industry sectors including defense, telecoms, automotive, etc. I have over 20yrs professional experience working on web-based business applications, with the majority of those years working with Java EE as the back-end technology. I'm currently the Lead Enterprise Architect at a large semiconductor company. My work interests lie within internet technologies, and containerization, as well as the realm of Enterprise Architecture and Digital Transformation.

Updated on July 05, 2022

Comments

  • Crollster
    Crollster almost 2 years

    My Problem

    I need to have Apache HTTP Server (v2.4.10) proxy requests to Tomcat for dynamic applications, which not only do not match the path in Tomcat, but are also have similar paths to one another. For example:

    /products/<category>/<sub-category>/<sub-sub-category>/<product-id>.html proxy to: http://mycluster/pf/<product-id>.html

    ...and also...

    /products/<category>/<sub-category>/<sub-sub-category>/<anything-not-ending-in-html> proxy to: http://mycluster/search/<anything-not-ending-in-html>

    My Attempts

    I'm trying to use LocationMatch regex to handle this, but am not being fully successful. The following LocationMatch regex works on its own (proxy the *.html request to <tomcat>/pf/*.html):

    <LocationMatch ^/products/(?<cat>.+)/(?<subcat>.+)/(?<subsubcat>.+)/(?<partnum>.+).html>
    ProxyPass balancer://mycluster/pf/%{env:MATCH_PARTNUM}.html
    ProxyPassReverse balancer://mycluster/pf/%{env:MATCH_PARTNUM}.html
    </LocationMatch>
    

    This passes URLs using the following example path: /products/aaa/bbb/ccc/ddd3456.html (which is correct)

    However, when I also enable the regex below:

    <LocationMatch ^(?!.*\.html$)/products/(?<cat>.+)/(?<subcat>.+)/(?<subsubcat>.+)((/?)|(./*))$>
    ProxyPass balancer://mycluster/search/
    ProxyPassReverse balancer://mycluster/search/
    </LocationMatch>
    

    Trying to access /products/aaa/bbb/ccc/ results in the 404 page. Here I'm expecting any requests to "/products/aaa/bbb/ccc/" that do NOT end in .html to be passed to /search/ (including any subsequent path info to be included: eg .../search/compare )

    My Question

    I can't quite figure it out what is wrong. According to Rubular the supplied regex is correct:

    What am I missing here?

    I'd appreciate any advice on resolving this!