Is it possible to replace content on every page passed through a proxy similar to how mod_rewrite is used for URLs?
Solution 1
There's an apache module called mod_substitute that can do this. Here's a short example:
<Location "/">
AddOutputFilterByType SUBSTITUTE text/html
Substitute "s/uat.site.co.jp/jp.uat.site2uk.co.uk/ni"
</Location>
Or, when combined with mod_proxy:
ProxyPass / http://uat.site.co.jp/
ProxyPassReverse / http://uat.site.co.jp/
Substitute "s|http://uat.site.co.jp/|http://jp.uat.site2uk.co.uk/|i"
There's more information at the Apache documentation for mod_substitute.
Solution 2
If you haven't restarted Apache, be sure to do that, but if you've already done so, you could try a global output filter that runs a custom PHP script to do your replacing just to see if that solves it for some reason.
EDIT: based on your comment, it could be that substitute isn't working because the content is compressed. To turn off compression, add these lines to your VirtualHost:
RequestHeader unset Accept-Encoding
RequestHeader set Accept-Encoding identity
If that doesn't work, try the following:
Add these to your conf, updating the paths of course:
#add this outside of any VirtualHost tags
ExtFilterDefine proxiedcontentfilter mode=output cmd="/usr/bin/php /var/www/proxyfilter.php"
#add these in your VirtualHost tag
RequestHeader unset Accept-Encoding
RequestHeader set Accept-Encoding identity
SetOutputFilter proxiedcontentfilter
In proxyfilter.php have some code like the following:
#!/usr/bin/php
<?php
$html = file_get_contents('php://stdin');
$html = str_ireplace('uat.site.co.jp', 'jp.uat.site2uk.co.uk', $html);
file_put_contents('php://stdout', $html);
If this works, then narrow the focus of this to just text/html content as you have in your example.
Solution 3
According to https://httpd.apache.org/docs/2.4/mod/mod_proxy.html#proxypassreverse which rewrites the headers, you use "
To rewrite HTML content to match the proxy, you must load and enable mod_proxy_html.
Related videos on Youtube
ZZ9
Updated on September 18, 2022Comments
-
ZZ9 over 1 year
Is it possible to replace content on every page passed through a proxy similar to how mod_rewrite is used for URLs? The documentation on substitute is not clear.
I have some pages I am reverse proxying that have absolute paths. This breaks the site. They need replacing and tools like mod_rewrite are not picking them up as they are not URL requests.
<VirtualHost *:80> ServerName servername1 ServerAlias servername2 ErrorLog "/var/log/proxy/jpuat_prox_error_log" CustomLog "/var/log/proxy/jpuat_prox_access_log" common RewriteEngine on LogLevel alert rewrite:trace2 RewriteCond %{HTTP_HOST} /uat.site.co.jp$ [NC] RewriteRule ^(.*)$ http://jp.uat.site2uk.co.uk/$1 [P] AddOutputFilterByType SUBSTITUTE text/html Substitute "s|uat.site.co.jp|jp.uat.site2uk.co.uk|i" ProxyRequests Off <Proxy *> Order deny,allow Allow from all </Proxy> ProxyPass / http://uat.site.co.jp/ ProxyPassReverse / http://uat.site.co.jp/ </VirtualHost>
Neither of the above works at replacing the HTML string
<link href="//uat.site.co.jp/css/css.css
with
<link href="//uat.site2uk.co.uk/css/css.css
Conf after changes:
<VirtualHost *:80> ServerName jp.uat.site2uk.co.uk ServerAlias uat.site.co.jp ErrorLog "/var/log/proxy/jpuat_prox_error_log" CustomLog "/var/log/proxy/jpuat_prox_access_log" common ProxyRequests Off <Proxy *> Order deny,allow Allow from all </Proxy> ProxyPass / http://uat.site.co.jp/ ProxyPassReverse / http://uat.site.co.jp/ AddOutputFilterByType SUBSTITUTE text/html Substitute "s|uat.site.co.jp|jp.uat.site2uk.co.uk|ni" </VirtualHost>
-
GregL about 9 yearsI'm confused. That looks like it's from an HTML
a
tag. Clicking on that link likely won't result in the web browser following the link, but rather a file browser (Windows Explorer) trying to open the UNC. Are you trying to replace that string in HTML text? -
ZZ9 about 9 yearsThey site works correctly. However once we put it behind a firewall we ofcourse get 404s on a bunch of css and images. Normally everything gets 200
-
ZZ9 about 9 yearsThey are from link tags on an IIS server <link href="//fqdn/asset"
-
GregL about 9 yearsI don't think you can provide UNC paths in
link
tags. If you can, I can't say it would be a good idea.. In any event, that's not your question. According to the Apache docs, thesubstitute
directive is only valid insideDirectory
blocks or.htaccess
files. Try creating a<location>
block (even if it's for /) and put the directive in there. -
GregL about 9 yearsTry a
location
block instead, or read about their differences and use whichever one is better. -
Tero Kilkanen about 9 years@GregL, this format of URL is a "protocol-relative" URL, it is perfectly valid way to link to pages, although it is not that commonly known. "//domain.com/path" makes the browser request the document with the same protocol that was used to request the page containing the link.
-
-
ZZ9 about 9 yearsHi, thanks for the suggestion, unfortunately I have not had much luck down this path. I have tested it outside of the proxy successfully though. It appears mod_proxy ignores it.
-
Jenny D about 9 yearsI added some more info which may be helpful.
-
ZZ9 about 9 yearsI get a HTML 200 on the page but the browser shows: Content Encoding Error The page you are trying to view cannot be shown because it uses an invalid or unsupported form of compression.
-
g491 about 9 yearsAh, add these to your VirtualHost. RequestHeader unset Accept-Encoding and also RequestHeader set Accept-Encoding identity
-
g491 about 9 yearsI updated my answer with something to try to get your original substitute line working. I'd recommend trying that first as it's simpler to try and may be what's going on.
-
ZZ9 about 9 yearsUpdate for a great answer but I got the other answer working first
-
ZZ9 about 9 yearsThanks a lot, this works. Turned out to be a glitch with Apache picking up backups of my files in /etc/httpd/conf.d/ that didn't end in .conf (vhost.bak).
-
user3071284 about 3 yearsThis is true when ProxyPass and ProxyPassReverse are not used in <Location>
-
lorenzo-s over 2 yearsAlternatively, you can decompressing incoming content before substituting, and then compress again after, using just
SetOutputFilter INFLATE;DEFLATE