Removing 'index.html' from url and adding 'www' with one single 301 redirect

13,060

Solution 1

To avoid double redirection have another rule in .htaccess file that meets both conditions like this:

Options +FollowSymlinks -MultiViews
RewriteEngine on

RewriteCond %{HTTP_HOST} !^www\. [NC]
RewriteCond %{REQUEST_URI} ^(.*/)index\.html$ [NC]
RewriteRule . http://www.%{HTTP_HOST}%1 [R=301,NE,L]

RewriteCond %{HTTP_HOST} !^www\. [NC]
RewriteRule . http://www.%{HTTP_HOST}%{REQUEST_URI} [NE,R=301,L]

RewriteCond %{REQUEST_URI} ^(.*/)index\.html$ [NC]
RewriteRule . %1 [R=301,NE,L]

So if input URL is http://mydomain.com/path/index.html then both the conditions get satisfied in the first rule here and there will be 1 single redirect (301) to http://www.mydomain.com/path/.

Also I believe QSA flag is not really needed above since you are NOT manipulating query string.

Solution 2

A better solution would be to place the index.html rule ahead of the www rule and inside the index.html rule ADD the www prefix to the destination url. This way someone looking for http://domain.com/index.html would get sent to http://www.domain.com/ by the FIRST rule. The second (www) rule would then only apply if index AND www are missing, which is again only one redirect.

Share:
13,060
Marco Demaio
Author by

Marco Demaio

Just love coding all day long! Now I'm using PHP and Javascript for my daytime job. I took part in realizing all the back end application to handle the orders, contracts and invoices for a site that sells posta certificata per aziende. Language I love most is C++ Language I hate most is VB6/VB.NET Wishes: to see PHP becoming more OO with operator overloading, and Python add curly braces. I have started coding in BASIC since I was a 13 years old kid with a Commodore 64 and Apple II. During University they taught me C and C++ and JAVA and I realized even more how much I love to code. :) Funny stuff: Micro Roundcube plugin to improve the search box

Updated on June 04, 2022

Comments

  • Marco Demaio
    Marco Demaio almost 2 years

    In order to remove index.html or index.htm from urls I use the following in my .htaccess

    RewriteCond %{REQUEST_URI} /index\.html?$ [NC]
    RewriteRule ^(.*)index\.html?$ "/$1" [NC,R=301,NE,L]
    

    This works! (More info about flags at the end of this question *)

    Then in order to add www in urls I use the following in my .htaccess

    RewriteCond %{HTTP_HOST} !^www\.mydomain\.com$ [NC]
    RewriteRule ^(.*)$ "http://www.mydomain.com/$1" [R=301,NE,L]
    

    This works too!

    The question here is how to avoid the double redirection created by rules above in cases like the one below:

    1. browsers asks for http://mydomain.com/path/index.html
    2. server sends 301 header to redircet browser to http://mydomain.com/path/
    3. then browser requests http://mydomain.com/path/
    4. now the server sends 301 header to redircet browser to http://www.mydomain.com/path/

    This is obviously not very smart cause a poor user who is asking http://mydomain.com/path/index.html would be double redirected, and he would feel page goes too slow. Moreover Googlebot might stop following the link cause to the double redircetion (I'm not sure on this last one and I don't want to get into a discussion on this, it's just another possible issue.)

    Thanks!


    *To whom it might be interested:

    • NC is used to redirect also uppercased files i.e. INDEX.HTML / InDeX.HtM
    • NE is used to avoid double url encoding I avoid http://.../index.html?hello=ba%20be to be redirected to http://.../index.html?hello=ba%2520be
    • QSA is used to redirect also queries, i.e. http://.../index.html?hello=babe to http://.../?hello=babe (not needed thanks to anubhava answer)