The right way of using index.html
Solution 1
The reason we use index.html
or home.html
or derivitives thereof, is because the webserver software itself actually looks for that and serves it. For example:
This is INVALID: (www-directory)
/var/www/
|_blog.html
|_blog/
|_math.html
|_page2.html
|_page3.html
|_(...)
This will in fact get served as a page listing the folders and files. (Not what you want). You can try this structure, but also make an index.html file next to blog.html. Notice how it will not serve blog.html unless you specify http://www.site.com/blog.html
) This is why http://www.google.com/
shows the page without you having to specify http://www.google.com/index.html
This is VALID:
/var/www/
|_index.html (renamed blog.html to index.html)
|_blog/
|_math.html
|_page2.html
|_page3.html
|_(...)
This will serve your blog.html
file AS THE HOMEPAGE. (Not list all the folders/files in that directory)
The webserver software has (in the config) a specialized list of file names that will be served as the homepage or the main page of a folder. (In my experience, index.html
takes precedence over index.php, so if you have index.html
and index.php
in a folder, the index.html is what the public will see) Of course that can all be changed, and you can even set blog.html
to be recognized as an "index".
Addressing your comment:
"This trick would change the address of my blog from www.xxx.com/blog.html into www.xxx.com/blog/."
This would be done by moving blog.html
entirely into /blog/
and renaming it to index.html.
Your new structure would be:
/var/www/
|_blog/
|_index.html (renamed from blog.html)
|_math.html
|_page2.html
|_page3.html
|_(...)
This should correctly serve http://www.site.com/blog/
to show the contents of your blog.html which we renamed to index.html
so the software could set it as the index of your directory /blog/
You're also free now to put and index.html
file into the root of your site http://www.site.com/(index.html)
to have links to /blog/
and whatever else you wish.
Specifically answering your questions in short statements:
-
Is it a good practice to have the index.html file in every subfolder or is it intended to be only in the root folder?
Yes, because it prevents people from seeing what files are in your directories. You can prevent this with a
.htaccess
file containingOptions -Indexes
-
Are there any disadvantages or problems that may occur when using the second, "index in every folder" method?
None that I can think of.
-
Which one of the two ways of structuring the website described above would you prefer?
I usually have an
index.html
orindex.php
file in the root, subfolders based on category (such asforum
ornews
orlogin
etc.) and then some sort of index inside each of those.
Solution 2
The technical term for index.html is Directory Index for Apache and Default Document for IIS. The other Apache directive of interest is the Options directive. As indicated in the documentation, when Options Indexes
is set:
If a URL which maps to a directory is requested, and there is no DirectoryIndex (e.g., index.html) in that directory, then mod_autoindex will return a formatted listing of the directory.
When I setup a website that is not using a content management system, my preferred setup is to have one content page per directory. That page is the directory index (default document) for the directory. All links on the site only link to the directory and end with a trailing slash (e.g., http://example.com/blog/
instead of http://example.com/blog/index.html
or ./blog/
instead of ./blog/index.html
). The trailing slash is important to avoid what is commonly referred to as a courtesy redirect. (If the trailing slash is omitted, everything still resolves correctly, but the number of HTTP requests and thus bandwidth increase.)
My primary motivation for the above methodology is twofold. First, it facilitates switching the technology used on the website. For example, I can change a page from index.html to index.php without breaking any links or search engine listings. Second, the file extension of a content page is "noise"; removing the file extension from the URL results in shorter and hopefully more readable URLs.
As for other file types:
- All CSS files reside in a css directory in the root of the website.
- All image files reside in an image directory or subdirectory thereof in the root of the website.
- All JavaScript files reside in a scripts directory in the root of the website.
- All flash and other movie files reside in a video directory or subdirectory thereof in the root of the website.
On an Apache server, I disable Options Indexes
for the abovementioned directories. On both Apache and IIS servers, I do not specify a directory index (default document) for the abovementioned directories. Thus, a request for any of the directories results in an HTTP 403 error.
Related videos on Youtube
lukaszzenko
Updated on September 18, 2022Comments
-
lukaszzenko over 1 year
I have quite a lot of issues I'd like to hear your opinion on, so I hope I'll manage to explain it well enough. I should also note that I'm beginner equipped only with the knowledge of HTML and CSS so although I'm almost sure that there is a simple solution using powerful PHP, it won't help me.
Let's say that I have my personal blog on the address
example.com/blog.html
and there are links to several sub-blogsexample.com/blog/math.html
,example.com/blog/coding.html
etc. So my root folder containsblog.html
andblog
folder, theblog
folder itself contains filesmath.html
andcoding.html
.First of all, I learned (from Google Webmasters Tools) that for SEO and aesthetical purposes it's good to unify
example.com.com
andexample.com/index.html
by adding_rel="canonical"_
attribute into the source of theindex.html
. Using a couple of other tricks (like linking to../
and./
) I got rid of the uglyindex.html
appearing in my web addresses.And now I wonder if this trick can be used not only for the root folder but for any folder? I mean, I would move my
blog.html
into theblog
folder, rename it into theindex.html
and addrel="canonical"
to unifyexample.com/blog/index.html
withexample.com/blog/
.
This trick would change the address of my blog fromexample.com/blog.html
intoexample.com/blog/
.Not finished! I'm also experiencing problems with the google robot indexing my folders. So when I type
site:example.com/
into the google search, the link to my folderexample.com/blog/
with raw files, icons etc. appears among the other results. I guess there are also other ways how to fix it, but IMHO the change mentioned above would do the trick too - the index.html in the blog folder would preserve the user from viewing the actual raw content of that folder, there would appear only the right linkexample.com/blog/
in the google search and (I hope that)_rel="canonical"_
would make the second, unwanted linkexample.com/blog/index.html
not to appear in the search results.So my questions are:
- Is it a good practice to have the
index.html
file in every subfolder or is it intended to be only in the root folder? - Are there any disadvantages or problems that may occur when using the second, "index in every folder" method?
- Which one of the two ways of structuring the website described above would you prefer?
-
Admin almost 11 yearsFor my clarification, do search engines see site.com/blog and site.com/blog/index.html as being 2 distinct files? If links with both URLs are being used, is there a chance you are splitting link juice/page authority between 2 locations?
-
Admin almost 11 yearsAs far as I know, search engines (at least Google) DO see them as two distinct files. Because they actually can be distinct - the two links can differ only in one single slash. (Read more here.) And yes, if you're using two different links to one page, the whole rank of the page is split between those two links and your page is effectively loosing the half of its rank. That's why I suggest the mentioned link canonicalization to prevent those leaks.
- Is it a good practice to have the
-
lukaszzenko almost 12 yearsThank you for such comprehensive answer! That public access to my folders and the fact that they are indexed by google, makes me quite angry so now when I know that there's no problem with the "index in every folder" trick, I will change my website this way. I just hope that the rel="canonical" trick will work and all those indexes won't appear in the google search... :D
-
Ryan Prechel almost 12 yearsDue to the two link limit restriction, I could not include links to Directory Index and Default Document in my answer, so here they are.