Excluding Pages from a Search Engine’s Index continued
In the previous post I gave two examples for excluding a directory from being crawled and indexed by search engine bots. Now, if you simply want to your website NOT to be indexed (for whatever reason, which is mystifying considering the fact that you are reading an SEO blog and so you must want to promote your site!) all you need to do is include this in your robots.txt file.
User-agent: *
Disallow: /
Now you might think that this isn’t necessary especially if you didn’t even submit the site to any directories or any search engine submission pages but the truth is that as long as there is one site out there that links to your page, there is a possibility that your page will be indexed, even though it might take a long time for that to happen. So if you really want to keep your site out of SERPs then do use the robots.txt file.
Now if you want to some files in a directory indexed but most of the other contents of that directory out of search engine databases then you can also specify just that certain files to be indexed. For example
User-agent: *
Disallow: /diary/
Allow: /diary/keepoutwarning.html
This means that the contents of the directory diary will not be indexed except for the allowed page keepoutwarning.html.
For more information on how to use the robots.txt file visit The Web Robots Pages. They also have a Robots Database as well as robots.txt checker and IP lookup, which are very useful tools.
Popularity: 13% [?]
Similar Posts:


0845 838 7448 
No comments yet.