Why hide your website from search engines?
Here are a number of reasons why you may want to make a site impossible to find:
- Exclusivity - your website has a special section reserved only for people coming from a specific place - such as a fedelity card giving access to an promotional offer.
- Privacy - businesses often need a private area on their website for employees. They usually do not want this section to be indexed.
- Low quality webpage - an important factor in search engine optimization is the quality of the content on your website. If you have some low quality pages, it is good practice to block search bots from indexing the pages.
- The page is irrelevant to your business - another key SEO ranking factor is how your pages are interlinked and how they relate to one another and your business.
You may wish to hide from search engines pages that are off-topic or not in your line of business, to provide a clear and powerful message to search engines about what you actually do and avoid diluting or obscuring your message.
Obviously I cannot leave a link here to our uncrawlable site, as our client required a hidden website that was only accessible to clients passing through the private area of a popular shopping website.
How to build an unindexable, uncrawlable site
There are lots of ways to keep your website out of search, and some practices that must be avoided. In our experience the first 4 should do the trick:
1. Absolutely no links on the web to your website
Avoid leaving any scent that a robot may pick up on.
2. Robots noindex meta tag on every page
The meta tag - "meta name="robots" content='noindex'" placed in the head section of all pages sends a direct instruction to a robot to NOT index the page even if a robot stumbles across it.
The robots.txt is a file specifically written for robots, placed in the website root folder and most websites have one. You can write in robots.txt specific instructions to not index part, or all of your website. This is a highly effective way of blocking major search engines. The following two lines of code would block robots from your entire website:
Use Robots.txt if step (2) above has not worked. Robots txt and robots meta have a conflict in that blocking with robots.txt will stop the robots crawling the page and reading the meta noindex instruction (as the robot cannot see it), but will not necessarily stop it being indexed.
4. Remove the pages from your site map
Same as (1) above.
5. Password protect your site
Still paranoid, or just want to be absolutely sure - add a password protection or user logon to your website.
Does it work?
Yes. We are currently building our fourth un-indexable website (using methods 1, 2 and 4 above), and to date we have not had a single visit via search.