Tutorials Blog
Hosting
Software
Dolphin Tips
Help Forums FREE Mods
Licensing
People
Donate
News
Search
  •  
 
 
Tutorial
10.18.2009 00:00    Categories: Webmaster Resources     

How to create a robots.txt file for your Boonex Dolphin web site or any site.

Whether you have a Boonex Dolphin site or any other type of site, and regardless of the company you use for web hosting you should consider adding a robots.txt file.

A robots.txt file helps various robots to know which part of your domain to find and read and which parts not to read. There are hundreds of internet agents, or robots out there, so it's important to give some direction to those robots that actually do read the robots.txt file.

There are a number of robots that claim to follow and abide by the rules of your robots.txt file, but there are plenty that simply ignore it.

If there is a particular robot that you don't want visiting your site and pages then you should use .htaccess which can block them completely.

There are many discussions on whether robots.txt file is really that useful or not. It only takes a few moments to create one so it's something to think about and work on in your spare time.

The general concept behind robots.txt file is you can tell crawlers or bots such as the Googlebot that you don't want them looking in certain folders/directories, or files on your site. Basically telling them not to crawl them so they won't be indexed. By default many robots (bots) will look to see if your site has robots.txt file.

Like I mentioned many of these bots ignore the rules you set up in robots.txt these days, even Google has stated in this video on Matt Cutt's Blog that Google will obey your robots.txt file, but if other sites are linking to it Google may add it to the index. Google recommends adding noindex to the meta tag in the head of your web page if you do not want it indexed. To me it doesn't seem right that the Googlebot supposedly obey's your robots.txt file, but still may index the page. Kind of defeats the purpose of having a robots.txt file to begin with.

Say you wanted to prevent bots from crawling your Boonex Dolphin admin folder / directory (http://www.yoursite.com/admin).

You can create this rule in robots.txt with something like:
Disallow: /admin/

Or perhaps:
User-agent: *
Disallow: /admin/


Continue adding folder/directories and files you do not want crawled:
User-agent: *
Disallow: /admin/
Disallow: /inc/
Disallow: /langs/
Disallow: /xml/

Or add specific files to it:
User-agent: *
Disallow: /admin/
Disallow: /inc/
Disallow: /langs/
Disallow: /xml/
Disallow: /greet.php

The path is relative to root and must contain a trailing "/"

Simply use any text editor such as notepad and save the file as robots.txt, then upload the file to your root (http://www.yoursite.com/robots.txt).

Continue adding folders/directories and files to suite your sites needs. This is not Dolphin specific it can and should be used for any website.


One of the helpful things you can do is add your sitemap xml location to robots.txt. This way when a crawler or bot comes to your site and finds your robots.txt file you can tell it where to find your sitemap.xml containing all your links and web pages. Sort of a little plus or bonus.

Simply add this to the bottom of robots.txt file:
Sitemap: http://www.yoursite.com/sitemap.xml

*Obviously you should have created a sitemap.xml first and have it uploaded to your root too (http://www.yoursite.com/sitemap.xml).

So your robots.txt file would then look something like:
User-agent: *
Disallow: /admin/
Disallow: /inc/
Disallow: /langs/
Disallow: /xml/
Disallow: /greet.php
Sitemap: http://www.yoursite.com/sitemap.xml


After you have uploaded robots.txt file there is nothing more you need to do. The bots will find the file all by themselves once they visit your site.

If you haven't created a sitemap for your site you can see this article for more information: Create and Generate Sitemap Files for your Site.

For a quick and easy robots.txt generator check this site out:
Robots.txt Generator Tool

And .htaccess File Generator Tool:
.htaccess File Generator

 
Share It