Robots.txt for Silo SEO
The Robots.txt file is like a roadmap, each road is an access point and by using Robots.txt you can control access to specific roads.
Robots is referring to bots, a bot is generally an automated crawler that goes through your site by locating and following links then drawing out a roadmap.
The problem with bots is they don’t know when and where to stop, they have zero intelligence, they just collect the data of a site then take it back home for the bigger brothers to crush up and analyze, the bots bigger brothers are the more powerful automated machines that sit in server farms and crush numbers based on rule sets in this case algorithms, these algorithms are used to determine what your site is about and eventually determine its ranking in the SERPS (Search Engine Result Page).
The problem is they can both start to get confused if you have a messy site structure, WordPress in this case contains so many roads that overlap it can and does make our sites appear more like a set of roundabouts.
Silo SEO WordPress Robots.txt
Never allow indexing of your cgi-bin for the love of god.
- User-agent: *
- Disallow: /cgi-bin
Next up we need to tell the bots not to bother indexing our private WordPress directories.
- Disallow: /wp-admin
- Disallow: /wp-includes
- Disallow: /wp-content/plugins
- Disallow: /wp-content/cache
- Disallow: /wp-content/themes
We also need to block access to our feeds, why would we want the bots crawling through our feeds right? We want them crawling our onsite content so its ranks well in the SERPS.
- Disallow: /feed
- Disallow: /*/feed
Next up is comments, we want to treat our comments as part of the on site content, not in our comment feed.
- Disallow: /comments
You also don’t want the bots indexing author archives, because it just adds more and more onsite duplicate content.
- Disallow: /author
Another one that adds duplicate content is our tag archives.
- Disallow: /tag
And believe it or not the date archives are also a problem for SEO, so lets just block the entire archives out of the search engines.
- Disallow: /archives
And just to make sure the bots don’t go near the date archives put this in.
- Disallow: /2010/*
- Disallow: /2011/*
- Disallow: /2012/*
You also don’t want any iframes being indexed NOTE this is pointless unless you create an iframe directory.
- Disallow: /iframes
- Disallow: /privacy-policy.html
- Disallow: /web-site-agreement.html
You also don’t want your categories being indexed, we cut this out in the Basic Bogan Training, but you can do this here also, note don’t add this to your robots.txt unless you have followed along in module 3.2 WordPress SEO in the Bogan Basic Training.
- Disallow: /category/*/*
And forget indexing trackbacks
- Disallow: */trackback
Cool now we are looking sweetin terms of WordPress Silo SEO.
But you also don’t want certain file types being indexed for example type this into Google.
Scary right, I can remember doing all sorts of crazy stuff with this back in the day, people had no idea Google was indexing file types.
Here is a good start of file extensions to start blocking, you can make your own file extensions up and block them so you can store hidden files, works well.
- User-agent: Googlebot
- Disallow: /*.php$
- Disallow: /*.js$
- Disallow: /*.inc$
- Disallow: /*.css$
- Disallow: /*.gz$
- Disallow: /*.wmv$
- Disallow: /*.cgi$
- Disallow: /*.xhtml$
- Disallow: /*.xlsx $
- Disallow: /*.doc$
- Disallow: /*.pdf$
- Disallow: /*.zip$
Because we blocked all wp-* directories you will need to update your wp-content/uploads to another directory, I suggest you just create images.
- Allow: /images
Now just add a link to your site map, take this out for mass blog installs, you will need to install the XML Sitemap plugin ti generate this file.
- Sitemap: http://yourdomain.com/sitemap.xml.gz
That’s it’s your now solid, forget paying for Silo plugins or whatever, if you want more I suggest you check out Module 3.2 WordPress SEO in the Basic Bogan Training so you can get your permalinks perfect for SEO.
Below is the full robots.txt file, if you copy and past the code below into a .txt file called robots.txt and upload it into your sites root directory the bots will treat your site as a Silo SEO wordpress blog.
Disallow: /cgi-bin Disallow: /wp-admin Disallow: /wp-includes Disallow: /wp-content/plugins Disallow: /wp-content/cache Disallow: /wp-content/themes Disallow: /feed Disallow: /*/feed Disallow: /comments Disallow: /author Disallow: /tag Disallow: /archives Disallow: /2010/* Disallow: /2011/* Disallow: /2012/* Disallow: /iframes Disallow: /privacy-policy.html Disallow: /web-site-agreement.html Disallow: /category/*/* Disallow: */trackback
User-agent: Googlebot Disallow: /*.php$ Disallow: /*.js$ Disallow: /*.inc$ Disallow: /*.css$ Disallow: /*.gz$ Disallow: /*.wmv$ Disallow: /*.cgi$ Disallow: /*.xhtml$ Disallow: /*.xlsx $ Disallow: /*.doc$ Disallow: /*.pdf$ Disallow: /*.zip$
User-agent: * Allow: /images
Global Translator : Automatically translates your blog into 41 different languages. The number of available translations depends on your blog languages and the translation engine (features four – Google translation engine, Babel Fish, Promt, and Free Translations) you choose to use. Features include its SEO friendliness (by adding permalink language code, (e.g. en for English) at the beginning of you URL), fast caching system, configurable layout and not database modifications. The Global Translator plugin has been downloaded over 180,000 times. There is a great list of FAQ’s on the plugins addressing speed, layout and translation engine problems. Similar translation plugins are the ConveyThis Blog Translator (nearly 3,000 downloads)