Robots.txt vs meta tag conflicts
Have you ever wondered what would resolve if your robtos.txt and your meta tags had conflicting directives regarding content indexing? Don’t worry, this question can be a real challenge to even very advanced SEOs. To make this issue a bit clearer to the general public, I decided to make this guide that will show you exactly what happens in these situations.
Robots.txt noindex vs meta noindex, which one has priority?
Case 1: Robots.txt forbids indexing of a URL but meta tag allows it. Outcome: Page will be blocked by robots.txt, meta tag will never be seen by googlebot thus will be ignored, page will only be shown in results as a reference in very rare occasions (it will not have a description, only title and URL).
Case 2: Robots.txt allows indexing of a URL but meta tags forbids it. Outcome: Page will not be indexed and will not be shown in the search results at all.
Case 3: Both Robots.txt and meta tag forbid the indexing of a URL. Outcome: Page will be blocked by robots.txt, meta tag will never be seen by googlebot thus will be ignored, which will allow the page to be shown as snippet with no description in rare occasions in the search results.
Meta noindex vs meta index conflict: If a single page has both meta noindex and meta index tags, meta noindex will resolve. The page will not be shown in the search results.
Advanced stuff, passing Page Rank in robots.txt and meta conflicts
Now that you understand the cases above, lets move on to something even more advanced. When will the link juice be passed and when it will not. As you may or may not know, even pages that are forbidden to be indexed via any means can still accumulate PageRank. If you don’t know how to handle it properly, you might be wasting a large amount of your link juice that could have flown to pages you actually want to receive it.
If you forbid indexing on a URL with robots.txt, Google is not able to see the content of the page, thus no links on that URL will ever be seen and they will not be able to pass their link juice.
On the other hand, if you use meta name=”robots” content=”noindex” and don’t forbid that URL through robots.txt, Google will be able to index the page, follow the links and pass the link juice, but will not be able to show the page in the search results.
From this we can conclude that in most cases, it’s much better decision to forbid indexing through meta tag than use robots.txt for the same purpose.
Need help with your Magento store’s SEO? Request a quote!
how to give different meta tag for each page , i have single website for multi language using wordpress plugin.
see the url for exp.
this is for english content same for hindi , /hi , korean /kr etc.
Nice case study, but i have a question, Suppose a page is blocked in robots.txt, but if i got one back-link of that page coming from another website. In this case what happen ? Page will indexed in SERPs or not ?
That’s a very interesting question.
It will not get indexed. It will most likely be displayed in SERP without description and with a note saying that this page can not be indexed due to robots.txt file.
Since Google can’t index the page, you cannot benefit from the link juice that the external backlink gave you.
For this reason, it would be better if the page in question was not disallowed through robots.txt file, but noindexed through the meta tag instead, as that would enable your page to forward that link juice to other pages on your website through internal navigation while still remaining outside of SERP.
Awesome, easy to understand explanation Toni.
I’ve implemented the directives suggested in this article: http://www.byte.nl/blog/magento-robots-txt/
Is this right, or am I blocking things that I shouldn’t block via robots.txt?
I also appreciated your video on dealing with the layered navigation SEO issue, although I am still unsure how exactly to block these damn search engines from this part of my site… the pages are growing fast in WMT! Can you recommend any one extension that could handle this for me, as I’m still getting my head around Magento’s code.
Great one…I also dont rely much on the robots.txt