18 Aug, 2011 | Adam Henige in SEO

Google Not Indexing Your Site? Your Robots.txt Might Be To Blame.

A few months back we were contracted by a local organization that came to us with a very clear problem – they weren’t showing up on search engines.  Any of them.  For anything.  After a brief conversation with the developer we found out that the site had been live with a noindex command in the robots.txt file (for roughly two years, we were told).  This seemed like a simple enough fix, we’d remove the robots.txt file, do some basic on site tweaks, submit some .xml sitemaps and all the usual playbook stuff and next thing you know we’d have this site ranking out.

Now, the site in question was closely related to a government agency, so it’s not as if this site didn’t have any links, in fact it had some high quality .gov and .edu links under its belt already.  So we started the typical playbook stuff as we had intended and before you know it the site was indexed and ranking out in Yahoo and Bing, but there was still no sign of life in Google.  Well, that is of course if you don’t count the fact that Webmaster tools was telling us the site IS being indexed.

The sitemap says we're indexed, right?

The sitemap says we're indexed, right?

Pray tell Google, what is going on?

Obviously, we found this kind of odd.  So we figured we’d go directly to the Google and ask if they would reconsider the site, since everything looked good to us.  The response we got from big brother was as follows:

Dear site owner or webmaster of http://www.I’MNOTTELLING.org/,

We received a request from a site owner to reconsider http://www. I’MNOTTELLING.org/ for compliance with Google's Webmaster Guidelines.

We reviewed your site and found no manual actions by the webspam team that might affect your site's ranking in Google. There's no need to file a reconsideration request for your site, because any ranking issues you may be experiencing are not related to a manual action taken by the webspam team.

Of course, there may be other issues with your site that affect your site's ranking. Google's computers determine the order of our search results using a series of formulas known as algorithms. We make hundreds of changes to our search algorithms each year, and we employ more than 200 different signals when ranking pages. As our algorithms change and as the web (including your site) changes, some fluctuation in ranking can happen as we make updates to present the best results to our users.

If you've experienced a change in ranking which you suspect may be more than a simple algorithm change, there are other things you may want to investigate as possible causes, such as a major change to your site's content, content management system, or server architecture. For example, a site may not rank well if your server stops serving pages to Googlebot, or if you've changed the URLs for a large portion of your site's pages. This article has a list of other potential reasons your site may not be doing well in search.

If you're still unable to resolve your issue, please see our Webmaster Help Forum for support.

Sooooooo?  This left us stumped.  We engaged other SEOs to see what they thought (thank you SEOmoz Q&A) and most of our brethren seemed to agree that this was a rather peculiar situation.  We built a few more quality links into the site and didn’t see any changes, so we’re working with the client on deciding on some more drastic courses of action (moving the domain and using 301s is probably our preferred option).

The final bit of weirdness here was with the latest Google PageRank update the site went from a PR0 (not an n/a, which I also thought was a little strange) to a PR4.  Several subpages are also PR4, which makes this the most authoritative group of pages we’ve ever seen (both from a PageRank standpoint as well as just from a subjective qualitative standpoint) that are not indexed.

Anyhow, I just wanted to share this tale since it seems so out of the ordinary and wanted to hear if anyone else had similar experiences and throw out some caution that Google seems to treat a long standing nofollow in robots.txt like Axl Rose treats other members of Guns n’ Roses.

UPDATE 8.18.11.

Last week the site was FINALLY indexed by Google.  From the moment we removed the offending robots.txt file it took just over three months to get indexed.  It would seem the extensive period of self induced non-indexation was a real problem, so this may not be a perfect guide for everyone, but at least it’s an example of how Google might react in a similar situation.

Related articles

About Adam Henige

Adam Henige is Managing Partner of Netvantage Marketing, an online marketing company specializing in SEO, PPC and social media. Adam heads the SEO and link building efforts for Netvantage and has been a contributing blogger for industry publications like Search Engine Journal and SEOmoz.

Tags: , , , , , , , , , , , , , , ,

 

Comments

  1. That addresses seerval of my concerns actually.

Leave a Reply

Before you post, please prove you are sentient.

What is 2 times 8?