I have been experiencing a rather weird issue with my blog recently, and one that I hadn’t noticed until I accessed my Google Webmaster account. The reason I hadn’t noticed the issue was because it wasn’t easily identifiable when viewing my blog, or the administrative pages. The problem was the fact that several of my posts were returning a “404 Not Found” status code instead of a “200 OK” status code.
What has made this problem weird was that it was only happening on about 5 pages of my blog. All the other pages were returning a “200 OK” status code. Since all post pages use the same template file, and therefore are created the exact same way, I couldn’t figure out what created the problem. While the headers of the pages indicated a 404, the pages were displaying in the Web browsers without any problems.
Identifying the Problem

Every once in a while I access my Google Webmaster account to get an idea of what Google has to say about my blog. Usually, I don’t see anything too alarming, which is a good thing, but the other day I noticed that there were several crawl errors.
Usually the crawl errors occur because I have restricted the Googlebot from accessing some pages that I don’t wish to be indexed. This time, however, the crawl errors indicated that several of my post pages were returning a “404 Not Found” status code.
Thinking that there was a glitch, such as my blog being down for a few minutes and the pages couldn’t be accessed, I began looking at the headers returned by the pages. To do this I used Google’s Chrome browser:
- I right-clicked an empty area of the post returning a 404, and selected “Inspect element” from the menu. This opened a frame or window at the bottom of the browser.
- Next I clicked the “Resources” option in the bottom window to display a list of all resources that were downloaded for the post page.
- To view the header information, I clicked the page in the list of resources. The headers for the page opened up on the right, with the status code shown. Sure enough, the post page returned a “404 Not Found” status code.
I tried other post pages on my blog, and they all returned a “200 OK” status. I was confused as to why a few of the pages would return a 404 status code.
Temporary Solution
I began searching for a solution to this problem, but wasn’t able to find any. Apparently, not too many experience the same problem that I am experiencing. I decided to temporarily manage the status code from within my blog instead of having the server send it to the requesting client.
To accomplish that task, I included the following lines of code at the very top (above the DOCTYPE and html definitions) of my header.php file:
if (!have_posts()) {
header("Status: 404");
} else {
header("Status: 200");
}
?>
This is only a temporarily solution as there are problems with this approach. The web server is no longer managing the return codes of my blog, and only two return codes are sent back to the client. This means that for redirects, the client would receive a 200 status code instead of a 301. I may be able to alter the above code to return other status codes, but I would much rather find a solution to my problem than having to rely on including status code management in my template.
So now a few questions: has anyone experienced the same issue that I am experiencing? Do you have an idea of what is causing this problem? I would like to find the answer to this problem.







on August 4, 2010 at 12:27 am
That is bizarre that it wasn’t duplicated with all the pages. I have had 404 errors
1) when the hosting company was doing some maintenance..and once they were working on an error with some php files..weird in that my blog looked ok..I just couldn’t log into my WordPress admin. Luckily it was short lived.
2)A plugin caused the pages to mess up. I never figured out which one because: I just uninstalled all the plugins..and then reinstalled them and the error went away. This seems to be the cure for all the weird stuff I encounter.
on August 4, 2010 at 8:15 am
I agree, and am probably leaning towards a plugin as the problem. I’m thinking it may be my cache plugin, so I’ll have to look deeper into the issue.
on August 22, 2010 at 4:31 am
Hmm..weird. I never had this problem before. As in, I had 404 problems but that was because the bots were still using the old links instead of the new ones
Michael Aulia recently posted…Free Zinio Magazines for iPad or your web browser
Twitter: michaelaulia
on August 22, 2010 at 8:49 pm
I’ve had the problem of search engines using old links instead of the new ones as well. I find this problem, however, to be eerie as it only happens on a few pages.
on March 19, 2011 at 12:20 pm
Well call me nuts guys but I’ve been having this same problem with just one page on my site and it’s being driving me crazy.
The retards at Godaddy could only visit the page and insist on telling me it was ok without ever actually checking that it was returning 404 instead of 200.
So I started investigating further…
Firstly in my mind there were two problems –
1) http://www.proposedsolution.com/welcome/ was showing me a WordPress Page using a template called Welcome which was correct and exactly what I wanted except for the fact that it was returning a 404 at the same time as showing the correct page hence the mystery.
2) http://www.proposedsolution.com/welcome (without the trailing slash) was bringing me to Godaddys bog standard 404 page and not even redirecting to my own 404 page which is what any 404 on my site is meant to do but that’s another story.
So I decided to place a little blank welcome.html file in my root directory and BAM…both URLS started going to the correct welcome page and returned a 200 response code. Weird or what?
Now I can only surmise that this mysterious phenomenon is being caused by a number of factors including but possibly not limited to:
1) My .htaccess file – I have it set to re-write URLs without a trailing slash to include a slash. But this raises the question why did /welcome without the slash not get re-written by .htaccess instead of going to godaddys 404 page.
2) My WordPress page template is called welcome as is my page itself – is this causing issues?
3) I’ve read in other places that the server configuration can cause pages to return incorrect HTTP status codes but I never got deep enough to start investigating that due to Godaddys utter incompetence.
So, for the moment, I’m happy.
Hope this helps.
Rosco
Rosco recently posted…100 Advertising Revenue To Be Donated To Japan