I have been experiencing a rather weird issue with my blog recently, and one that I hadn’t noticed until I accessed my Google Webmaster account. The reason I hadn’t noticed the issue was because it wasn’t easily identifiable when viewing my blog, or the administrative pages. The problem was the fact that several of my posts were returning a “404 Not Found” status code instead of a “200 OK” status code.
What has made this problem weird was that it was only happening on about 5 pages of my blog. All the other pages were returning a “200 OK” status code. Since all post pages use the same template file, and therefore are created the exact same way, I couldn’t figure out what created the problem. While the headers of the pages indicated a 404, the pages were displaying in the Web browsers without any problems.
Identifying the Problem
Every once in a while I access my Google Webmaster account to get an idea of what Google has to say about my blog. Usually, I don’t see anything too alarming, which is a good thing, but the other day I noticed that there were several crawl errors.
Usually the crawl errors occur because I have restricted the Googlebot from accessing some pages that I don’t wish to be indexed. This time, however, the crawl errors indicated that several of my post pages were returning a “404 Not Found” status code.
Thinking that there was a glitch, such as my blog being down for a few minutes and the pages couldn’t be accessed, I began looking at the headers returned by the pages. To do this I used Google’s Chrome browser:
- I right-clicked an empty area of the post returning a 404, and selected “Inspect element” from the menu. This opened a frame or window at the bottom of the browser.
- Next I clicked the “Resources” option in the bottom window to display a list of all resources that were downloaded for the post page.
- To view the header information, I clicked the page in the list of resources. The headers for the page opened up on the right, with the status code shown. Sure enough, the post page returned a “404 Not Found” status code.
I tried other post pages on my blog, and they all returned a “200 OK” status. I was confused as to why a few of the pages would return a 404 status code.
Temporary Solution
I began searching for a solution to this problem, but wasn’t able to find any. Apparently, not too many experience the same problem that I am experiencing. I decided to temporarily manage the status code from within my blog instead of having the server send it to the requesting client.
To accomplish that task, I included the following lines of code at the very top (above the DOCTYPE and html definitions) of my header.php file:
if (!have_posts()) {
header("Status: 404");
} else {
header("Status: 200");
}
?>
This is only a temporarily solution as there are problems with this approach. The web server is no longer managing the return codes of my blog, and only two return codes are sent back to the client. This means that for redirects, the client would receive a 200 status code instead of a 301. I may be able to alter the above code to return other status codes, but I would much rather find a solution to my problem than having to rely on including status code management in my template.
So now a few questions: has anyone experienced the same issue that I am experiencing? Do you have an idea of what is causing this problem? I would like to find the answer to this problem.