A soft 404 is one of those little niggling things which appears to make technical SEO that little bit more fiddly now and again. They can be a real issue, as they make crawling your site for 404s much more difficult, and can serve as a bottleneck for link equity.
Soft 404s, often known as crypto 404s, are pages which aren’t found according to the web browser (often returning a standard 404 page), but which return a response code of 200.
(Redirect path by Ayima – very useful for finding soft 404s)
This is the kind of problem which occurs when you try to construct your own swanky 404 page, and then forget that you need to return the appropriate server status code – but it also happens when redirects are done badly.
Soft 404s are an issue as, according to Google, “soft 404s can limit a site’s crawl coverage by search engines because these duplicate URLs may be crawled instead of pages with unique content”.
Soft 404s tell search engines that there is a a real page at that particular URL (thus the status code 200) which means that it can be indexed and can continue to appear in results pages. This means your crawl coverage will take a blow – especially if there’s lots of them.
Finding Soft 404s
Thankfully, you can find soft 404s in Webmaster Tools relatively easily. It was a feature which Google first installed in 2010.
You’ll get this message in Webmaster Tools if you’ve got some soft 404s:
You will need to download the list of soft 404 URLs and correct them in your .htaccess file.
It is not always a good idea to redirect all retired pages to the homepage – if you do so, you may end up with a load of soft 404s.
When you implement a redirect, that target page must be relevant to the original page. If you don’t have any alternative page, it is often a better option, especially if this is an e-commerce site with high product turnaround, to allow the retired page to return a 404 error.
It is for this reason that a good 404 page is key. A good 404 page will offer a search function, categories and anything else which might be useful to someone who has come to your site looking for something only to not find it right off. Helpfully, Google provide a custom 404 page that makes things a bit easier.
You need a 301 redirect if your page has moved permanently. Only use a 302 redirect if you intend to return to using said page in the future – a 301 redirect tells search engines to only index the landing URL and pass on link equity to that URL.
If you have a WordPress site, there are plenty of plugins which will do the redirecting for you. However, it may be that you need to access the .htaccess file of your site (via FTP) in order to set up redirects properly (ALWAYS copy and save your code before editing. The .htaccess method only works for Apache servers).
All you need to put into the .htaccess is:
In this example, the page /cats/kittens.html will be redirected to /cats/kittens-in-baskets.html.
Redirect Entire Domains
In order to redirect a domain to a new one, and redirect all pages to their new equivalent (for example, www.thisisoneexample.com/cats/kittens.html redirected to www.thisisanotherexample.com/cats/kittens.html) you need to use Regex.
(Neither of the above will work with IIS)
In the above example, the ‘$’ dentotes the inclusion of any URL strings after the main domain name, and sends them redirected to the URLs with the same filenames and directories. This should only be used if you haven’t made any changes to your current site structure, or you’ll get a bunch of soft 404s again.
By doing this correctly, pages will receive the correct HTTP header status code. You can use the ‘fetch as Google’ tool to check that everything is now in order.