If you are running a website, and side-by-side with that is a staging or testing site, it is important to have a strategy that stops your staging site from damaging your SEO efforts.
There are a number of ways a staging site can cause you problems with SEO.
For one thing, the search engines might find and index your staging site. Worse, it might actually rank better than your main site – especially if you are road testing SEO-focused technical upgrades on the staging site first. Worse still, the search engines might decide to drop the main site from the index showing preference to the staging site.
This is especially a problem for an ecommerce site. Even more so if (like they should be) the admin systems for your staging and main websites are distinct – if a user places an order on the staging site, this might not get fulfilled, which ultimately damages your online reputation in the long term, even if you repair the technical issues to prevent it happening in the future.
You might try to prevent the staging site being indexed using a robots exclusion file (robots.txt). This is quite common, but it carries its own risks. Many times I have seen robots exclusion files from staging sites uploaded to main websites in the past. The result is what stopped a staging site being indexed suddenly stops a main site from being indexed. And until the sites are dropping from their normal ranking, it can be quite difficult to notice – do you go and check your robots.txt file every day to make sure it hasn’t changed?
To overcome these risks, there are two things you should do.
Firstly, you should implement the canonical tag as standard in your code. This has to be done correctly, as a poor implementation is also risky. However, the benefit of doing this right is that even if search engines do begin to index your staging site, the URLs they list should be those on the main website, so searchers should no inadvertently end up on the staging site in the first place.
The canonical tag should always use the domain of the main website, and the extended URI of the page which is being viewed.
For example, on http://staging.domain.com/page10.htm, the canonical tag should be
<link rel=”canonical” href=”http://www.domain.com/page10.htm” />
The second thing that you should do is to password protect your staging site. This has two benefits. The primary benefit is that the site cannot be indexed by search engines, and so should not appear in search results (remember to send the correct “Unauthorised” 401 header response, so the search engines understand it is protected content). The secondary benefit is that competitors cannot get an early heads up on new features of functionality that are being tested on the staging site.
As an aside, have you been able to find and explore your competitors’ staging sites at all? If they have left them open to public interrogation by not doing the above, it gives you a chance to keep up with what they may be planning. Thanks to Sam Crocker for that tip at last year’s ProSEO!
By implementing the staging site in the manner described above, you are able to maintain the same robots exclusion (robots.txt) file on both your main website and your staging site (written from the point of view of your main website), preventing ranking catastrophes should it be inadvertently uploaded to the main site from staging at any point.
Bear in mind though, that if you are using .htaccess on Apache to password protect your staging site, then there is a risk you might upload the .htaccess file that causes the password requirement – this is much easier to spot than an inadvertent robots.txt upload though, as all users will be asked for a password. Make sure you have a backup of your normal root .htaccess file if your site uses one, so you can restore it if you need to.
You might (rightly) point out that if you carry out the password protection, the canonical tag is unnecessary. This is true, but in my experience few developers will password protect their staging servers. However, whether they do or not – if the password protection is ever absent (for example, if an .htaccess file is deleted by mistake), the canonical tag provides back-up protection.
More than this though, so long as it is done correctly, in my opinion, use of the canonical tag should just form part of your best practice, anyhow.