In this guest post, regular columnist and B&T’s resident SEO expert, Octos MD Nital Shah, talks all things “canonical”. If your game’s search then here’s what you need to know…
Traditionally, ‘canonical law’ refers to the body of laws laid down by religious leaders, for the internal regulation of a church and its members. Given the almost-holy role of search in the modern digital age, it’s fitting that, when the almighty search overlords Google, Yahoo and Microsoft decreed a new rule for the internal regulation of search engines, they used the term ‘canonical’ to describe it.
When Google, Yahoo! And Bing adopted the rel=canonical link element in 2009, the search community went into an instant flap trying to figure out how to make the most of the new coding commandments. Some of my colleagues mastered the new protocol swiftly and expertly, leveraging it to improve traction in search results. Others are are still floundering to make headway, seven years on.
Though the initial hubbub about canonical URL tags has long subsided, many still don’t quite know how to apply the tags for optimum SEO performance. If you haven’t yet nailed the nuances of canonical tags, the following guide offers some practical advice to get you on track.
What Are Canonical Tags?
Put simply, a canonical URL tag is intended to help webmasters clean up their indices by eliminating self-created duplicate content. The canonical tag acts as a master key of sorts, unlocking a preferred version of the content and saving the search engines from rummaging through a bunch of similar-looking keys.
Many websites contain several similar versions of the same content and, as we know, search engines don’t like duplicate content. Duplication slows down the crawlers, making it harder and more time-consuming for the search engines to do their job. Consequently, websites with duplicated content can be penalised with lower search rankings.
But, of course, sometimes duplication is unavoidable and integral to the function of a website – particularly for e-commerce sites. In these cases, canonical tags offer a smart solution for the content duplication dilemma. Where a site contains multiple similar versions of the same content, you simply choose one ‘canonical’ version and direct the search engines towards that, making it clear to them which version of the content to show.
The tag can be applied to all duplicated versions of the content – it looks like this:
<link rel=”canonical” href=”http://samplecontent.com/page.html“/>
The canonical tag is part of any given web page’s HTML header, the same section where you add Title and Meta Description. By adding a canonical tag to the HEAD section, you are telling the search engines that the page should be treated as a copy of the URL samplecontent.com/page.html and that, therefore, any applicable link and content metrics should be tied back to that URL.
An SEO Blessing: three Tips For Using Canonical Tags
- Find the Culprit
Website content tends to multiply like gremlins: every social media visit, referral link, and internal site search has the potential to generate a unique URL. On top of that, many content management systems create multiple URL paths to access the same content. Without canonical tags, you can end up with a complex web of duplication that will potentially impact negatively on your search rankings.
Duplicate content is a big SEO no-no. Use canonical tags wherever duplication occurs, to help the search engines identify the original content, and let them know which URL should be crawled, indexed and returned on SERPs. A thorough SEO audit may be necessary to identify the unique duplication issues that are impeding your search rankings.
- Canonical Tag vs 301 Redirect: Know the Difference
Although they share similarities, 301 redirects and canonical tags are NOT the same. The former redirects all traffic (including humans and bots) to a specified location, whereas the latter speaks purely to the search engines, so visits to each unique URL version can still be separately tracked.
Another key distinction is that 301 redirects have cross-domain functionality so you can redirect from one domain to another and retain search engine metrics. A Canonical URL tag, on the other hand, operates exclusively on a single root domain, which means it will only carry over across subfolders and subdomains.
- Duplicate Content Not Always Identical
Google have made it clear that it’s okay if the canonical is not an exact duplicate of the content. The search giant allows for minor variations, such as differentials in the sort order of a table of products. They also acknowledge that they may crawl the canonical and duplicate pages at different points in time and therefore expect to see different versions of your content over time. The other search engines take a similar tack.
3 SEO Sins: How Not to Use Canonical Tags
- Use Your <Head> – Don’t Put Canonical Tags in the <Body>
In order for the search engines to find and recognise your canonical tag, it must be placed in the <head> section. If you place rel=canonical anywhere other than the <head> section, it will be ignored. Furthermore, the tag should ideally appear as early as possible to avoid any parsing problems.
And don’t get carried away with canonical tags; if you use more than one in the <head> section, the command will be ignored. Multiple canonical links can sometimes occur without you realising it – so always tread with caution when installing SEO plugins and editing themes or templates.
- Don’t Set Home Page As Your Preferred URL
There are some rare occasions where your home page will be the preferred URL, but they are an exception to the rule. If all of your canonical pages direct to your home page, you run the risk of having no pages, other than your index, crawled and indexed by the search engines.
- HTTPS Over HTTP
Google usually prefers HTTPS pages to equivalent HTTP pages for canonical links, except where there are conflicting signals. To prevent Google from incorrectly making the HTTP page canonical, avoid bad SSL certificates and don’t include the HTTP page in your sitemap in place of the HTTPS version. Also, when you block a resource with a robots.txt file, be sure to block both versions – HTTP and HTTPS.