What are URL canonicalization issues and how do you fix them?
A very common issue that we run into when performing an SEO analysis of a website is canonicalization. It’s a simple concept, but one that so many webmasters overlook. So what is it, and more importantly, how do you fix it?
What are URL canonicalization issues?
Canonicalization is the process that the search engines use to pick the best URL for a given page when there are several choices available. This often refers to
homepages, though the concept is true across entire domains. This is an important issue in search engine optimization because instead of your PageRank and link authority being distributed among one page (or domain), it’s essentially distributed among several.
Common examples of URL canonicalization issues.
The most common way we find this is in talking about www vs. non-www versions of a website (and sometimes in http vs. https). While you may think that typing in “www.yourdomain.com” or just “yourdomain.com” are the same thing, search engines look at the two as separate domains. What this means is that you could essentially have a duplicate of your entire website in the eyes of the search engines. Most often, Google or the other search engines will simply choose which version they think is most relevant for any given page, but when you start to factor in link building etc. things can become a mess quickly. Not to mention, do you really want to give that control over to the search engines vs. having it for yourself? Didn’t think so!
-
Example 1: Search engines will treat the following as separate domains even though you or I would see them as the same site.
- http://www.yourdomain.com
- http://yourdomain.com
- https://www.yourdomain.com
- https://yourdomain.com
In the above example 1, we could have four versions of a domain being crawled by search engine spiders.
Another very common way we run into canonicalization issues are with multiple versions of pages. A common way this happens is by inconsistent linking, like linking to the root domain in your header but to a variation of the index page elsewhere.
-
Example 2:
- http://www.yourdomain.com/
- http://www.yourdomain.com/index.html
In example 2, we essentially have two versions of the homepage, both identical, because of sloppy linking. Mix this kind of problem with the one outlined in example 1 and you can quickly see how this can pose an SEO problem.
How to fix URL canonicalization problems.
Addressing www vs. non-www canonicalization issues is usually done by performing a 301 redirect of the undesired URL to the desired URL. A 301 redirect tells a browser (and the search engines) that a page has moved permanently, and is considered to be “SEO friendly.” This is often done on the server side in the .htaccess file.
You can also use the rel=canonical tag, though there are reasons you might or might not want to do this. There aren’t any standards for how this tag is used across the various search engines. Google, for example, will take your preference into account but does not offer unconditional support.
Or if you use a content management system (CMS) like Joomla there are many plugins and extensions out there to address canonical issues.
Regardless of the way you choose to address canonicalization, it is a topic worthy of your consideration. For more reading on the topic check out Google’s webmaster page on canonicalization here.