Understanding 301 Redirects
From Tom:
Please welcome Bruce Chapman as a seablick.com guest blogger. If you are a “regular” around here, Bruce won’t need much of an introduction as he has made a name for himself in the DNN community as a developer of SEO-focused modules and components. In upcoming blog posts, Bruce will shed light onto the technical side of DNN, SEO, and DNN SEO and I’m thrilled to have him on board … welcome Bruce!
The most important piece of advice for obtaining and maintaining a high position in search engine result pages for your website is to get relevant, one-way, inbound links from other respected websites. Relevant in that the link originates from a page with content related to yours, and the link anchor text preferably contains the keywords you wish to rank highly for. Respected websites are those not involved in shady practices such as link farms, and, ideally, represent an authoritative source.
This advice is given just about everywhere - and for good reason. It's good advice, and quality inbound links will have a greater effect on your search result placement than any other optimization technique. However, there's a problem with collecting inbound links that may not be obvious straight away. The problem appears down the road when you decide to change your site in some way. You may need to reorganize your content into a new structure, or perhaps you want to change the platform the site runs on (not that you ever want to move away from DNN :)- You might wish to change the URL scheme, or simply change the URL of a given page. All of these actions have the same effect: the location of the page has changed.
Here's an example:
You set up a page called http://mydomain.com/myverycoolthing.html.
It's an interesting page, and someone submits it to a social news site. As a result, you get plenty of inbound links to it, and you start to rank well in the search results for searches of 'very cool thing.' But after reading a few articles, you decide the URL would be better if it separated out the keywords with a hyphen. So you rename the page 'my-very-cool-thing.html.'
Once you change the URL of a page in any way, here's what happens:
- visitors following that link from the source page will arrive at a page that no longer exists ('page not found' error)
- visitors who have bookmarked that page will return to find the page no longer exists
- search engine robots will revisit the link to discover it no longer exists and remove it from the search index
- search engine robots following the link from the source pages will discover it no longer exists, and the "vote" of those incoming links is removed from your site
You'll start to slip down the search engine results pages, and soon enough you'll disappear from it altogether. You've undone all of that hard work by renaming the page URL.
What are Http status codes?
I've already mentioned '404' in this post, and it's barely past the introduction. A lot of people talk about 404's and 302's like everyone understands them, so here's a jargon-busting refresher on status codes. Status codes are hidden (for the most part) by modern browsers, but I think that's a shame, as they are useful information once you understand them.
Http status codes are the numeric values returned by a web server in response to a user agent (such as your web browser) making a request for a page or other recource. The status code lets the browser know, in a shorthand way, whether or not the web server could fulfil the request. The full list of status codes is quite long, and is good for fixing sleepless nights, but are the most common ones:
200 : Request completed OK (translation : great, here's the page you want!)
301 : Page / Resource permanently moved to new location (translation : we've moved location, please go across the street.)
302 : Page / Resource Moved Temporarily (translation : this checkout closed. Please use aisle 7.)
404 : Page / resource not found (translation : nope, I don't have one of those. Try something else.)
500 : Server error (translation : Oh dear, I seem to be broken and can't help you right now.)
Wikipedia has the full list of http status codes if you are up late and need something to help you sleep. In general, status codes beginning with '2' means things went OK. Status codes beginning with '3' means the resource has moved somewhere else, and status codes beginning with '4' mean that you can't get the thing you asked for. Status codes beginning with '5' mean that something is wrong with the web server.
Enter the Redirect
The solution, of course, is to redirect all requests for 'myverycoolthing.htm' to 'my-very-cool-thing.htm.' A redirect means that if anyone requests the 'old' URL, they will get forwarded to the 'new' URL. Redirects are nothing new - in fact they are as old as the World Wide Web itself.
The most common redirect you see on websites is called the '302' redirect. 302 refers to the Http status code (see sidebar), and it will temporarily redirect the request to the specified new location (the new location is supplied along with the status code.)
A 302 redirect will ensure that people requesting the Url will get to see the content they expected (points 1 and 2 above). It will also direct search engine robots to the new page and ensure it doesn't get removed (point 3). However, search engine bots will vary in how they treat the weighting of the link (point 4).
It's like when you go away from your house for a while, and get the mail forwarded. Should people update their address books to your new address? Not really, because forwarded mail generally means you'll be back at the original address at some point. A 302 redirect is equivalent to telling the post office to send your mail to a beach house for the holidays, and to stop when you return.
Part of the problem with using a 302 redirect is that it is has been misused by people trying to trick search engines, and it indicates a temporary shift in the URL. So some search engines will update their index to the new URL, others won't, and the end result is not what you are after.
Matt Cutt's blog (Matt works for Google) has an interesting post on the difficulties in interpreting 302 redirects. It's well worth a read for more details on the issue.
A 301 Will Do
A 301 redirect (again, 301 refers to the Http status code) means a permanent redirect, as in 'the resource has permanently moved.' Reverting to the mail metaphor, if you move into a new house, you get the mail redirected to your new location, and at least where I live, the post office takes care of informing your bank, insurance company and government agencies that you have permanently changed your address. A 301 is equivalent to this scenario : we've moved and we're not coming back.
When search engines receive a 301 redirect status code, they know the URL in question has permanently changed, and they will go and investigate and index the new page. They will also assign any link weighting from incoming links over from the old URL to the new URL. This means your page, even if it has completely new content and a new URL, will retain the Google PageRank of your old page. Your page has permanently moved to a new location.
If you search around the Internet long enough, you'll find all sorts of conflicting information about using 301's and search engines. But I'm here to tell you this: it works, and it works well. I've changed the URL on many, many pages on websites, and without exception, it works well.
An Example Using 301 Redirects
I'm always tinkering with my own site, ifinity.com.au. I read my search keyword statistics and make adjustments to URLs and site structure based on what people are actually searching (and finding) the site with.
Originally, I set up a section / subsection in my website called 'What We Do.' I thought this title sounded more personable. But really, I found that people never search for 'what we do.' They search for things like 'products', 'services', 'consultants' and keywords like that. I made the decision to change 'what we do' to 'services.' It's the sort of language that people actually use, and as people speak, so do they search.
The menu structure before the change is shown on the image at the left. Because my site is built with DNN, the URLs are based on the page names, which are also used to generate the menu items. This resulted in a URL of 'What_we_do/Software_Development' for the page.
You can see a screenshot of the Google search results for 'ifinity software development' below. (the fact that Google assumes I mispelled the name - we'll just skip over and cover another day).
The screenshot was taken on May 21, 2008, the day I made the change to the site structure. The relevant URL for the page has been highlighted in red.
The menu structure after the change is shown in the image on the left. The new URL looks like this 'services/software_development/.' It's important to note that I also changed the entire content of the page, rewriting the copy for the page to better reflect what I think people are looking for when they find this page. In terms of search engines, the page is completely different - different URL, different content.
When I changed the page over, I also created a 301 redirect from the old URL 'What_We_Do/Software_Development' to the new page 'Services/Software_Development.' I then monitored search engine bots visiting the new page and whether or not the Google index was properly updated. Within a week, I had my answer from the Google:
This screenshot was taken a week later, for the same search term. As you can see in the red-highlighted area, the page URL was updated, and the page maintained its position in the search results (overall, the site jumped one position, which may or may not be related.) The 301 redirect did its magic, and did so within a week of being issued. Now, you can't rely on 'a week' as the time it takes to get updated. It certainly isn't instantaneous, and it can take much, much longer. There are still some links (a month later) in the index which haven't been updated since my reorganization - these are links which rank low in search results. However, the higher ranked your page, the faster the index will get updated.
Incidentally, you can see my change in the page meta description field, with my 'new' copy intended to encourage more click-throughs from the search results. I'm interested in feedback which seems more 'clickable.'
Checking the Logs
If you're the technical type, like me, you might want to check your website log files to see if the redirect is working as intended. Here is an excerpt from the log files of ifinity.com.au:
2008-05-22 17:22:25 GET /What_we_do/Software_Development/ - 80 HTTP/1.0 Mozilla/5.0+(compatible;+Yahoo!+Slurp;) - www.ifinity.com.au 301 0 0 473 267 250 2008-05-22 17:22:26 GET /Services/Software_Development/ - 80 HTTP/1.0 Mozilla/5.0+(compatible;+Yahoo!+Slurp;) - www.ifinity.com.au 200 0 0 27433 215 1015
The first line is the Yahoo bot reading the old URL it expects : 'What_We_Do/Software_Development'. It receives the 301 status code (shown after the domain name) and, 1 second later, returns looking for the new Url of 'Services/Software_Development.' This is the page that ultimately gets indexed and kept, and all old references in the search index are updated. You can see the '200' status code returned after the second URL to indicate that all went OK.
Note: I took some detail out of the log lines for simplicity.
301 Redirects as Repair Strategies
We all make mistakes, and I make quite a few with my own website as I often try out new ideas and software before releasing it to anyone else. One of those mistakes in the past was somehow making the IP address available as domain name. This resulted in a complete, duplicate indexing of my site with associated dilution of ranking and all sorts of other dramas. Once I realized this had happened, I need to correct the index by removing the 'wrong' domain name (the IP address.) You can go through removal tools, but there was a chance that someone had linked to my site using the IP-based domain name. And besides, the site was in the index, I might as well try and merge the sites (ifinity.com.au and the IP address) together.
You can see the problem in the Google result on the left. You definitely don't want this to happen to your site, and if it does, you do want to correct it as quickly as possible.
Again, the 301 redirect is the answer. I built a custom tool which intercepted the IP address based domain, and issued a 301 redirect to the equivalent page using the www.ifinity.com.au domain - in effect, telling search engines that the old domain of 202.60.91.201 no longer exists, and to find all that same information at www.ifinity.com.au'.
The results were successful, a short time after putting in the fix, all of the old IP address indexing had been migrated across to the correct domain name. Although it's not without problems as are still a few references to the IP address still floating around in the search indexes. At least the 301 redirect does put the visitor onto the right page if they click on it, though.
Other Popular Uses of 301 Redirects
Now that you have a basic understanding of using 301 redirects to update search engine indexes, it's time to discuss other applications of the 301 redirect. Mostly, these center around creating 'canonical URLs.' A canonical URL is one that is the 'best' URL for a page - or, if you like, a single URL to unite all the different ways pages can be represented. You absolutely want to encourage canonical URLs for your pages, so that each page in your site only has one single representation. For this it's important to think of a 'page' as a unique piece of content, referenced by a unique URL. So, while /products.aspx?productid=45 and /products.aspx?productId=46 refer to the same physical .aspx page, as far as search engines are concerned, they are two different pages of content.
Taking the products.aspx page a step further, perhaps that same page can be access through additional URLs such as : /products/productid/45.aspx or /products/45/my-cool-product.aspx. These might all show the same content, but the problem is that a search engine might index the first, 5 people link to the second one, and 3 other people link to the third version. You've got the potential for three separate pages to show up in the index, and probably one or more of them will show up in 'supplemental results', otherwise known as 'duplicate content.'
In this case, what needs to happen is for a canonical URL to be chosen by you, and all requests for the page end up at the same URL. The end result is that search engines index a single URL, and anyone linking to your content does so through the same URL. That way, all the value from the incoming links is concentrated onto a single page, giving it a much greater chance of ranking success.
Again, Matt Cutts' blog has a very informative post on this topic : SEO Advice Url Canonicalization.
Having a canonical URL means a single URL for each page of content, no exceptions. When I ended up with an IP address for a domain name, it was the absolute opposite of having canonical URLs. Every single page in my site was available as a duplicate URL.
While you may not have this problem with your own site, you may have another, more common problem : www vs no-www. It's really your choice as a web administrator whether you want to advertise your site as domain.com or www.domain.com. Certainly it makes little difference from a search engine point of view, but what you should be doing is redirecting all the 'www' to no-www, or the 'no-www' to 'www' URLs.
You can take this as far as you want - right down to forcing all versions of a URL to be in the same case (Domain.com -> domain.com) and adding/removing forward slashes (domain.com -> domain.com/). Personally I don't bother with this level as it is my belief that search engines are smart enough to know that 'domain.com' and 'Domain.com' are the same thing. Sure, some Web servers allow content to be differentiated by different case URLs, but most of the time websites will give you the same content, regardless of what case it is requested in. But it's up to you as the website owner and person ultimately responsible for search traffic to make sure it functions the way you would like.
301 Redirects and DotNetNuke
How does all this apply to DotNetNuke-based websites you ask? For a start, there isn't any 301 functionality 'baked' into the DNN core framework. But that's OK, because DNN runs on top of IIS, and IIS has plenty of functionality for forwarding and redirecting. For those in a shared hosting environment without direct access to IIS there is the option of installing third-party products to set up custom redirects.
The three biggest problems with DNN URLs and search engine optimization, as I see them, are:
- No canonical URL functionality:
The home page is typically available on the site root (/), /default.aspx?tabid=36, /tabid/36/default.aspx, /home/tabid/36/default.aspx, home.aspx and just plain old /default.aspx. - No separation of keywords in the generated Page URLs"
'My Cool Thing' ends up being 'mycoolthing.' - No redirection of deleted or changed pages:
Once you delete a page, it won't forward you on or show any content.
These three issues can all be remedied by using 301 redirects. There are many ways to do this - either by setting up individual redirects in IIS, using a third-party IIS tool for setting up redirects, or using a dedicated DNN 301 module, of which there are multiple available.
Of course, now is the right time to plug my Url Master DNN Url Redirecting and Rewriting module. This DNN module provide a solution to all of the above problems, and many more. It features automatic 301 redirects to enforce canonical URLs, plus the ability to add custom redirects to handle all types of situations where content has been moved or deleted.
Summing Things Up
With any luck, you have learned what a 301 redirect is, and why you should be using them to maintain and increase your position in the search engine results pages. Once you have established yourself in the indexes, it's important you manage your URLs effectively otherwise all your hard work might just be undone.
Do you disagree with anything? Do you have more to add? Submit your thoughts using the comments field below.
Name (required)
Email (will not be published) (required)
Website