Posted by Bruce on Wednesday, June 25, 2008 to DotNetNuke, SEO, DNN Tips and Tricks
From Tom:
Please welcome Bruce Chapman as a seablick.com guest blogger. If you are a “regular” around here, Bruce won’t need much of an introduction as he has made a name for himself in the DNN community as a developer of SEO-focused modules and components. In upcoming blog posts, Bruce will shed light onto the technical side of DNN, SEO, and DNN SEO and I’m thrilled to have him on board … welcome Bruce!
The most important piece of advice for obtaining and maintaining a high position in search engine result pages for your website is to get relevant, one-way, inbound links from other respected websites. Relevant in that the link originates from a page with content related to yours, and the link anchor text preferably contains the keywords you wish to rank highly for. Respected websites are those not involved in shady practices such as link farms, and, ideally, represent an authoritative source.
This advice is given just about everywhere - and for good reason. It's good advice, and quality inbound links will have a greater effect on your search result placement than any other optimization technique. However, there's a problem with collecting inbound links that may not be obvious straight away. The problem appears down the road when you decide to change your site in some way. You may need to reorganize your content into a new structure, or perhaps you want to change the platform the site runs on (not that you ever want to move away from DNN :)- You might wish to change the URL scheme, or simply change the URL of a given page. All of these actions have the same effect: the location of the page has changed.
Here's an example:
You set up a page called http://mydomain.com/myverycoolthing.html.
It's an interesting page, and someone submits it to a social news site. As a result, you get plenty of inbound links to it, and you start to rank well in the search results for searches of 'very cool thing.' But after reading a few articles, you decide the URL would be better if it separated out the keywords with a hyphen. So you rename the page 'my-very-cool-thing.html.'
Once you change the URL of a page in any way, here's what happens:
- visitors following that link from the source page will arrive at a page that no longer exists ('page not found' error)
- visitors who have bookmarked that page will return to find the page no longer exists
- search engine robots will revisit the link to discover it no longer exists and remove it from the search index
- search engine robots following the link from the source pages will discover it no longer exists, and the "vote" of those incoming links is removed from your site
You'll start to slip down the search engine results pages, and soon enough you'll disappear from it altogether. You've undone all of that hard work by renaming the page URL.
What are Http status codes?
I've already mentioned '404' in this post, and it's barely past the introduction. A lot of people talk about 404's and 302's like everyone understands them, so here's a jargon-busting refresher on status codes. Status codes are hidden (for the most part) by modern browsers, but I think that's a shame, as they are useful information once you understand them.
Http status codes are the numeric values returned by a web server in response to a user agent (such as your web browser) making a request for a page or other recource. The status code lets the browser know, in a shorthand way, whether or not the web server could fulfil the request. The full list of status codes is quite long, and is good for fixing sleepless nights, but are the most common ones:
200 : Request completed OK (translation : great, here's the page you want!)
301 : Page / Resource permanently moved to new location (translation : we've moved location, please go across the street.)
302 : Page / Resource Moved Temporarily (translation : this checkout closed. Please use aisle 7.)
404 : Page / resource not found (translation : nope, I don't have one of those. Try something else.)
500 : Server error (translation : Oh dear, I seem to be broken and can't help you right now.)
Wikipedia has the full list of http status codes if you are up late and need something to help you sleep. In general, status codes beginning with '2' means things went OK. Status codes beginning with '3' means the resource has moved somewhere else, and status codes beginning with '4' mean that you can't get the thing you asked for. Status codes beginning with '5' mean that something is wrong with the web server.
Enter the Redirect
The solution, of course, is to redirect all requests for 'myverycoolthing.htm' to 'my-very-cool-thing.htm.' A redirect means that if anyone requests the 'old' URL, they will get forwarded to the 'new' URL. Redirects are nothing new - in fact they are as old as the World Wide Web itself.
The most common redirect you see on websites is called the '302' redirect. 302 refers to the Http status code (see sidebar), and it will temporarily redirect the request to the specified new location (the new location is supplied along with the status code.)
A 302 redirect will ensure that people requesting the Url will get to see the content they expected (points 1 and 2 above). It will also direct search engine robots to the new page and ensure it doesn't get removed (point 3). However, search engine bots will vary in how they treat the weighting of the link (point 4).
It's like when you go away from your house for a while, and get the mail forwarded. Should people update their address books to your new address? Not really, because forwarded mail generally means you'll be back at the original address at some point. A 302 redirect is equivalent to telling the post office to send your mail to a beach house for the holidays, and to stop when you return.
Part of the problem with using a 302 redirect is that it is has been misused by people trying to trick search engines, and it indicates a temporary shift in the URL. So some search engines will update their index to the new URL, others won't, and the end result is not what you are after.
Matt Cutt's blog (Matt works for Google) has an interesting post on the difficulties in interpreting 302 redirects. It's well worth a read for more details on the issue.
A 301 Will Do
A 301 redirect (again, 301 refers to the Http status code) means a permanent redirect, as in 'the resource has permanently moved.' Reverting to the mail metaphor, if you move into a new house, you get the mail redirected to your new location, and at least where I live, the post office takes care of informing your bank, insurance company and government agencies that you have permanently changed your address. A 301 is equivalent to this scenario : we've moved and we're not coming back.
When search engines receive a 301 redirect status code, they know the URL in question has permanently changed, and they will go and investigate and index the new page. They will also assign any link weighting from incoming links over from the old URL to the new URL. This means your page, even if it has completely new content and a new URL, will retain the Google PageRank of your old page. Your page has permanently moved to a new location.
If you search around the Internet long enough, you'll find all sorts of conflicting information about using 301's and search engines. But I'm here to tell you this: it works, and it works well. I've changed the URL on many, many pages on websites, and without exception, it works well.
An Example Using 301 Redirects
I'm always tinkering with my own site, ifinity.com.au. I read my search keyword statistics and make adjustments to URLs and site structure based on what people are actually searching (and finding) the site with.
Originally, I set up a section / subsection in my website called 'What We Do.' I thought this title sounded more personable. But really, I found that people never search for 'what we do.' They search for things like 'products', 'services', 'consultants' and keywords like that. I made the decision to change 'what we do' to 'services.' It's the sort of language that people actually use, and as people speak, so do they search.
The menu structure before the change is shown on the image at the left. Because my site is built with DNN, the URLs are based on the page names, which are also used to generate the menu items. This resulted in a URL of 'What_we_do/Software_Development' for the page.
You can see a screenshot of the Google search results for 'ifinity software development' below. (the fact that Google assumes I mispelled the name - we'll just skip over and cover another day).
The screenshot was taken on May 21, 2008, the day I made the change to the site structure. The relevant URL for the page has been highlighted in red.
The menu structure after the change is shown in the image on the left. The new URL looks like this 'services/software_development/.' It's important to note that I also changed the entire content of the page, rewriting the copy for the page to better reflect what I think people are looking for when they find this page. In terms of search engines, the page is completely different - different URL, different content.
When I changed the page over, I also created a 301 redirect from the old URL 'What_We_Do/Software_Development' to the new page 'Services/Software_Development.' I then monitored search engine bots visiting the new page and whether or not the Google index was properly updated. Within a week, I had my answer from the Google:
This screenshot was taken a week later, for the same search term. As you can see in the red-highlighted area, the page URL was updated, and the page maintained its position in the search results (overall, the site jumped one position, which may or may not be related.) The 301 redirect did its magic, and did so within a week of being issued. Now, you can't rely on 'a week' as the time it takes to get updated. It certainly isn't instantaneous, and it can take much, much longer. There are still some links (a month later) in the index which haven't been updated since my reorganization - these are links which rank low in search results. However, the higher ranked your page, the faster the index will get updated.
Incidentally, you can see my change in the page meta description field, with my 'new' copy intended to encourage more click-throughs from the search results. I'm interested in feedback which seems more 'clickable.'
Checking the Logs
If you're the technical type, like me, you might want to check your website log files to see if the redirect is working as intended. Here is an excerpt from the log files of ifinity.com.au:
2008-05-22 17:22:25 GET /What_we_do/Software_Development/ - 80 HTTP/1.0 Mozilla/5.0+(compatible;+Yahoo!+Slurp;) - www.ifinity.com.au 301 0 0 473 267 250 2008-05-22 17:22:26 GET /Services/Software_Development/ - 80 HTTP/1.0 Mozilla/5.0+(compatible;+Yahoo!+Slurp;) - www.ifinity.com.au 200 0 0 27433 215 1015
The first line is the Yahoo bot reading the old URL it expects : 'What_We_Do/Software_Development'. It receives the 301 status code (shown after the domain name) and, 1 second later, returns looking for the new Url of 'Services/Software_Development.' This is the page that ultimately gets indexed and kept, and all old references in the search index are updated. You can see the '200' status code returned after the second URL to indicate that all went OK.
Note: I took some detail out of the log lines for simplicity.
301 Redirects as Repair Strategies
We all make mistakes, and I make quite a few with my own website as I often try out new ideas and software before releasing it to anyone else. One of those mistakes in the past was somehow making the IP address available as domain name. This resulted in a complete, duplicate indexing of my site with associated dilution of ranking and all sorts of other dramas. Once I realized this had happened, I need to correct the index by removing the 'wrong' domain name (the IP address.) You can go through removal tools, but there was a chance that someone had linked to my site using the IP-based domain name. And besides, the site was in the index, I might as well try and merge the sites (ifinity.com.au and the IP address) together.
You can see the problem in the Google result on the left. You definitely don't want this to happen to your site, and if it does, you do want to correct it as quickly as possible.
Again, the 301 redirect is the answer. I built a custom tool which intercepted the IP address based domain, and issued a 301 redirect to the equivalent page using the www.ifinity.com.au domain - in effect, telling search engines that the old domain of 202.60.91.201 no longer exists, and to find all that same information at www.ifinity.com.au'.
The results were successful, a short time after putting in the fix, all of the old IP address indexing had been migrated across to the correct domain name. Although it's not without problems as are still a few references to the IP address still floating around in the search indexes. At least the 301 redirect does put the visitor onto the right page if they click on it, though.
Other Popular Uses of 301 Redirects
Now that you have a basic understanding of using 301 redirects to update search engine indexes, it's time to discuss other applications of the 301 redirect. Mostly, these center around creating 'canonical URLs.' A canonical URL is one that is the 'best' URL for a page - or, if you like, a single URL to unite all the different ways pages can be represented. You absolutely want to encourage canonical URLs for your pages, so that each page in your site only has one single representation. For this it's important to think of a 'page' as a unique piece of content, referenced by a unique URL. So, while /products.aspx?productid=45 and /products.aspx?productId=46 refer to the same physical .aspx page, as far as search engines are concerned, they are two different pages of content.
Taking the products.aspx page a step further, perhaps that same page can be access through additional URLs such as : /products/productid/45.aspx or /products/45/my-cool-product.aspx. These might all show the same content, but the problem is that a search engine might index the first, 5 people link to the second one, and 3 other people link to the third version. You've got the potential for three separate pages to show up in the index, and probably one or more of them will show up in 'supplemental results', otherwise known as 'duplicate content.'
In this case, what needs to happen is for a canonical URL to be chosen by you, and all requests for the page end up at the same URL. The end result is that search engines index a single URL, and anyone linking to your content does so through the same URL. That way, all the value from the incoming links is concentrated onto a single page, giving it a much greater chance of ranking success.
Again, Matt Cutts' blog has a very informative post on this topic : SEO Advice Url Canonicalization.
Having a canonical URL means a single URL for each page of content, no exceptions. When I ended up with an IP address for a domain name, it was the absolute opposite of having canonical URLs. Every single page in my site was available as a duplicate URL.
While you may not have this problem with your own site, you may have another, more common problem : www vs no-www. It's really your choice as a web administrator whether you want to advertise your site as domain.com or www.domain.com. Certainly it makes little difference from a search engine point of view, but what you should be doing is redirecting all the 'www' to no-www, or the 'no-www' to 'www' URLs.
You can take this as far as you want - right down to forcing all versions of a URL to be in the same case (Domain.com -> domain.com) and adding/removing forward slashes (domain.com -> domain.com/). Personally I don't bother with this level as it is my belief that search engines are smart enough to know that 'domain.com' and 'Domain.com' are the same thing. Sure, some Web servers allow content to be differentiated by different case URLs, but most of the time websites will give you the same content, regardless of what case it is requested in. But it's up to you as the website owner and person ultimately responsible for search traffic to make sure it functions the way you would like.
301 Redirects and DotNetNuke
How does all this apply to DotNetNuke-based websites you ask? For a start, there isn't any 301 functionality 'baked' into the DNN core framework. But that's OK, because DNN runs on top of IIS, and IIS has plenty of functionality for forwarding and redirecting. For those in a shared hosting environment without direct access to IIS there is the option of installing third-party products to set up custom redirects.
The three biggest problems with DNN URLs and search engine optimization, as I see them, are:
- No canonical URL functionality:
The home page is typically available on the site root (/), /default.aspx?tabid=36, /tabid/36/default.aspx, /home/tabid/36/default.aspx, home.aspx and just plain old /default.aspx. - No separation of keywords in the generated Page URLs"
'My Cool Thing' ends up being 'mycoolthing.' - No redirection of deleted or changed pages:
Once you delete a page, it won't forward you on or show any content.
These three issues can all be remedied by using 301 redirects. There are many ways to do this - either by setting up individual redirects in IIS, using a third-party IIS tool for setting up redirects, or using a dedicated DNN 301 module, of which there are multiple available.
Of course, now is the right time to plug my Url Master DNN Url Redirecting and Rewriting module. This DNN module provide a solution to all of the above problems, and many more. It features automatic 301 redirects to enforce canonical URLs, plus the ability to add custom redirects to handle all types of situations where content has been moved or deleted.
Summing Things Up
With any luck, you have learned what a 301 redirect is, and why you should be using them to maintain and increase your position in the search engine results pages. Once you have established yourself in the indexes, it's important you manage your URLs effectively otherwise all your hard work might just be undone.
Do you disagree with anything? Do you have more to add? Submit your thoughts using the comments field below.
Permalink
3 Comments
RSS feeds
Email updates
Posted by Tom on Sunday, June 08, 2008 to DotNetNuke, DNN News, OpenForce 08
Welcome to a day of live blogging from OpenForce Connect Orlando. After a few days of family fun in Cocoa Beach, I’ve made my way to the Orange County Convention Center. Of course, I parked just outside the North Building and learned shortly thereafter that the event is held in the South Building in room S210E. I never mind a walk though, except when I’m wearing a shirt and jacket and it’s over 80 degrees at 8:00 am … welcome to Florida!
It’s about 8:30 am now and people are making their way into the room. The first session is scheduled to kick off at 9:00.
Shortly after 9:00 am, Brian Scarbeau, founder of the Orlando DNN user group, greets a good-size crowd and sets the stage for the day by quickly reviewing how this first OpenForce Connect event came about. Then Brian continues in a well articulated way to provide an overview of the DotNetNuke web application framework and shows how to get a local install up and running based on the DNN starter kit. He goes on to point out blogs, tutorials, and other resources (including Vasilis' site) to help newcomers ease their way into the world of DotNetNuke.
At 10:00 am, Will Strohl, lead developer of RezHub.com, takes the stage for his presentation on DNN Skinning Tips and Tricks. He starts off with a high-level overview of the DNN skinning engine including fundamental concepts such as skin tokens, panes, and skin packaging. In between bombarding the audience with RezHub swag, Will outlines techniques to make panes collapsible, sheds light on taking advantage of about.htm as part of the skin package and tries hard not to get drawn into the heated debate of HTML table-based layouts versus CSS-based skin layouts.
Next up is Raul Rodila of Arrow Consulting who talks about understanding and using settings in custom modules. As this was a very developer-focused talk, I’ve turned most if my attention to writing up the morning sessions. Check back later in the week for links to all sessions and downloads.
Tracy Wittenkeller of T-WORX, Inc kicks off the afternoon with his session on building websites with DotNetNuke. Tracy sets the tone of the talk by asserting that the “core modules” included with DNN provide a powerful set of tools that is often overlooked. He proves his point by walking through the most important settings and configurations of the Announcements, Links, Media, and Surveys modules highlighting the power of layout templates along the way.
With his session on securing the DNN connection string, Darrell Hardy of Hardy Consulting, rounds out a day of local speakers. He stresses the importance of hardening your DNN installs by following general security guidelines such as using strong passwords and installing only trustworthy modules. Then Darrell goes deeper into various methods of encrypting and decrypting the DNN connection strings in the web.config file. This was an eye-opening session for me and I strongly encourage you to download Darrell’s slides and code as soon as available.
The last session of the day is presented by Nik Kalyani, co-founder and CEO of DotNetNuke Corporation and Microsoft MVP. Nik gives a preview of the enhancements to the DNN skinning engine to be released with Cambrian (DNN 5.0.) According to Nik, the overall goal of these new features has been to let the designer be a designer instead of having to grow into “half-programmers” just to be able to skin and style DotNetNuke. Most interesting to me is the introduction of layout and “super” stylesheets to guide skin developers on their way to a more CSS-based approach to skinning.
And finally, Shaun Walker, Nik Kalyani, Joe Brinkman, and Scott Willhite, all founding members of DotNetNuke Corporation, took the stage for an open panel discussion on the past and future of the DNN web application framework. The guys revisited the major goals (social networking, dynamic content localization, workflow and versioning, and skinning enhancements) for Cambrian set forth at last year’s OpenForce conference in Las Vegas and updated the audience on the current status of these new features and enhancements as well as the overall development status of the upcoming release. The team has worked through 5 betas and is currently “pretty much” in feature lockdown mode. The first iteration of Cambrian will include the above mentioned skinning enhancements as well as an installer roll-back feature and the decoupling of admin and host pages and modules among other enhancements. These new features will form the base necessary for the more “involved” goals of taking on social networking, dynamic content localization, and workflow and versioning.
The inaugural OpenForce Connect conference in Orlando set a high standard for similar events to come. Thanks again to Microsoft’s Joe Healy for spearheading the event and providing the facilities. Further appreciation goes to the entire team of the Orlando DNN user group for their organizational talent and attention to detail. Thanks again also to my fellow sponsors who made this event possible … see you all no later than November in Vegas!
Permalink
5 Comments
RSS feeds
Email updates
Posted by Tom on Wednesday, May 14, 2008 to DotNetNuke, SEO
Almost a year ago in my DNN SEO quick start guide I talked about minimizing duplicate content by crafting “well-formed internal links” and over the last few months many of you wrote in to ask what exactly I meant by that.
We’ll get to the bottom of the issue in a moment, but first let’s refresh the idea of “duplicate content” again. Ideally, every URL of your website should correspond to exactly one unique page within your website. And that’s generally how it worked back in the day when most sites were made up of static HTML pages. That all changed though as larger sites started moving their content into databases and pages were assembled “on the fly.” And as ecommerce and content management systems gained in popularity, multiple URLs leading to the same page became quite common. That in turn did not sit well with Google and other search engines as it undermines the quality of web search results, which sparked rumors of search engines coming down on webmasters with “duplicate content penalties.”
Today, according to Google, there is no need to lose sleep over duplicate content as long as you try to minimize it by reasonable means. One way of doing so is to pay attention to the internal links you create within DotNetNuke. Our number one enemy here is the DNN URL control (also known as LinkClick.aspx), which facilitates link building in modules such Announcements, Links, Text/HTML and others.



As you can see in the above screenshots, when you go through the steps of creating a link the “point and click” way in the FCKeditor, you’ll end up with an anchor tag that looks like this:
<a href="/LinkClick.aspx?link=53&tabid=56">DNN SEO Blog</a>
Technically this link obviously works, meaning it will take your visitor to the intended page on your site. However, you’ve just produced a piece of duplicated content in the eyes if search engines as you’ve now 2 URLs leading to one and the same page. Furthermore, and maybe more importantly, you are wasting “link juice” or “votes” for the page you are linking to by referencing it via multiple URLs. Here is what the anchor tag should look like instead:
<a href="/blog.aspx">DNN SEO Blog</a>
As a general rule, follow the URL structure as laid out in your main menu. If you have to rely on creating links via FCK’s Insert/Edit Link button, then typing or pasting the URL from the browser address bar is your only option:

And if you think that switching to a different WYSIWYG editor will solve the problem, think again. Telerik’s editor, for instance, creates URLs such as this one when picking from the Custom Links dropdown:
<a href="/Default.aspx?TabName=Blog">DNN SEO Blog</a>
Hardly any better.
The very same problem arises when using the Links and the Announcements module with link counter turned on:

Simply unchecking “Track Number of Times this Link is Clicked?” takes care of the issue here. I like to argue that link tracking is better handled by your web analytics provider instead of DNN itself.
Starting with version 4, DotNetNuke has considerably cut down on duplicate content issues, but the dreaded LinkClick.aspx is still with us to this day. Fortunately, a heightened awareness of what’s going on behind the UI and a basic understanding of what constitutes a well-formed link is all it takes to minimize duplicated content and maximize link equity when linking between pages in DNN.
As always, I’d like to hear from you. Do you consider LinkClick.aspx your friend or foe? What other duplicate content issues have you run into in your daily DotNetNuke adventures?
Permalink
19 Comments
RSS feeds
Email updates