TNL.net is designed for modern browsers but the content is still readable in older ones. If you want to ensure the best experience, please install a browser that was developed after 2009.

tnl.net

Capacity planning and RSS

Robert Scoble points to MSDN hav­ing issues with full entry RSS. What it comes down to is a capac­ity plan­ning exercise.

In his note, he says that RSS is bro­ken. I per­son­ally believe that at issue is not whether RSS is work­ing or not. RSS is work­ing but it has com­pli­cated the band­width issue. At issue is the fact that RSS feeds are gen­er­ally gen­er­at­ing more traf­fic to a site. Because RSS read­ers are polling the site to check if a feed has been updated, the traf­fic pat­terns change, with increased num­bers of spikes on a hourly basis. This is sim­i­lar to some of the issues net­work admin­is­tra­tors started fac­ing when Point­cast first appeared.

There are a num­ber of ways to mit­i­gate the issue.

HTTP Con­di­tional GET for RSS

First of all, one of the things to con­sider when using RSS is to cre­ate con­di­tional HTTP head­ers on RSS feeds. This helps mit­i­gate some of the impact by ensur­ing that feeds are only served if the con­tent has changed.

Feed Com­pres­sion

The next item to think of is to use com­pres­sion when serv­ing feeds. By doing so, one reduces the size of the pay­load, which ends up being much bet­ter in terms of man­ag­ing band­width. In my own expe­ri­ence, because RSS is pri­mar­ily text, I’ve seen a reduc­tion of 80% of the band­width when deliv­er­ing RSS feeds in a com­pressed for­mat. That rep­re­sents a fairly large gain in band­width that can then accom­mo­date more users.

Change the polling schedule

The RSS 2.0 spec­i­fi­ca­tion already offers a num­ber of optional ele­ments to give RSS read­ers a bet­ter idea as to when to get con­tent. For exam­ple, the pubDate ele­ment offers infor­ma­tion as to when a feed was last pub­lished, as does the lastBuildDate one. ttl (aka. time to live) can also be used to indi­cate to the soft­ware that this feed should live for a cer­tain amount of time. Finally, skipHours and skipDays offers more point­ers as to when RSS reader soft­ware should not poll. With all those mech­a­nisms in place, it looks like a lot of flex­i­bil­ity exists in the for­mat to accom­mo­date scalability.

When all else fails, reduce

If all of the above still fail, RSS pub­lish­ers should look at reduc­ing the size of their feeds. There are two ways you can do this. First, you can just say that you’re not going to offer full-text feeds. This seems to be the option that Scoble hates. Another way to do things is to offer both abbre­vi­ated feeds and full-text feeds or offer more detailed feeds, as I do on TNL.net.

An impor­tant con­sid­er­a­tion when doing some­thing like this is how to address them. By default, users who just use the RSS autodis­cov­ery fea­ture will only get the abbre­vi­ated feed. How­ever, they still have the option to go and get the full-text ver­sion. The com­pro­mise here is that users who just want to sub­scribe quickly can do so at a lower band­width costs, while power users can seek out the fuller feed and sub­scribe to that. The result, in my expe­ri­ence, is that most peo­ple use the autodis­cov­ery fea­ture, grab­bing the smaller feed. Some power users do seek out the fuller feed and sub­scribe to that instead (based on the num­bers, I’m see­ing a 5% usage of the full-text feed as opposed to the default abbre­vi­ated one. This is a com­pro­mise solu­tion that seems to accom­mo­date every­one involved to date.

Final con­sid­er­a­tions

When pub­lish­ing RSS feeds, your audi­ence grows, which results in traf­fic growth too. One of the thing to real­ize is that RSS feeds are gen­er­ally stick­ier than the rest of a site. What this means is that, for every new sub­scriber you get, you will see an on-going increase in your over­all site traf­fic stats. This is not a bad thing as mes­sages ema­nat­ing from your site do get a higher pas­sive read­er­ship. One of the thing that new syn­di­ca­tion stan­dards should con­sider is a follow-up on this. While RSS pub­lisher know how many feeds are being pushed out, there is lit­tle, in the way of infor­ma­tion as to what per­cent­age of those feeds is being read. Stronger met­rics need to be devel­oped to get an under­stand­ing of pas­sive vs. active sub­scribers (pas­sive sub­scribers are sub­scribers that receive the feed but do not read it, while active sub­scribers are actu­ally read­ing the con­tent and click­ing through). This, I believe, is one of the next chal­lenges that needs to be addressed in order to make RSS a more viable and wide­spread dis­tri­b­u­tion platform.

Originally published on September 9, 2004 in Technology . You may find related thoughts pieces under the following terms: