<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>TNL.net &#187; Search</title>
	<atom:link href="http://www.tnl.net/blog/tag/search/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.tnl.net/blog</link>
	<description>Turning Data into Knowledge</description>
	<lastBuildDate>Wed, 08 Feb 2012 20:15:55 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
<cloud domain='www.tnl.net' port='80' path='/blog/?rsscloud=notify' registerProcedure='' protocol='http-post' />
		<item>
		<title>Google Goes Real-Time</title>
		<link>http://www.tnl.net/blog/2009/12/03/google-goes-real-time/</link>
		<comments>http://www.tnl.net/blog/2009/12/03/google-goes-real-time/#comments</comments>
		<pubDate>Thu, 03 Dec 2009 22:00:06 +0000</pubDate>
		<dc:creator>Tristan Louis</dc:creator>
				<category><![CDATA[Technology]]></category>
		<category><![CDATA[DNS]]></category>
		<category><![CDATA[DNS system]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[real-time search]]></category>
		<category><![CDATA[real-time web]]></category>
		<category><![CDATA[search engine]]></category>

		<guid isPermaLink="false">http://www.tnl.net/blog/?p=1530</guid>
		<description><![CDATA[Google's DNS could have as much to do with search and advertising as it does with other technical concepts.<p><p><i><a href="http://tnl.net/who" rel="author" title="Who is Tristan Louis?">Tristan Louis</a> is the founder and CEO of <a href="http://www.keepskor.com" title="Keepskor">Keepskor</a> and  writes the influential <a href="http://www.tnl.net/" title="tnl.net">tnl.net</a> weblog, where this was initially posted under the title <a href="http://www.tnl.net/blog/2009/12/03/google-goes-real-time/">Google Goes Real-Time</a>. You can follow him on twitter <a href="https://twitter.com/TNLNYC">here</a> or receive his weekly newsletter by subscribing <a href="http://eepurl.com/gb6zD">here</a>.</i></p>
</p>
]]></description>
			<content:encoded><![CDATA[<p>Over the last few days, Google has made a large effort to claim that it was getting more interested in how fast the internet is going. The company announced <a href="http://analytics.blogspot.com/2009/12/google-analytics-launches-asynchronous.html">changes to its analytics engine</a> to speed up sites,<a href="http://googlewebmastercentral.blogspot.com/2009/12/how-fast-is-your-site.html"> provided new tools to webmaster</a> to enhance their offering’s performance, and is now <a href="http://googleblog.blogspot.com/2009/12/introducing-google-public-dns.html">offering a set of DNS servers </a>to anyone who wants to use Google instead of their own ISP to figure out web addresses.</p>
<h2>Faster access to pages… for Google</h2>
<p>DNS servers, for people who don’t know the technical details, are basically <a href="http://en.wikipedia.org/wiki/Domain_Name_System">the phone books</a> of the internet. On the internet, every computer server is known by a set of digits known as its IP address. For example, when you typed tnl.net, the DNS server looked up the name and discovered that it was at 206.127.35.2 allowing your computer to connect to mine.</p>
<p>I’ve argued, <a href="http://www.tnl.net/blog/2005/05/06/google-accelerates-search/">for many years</a>, that Google wanted to find a way to access new web pages at a much faster rate. The challenge the company has had is that it is difficult to find new pages when they appear. While traditional technology to discover what pages are available on the internet has evolved and Google has managed to coerce some site owners in providing it with a quick update when changes are available on their site, the efforts the company has pushed have also been helpful to its competitors, who could build on top of the processes and <a href="http://en.wikipedia.org/wiki/Sitemaps">open standards</a> Google was fostering.</p>
<p>With Microsoft getting some early successes in the search game with their new <a href="http://www.bing.com">Bing</a> offering, and <a href="http://www.nytimes.com/auth/login?URI=/2009/06/14/business/14digi.html&#038;OQ=_rQ3D5&#038;REFUSE_COOKIE_ERROR=SHOW_ERROR">new entrants</a> in the <a href="http://en.wikipedia.org/wiki/Real-time_web#Real-time_search">real-time search</a> business <a href="http://www.guardian.co.uk/business/2009/may/19/google-twitter-partnership">eating up</a> some of Google’s mindshare in search, the company needs to do something radical, lest the cornerstone of its business, and the source of most of its revenue, be undermined.</p>
<p>Enters the DNS system. Every time a page is called, your browser makes a DNS call (several, in fact, as every web asset can result in a different one). In other words, the DNS system truly serves as the heartbeat of the internet and convincing a large swath of users could allow Google to get an idea as to what’s new on the internet.</p>
<p>If, for example, a user were to access a new page that’s not in the Google index, Google’s own DNS servers could be wired up to alert its search spiders to immediately pick up the page, analyse it, index it, and make it available to its search users within seconds of the page first being accessed. This could give Google a substantial advantage over Microsoft and others in indexing the web in real time.</p>
<h2>Some potential risks</h2>
<p>There is, however, a huge caveat in all this. For starters, Google needs to convince a large number of people to access their DNS. Providing the product for free may work for some but will not convince everyone. Another issue they may have to deal with is the perception that they might snoop on personal data (something that is <a href="http://code.google.com/speed/public-dns/privacy.html">already being addressed on their site</a>). The ability to access information about everything you do on the internet, whether it is via a web browser or another application like Skype, online games, or video and music player, is granting Google some brand new capabilities and not everyone may be willing to share such information.</p>
<p>Google will also have to contend with <a href="http://www.internetnews.com/dev-news/article.php/1486981">large numbers</a> of potential <a href="http://en.wikipedia.org/wiki/DNS_cache_poisoning">denial of service attacks (or worse)</a>, which have become<a href="http://searchsecurity.techtarget.com/news/interview/0,289202,sid14_gci859083,00.html"> more common</a> of late, against those DNS servers. Such attacks could represent a substantial reputational risk to the company. If, for example, one of Google’s DNS servers could be compromised, the hacker could decide to redirect the traffic of banks or other financial institutions to their own sites. The potential financial impact of such a thing would become a legal nightmare for the company.</p>
<h2>The Prize</h2>
<p>All this, however, can be counter-balanced by the rich prize the company would get in being able to index every bit of internet content within seconds (or even nanoseconds) of such content being available on the internet. If that were to be achieved, the company’s perch at the top of the search heap would be guaranteed for a long long time and its continued dominance in the advertising world, based on the rich analytical data it could get from snooping on users of its services, would provide it with cashflow that other players on the internet would have a hard time to match.</p>
<p>Furthermore, Google could have control over where people go and could, if they decided to be evil, redirecto such traffic. That would be a tremendous amount of power.</p>
<p>All told, an interesting move and it will be fascinating looking at where they are planning to take this.</p>
<p><p><i><a href="http://tnl.net/who" rel="author" title="Who is Tristan Louis?">Tristan Louis</a> is the founder and CEO of <a href="http://www.keepskor.com" title="Keepskor">Keepskor</a> and  writes the influential <a href="http://www.tnl.net/" title="tnl.net">tnl.net</a> weblog, where this was initially posted under the title <a href="http://www.tnl.net/blog/2009/12/03/google-goes-real-time/">Google Goes Real-Time</a>. You can follow him on twitter <a href="https://twitter.com/TNLNYC">here</a> or receive his weekly newsletter by subscribing <a href="http://eepurl.com/gb6zD">here</a>.</i></p>
</p>
]]></content:encoded>
			<wfw:commentRss>http://www.tnl.net/blog/2009/12/03/google-goes-real-time/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Is Techmeme myopic?</title>
		<link>http://www.tnl.net/blog/2008/06/02/is-techmeme-myopic/</link>
		<comments>http://www.tnl.net/blog/2008/06/02/is-techmeme-myopic/#comments</comments>
		<pubDate>Tue, 03 Jun 2008 00:00:41 +0000</pubDate>
		<dc:creator>Tristan Louis</dc:creator>
				<category><![CDATA[Business]]></category>
		<category><![CDATA[Media]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[3G]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[History]]></category>
		<category><![CDATA[Intel]]></category>
		<category><![CDATA[Microsoft]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[Trends]]></category>
		<category><![CDATA[Wireless]]></category>
		<category><![CDATA[XML]]></category>
		<category><![CDATA[Yahoo]]></category>

		<guid isPermaLink="false">http://www.tnl.net/blog/?p=528</guid>
		<description><![CDATA[I’m a big fan of TechMeme, a web aggregation service that provides, at a glance, a few of what’s being discussed in the technology-focused part of the blogosphere. It has allowed me to unsubscribe from a large number of RSS feeds that were providing me with redundant information and I’ve long hoped for a version [...]<p><p><i><a href="http://tnl.net/who" rel="author" title="Who is Tristan Louis?">Tristan Louis</a> is the founder and CEO of <a href="http://www.keepskor.com" title="Keepskor">Keepskor</a> and  writes the influential <a href="http://www.tnl.net/" title="tnl.net">tnl.net</a> weblog, where this was initially posted under the title <a href="http://www.tnl.net/blog/2008/06/02/is-techmeme-myopic/">Is Techmeme myopic?</a>. You can follow him on twitter <a href="https://twitter.com/TNLNYC">here</a> or receive his weekly newsletter by subscribing <a href="http://eepurl.com/gb6zD">here</a>.</i></p>
</p>
]]></description>
			<content:encoded><![CDATA[<p>I’m a big fan of <a title="Techmeme" href="http://www.techmeme.com">TechMeme</a>, a web aggregation service that provides, at a glance, a few of what’s being discussed in the technology-focused part of the blogosphere. It has allowed me to unsubscribe from a large number of RSS feeds that were providing me with redundant information and I’ve long hoped for a version of TechMeme that would provide me with a customized view that providing a similar user interface for my own personal feeds.</p>
<p>Recently, though, TechMeme has gotten me thinking about the tech blogosphere conversations as a whole and their longer term relevance. To the small “web 2.0″ community, TechMeme serves as a bit of a paper of record; The subhead even claims that it represents the “Tech Web, page A1”, claiming to bring us the important stories. But how do those stories fare over time? Is today’s hot topic a step in understanding a longer term trend or is it just a temporary distraction that will be forgotten a month/3 months/6 months/a year from now.</p>
<p>Fortunately, Gabe Rivera, the founder of TechMeme must have anticipated such a question and provided a way to look at TechMeme as it was a particular point in its short history. Using the simple interface, it’s easy to see the page as it existed at a precise point in time. So I decided to start looking at the site at the same time in single month spaces. The middle of the night and middle of the day position ought to be good time stamps so I decided to look at the site at 12am and 12pm on the selected date. I also had to discount the fact that April 1st is April fool’s day so I could not use the first of the month as this fact could skew the data. Here are the dates and times I ended up with:</p>
<ul>
<li>Today: June 2nd 2008 at <a href="http://www.techmeme.com/080602/h0000">12am</a> and <a href="http://www.techmeme.com/080602/h1200">12pm</a></li>
<li>A week ago: May 26, 2008 at <a href="http://www.techmeme.com/080526/h0000">12am</a> and <a href="http://www.techmeme.com/080526/h1200">12pm</a></li>
<li>Two weeks ago: May 19, 2008 at <a href="http://www.techmeme.com/080519/h0000">12am</a> and <a href="http://www.techmeme.com/080519/h1200">12pm</a></li>
<li>One month ago: May 2, 2008 at <a href="http://www.techmeme.com/080502/h0000">12am</a> and <a href="http://www.techmeme.com/080502/h1200">12pm</a></li>
<li>Two months ago: April 2, 2008 at <a href="http://www.techmeme.com/080402/h0000">12am</a> and <a href="http://www.techmeme.com/080402/h1200">12pm</a></li>
<li>Three months ago: March 2, 2008 at <a href="http://www.techmeme.com/080302/h0000">12am</a> and <a href="http://www.techmeme.com/080302/h1200">12pm</a></li>
<li>Six months ago: December 2, 2007 at <a href="http://www.techmeme.com/071202/h0000">12am</a> and <a href="http://www.techmeme.com/071202/h1200">12pm</a></li>
<li>Nine months ago: September 2, 2007 at <a href="http://www.techmeme.com/070903/h0000">12am</a> and <a href="http://www.techmeme.com/070902/h1200">12pm</a></li>
<li>One Year ago: June 2 2007 at <a href="http://www.techmeme.com/070602/h0000">12am</a> and <a href="http://www.techmeme.com/070602/h1200">12pm</a></li>
<li>Two years ago: June 2, 2006 at <a href="http://www.techmeme.com/060602/h0000">12am</a> and <a href="http://www.techmeme.com/060602/h1200">12pm</a></li>
</ul>
<p>With 20 data points, here’s what I discovered.</p>
<h3>Today</h3>
<p>Based on <a href="http://www.techmeme.com/080602/h1200">today’s news at noon</a>, it looks like the important subjects at noon in the blogosphere are Adobe’s latest move, combining Flash and Acrobat with their entry in the already crowded (Google, Microsoft, Zoho, etc..) web-based office suite market. <a href="http://www.techmeme.com/080602/h0000">At midnight</a>, things were a little less exciting, with discussion around the privacy issues Google Maps is raising with their StreetView offering.</p>
<p>Of course, it’s still too early to tell whether those stories will have a long term impact so let’s roll the tape back a little.</p>
<h3>One Week Ago: May 26, 2008</h3>
<p><a href="http://www.techmeme.com/080526/h1200">At noon, a week ago</a>, the top story was about a new type of SSD, developed by Samsung. Since it’s hardware, I assume that the impact of this news can’t be felt initially but there could be longer term repercussions. Also of note on that page is a small item lower on the page about Paypal outages. An interesting trend in my research on this is that this story is slowly developing over a period of weeks and months and the noise level appears to be increasing on it.</p>
<p><a href="http://www.techmeme.com/080526/h0000">At midnight</a>, the discussion was around Google’s power and the needed for another organization to work as a counter balance to that powerful force in the search engine space. Coupled with the discussions last night about privacy issues relating to Google maps, it seems we are seeing an emerging pattern here.</p>
<h3>Two Weeks Ago: May 19, 2008</h3>
<p><a href="http://www.techmeme.com/080519/h1200">Two weeks ago, at noontime</a>, the claim that Microsoft would eventually buy Facebook and keep it close was dominating TechMeme. At this point, no announcement has been made so this is largely conjecture and, while an interesting opinion, it’s not really news. This editorial was largely in response to the news item that dominated the previous <a href="http://www.techmeme.com/080519/h0000">12 hour cycle</a> about Microsoft’s statements regarding pursuing a possible deals other than a full acquisition with Yahoo!</p>
<h3>One Month Ago: May 2, 2008</h3>
<p>On <a href="http://www.techmeme.com/080502/h1200">May 2, 2008 at noon</a>, the big news was… that the Google RSS reader is now available for the iphone. I’m sure many people consider this event as a major turning point when… well, hmm… a big big deal. Amusingly, Adobe was also in the news that day, with news that its flash plugin would escape computers and appear in set top boxes and mobile phones.</p>
<p>Another big subject was Steve Ballmer’s mention that Microsoft could go it alone without Yahoo, a discussion that dominated the <a href="http://www.techmeme.com/080502/h0000">midnight page on that day</a>. The Yahoo/Microsoft chat has been kind of the soap opera of our industry and this latest installment was remembered as a turning point (or not) by many.</p>
<p>A possibly interesting trend piece, around midnight, was also intriguing: <a href="http://www.techmeme.com/080501/p101#a080501p101">Will Grand Theft Auto IV hurt Iron Man opening weekend sales</a>. I haven’t seen a follow up on that piece yet, which could tell us whether video games are displacing movies as the primary form of entertainment but my guess is that the answer is no.</p>
<h3>Two Months Ago: April 2, 2008</h3>
<p>On <a href="http://www.techmeme.com/080402/h1200">April 2, 2008 at noon</a>, the top story on techmeme was about Intel’s plan for chips that would power up more mobile devices. Interestingly, this story was largely driven by mainstream media as the lead was taken by john Markoff of the New York Times, followed by comments from Forbes magazine, and Infoworld. The other related story was the press release itself, which can be seen as bloggers pointing straight to the source of the news. I suspect that this story will probably have more legs moving forward. A cursory glance provides glances at developing stories ranging from the rumor stage (that all important Google/Skype partnership or acquisition… which didn’t happen) to the focus on process (like the approval of Office Open XML as an ISO standard).</p>
<p>The departure of Google’s CIO dominated the <a href="http://www.techmeme.com/080402/h0000">prior night’s news cycle</a> and word of Apple’s 3G iphone started to filter through.</p>
<h3>Three Months Ago: March 2, 2008</h3>
<p><a href="http://www.techmeme.com/080302/h1200">March 2, 2008 at noon</a> provides us perspective on today’s news, thanks to Microsoft’s announcement of ITS entry into the web-based office suite market. When put side by side with <a href="http://www.readwriteweb.com/archives/adobe_launches_online_office_suite.php">today’s announcement by Adobe</a>, it seems to start pointing to more of a trend. Beyond that, little news that seems to be of note from a memorable standpoint.</p>
<p>The interesting thing here is that the same subject was leading the <a href="http://www.techmeme.com/080302/h0000">previous night’s news cycle</a>. This seems to establish a first rule for techmeme: <strong>subjects that survive on the front page more than 12 hours may be worth paying attention to</strong>.</p>
<h3>Six Month Ago: December 2, 2007</h3>
<p>There’s an all saying in journalism that 3 items make for a trend. In the case of this study, it looks like Web-based office suite are definitely the hottest trend around, as the <a href="http://www.techmeme.com/071202/h1200">top news on December 2, 2007 at noon</a> was information about the future of Google’s offering in that space (either that or there is an unwritten rule in the technology field that information about web-based office suites MUST be introduced on the second day of the month or wait until the following month).</p>
<p>The subject was starting to climb the chart <a href="http://www.techmeme.com/071202/h0000">12 hours earlier</a>, even thought the discussion at the time was dominated by a Facebook misstep (remember Facebook Beacons? Well, that was around that time). From an interface standpoint, it also brings up something that I’d like to recommend to Gabe: could you add and up or down arrow to highlight if a subject is getting more play or not. On something like this, it would be nice to get an idea of the stickiness of a topic. It appears many topic appear low on the page and move up over time, the quicker and faster they move up seems to indicate the importance of the story and it would be a nice addition to have that info on the screen.</p>
<h3>Nine Month Ago: September 2, 2007</h3>
<p>September 2, 2007 was a quiet news day. I guess everyone was mourning the death of the newspaper, which was forced by Google on that day, according to the <a href="http://www.techmeme.com/070902/h1200">noon-time headlines</a>. There doesn’t seem to have been any other major news <a href="http://www.techmeme.com/070902/h0000">around midnight </a>either. This, however, could be an artifact in the data as September 2, 2007 was a Sunday, which is generally a pretty quiet news day as most people don’t work on Sunday.</p>
<p>Interestingly, a story that is just now starting to get more notice is the continuing brushfires around Paypal’s outages. Not that sexy a subject but <a href="http://www.techmeme.com/070903/h0000">one that started to be raised around that time</a>. At the time, <a href="http://www.techmeme.com/070903/h1200">discussion of Google’s entry in the mobile market</a> centered around the idea they would deliver a device instead of a platform.</p>
<h3>Last Year and Two Years Ago</h3>
<p><a href="http://www.techmeme.com/070602/h1200">A year ago, at noon</a>, the Techmeme conversation was around porn. <a href="http://www.techmeme.com/070602/h0000">During the night</a>, though, the conversation was centering around the acquisition of Feedburner by Google. This is probably remembered by people in the industry as an important milestone and here, techmeme shines at organizing a package with the appropriate conversations.</p>
<p>Things do not improve much if you go further back: 2 years ago, at <a href="http://www.techmeme.com/060602/h1200">noon</a>, and <a href="http://www.techmeme.com/060602/h0000">midnight</a>, gives us little to mull over.</p>
<h3>Conclusion</h3>
<p>The data seems to point that the front page of TechMeme largely represents what’s hot right now but does not necessarily highlight stories which have a longer term type of impact. In that sense, it may also be highlighting that discussions in the tech blogosphere are largely centered on insider-type minutia while failing to put things in a larger context. This appears to present a myopic view of the tech world that leaves us with lots of data but preciously little information. So while TechMeme provides a useful tool in terms of getting an idea of the pulse of the conversation “right now,” it does little in providing data that would allow someone to understand the larger trends that are affecting our world as a result of the internet (and web 2.0 revolution).</p>
<p>I would argue that the answer to the question I posed in the title for that post is a resounding yes. Because it deals largely with the trivial and assess little value to longer type impact, TechMeme creates a self-imposed myopia on its readers and participants. A possible exception is when a story manages to survives through multiple 12-hour instances, providing many angles to the same events. But those events are few and far between.</p>
<p>Whether the lack of headlines with a major impact is a phenomenon that is unique to TechMeme or to the tech world in general is a question I’d like to leave to readers and I’d appreciate comments as to your thinking around this.</p>
<p>But all this comes down to a simple fact: if you’ve missed what happened on TechMeme in the last XX hours, days or weeks, you may not necessarily have missed much. so kick back, relax, step away from the computer and, if you need to catch up, you can always pick up a mainstream publication that may cover a distilled version of what happened if it’s of any particular significance.</p>
<p><p><i><a href="http://tnl.net/who" rel="author" title="Who is Tristan Louis?">Tristan Louis</a> is the founder and CEO of <a href="http://www.keepskor.com" title="Keepskor">Keepskor</a> and  writes the influential <a href="http://www.tnl.net/" title="tnl.net">tnl.net</a> weblog, where this was initially posted under the title <a href="http://www.tnl.net/blog/2008/06/02/is-techmeme-myopic/">Is Techmeme myopic?</a>. You can follow him on twitter <a href="https://twitter.com/TNLNYC">here</a> or receive his weekly newsletter by subscribing <a href="http://eepurl.com/gb6zD">here</a>.</i></p>
</p>
]]></content:encoded>
			<wfw:commentRss>http://www.tnl.net/blog/2008/06/02/is-techmeme-myopic/feed/</wfw:commentRss>
		<slash:comments>18</slash:comments>
		</item>
		<item>
		<title>Economic Activity in Virtual Worlds</title>
		<link>http://www.tnl.net/blog/2006/07/31/economic-activity-in-virtual-worlds/</link>
		<comments>http://www.tnl.net/blog/2006/07/31/economic-activity-in-virtual-worlds/#comments</comments>
		<pubDate>Tue, 01 Aug 2006 03:53:38 +0000</pubDate>
		<dc:creator>Tristan Louis</dc:creator>
				<category><![CDATA[Business]]></category>
		<category><![CDATA[China]]></category>
		<category><![CDATA[Microsoft]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[Sony]]></category>
		<category><![CDATA[United States]]></category>
		<category><![CDATA[e - commerce]]></category>
		<category><![CDATA[eBay]]></category>

		<guid isPermaLink="false">http://tnl.net/blog/2006/07/31/economic-activity-in-virtual-worlds/</guid>
		<description><![CDATA[Over the last few months, I’ve been trying to get a better understanding of what is happening with the concept of virtual worlds. Let me go into more details as to why I think this phenomenon has some real potentials. In this first entry in a series, I will explore the economic activity surrounding this [...]<p><p><i><a href="http://tnl.net/who" rel="author" title="Who is Tristan Louis?">Tristan Louis</a> is the founder and CEO of <a href="http://www.keepskor.com" title="Keepskor">Keepskor</a> and  writes the influential <a href="http://www.tnl.net/" title="tnl.net">tnl.net</a> weblog, where this was initially posted under the title <a href="http://www.tnl.net/blog/2006/07/31/economic-activity-in-virtual-worlds/">Economic Activity in Virtual Worlds</a>. You can follow him on twitter <a href="https://twitter.com/TNLNYC">here</a> or receive his weekly newsletter by subscribing <a href="http://eepurl.com/gb6zD">here</a>.</i></p>
</p>
]]></description>
			<content:encoded><![CDATA[<p>Over the last few months, I’ve been trying to get a <a href="http://www.tnl.net/blog/2006/03/31/where-virtual-and-physical-meet/">better</a> <a href="http://www.tnl.net/blog/2006/05/15/future-tense-participatory-applications/">understanding</a> of what is happening with the concept of virtual worlds. Let me go into more details as to why I think this phenomenon has some real potentials. In this first entry in a series, I will explore the economic activity surrounding this phenomenon.</p>
<h3>Size of the market</h3>
<p>When talking about virtual worlds, I am focusing on the new space created by the gaming industry that allows to create online avatars and interact with other players in a fully immersive environment. From an economic standpoint, estimates range from around 100 millions to a high of <a href="http://papers.ssrn.com/sol3/papers.cfm?abstract_id=294828">US$1.5 billion a year</a>. These are not insignificant numbers and they point to an emerging phenomenon and potentially the rise a new industry, with its own set of marketplaces, gathers, owners, creators, and marketers.</p>
<h3>Marketplaces</h3>
<p>To understand virtual worlds marketplace, one must first understand what si going on in those virtual worlds. When a player sets up an account, he’s given a basic set of skills. As he or she progresses and interacts with the virtual world and its denizens, the player gains more and more skills and goods. However, this type of interaction requires time. Some people have figured that, because time is money, the amount of time spend in a virtual world could be converted into real hard currency. Thus was born the concept of <a href="http://en.wikipedia.org/wiki/Real-money_trading">Real Money Trading (aka RMT)</a>, whereas players go to specialized sites and buy virtual goods with real financial currency.</p>
<p>The action initially started on auction sites like <a href="http://video-games.shop.ebay.com/Games-/139973/i.html?_armrs=1&#038;_dmd=1&#038;_mdo=Video-Games&#038;_mspp=&#038;_pcats=1249&#038;_sop=3">Ebay,</a> where characters or other virtual goods range in price from a few cents to several thousands of dollars. Because the trades were largely unregulated, some companies, like Sony, decided to set up their own exchange while others (Internet Game Exchange, <a href="http://www.mogs.com/">Massive Online Gaming Sales</a>, <a href="http://www.tekgaming.com/">Tek Gaming Supplies</a>, <a href="http://www.swagvault.com/">Swag Vault</a>, and <a href="https://gamersloot.net/catalog/">Gamers’ Loot</a>) have created specialized marketplaces to cater to this new phenomenon. This, in turns has led to the rise of two new classes of activities: informational ones that provide analysis on the financial going ons in those worlds and arbitration, whereas companies use people in the developing world to build up assets they resell to people in the developed world. Let’s go deeper in those areas.</p>
<h4>Information Sites</h4>
<p>There is now a nascent information industry surrounding the costs of goods in virtual worlds. For example, Eyes on Mogs is a shopping search engine for virtual goods. All the attributes of other search engines are part of it, including comparison shopping, comparisons of the different vendors, pricing, delivery date, and buy it now info. GameUSD tracks the financial value of virtual currencies over time, providing price trends across not only the provider but also the alternative marketplaces. MMOfx claims to track “over 18,000 price quotes daily” and provide information on the fluctuation of virtual currencies.</p>
<h4>Arbitration</h4>
<p>Another type of economic activity to have arisen out of the marketplace phenomenon is the arbitration of virtual work. As the primary pursuit in these worlds is the acquisition of wealth, status or levels, an emerging market has arisen to give people with real money a chance to bypass the time investment required to acquire those things. For example, <a href="http://www.nytimes.com/2005/12/09/technology/09gaming.html?ex=1291784400&amp;en=48a72408592dffe6&amp;ei=5088" title="Ogre to Slay? Outsource it to China">Chinese workers get paid between $75 and $250 a month to work in World Of Warcraft, in 12 hours shifts, “killing onscreen monsters and winning battles, harvesting artificial gold coins and other virtual goods</a>. Affluent online gamers who lack the time and patience to work their way up to the higher levels of gamedom are willing to pay the young Chinese to play the early rounds for them.” Similarly, Romanian players can make a living wage (the ABC News story I linked to says that $200 is a good wage for Romania) on the same kind of activity.</p>
<p>Edward Castranova, the leading economist on the subject of money in virtual worlds has been quoted as saying that “They’re exploiting the wage difference between the U.S. and China for unskilled labor.” What is basically happening here is that these companies have found a niche on the global marketplace to accumulate goods at a low cost and resell them at a premium. This type of arbitrage has been the way a lot of developing markets have revolutionized industries, from the export of manufacturing capabilities in the 20th century to the export of some service jobs nowadays. It’s a natural phenomenon and shows that those marketplaces are starting to develop a high level of maturity, which should be noticed by a lot more people.</p>
<h3>Virtual Goods Ownership</h3>
<p>Beyond the buying and selling of virtual goods in virtual marketplaces, there is also an emerging trend in the real estate business, which can be broken down into three main groups: real-estate owners, creators and integrators, and marketers.</p>
<p><a href="http://secondlife.com/?v=1.1">Second Life</a> is a virtual world more focused on the social aspect of virtual environments than on the goal oriented aspect of missions and war-craft. <a href="http://money.cnn.com/magazines/fortune/fortune_archive/2005/11/28/8361953/index.htm">Fortune Magazine reported last year about the interesting case of Anshe Chung</a>, a character created by a German woman who has accumulated more than US$200,000 in virtual land holdings in Second Life. She rents the property out to other people, after having developed the property. Similarly, the <a href="http://news.bbc.co.uk/2/hi/technology/4421496.stm">BBC reports that a 23-year-old spent Â£13,770 in Project Entropia and recouped his investment in under a year</a>. In fact, the land rush has been so strong that <a href="http://secondlife.com/land/pricing.php">Second Life has build a model around land use fees</a>, generating a nice chunk of income in the process.</p>
<p>While visiting this world, I’ve talked to people who had few problems paying $75 per month to Linden Labs for those fees. This is pretty incredible when you think that all they are buying is portion of disk space on a server. In a way, the real estate market presented by those virtual worlds can be seen as a hosting fee in a 3D environment and could represent a high growth market (in a future entry, I will look at the opportunities in the Virtual Spaces in more details).</p>
<h3>The Integration Model</h3>
<p>Another nascent portion of this new industry is the integration game. As with any new technology, developing and managing something in a virtual world is an endeavor that requires specialized skills. New companies like <a href="http://www.electricsheepcompany.com">The electric sheep company</a> and Space Think Dream have emerged as developers/integrators, offering their services to other companies. Their main business is to use the skills they’ve acquired to help existing companies experiment in these new worlds. This is, in a way, similar to the type of work that was done by early web design agencies, treating virtual worlds as a new interface either to existing systems or to create a new value proposition.</p>
<p>Other companies have emerged with the sole purpose of selling digital goods in those worlds. <a href="https://id.secondlife.com/openid/cc?n=0&#038;going_next=https%3A%2F%2Fxstreetsl.com%2Fauth_start.php%3Fredirect%3Dhttps%253A%252F%252Fxstreetsl.com%252F%26openid_identifier%3Dhttps%253A%252F%252Fid.secondlife.com%252Fid%252Fanonymous&#038;session=af6ba9c1-ec62-6094-65da-5b12da9e68f0">SLexchange</a> is a virtual market where people can buy and sell such goods. Similarly, the Electric Sheep company has created SLBoutique as a competitor to SLexchange. What is interesting here is that there is a whole ecosystem building around Second Life, allowing other companies to prosper based on this new platform. This is similar to what has happened with Ebay and allows us to better understand SecondLife as a platform for e-commerce rather than just a game, a fact that <a href="http://andrewkeen.typepad.com/aftertv/2006/07/interview_with_.html">Philip Rosedale, CEO of LindenLab and the power behind Second Life, likes to emphasize</a>. This explains why the company has <a href="http://www.siliconbeat.com/entries/2006/03/28/linden_lab_raises_11_million_to_go_more_mainstream.html">received investments</a> from people like <a href="http://www.blogcharm.com/index.php" class="broken_link">amazon.com CEO Jeff Bezos</a>, Lotus founder Mitch Kapor, Ebay founder Pierre Omidyar, and <a href="http://scobleizer.com/2006/07/28/why-ozzie-doesnt-think-the-web-is-the-be-all-and-end-all/">Microsoft CTO Ray Ozzie</a>. Those people understand that this a new emerging platform and <a href="http://www.businessweek.com/technology/content/mar2006/tc20060328_688225.htm">could see potentially high return on their investment</a>.</p>
<h3>Bridging the gap</h3>
<p>The development of virtual worlds as a new platform is starting to take shape. Companies and organizations like <a href="http://www.secretlair.com/index.php?/clickableculture/entry/american_apparel_establishes_second_life_island/">American Apparel</a>, <a href="http://news.bbc.co.uk/2/hi/technology/4766755.stm">the BBC</a>, Major League Baseball, NASA, <a href="http://www.jeff-barr.com/?p=537">The American Cancer Society</a>, <a href="http://aws.typepad.com/aws/2006/07/life2life_ecspo.html">Amazon.com</a> and <a href="http://wellsupdate.wellsfargo.com/m/p/wls/ibk/sc.asp">Wells Fargo</a> are starting to experiment in that space. Increasingly, virtual worlds are becoming not only <a href="http://www.secretlair.com/index.php?/clickableculture/entry/harvard_business_review_on_avatar_based_marketing/">a new way to market</a> but also a new integration point for e-commerce.</p>
<p>Some of the virtual worlds (<a href="http://www.techdirt.com/articles/20060502/0937209.shtml">Project Entropia, for example</a>) have even gone as far as issuing ATM cards that allow denizens of those worlds to take virtual money and trade it for real money that they can use for regular economic activity.</p>
<h3>Conclusion</h3>
<p>With large amounts of real currency already moving through virtual worlds, we are looking at a major new economic phenomenon that parallels the initial development of the commercial web and the rise of software as platform in the last few years.</p>
<p>With a new ecosystem forming around some of the virtual worlds, there is a fair amount of incentive for a lot of people to see this phenomenon succeed. SecondLife will probably be an early winner in this race, largely due to how quickly it has managed to get other companies to rely on it. A few more established companies are also early in staking ground in this new space and will probably reap rich rewards for their efforts, expanding their brand into those virtual spaces.</p>
<p>While it may appear that this is largely a subculture of gaming, the phenomenon is much more widespread. In my next entry, I will go through the demographic profile of denizens of those virtual spaces, showcasing a rich and varied texture to this phenomenon.</p>
<p><p><i><a href="http://tnl.net/who" rel="author" title="Who is Tristan Louis?">Tristan Louis</a> is the founder and CEO of <a href="http://www.keepskor.com" title="Keepskor">Keepskor</a> and  writes the influential <a href="http://www.tnl.net/" title="tnl.net">tnl.net</a> weblog, where this was initially posted under the title <a href="http://www.tnl.net/blog/2006/07/31/economic-activity-in-virtual-worlds/">Economic Activity in Virtual Worlds</a>. You can follow him on twitter <a href="https://twitter.com/TNLNYC">here</a> or receive his weekly newsletter by subscribing <a href="http://eepurl.com/gb6zD">here</a>.</i></p>
</p>
]]></content:encoded>
			<wfw:commentRss>http://www.tnl.net/blog/2006/07/31/economic-activity-in-virtual-worlds/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>5 opportunities around social networks</title>
		<link>http://www.tnl.net/blog/2006/06/30/5-opportunities-around-social-networks/</link>
		<comments>http://www.tnl.net/blog/2006/06/30/5-opportunities-around-social-networks/#comments</comments>
		<pubDate>Sat, 01 Jul 2006 02:54:34 +0000</pubDate>
		<dc:creator>Tristan Louis</dc:creator>
				<category><![CDATA[Business]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[Social Networks]]></category>
		<category><![CDATA[eBay]]></category>

		<guid isPermaLink="false">http://tnl.net/blog/2006/06/30/5-opportunities-around-social-networks/</guid>
		<description><![CDATA[In previous entries, I looked at the benefits and issues with social networks. As they move forward, here’s a list of opportunities relating to social networks.Â  5. Data Mining/Research A main attribute of social networks is how much data people provide to them. On top of it, this data and the interaction of users on [...]<p><p><i><a href="http://tnl.net/who" rel="author" title="Who is Tristan Louis?">Tristan Louis</a> is the founder and CEO of <a href="http://www.keepskor.com" title="Keepskor">Keepskor</a> and  writes the influential <a href="http://www.tnl.net/" title="tnl.net">tnl.net</a> weblog, where this was initially posted under the title <a href="http://www.tnl.net/blog/2006/06/30/5-opportunities-around-social-networks/">5 opportunities around social networks</a>. You can follow him on twitter <a href="https://twitter.com/TNLNYC">here</a> or receive his weekly newsletter by subscribing <a href="http://eepurl.com/gb6zD">here</a>.</i></p>
</p>
]]></description>
			<content:encoded><![CDATA[<p>In previous entries, I looked at the benefits and issues with social networks. As they move forward, here’s a list of opportunities relating to social networks.Â </p>
<h3>5. Data Mining/Research</h3>
<p>A main attribute of social networks is how much data people provide to them. On top of it, this data and the interaction of users on those networks. This is rich fodder for data mining. For example, researchers recently used <a href="http://www.wheresgeorge.com/">Where’s George</a>, a website tracking dollar bills in the real world, to <a href="http://www.newscientist.com/article/dn8636">assess how disease spreads</a>.Â Similarly, <a href="http://www.linkedin.com">LinkedIn</a> provides its users with demographic/geographic data about members of your social network.</p>
<p>Traditional companies spend millions of dollars trying to understand the flow of people, flow of ideas (or memes) and how to exploit them. From <a href="http://www.smallworldexperiment.com/2007/07/welcome_16.html">Milgram’s small world experiment</a> to the success of “<a href="http://www.amazon.com/dp/0316346624/?tag=tnlnetinassociwi">The Tipping Point</a>” by Malcolm Gladwell, there has been a fairly large body of research in this area but, for what may be the first time in history, there is now a heavy trove of data that can be analyzed.</p>
<h3>4. Problem Solving</h3>
<p>Sites likeÂ <a href="http://answers.google.com/answers/">Google Answers</a>Â are working on providing better answers to questions. Add-in some social network glue and one could be able to figure whether the person is a subject matter expert in the area he/she is answering the question about. For example, you might want to trust an individual with strong network ties in technology on questions related to technology but might be a little more wary of answers that person would provide about medical care (and similarly, you might trust a doctor more about medical care than you would a computer geek). Social networks, when seen through the lens of expertise, can provide quick access to answers from subject matter experts in one area. It is impossible to know everything but you might have a friend of a friend of a friend who has the answer in a specific area you are researching.</p>
<p>Similarly, social networks can provide a way to get social matter experts to connect and work collectively on difficult problems. When combined with <a href="http://digg.com/news">digg</a>–like features, social networks could become a way to speed up the vetting process on scientific publications by allowing a large set of peers to review articles and rank them according to value. This, in itself, could help humanity make radical moves forward in the area of scientific research.</p>
<p>Take, for example, my friends at ACOR who have been thinking of developing, in partnership with the National Cancer Institute,Â a data-mining system that analyzes information about patients to identify potential root cause for different cancers. Here, we see social networks (in this case, via mailing lists that are finely targeted) potentially being useful to help advance science and hopefully discover some root causes for cancer. A set of tools to such granular community could help a scientist, for example, sent a questionnaire to a sub-segment of the population to test a hypothesis (eg. “let’s see if people who have skin cancer and drank more than 1 glass of milk a day are reacting better to this type of drug?”) before deciding to do a clinical trial. If a specialized social network for such community was created, there might be no end to how much data could be gathered. Thing of it as a shotgun approach to medicine.Â Â Â </p>
<h3>3. Marketing</h3>
<p>Marketing, off course, is all about deep knowledge of the audience. The best way to market a messageÂ is to discover what motivates people and how to craftÂ the message to match the motivations. When combined with <a href="http://battellemedia.com/archives/2003/11/the_database_of_intentions">the database of intentions</a>, aÂ social network can work as a set of focus groups for messages. Testing different messages on a narrow audience can allow people to better market their products.</p>
<h3>2. Reputation Management</h3>
<p>The old adage is that “it’s not what you know, it’s who you know“Â is at the core of social networking. As more and more people are online and more and more interactions are happening between people withÂ weak ties, assessing a person’s reputation is increasingly important. LinkedIn has keyed in on that effort by giving people the ability to “endorse” members of their social network, providing more information about how a person performed in a particular job. In a similar fashion, profiles no Ebay allow buyers and sellers to assess the track record of a buyer or seller before making a transaction. Endorsements by one’s strong ties generally reflects much higher than by someone you don’t know. Thus, social networks can work as the glue to reputation management. It is not enough for people to know that a person is seen as important by some random stranger but when one discovers that their friends or colleagues have endorsed a particular individual, they tend to trust those opinions more heavily.</p>
<p>Let’s take a pedestrian example: imagine you need to get some electrical work done in your house but don’t know any electricians. By looking at your social network, you could find such an expert with ease as the best electrician might be linked to your friends. In a way, social networks are just an extension of asking people for recommendations. Which brings me to the last opportunity on this list.</p>
<h3>1. Recommendation</h3>
<p>Recommendation is a very powerful driver to decision making: whether it is for hiring a person, picking a new product, or finding a general direction, humans tend to look to their existing network andÂ do a subconscious “most-like” analysis of the information they receive.Â For example, Amazon has been very successful with the “people who bought this also bought…” and “people who looked at this also looked at…” features.Â As they gather more data, patterns emerge.</p>
<p>Similar approaches can be taken into the search space (where what people linked to or clicked on is ranked higher than other stuff) and in other areas like music (<a href="http://www.last.fm">last.fm</a> comes to mind) or other media consumption (for example, the success of aggregator like Digg, <a href="http://www.techmeme.com">techmeme</a> or tailrank can be attributed in large part to the need people have to know what other people think is good).</p>
<h3>Conclusion</h3>
<p>Social Networks should not really be a set of standalone tools but they are essential to building the next set of applications that leverage the power of the crowds. As such, social networking should be a feature and not an end-goal until itself. The companies that understand this basic rule will be the ones that succeed in that space, leveraging opportunities created by social networks in a fashion that will provide unprecedented benefits.</p>
<p><p><i><a href="http://tnl.net/who" rel="author" title="Who is Tristan Louis?">Tristan Louis</a> is the founder and CEO of <a href="http://www.keepskor.com" title="Keepskor">Keepskor</a> and  writes the influential <a href="http://www.tnl.net/" title="tnl.net">tnl.net</a> weblog, where this was initially posted under the title <a href="http://www.tnl.net/blog/2006/06/30/5-opportunities-around-social-networks/">5 opportunities around social networks</a>. You can follow him on twitter <a href="https://twitter.com/TNLNYC">here</a> or receive his weekly newsletter by subscribing <a href="http://eepurl.com/gb6zD">here</a>.</i></p>
</p>
]]></content:encoded>
			<wfw:commentRss>http://www.tnl.net/blog/2006/06/30/5-opportunities-around-social-networks/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Building Buzz</title>
		<link>http://www.tnl.net/blog/2006/03/21/building-buzz/</link>
		<comments>http://www.tnl.net/blog/2006/03/21/building-buzz/#comments</comments>
		<pubDate>Tue, 21 Mar 2006 20:55:15 +0000</pubDate>
		<dc:creator>Tristan Louis</dc:creator>
				<category><![CDATA[Business]]></category>
		<category><![CDATA[Energy]]></category>
		<category><![CDATA[Finance]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[MP3]]></category>
		<category><![CDATA[Microsoft]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[Sony]]></category>
		<category><![CDATA[Yahoo]]></category>

		<guid isPermaLink="false">http://tnl.net/blog/2006/03/21/building-buzz/</guid>
		<description><![CDATA[Apple has it. Google has it. Microsoft fails at it. Yahoo! sometimes does and sometimes doesn’t. What I am talking about is buzz and coolness. It seems every time Apple or Google introduces a new product, the buzz is high. For example, Apple recently introduced a $350 speaker and, while the reaction was more tepid [...]<p><p><i><a href="http://tnl.net/who" rel="author" title="Who is Tristan Louis?">Tristan Louis</a> is the founder and CEO of <a href="http://www.keepskor.com" title="Keepskor">Keepskor</a> and  writes the influential <a href="http://www.tnl.net/" title="tnl.net">tnl.net</a> weblog, where this was initially posted under the title <a href="http://www.tnl.net/blog/2006/03/21/building-buzz/">Building Buzz</a>. You can follow him on twitter <a href="https://twitter.com/TNLNYC">here</a> or receive his weekly newsletter by subscribing <a href="http://eepurl.com/gb6zD">here</a>.</i></p>
</p>
]]></description>
			<content:encoded><![CDATA[<p>Apple has it. Google has it. Microsoft fails at it. Yahoo! sometimes does and sometimes doesn’t. What I am talking about is buzz and coolness.</p>
<p>It seems every time Apple or Google introduces a new product, the buzz is high. For example, Apple recently introduced <a href="http://www.apple.com/itunes/">a $350 speaker</a> and, while the reaction was more tepid than it has been for other Apple products, no one seem to point that the emperor was looking very very naked. Yet, Microsoft keeps throwing out new products and few people seem to be very interested (no matter how Scoble tries to browbeat us into thinking of Microsoft as cool).</p>
<p>Similarly, today, Google introduced <a href="http://www.google.com/finance">a finance section</a> that mimicked much of what <a href="http://finance.yahoo.com">yahoo! finance</a> has been doing for years. It has a couple of nice AJAX-based features but, all and all, it’s not enough of an improvement to be considered like something that could potentially dominate the tech news cycle. And yet, every major tech pub or mainstream publication has covered the release.</p>
<p>why?</p>
<h3>Trying to divine the source of coolness</h3>
<p>What Google and Apple seem to have understood is that there are ways to make oneself look cool. I’m going to try to lay out some of the things I’ve seen (and I hope that others will chime in in the comments):</p>
<h4>Rumor Mill</h4>
<p>First, let the rumors float or give the appearance that you don’t want rumors spreading. Google Finance has long been a rumored product (as is Google payment, for example) but no word ever came out of the company about their intentions. In fact, Google is relatively stingy in terms of providing advance information about their products. They have learned to let the rumors run wild, leaving their competitors tearing their hair out trying to divine what Google will do next.</p>
<p>Apple takes a different approach to this. In the past, the company has been relatively ruthless in its attempts to shut leaks down. However, it seems that, when leaks are presenting compelling products and the company doesn’t really have anything to announce, Apple is happy to let the rumor mill run wild. So, before the release of the iSpeaker, uh, iPod Hi-Fi, Apple did not crack down on rumors about a new video iPod.</p>
<p>The two approaches speak to two different traits: one is to be extremely secretive about your action and the other is to let rumors go wild as long as they paint a picture of your company that is far cheerier than its reality.</p>
<h4>The one feature</h4>
<p>When selling technology, there are two publics to serve: the early adopters, and the general public. The early adopters are a fickle bunch but they can have some influence on the general public. So giving the early adopters one feature that they will like is an important feature of creating good buzz. Similarly, when dealing with the general public, emphasize the one feature that makes your product different. It doesn’t have to be something that is actually innovative (many companies were making MP3 players years before the iPod; many companies have offered services (other than search) which did what Google did in categories like mail, web hosting, classified, news, etc..) but it has to be presented as such. The early adopters may groan but they are eventually drowned out by the masses.</p>
<p>Thus, Apple did not release a featureless MP3 players without a screen, they released the “Shuffle” which allowed people to get a little more randomness out of their music collection. Or Apple didn’t release a $350 speaker, they release a Hi-Fi system that will work with an iPod (iPod sold separately). Similarly, Google did not release a Geocities rethread, they released pages, a cool online web editor and page hosting service. They did not release a me-too version of finance: look at the cool graphs they have.</p>
<p>I may sound a little cynical in that last paragraph but I believe it is this kind of cynicism that infuses the marketing of cool products. They may not be the top technology in the market but they are different. And emphasizing that they are different gives a chance to the users to feel like they, too, are different.</p>
<h4>Cool by association</h4>
<p>The next item on the list, in terms of generating buzz is to create an appearance of exclusivity from the get-go. Thus Apple does not complain too much when the police report rise in theft of iPod, due to the high visibility of the white headphones (see, our product is so popular, people steal it). Similarly, Google did not offer a free web-mail service for all, you had to receive an invitation.</p>
<p>By creating a certain level of exclusivity or belonging to a certain tribe, Apple and Google have managed to go beyond the product. They’ve created an aura of cool in being associated with them. When a new product comes out, you have to check it out or you will be out of the loop. The trend folds on itself, ensuring that future product launches benefit from the buzz of previous product launches. Over times, the duds are forgotten, and the companies are seen as innovative.</p>
<h4>Look! Feel!</h4>
<p>One of the other things to consider, when creating some level of buzz is the fizz and whiz of look and feel. Apple is known for designing beautiful computers (in the mainstream PC world, only Sony puts as much thought into how their computers look). The energy they put into the design allows them to bypass some of the technology issues that other vendors would encounter.</p>
<p>Similarly, Google has become an expert at using AJAx for their interfaces. As a result, new products generally look more polished than the competition. In the case of Finance application, it was interesting to see <a href="http://www.internetoutsider.com/2006/03/google_finance_.html#comment-15261337">comments by people in the financial space</a> on the performance of the product in terms of delays giving stock quote prices, etc.. However, few users would drill in and discover that stock prices were about 20–25 behind, or that</p>
<h3>What value does buzz have?</h3>
<p>At the end of the day, though, much remains to be seen about the value of such buzz. While Apple generates a lot of buzz about its computers, it still only retains between 5 and 10 percent of the market. Similarly, while Google has generated much buzz for all its new products, its bread and butter is still revenue from advertising on the search engine. So the question that still needs to be considered is whether buzz has value beyond the introduction of a new product and what that value translates to in terms of real dollars.</p>
<p><p><i><a href="http://tnl.net/who" rel="author" title="Who is Tristan Louis?">Tristan Louis</a> is the founder and CEO of <a href="http://www.keepskor.com" title="Keepskor">Keepskor</a> and  writes the influential <a href="http://www.tnl.net/" title="tnl.net">tnl.net</a> weblog, where this was initially posted under the title <a href="http://www.tnl.net/blog/2006/03/21/building-buzz/">Building Buzz</a>. You can follow him on twitter <a href="https://twitter.com/TNLNYC">here</a> or receive his weekly newsletter by subscribing <a href="http://eepurl.com/gb6zD">here</a>.</i></p>
</p>
]]></content:encoded>
			<wfw:commentRss>http://www.tnl.net/blog/2006/03/21/building-buzz/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Technorati 100 Here Today Gone Tomorrow</title>
		<link>http://www.tnl.net/blog/2006/02/21/technorati-100-here-today-gone-tomorrow/</link>
		<comments>http://www.tnl.net/blog/2006/02/21/technorati-100-here-today-gone-tomorrow/#comments</comments>
		<pubDate>Tue, 21 Feb 2006 08:59:59 +0000</pubDate>
		<dc:creator>Tristan Louis</dc:creator>
				<category><![CDATA[Business]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Search]]></category>

		<guid isPermaLink="false">http://tnl.net/blog/2006/02/21/technorati-100-here-today-gone-tomorrow/</guid>
		<description><![CDATA[Based on the recent discussion about new gatekeepers, I recently wondered whether we were just deluding ourselves in thinking that there were gatekeepers. What provoked this line of thinking was a recent comment by Doc Searls in which he says that “being an alpha blogger was like being an alpha paramecium.” This pushed me to [...]<p><p><i><a href="http://tnl.net/who" rel="author" title="Who is Tristan Louis?">Tristan Louis</a> is the founder and CEO of <a href="http://www.keepskor.com" title="Keepskor">Keepskor</a> and  writes the influential <a href="http://www.tnl.net/" title="tnl.net">tnl.net</a> weblog, where this was initially posted under the title <a href="http://www.tnl.net/blog/2006/02/21/technorati-100-here-today-gone-tomorrow/">Technorati 100 Here Today Gone Tomorrow</a>. You can follow him on twitter <a href="https://twitter.com/TNLNYC">here</a> or receive his weekly newsletter by subscribing <a href="http://eepurl.com/gb6zD">here</a>.</i></p>
</p>
]]></description>
			<content:encoded><![CDATA[<p>Based on the recent discussion about new gatekeepers, I recently wondered whether we were just deluding ourselves in thinking that there were gatekeepers. What provoked this line of thinking was <a href="http://doc-weblogs.com/2006/02/18">a recent comment by Doc Searls</a> in which he says that “being an alpha blogger was like being an alpha paramecium.” This pushed me to analyze the rank of move within the Technorati 100. As frequent readers of this blog know, I did <a target="_blank" href="http://www.tnl.net/blog/2005/06/01/secrets-of-the-a-list-bloggers-technorati-links/" title="Secrets of the A-List Bloggers: Technorati Links">a study back in May 2005</a>, in which I analyzed linkage to members of the Technorati 100. Using this data as a point in time, I have now decided to revisit the list and see how much movement happened.</p>
<p>The first thing to do was to map out which of the May 19, 2005 members were still on the list. The results looked like this:</p>
<table border="1" summary="Technorati 100 - May 19, 2005">
<tr>
<th>Blog Title</th>
<th>Position 5/19/05</th>
<th>Position 2/20/06</th>
</tr>
<tr>
<td>Boing Boing</td>
<td>1</td>
<td>1</td>
</tr>
<tr>
<td>InstaPundit</td>
<td>2</td>
<td>12</td>
</tr>
<tr>
<td>Daily Kos</td>
<td>3</td>
<td>5</td>
</tr>
<tr>
<td>Gizmodo</td>
<td>4</td>
<td>9</td>
</tr>
<tr>
<td>Fark</td>
<td>5</td>
<td>23</td>
</tr>
<tr>
<td>EnGadget</td>
<td>6</td>
<td>2</td>
</tr>
<tr>
<td>Davenetics</td>
<td>7</td>
<td>Â </td>
</tr>
<tr>
<td>Eschaton</td>
<td>8</td>
<td>36</td>
</tr>
<tr>
<td>Dooce</td>
<td>9</td>
<td>15</td>
</tr>
<tr>
<td>Andrew Sullivan</td>
<td>10</td>
<td>51</td>
</tr>
<tr>
<td>The Best Page In The Universe</td>
<td>11</td>
<td>52</td>
</tr>
<tr>
<td>Talking Points Memo: by Joshua Micah<span>Marshall </span></td>
<td>12</td>
<td>26</td>
</tr>
<tr>
<td>lgf: anti-idiotarian</td>
<td>13</td>
<td>35</td>
</tr>
<tr>
<td>kottke.org</td>
<td>14</td>
<td>21</td>
</tr>
<tr>
<td>WIL WHEATON DOT NET</td>
<td>15</td>
<td>Â </td>
</tr>
<tr>
<td>Metafilter</td>
<td>16</td>
<td>47</td>
</tr>
<tr>
<td>Doc Searls</td>
<td>17</td>
<td>92</td>
</tr>
<tr>
<td>(In)formacao e (In)utilidade</td>
<td>18</td>
<td>Â </td>
</tr>
<tr>
<td>Wonkette</td>
<td>19</td>
<td>25</td>
</tr>
<tr>
<td>Scripting News</td>
<td>20</td>
<td>95</td>
</tr>
<tr>
<td>Power Line</td>
<td>21</td>
<td>33</td>
</tr>
<tr>
<td>Balmasque</td>
<td>22</td>
<td>Â </td>
</tr>
<tr>
<td>Corante</td>
<td>23</td>
<td>Â </td>
</tr>
<tr>
<td>A list Apart</td>
<td>24</td>
<td>17</td>
</tr>
<tr>
<td>Something Awful</td>
<td>25</td>
<td>44</td>
</tr>
<tr>
<td>Megatokyo</td>
<td>26</td>
<td>Â </td>
</tr>
<tr>
<td>Michelle Malkin</td>
<td>27</td>
<td>10</td>
</tr>
<tr>
<td>Arts and Letters Daily</td>
<td>28</td>
<td>Â </td>
</tr>
<tr>
<td>Gawker</td>
<td>29</td>
<td>19</td>
</tr>
<tr>
<td>Afterall it was the best I ever had</td>
<td>30</td>
<td>Â </td>
</tr>
<tr>
<td>The Volokh Conspiracy</td>
<td>31</td>
<td>74</td>
</tr>
<tr>
<td>Scobelizer</td>
<td>32</td>
<td>34</td>
</tr>
<tr>
<td>Jeffrey Zeldman</td>
<td>33</td>
<td>Â </td>
</tr>
<tr>
<td>This Modern World</td>
<td>34</td>
<td>Â </td>
</tr>
<tr>
<td>The Web Standards Project</td>
<td>35</td>
<td>57</td>
</tr>
<tr>
<td>Joel on Software</td>
<td>36</td>
<td>39</td>
</tr>
<tr>
<td>Media Matters for America</td>
<td>37</td>
<td>Â </td>
</tr>
<tr>
<td>Television without pity</td>
<td>38</td>
<td>Â </td>
</tr>
<tr>
<td>Kuro5hin</td>
<td>39</td>
<td>Â </td>
</tr>
<tr>
<td>Lileks</td>
<td>40</td>
<td>Â </td>
</tr>
<tr>
<td>Hugh Hewitt</td>
<td>41</td>
<td>55</td>
</tr>
<tr>
<td>Joel Veitch</td>
<td>42</td>
<td>Â </td>
</tr>
<tr>
<td>Truthout</td>
<td>43</td>
<td>Â </td>
</tr>
<tr>
<td>Baghdad Burning</td>
<td>44</td>
<td>Â </td>
</tr>
<tr>
<td>Buzz machine</td>
<td>45</td>
<td>60</td>
</tr>
<tr>
<td>fleugel</td>
<td>46</td>
<td>Â </td>
</tr>
<tr>
<td>Informed Comment</td>
<td>47</td>
<td>93</td>
</tr>
<tr>
<td>Doppler: redefining podcasting</td>
<td>48</td>
<td>Â </td>
</tr>
<tr>
<td>geek and proud</td>
<td>49</td>
<td>Â </td>
</tr>
<tr>
<td>loadmemory (Asian site)</td>
<td>50</td>
<td>Â </td>
</tr>
<tr>
<td>Photojunkie</td>
<td>51</td>
<td>Â </td>
</tr>
<tr>
<td>Ross Rader</td>
<td>52</td>
<td>Â </td>
</tr>
<tr>
<td>The Truth Laid Bear</td>
<td>53</td>
<td>Â </td>
</tr>
<tr>
<td>Joi Ito</td>
<td>54</td>
<td>Â </td>
</tr>
<tr>
<td>ScrappleFace</td>
<td>55</td>
<td>Â </td>
</tr>
<tr>
<td>LexText</td>
<td>56</td>
<td>Â </td>
</tr>
<tr>
<td>Google Blog</td>
<td>57</td>
<td>8</td>
</tr>
<tr>
<td>Xbox</td>
<td>58</td>
<td>Â </td>
</tr>
<tr>
<td>My life in a Bush of Ghosts</td>
<td>59</td>
<td>Â </td>
</tr>
<tr>
<td>Astronomy picture of the day</td>
<td>60</td>
<td>Â </td>
</tr>
<tr>
<td>Crooked Timber</td>
<td>61</td>
<td>Â </td>
</tr>
<tr>
<td>Vodka Pundit</td>
<td>62</td>
<td>Â </td>
</tr>
<tr>
<td>Captain’s quarter</td>
<td>63</td>
<td>70</td>
</tr>
<tr>
<td>A small victory</td>
<td>64</td>
<td>Â </td>
</tr>
<tr>
<td>Gato Fedorento</td>
<td>65</td>
<td>Â </td>
</tr>
<tr>
<td>Mezzoblue</td>
<td>66</td>
<td>Â </td>
</tr>
<tr>
<td>PostSecret</td>
<td>67</td>
<td>4</td>
</tr>
<tr>
<td>Samizdata.net</td>
<td>68</td>
<td>Â </td>
</tr>
<tr>
<td>Lawrence Lessig</td>
<td>69</td>
<td>Â </td>
</tr>
<tr>
<td>Counterpunch</td>
<td>70</td>
<td>Â </td>
</tr>
<tr>
<td>Democractic Underground</td>
<td>71</td>
<td>Â </td>
</tr>
<tr>
<td>Right Wing News</td>
<td>72</td>
<td>Â </td>
</tr>
<tr>
<td>StopDesign</td>
<td>73</td>
<td>Â </td>
</tr>
<tr>
<td>iBiblio</td>
<td>74</td>
<td>Â </td>
</tr>
<tr>
<td>Samizdata.net (mistake?)</td>
<td>75</td>
<td>Â </td>
</tr>
<tr>
<td>Abrupto</td>
<td>76</td>
<td>Â </td>
</tr>
<tr>
<td>gene7299 (Asian MSNSpaces site)</td>
<td>77</td>
<td>Â </td>
</tr>
<tr>
<td>Where is Raed</td>
<td>78</td>
<td>Â </td>
</tr>
<tr>
<td>B3TA: We love the web</td>
<td>79</td>
<td>Â </td>
</tr>
<tr>
<td>Talkleft</td>
<td>80</td>
<td>Â </td>
</tr>
<tr>
<td>Wizbang</td>
<td>81</td>
<td>Â </td>
</tr>
<tr>
<td>m1net (MSN spaces site)</td>
<td>82</td>
<td>Â </td>
</tr>
<tr>
<td>Hoder</td>
<td>83</td>
<td>Â </td>
</tr>
<tr>
<td>CTRL+Alt+Del</td>
<td>84</td>
<td>Â </td>
</tr>
<tr>
<td>Brad DeLong</td>
<td>85</td>
<td>Â </td>
</tr>
<tr>
<td>Blogs for Bush</td>
<td>86</td>
<td>Â </td>
</tr>
<tr>
<td>Neil Gaiman</td>
<td>87</td>
<td>Â </td>
</tr>
<tr>
<td>Gothamist</td>
<td>88</td>
<td>85</td>
</tr>
<tr>
<td>Thought Mechanics</td>
<td>89</td>
<td>7</td>
</tr>
<tr>
<td>IMAO</td>
<td>90</td>
<td>Â </td>
</tr>
<tr>
<td>Dan Gillmor (old weblog)</td>
<td>91</td>
<td>Â </td>
</tr>
<tr>
<td>HINAGATA</td>
<td>92</td>
<td>Â </td>
</tr>
<tr>
<td>Dean’s World</td>
<td>93</td>
<td>Â </td>
</tr>
<tr>
<td>Defamer</td>
<td>94</td>
<td>53</td>
</tr>
<tr>
<td>USS Clueless</td>
<td>95</td>
<td>Â </td>
</tr>
<tr>
<td>Dive into Mark</td>
<td>96</td>
<td>Â </td>
</tr>
<tr>
<td>Pandagon</td>
<td>97</td>
<td>Â </td>
</tr>
<tr>
<td>Blogging.la</td>
<td>98</td>
<td>Â </td>
</tr>
<tr>
<td>Why are you worshipping the ground I b<span>log on?</span></td>
<td>99</td>
<td>Â </td>
</tr>
<tr>
<td>Daring Fireball</td>
<td>100</td>
<td>Â </td>
</tr>
</table>
<p>This provided me with a departure point but it wasn’t really getting at what I wanted. Obviously, a fair number of people had changed position. So I decided to take a cut of the same data on the 20th of February and start mapping out movement. It looked as follows:</p>
<table border="1" summary="Technorati 100 - February 20, 2006">
<tr>
<th>Position 2/20/06</th>
<th>Name</th>
<th>Position on 5/19/05</th>
</tr>
<tr>
<td>1</td>
<td><a href="http://www.boingboing.net/">Boing Boing</a></td>
<td>1</td>
</tr>
<tr>
<td>2</td>
<td><a href="http://www.engadget.com/">Engadget</a></td>
<td>6</td>
</tr>
<tr>
<td>3</td>
<td><a href="http://www.filelodge.com/">File Lodge</a></td>
<td>Â </td>
</tr>
<tr>
<td>4</td>
<td><a href="http://www.postsecret.com/">PostSecret</a></td>
<td>67</td>
</tr>
<tr>
<td>5</td>
<td><a href="http://www.dailykos.com/">Daily Kos</a></td>
<td>3</td>
</tr>
<tr>
<td>6</td>
<td><a href="http://www.huffingtonpost.com/">The Huffington Post</a></td>
<td>Â </td>
</tr>
<tr>
<td>7</td>
<td><a href="http://www.thoughtmechanics.com/">Thought Mechanics</a></td>
<td>89</td>
</tr>
<tr>
<td>8</td>
<td><a href="http://googleblog.blogspot.com/">Official Google Blog</a></td>
<td>57</td>
</tr>
<tr>
<td>9</td>
<td><a href="http://gizmodo.com/">Gizmodo</a></td>
<td>4</td>
</tr>
<tr>
<td>10</td>
<td><a href="http://michellemalkin.com/">Michelle Malkin</a></td>
<td>27</td>
</tr>
<tr>
<td>11</td>
<td><a href="http://www.beppegrillo.it/">Blog di Beppe Grillo</a></td>
<td>Â </td>
</tr>
<tr>
<td>12</td>
<td><a href="http://pajamasmedia.com/instapundit/">Instapundit</a></td>
<td>2</td>
</tr>
<tr>
<td>13</td>
<td><a href="http://crooksandliars.com/">Crooks and Liars</a></td>
<td>Â </td>
</tr>
<tr>
<td>14</td>
<td><a href="http://lifehacker.com/">Lifehacker</a></td>
<td>Â </td>
</tr>
<tr>
<td>15</td>
<td><a href="http://www.dooce.com/">dooce</a></td>
<td>9</td>
</tr>
<tr>
<td>16</td>
<td>Herramientas para Blogs</td>
<td>Â </td>
</tr>
<tr>
<td>17</td>
<td><a href="http://www.alistapart.com/">A List Apart</a></td>
<td>24</td>
</tr>
<tr>
<td>18</td>
<td><a href="http://thinkprogress.org/">Think Progress</a></td>
<td>Â </td>
</tr>
<tr>
<td>19</td>
<td><a href="http://gawker.com/">Gawker</a></td>
<td>29</td>
</tr>
<tr>
<td>20</td>
<td><a href="http://MSN-SA.spaces.live.com">MSN-SA (MSN Spaces)</a></td>
<td>Â </td>
</tr>
<tr>
<td>21</td>
<td><a href="http://www.kottke.org/">kottke.org</a></td>
<td>14</td>
</tr>
<tr>
<td>22</td>
<td><a href="http://shiraishi.seesaa.net/">shiraishi.seesaa.net</a></td>
<td>Â </td>
</tr>
<tr>
<td>23</td>
<td><a href="http://www.fark.com/">Fark</a></td>
<td>5</td>
</tr>
<tr>
<td>24</td>
<td><a href="http://av.watch.impress.co.jp">AV Watch Title Page</a></td>
<td>Â </td>
</tr>
<tr>
<td>25</td>
<td><a href="http://www.wonkette.com/">Wonkette</a></td>
<td>19</td>
</tr>
<tr>
<td>26</td>
<td><a href="http://www.talkingpointsmemo.com/">Talking Points Memo: by Joshua Micah Marshall</a></td>
<td>12</td>
</tr>
<tr>
<td>27</td>
<td><a href="http://thespacecraft.spaces.live.com">The Space Craft</a></td>
<td>Â </td>
</tr>
<tr>
<td>28</td>
<td><a href="http://www.joystiq.com/">Joystiq</a></td>
<td>Â </td>
</tr>
<tr>
<td>29</td>
<td><a href="http://www.thesuperficial.com/">The Superficial</a></td>
<td>Â </td>
</tr>
<tr>
<td>30</td>
<td><a href="http://techcrunch.com/">TechCrunch</a></td>
<td>Â </td>
</tr>
<tr>
<td>31</td>
<td><a href="http://www.weebls-stuff.com/">Weebls Stuff News</a></td>
<td>Â </td>
</tr>
<tr>
<td>32</td>
<td><a href="http://manabekawori.cocolog-nifty.com/blog/">manabekawori (Japanese)</a></td>
<td>Â </td>
</tr>
<tr>
<td>33</td>
<td><a href="http://powerlineblog.com/">Power Line</a></td>
<td>21</td>
</tr>
<tr>
<td>34</td>
<td><a href="http://scobleizer.com/" class="broken_link">Scobleizer</a></td>
<td>32</td>
</tr>
<tr>
<td>35</td>
<td><a href="http://littlegreenfootballs.com/weblog/">lgf</a></td>
<td>13</td>
</tr>
<tr>
<td>36</td>
<td><a href="http://www.eschatonblog.com/">Eschaton</a></td>
<td>8</td>
</tr>
<tr>
<td>37</td>
<td><a href="http://cn.autoblog.com/">Autoblog China</a></td>
<td>Â </td>
</tr>
<tr>
<td>38</td>
<td><a href="http://blogoscoped.com">Google Blogoscoped</a></td>
<td>Â </td>
</tr>
<tr>
<td>39</td>
<td><a href="http://www.joelonsoftware.com/">Joel on Software</a></td>
<td>36</td>
</tr>
<tr>
<td>40</td>
<td><a href="http://xiaxue.blogspot.com/">Xiaxue</a></td>
<td>Â </td>
</tr>
<tr>
<td>41</td>
<td><a href="http://www.americablog.com/">AMERICAblog</a></td>
<td>Â </td>
</tr>
<tr>
<td>42</td>
<td>atnewz.jp</td>
<td>Â </td>
</tr>
<tr>
<td>43</td>
<td>WRETCH Blog</td>
<td>Â </td>
</tr>
<tr>
<td>44</td>
<td><a href="http://www.somethingawful.com/">Something Awful</a></td>
<td>25</td>
</tr>
<tr>
<td>45</td>
<td><a href="http://blogs.yahoo.co.jp/nosz50j">nosz50j</a></td>
<td>Â </td>
</tr>
<tr>
<td>46</td>
<td><a href="http://www.overheardinnewyork.com/">Overheard in New York</a></td>
<td>Â </td>
</tr>
<tr>
<td>47</td>
<td><a href="http://www.metafilter.com/">Metafilter</a></td>
<td>16</td>
</tr>
<tr>
<td>48</td>
<td><a href="http://cuteoverload.com/">Cute Overload</a></td>
<td>Â </td>
</tr>
<tr>
<td>49</td>
<td><a href="http://www.paulgraham.com/">Paul Graham</a></td>
<td>Â </td>
</tr>
<tr>
<td>50</td>
<td><a href="http://www.tuaw.com/">The Unofficial Apple Weblog</a></td>
<td>Â </td>
</tr>
<tr>
<td>51</td>
<td><a href="http://andrewsullivan.theatlantic.com/">Andrew Sullivan</a></td>
<td>10</td>
</tr>
<tr>
<td>52</td>
<td>The Best Page In The Universe.</td>
<td>11</td>
</tr>
<tr>
<td>53</td>
<td><a href="http://defamer.gawker.com/">Defamer</a></td>
<td>94</td>
</tr>
<tr>
<td>54</td>
<td>Mark’s Sysinternals Blog</td>
<td>Â </td>
</tr>
<tr>
<td>55</td>
<td><a href="http://hughhewitt.com/blog/">Hugh Hewitt</a></td>
<td>41</td>
</tr>
<tr>
<td>56</td>
<td><a href="http://www.techdirt.com/">Techdirt.</a></td>
<td>Â </td>
</tr>
<tr>
<td>57</td>
<td><a href="http://www.webstandards.org/">The Web Standards Project</a></td>
<td>35</td>
</tr>
<tr>
<td>58</td>
<td><a href="http://www.stuffonmycat.com/">Stuff On My Cat</a></td>
<td>Â </td>
</tr>
<tr>
<td>59</td>
<td><a href="http://gigaom.com/">Om Malik</a></td>
</tr>
<tr>
<td>60</td>
<td><a href="http://www.buzzmachine.com/">BuzzMachine</a></td>
<td>45</td>
</tr>
<tr>
<td>61</td>
<td><a href="http://www.break.com/">Break.com</a></td>
<td>Â </td>
</tr>
<tr>
<td>62</td>
<td>Dr Dave</td>
<td>Â </td>
</tr>
<tr>
<td>63</td>
<td><a href="http://trent.blogspot.com/">Pink Is The New Blog</a></td>
<td>Â </td>
</tr>
<tr>
<td>64</td>
<td><a href="http://www.microsiervos.com/">Microsiervos</a></td>
<td>Â </td>
</tr>
<tr>
<td>65</td>
<td><a href="http://37signals.com/svn">Signal vs. Noise (by 37signals)</a></td>
<td>Â </td>
</tr>
<tr>
<td>66</td>
<td><a href="http://www.micropersuasion.com/">Micro Persuasion</a></td>
<td>Â </td>
</tr>
<tr>
<td>67</td>
<td><a href="http://blogcritics.org/">Blogcritics.org</a></td>
<td>Â </td>
</tr>
<tr>
<td>68</td>
<td><a href="http://www.poynter.org/">Poynter Online</a></td>
<td>Â </td>
</tr>
<tr>
<td>69</td>
<td>excite.co.jp/News/odd</td>
<td>Â </td>
</tr>
<tr>
<td>70</td>
<td><a href="http://hotair.com/">Captain’s Quarters</a></td>
<td>63</td>
</tr>
<tr>
<td>71</td>
<td><a href="http://blog.makezine.com/">MAKE: Blog</a></td>
<td>Â </td>
</tr>
<tr>
<td>72</td>
<td>Aamukaste</td>
<td>Â </td>
</tr>
<tr>
<td>73</td>
<td><a href="http://battellemedia.com/">John Battelle</a></td>
<td>Â </td>
</tr>
<tr>
<td>74</td>
<td><a href="http://volokh.com/">The Volokh Conspiracy</a></td>
<td>31</td>
</tr>
<tr>
<td>75</td>
<td><a href="http://tpmcafe.talkingpointsmemo.com/">TPMCafe</a></td>
<td>Â </td>
</tr>
<tr>
<td>76</td>
<td><a href="http://www.dumpalink.com/" class="broken_link">dumpalink.com</a></td>
<td>Â </td>
</tr>
<tr>
<td>77</td>
<td><a href="http://iammew.spaces.live.com">iammew</a></td>
<td>Â </td>
</tr>
<tr>
<td>78</td>
<td><a href="http://sethgodin.typepad.com/seths_blog/">Seth Godin</a></td>
<td>Â </td>
</tr>
<tr>
<td>79</td>
<td><a href="http://hcy521.spaces.live.com">hcy521</a></td>
<td>Â </td>
</tr>
<tr>
<td>80</td>
<td><a href="http://blog.searchenginewatch.com/">Search Engine Watch</a></td>
<td>Â </td>
</tr>
<tr>
<td>81</td>
<td><a href="http://www.nationalreview.com/corner">The Corner on National Review Online</a></td>
<td>Â </td>
</tr>
<tr>
<td>82</td>
<td><a href="http://toothpastefordinner.com/">toothpaste for dinner</a></td>
<td>Â </td>
</tr>
<tr>
<td>83</td>
<td><a href="http://blog.livedoor.jp/aki09041/">aki09041</a></td>
<td>Â </td>
</tr>
<tr>
<td>84</td>
<td><a href="http://slim.spaces.live.com">slim</a></td>
<td>Â </td>
</tr>
<tr>
<td>85</td>
<td><a href="http://gothamist.com/">Gothamist</a></td>
<td>88</td>
</tr>
<tr>
<td>86</td>
<td><a href="http://yaplog.jp/strawberry2/">strawberry2</a></td>
<td>Â </td>
</tr>
<tr>
<td>87</td>
<td><a href="http://www.autoblog.com/">Autoblog</a></td>
<td>Â </td>
</tr>
<tr>
<td>88</td>
<td><a href="http://www.vgcats.com/">VG Cats</a></td>
<td>Â </td>
</tr>
<tr>
<td>89</td>
<td><a href="http://yarnharlot.ca/blog/">Yarn Harlot</a></td>
<td>Â </td>
</tr>
<tr>
<td>90</td>
<td><a href="http://www.bildblog.de/">BILDblog</a></td>
<td>Â </td>
</tr>
<tr>
<td>91</td>
<td><a href="http://www.aintitcool.com/">Ain’t It Cool News</a></td>
</tr>
<tr>
<td>92</td>
<td>The Doc Searls Weblog</td>
<td>17</td>
</tr>
<tr>
<td>93</td>
<td><a href="http://www.juancole.com/">Informed Comment</a></td>
<td>47</td>
</tr>
<tr>
<td>94</td>
<td><a href="http://www.rathergood.com/">Rather Good</a></td>
<td>Â </td>
</tr>
<tr>
<td>95</td>
<td><a href="http://www.scripting.com/">Scripting News</a></td>
<td>20</td>
</tr>
<tr>
<td>96</td>
<td><a href="http://www.semiologic.com/">Semiologic</a></td>
<td>Â </td>
</tr>
<tr>
<td>97</td>
<td><a href="http://www.we-make-money-not-art.com/">we make money not art</a></td>
<td>Â </td>
</tr>
<tr>
<td>98</td>
<td><a href="http://waiterrant.net/">waiterrant.net</a></td>
<td>Â </td>
</tr>
<tr>
<td>99</td>
<td><a href="http://atsuya-furuta.blog.so-net.ne.jp">atsuya furuta</a></td>
<td>Â </td>
</tr>
<tr>
<td>100</td>
<td><a href="http://www.treehugger.com/">Treehugger</a></td>
<td>Â </td>
</tr>
</table>
<p>This provided me with two points in time: One in May 2005 and one in February 2006, 9 months later. If the theory of gatekeepers held true, the lists should have been pretty consistent.</p>
<p>What the data showed, however, was that the technorati 100 list is a very dynamic one. Let’s take a look at some of the moves.</p>
<h3>Boing Boing: King of the blogosphere</h3>
<p>Only one blog, Boing Boing, manage to hold its position steady in the last 9 months. Sitting at the top spot, it looks like it won’t move for a long time to come.</p>
<h3>The movers and shakers</h3>
<p>In this new list, 9 blogs successfully moved up in the last 9 months. They are:</p>
<ul>
<li>EnGadget (from 6 to 2)</li>
<li>Post Secret (67 to 4)</li>
<li>Thought Mechanics (89 to 7)</li>
<li>Google official blog (57 to <img src='http://www.tnl.net/editor/wp/wp-includes/images/smilies/icon_cool.gif' alt='8)' class='wp-smiley' /> </li>
<li>Michelle Malkin (27 to 10)</li>
<li>A list apart (24 to 17)</li>
<li>Gawker (29 to 19)</li>
<li>Defamer (94 to 53)</li>
<li>Gothamist (88 to 85)</li>
</ul>
<p>Those were all blogs that appeared on both lists and managed to climb up in the ranks. More surprising, however, was the fact that 65 new bloggers appeared on the list, new claimant to the title of top blogger. A quick analysis seems to point to Asian blogs becoming a major force, one that I personally have not heard much about in discussion of the evolution of the blogosphere. <a href="http://blog.technorati.com">David Sifry’s State of the Blogosphere</a> did not cover any of this type of movement when he did his last overview of the state of the blogosphere. I don’t know if he deliberately decided to ignore the data or whether he did not see it as that important but I consider this a pretty powerful observation. In a world where globalisation is key, the blogosphere has not yet fully grappled with the impact of the Asian Pacific region and there probably will be some interesting discussion around this in the future.</p>
<p>From a legacy standpoint, it also seems that upward moves are not fully distributed across the space. The following table shows how the legacy upward moves were distributed among the population:</p>
<table border="1" summary="Upward move distribution">
<tr>
<td>Top 10</td>
<td>5</td>
</tr>
<tr>
<td>Top 25</td>
<td>7</td>
</tr>
<tr>
<td>Top 50</td>
<td>7</td>
</tr>
<tr>
<td>Bottom 10</td>
<td>Â </td>
</tr>
<tr>
<td>Bottom 25</td>
<td>1</td>
</tr>
<tr>
<td>Bottom 50</td>
<td>2</td>
</tr>
</table>
<p>So being in the top 50 percentile makes it easier to move up, which would give some credence to a network effect. However, because we are talking about such a small segment of the population, it is impossible to generate any meaningful conclusion from the data.</p>
<h3>The endangered list</h3>
<p>While 65 blogs already dropped off the list, the 25 following blogs are in danger for the next 9 months as they have suffered a drop in ranking over the last 9 months:</p>
<ul>
<li>Daily Kos (from 3 to 5)</li>
<li>Gizmodo (4 to 9)</li>
<li>Instapundit (2 to 12)</li>
<li>Dooce (9 to 15)</li>
<li>Kottke (14 to 21)</li>
<li>Fark (5 to 23)</li>
<li>Wonkette (19 to 25)</li>
<li>Talking Points Memo (12 to 26)</li>
<li>Powerline (21 to 33)</li>
<li>Scobelizer (32 to 34)</li>
<li>LGF (13 to 35)</li>
<li>Eschaton (8 to 36)</li>
<li>Joel On Software (36 to 39)</li>
<li>Something Awful (25 to 44)</li>
<li>Metafilter (16 to 47)</li>
<li>Andrew Sullivan (10 to 51)</li>
<li>Best Page in the Universe (11 to 52)</li>
<li>Hugh Hewitt (41 to 55)</li>
<li>Web Standard Project (35 to 57)</li>
<li>Buzz Machine (45 to 60)</li>
<li>Captain’s Quarter (63 to 70)</li>
<li>The Volokh conspiracy (31 to 74)</li>
<li>Doc Searls (17 to 92)</li>
<li>Informed Comment (47 to 93)</li>
<li>Scripting News (20 to 95)</li>
</ul>
<p>The interesting thing, in terms of that drop is that it seems to affect members across the list as a whole in a similar fashion. A quick analysis of the drop breakdown shows no clear advantage in being near the top of the list versus being closer to the bottom:</p>
<table border="1" summary="downward move distribution">
<tr>
<td>Top 10</td>
<td>2</td>
</tr>
<tr>
<td>To 25</td>
<td>7</td>
</tr>
<tr>
<td>Top 50</td>
<td>15</td>
</tr>
<tr>
<td>Bottom 10</td>
<td>3</td>
</tr>
<tr>
<td>Bottom 25</td>
<td>3</td>
</tr>
<tr>
<td>Bottom 50</td>
<td>10</td>
</tr>
</table>
<p>More interesting is that this number is low compared to the blogs which disappeared <em>completely</em> from the top 100. That number stands at 65 and breaks down as follows:</p>
<table border="1" summary="disappearance distribution">
<tr>
<td>From top 10</td>
<td>1</td>
</tr>
<tr>
<td>From top 25</td>
<td>5</td>
</tr>
<tr>
<td>From top 50</td>
<td>20</td>
</tr>
<tr>
<td>From bottom 50</td>
<td>45</td>
</tr>
<tr>
<td>From bottom 25</td>
<td>22</td>
</tr>
<tr>
<td>From bottom: 10</td>
<td>9</td>
</tr>
</table>
<h3>A dynamic list</h3>
<p>If you take those numbers, it means that a total of 90 blogs (25 dropping within the list and another 65 dropping off the list completely) ended up with a lower position in 9 months. Combined with the fact that 9 blogs moved up, this means that 99 percent of the list was dynamic.</p>
<p>This, to me, was a pretty stunning revelation: while there is much obsession about who is and isn’t on those lists, it seems that their nature is a lot more dynamic than expected. Going beyond that, it also look like being on top is no guarantee that you will stay there (if anything, it is a guarantee that you will not, as 9 out of 10 blogs fell and 65 percent disappeared from the list altogether).</p>
<p>Because the overwhelming majority of the blogs listed in May 2005 experienced a downward spin, it seems that the concept of a network effect is widely overstated. In fact, there seems to be the equivalent of a reverse pull, where being a Technorati 100 is only a short lived glory.</p>
<p><p><i><a href="http://tnl.net/who" rel="author" title="Who is Tristan Louis?">Tristan Louis</a> is the founder and CEO of <a href="http://www.keepskor.com" title="Keepskor">Keepskor</a> and  writes the influential <a href="http://www.tnl.net/" title="tnl.net">tnl.net</a> weblog, where this was initially posted under the title <a href="http://www.tnl.net/blog/2006/02/21/technorati-100-here-today-gone-tomorrow/">Technorati 100 Here Today Gone Tomorrow</a>. You can follow him on twitter <a href="https://twitter.com/TNLNYC">here</a> or receive his weekly newsletter by subscribing <a href="http://eepurl.com/gb6zD">here</a>.</i></p>
</p>
]]></content:encoded>
			<wfw:commentRss>http://www.tnl.net/blog/2006/02/21/technorati-100-here-today-gone-tomorrow/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Googling Netscape</title>
		<link>http://www.tnl.net/blog/2006/02/01/googling-netscape/</link>
		<comments>http://www.tnl.net/blog/2006/02/01/googling-netscape/#comments</comments>
		<pubDate>Wed, 01 Feb 2006 08:16:23 +0000</pubDate>
		<dc:creator>Tristan Louis</dc:creator>
				<category><![CDATA[Business]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[AOL]]></category>
		<category><![CDATA[API]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Internet Explorer]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[Microsoft]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[Wall Street]]></category>
		<category><![CDATA[Yahoo]]></category>

		<guid isPermaLink="false">http://tnl.net/blog/2006/02/01/googling-netscape/</guid>
		<description><![CDATA[The Google stock is getting hurt in after hours trading as the company’s earnings disappointed Wall Street. It was to be expected but now is the time for executives at Google to look at history and, hopefully, not repeat it. The history I am talking about, in particular, is that of a company that was [...]<p><p><i><a href="http://tnl.net/who" rel="author" title="Who is Tristan Louis?">Tristan Louis</a> is the founder and CEO of <a href="http://www.keepskor.com" title="Keepskor">Keepskor</a> and  writes the influential <a href="http://www.tnl.net/" title="tnl.net">tnl.net</a> weblog, where this was initially posted under the title <a href="http://www.tnl.net/blog/2006/02/01/googling-netscape/">Googling Netscape</a>. You can follow him on twitter <a href="https://twitter.com/TNLNYC">here</a> or receive his weekly newsletter by subscribing <a href="http://eepurl.com/gb6zD">here</a>.</i></p>
</p>
]]></description>
			<content:encoded><![CDATA[<p>The Google stock is getting hurt in after hours trading as the company’s earnings disappointed Wall Street. It was to be expected but now is the time for executives at Google to look at history and, hopefully, not repeat it. The history I am talking about, in particular, is that of a company that was in a similar position about a decade ago: Netscape.</p>
<p>Before I go any further in this, I want to have a huge disclaimer: I’m a pretty big fan of some Google products. One can see Google ads running on this site (I’m an AdSense user) and a portion of my traffic gets here thanks to Google’s search engine. i’m also a big user of the search engine, I have a Gmail account (although it is not my primary email system) and I use Google Maps and Google News often. I’ve played with the search API in the past and, for the most part, I’ve been happy with my overall Google experience. However, I worry that the company is heading in the wrong direction and I want to ensure they remain a viable player as they have re-ignited investments in the search space, which has benefited all users on the Internet. However, I fear that, if they are not careful, they could suffer a fate similar to that of Netscape, which popularized web browsing and ended up being gobbled up by AOL, where it is now a shadow of its former self.</p>
<p>That said, let’s look at some of the disturbing similarities.</p>
<h3>Market Shares are no guarantee</h3>
<p>In the early days of the commercial Internet (let’s say 1996), Netscape was a very successful company. It had beaten every Wall Street expectation and completed a stock offering that had captured the imagination of the general public. The Netscape management graced the covers of most magazines in America and the little browser that could (then in version 2.0) had captured an impressive 75+ percent of the market. Netscape had also introduced its own line of web servers, with a proprietary language called LiveWire, which allowed to create more dynamic applications. The company was also offering a web page development tool, and struck partnerships with many companies to integrate their audio and video components with the browser.</p>
<p>Microsoft had come out with Windows 95, which included a browser (Internet Explorer) which they had licensed from an outside source (NCSA, the place where Marc Andreesen had worked prior to Netscape and the browser was Mosaic, an early web browser Marc had been involved with). The world had mostly laughed at the pitiful version 1.0 offering from Redmond. It was simply a bad product, which did not get much redemption with version 2.0.</p>
<h3>Microsoft on the Offensive</h3>
<p>The folks at Netscape were feeling pretty smug. After all, they dominated the browser market, had managed to get a way to sell server products and comments about the upcoming irrelevance of Microsoft started making the rounds. But the giant was awake and the clouds over Redmond only covered a flurry of activity. By the time IE 3.0 was released, most people had written Microsoft off. If they couldn’t get as simple a piece of code as a browser to catch up, how could they have a chance to survive.</p>
<p>Netscape had come out with version 3.0 and it was good, if a little bloated from the everything but the kitchen sink approach they were taking. Netscape was now offering an Internet suite that included a browser, a mail client, a newsreader client, an IRC client, some groupware capabilities, etc, etc… There was no way Microsoft could catch up.</p>
<p>Netscape Navigator 4 came out and it was good. It was running Java applets, it could do DHTML, etc.. Basically people liked it and didn’t see a reason to switch…</p>
<p>But Microsoft released <a href="http://www.tnl.net/who/bibliography/ie4.php">IE 4.0</a> and it was better than people expected. it matched the Netscape browser feature for feature and threw in a few things. One of the people in charge of that development was a guy by the name of Yusuf Medhi, who now happens to be the head of MSN.</p>
<p>While Microsoft had fired a major shot with that new browser, everyone expected that all that would change again when Netscape 5 would come out.</p>
<p>Netscape 5 never came out. In fact, Microsoft release IE 5.0 and started gaining market shares (stealing them from Netscape). Netscape seemed to be trapped in its own legacy and had problem getting a new product out. Microsoft release IE 5.5 and Netscape was working on a new rewrite of their product.</p>
<p>Finally, <a href="http://www.tnl.net/blog/2000/04/05/netscape-navigator-60-better/" title="TNL.net: Review of Netscape 6">Netscape 6 came out</a>, conveniently skipping a version. Was it the answer to Microsoft that all had hoped? Not quite and by that point it was too late.</p>
<p>Netscape never recovered and now lives as a shadow of its former self. Microsoft put out a 6.0 version of their browser, cleaning up some of the last parts of the markets they wanted and then went to sleep, in terms of browser, until the recent competitive threat of Firefox reared its head, eating up some of their hard earned market shares.</p>
<p>So what went wrong? The answer is complex but I believe that a mix of Hubris (we can beat Microsoft, we have a huge market share) combined with some sloppy releases, the development of a bit of a monoculture (we set the agenda, the industry will follow), an unwillingness to deal with massive competitive threats, a loss of focus on core assets, and a media world that loves to take down the companies they’ve built up all added up.</p>
<h3>How does this apply to Google?</h3>
<p>For starters, it is clear that massive market shares are no guarantee of success. Google currently holds around 60 percent of the search market, which is good but is also a reason for concern as it is more likely that this share will go down than it is that it will go up.</p>
<p>More worrisome, however, is the development of the Google monoculture. Much of what is going on at Google is happening with little involvement and input from the community. This is where Microsoft generally starts striking. Say what you want about the Redmond giant, they know how to listen and how to take brutal feedback and turn it into decent product. Microsoft is not known for great products but it is known for decent ones. Last week, Microsoft organized Search Champs, gathering a bunch of smart people from the industry in a room and having them talk to them. I was there and was surprised by how focused they are on winning this one. It is the kind of focus I have not seen come from them since the browser wars.</p>
<p>If it wants to survive, Google needs to do something similar. Throwing a product out to the world with the world beta on it is not a feedback loop. Sitting down with users, developers, thought leaders is. The feedback is not always good but it helps improve the product, which is how one wins this war. Furthermore, the goodwill generated by getting people invested in its products and their success allows a company to develop a strong following from a small group of dedicated users, who then serves as advocates in the marketplace. They can have an impact in changing opinion and not involving them can be dangerous.</p>
<p>Of those people, developers tend to be the more finicky. Alas, the success of many platforms on the Internet depends on developers. As developers go, so tend the marketplace because developers tend to be early adopters. Developers were the first people to switch from Yahoo to Altavista. They were the first group to switch from Altavista to Google. Where will they go next? Is it guaranteed that they will stay with Google (however, here is an interesting case, as developers tend to have a bias against Microsoft. The corollary of this is that Microsoft has to offer something that is radically better in order to make gains in the developer world). A good way for Google to mend some of the rift with the development community would be to support RSS along with ATOM as a syndication format. At the current time, Google is the only major search engine without native RSS support.</p>
<p>Another area to watch out for is the loss of focus. Could someone at Google please explain to me how the Google pack, Google WiFi, Google IM or the Google web accelerator fit Google’s mission (to organize the world’s information). How does owning a radio advertising business (something they acquired recently) fit in that model? It seems that Google is trying to do a lot of things in a lot of areas. I’m sure they’re all interesting things but what does that do to the core search assets on which the business was build (or is it that search is just a side business and Google’s mission is really about advertising?) There has been much discussion in the search world about the relevancy of results in the Google search engine suffering from some level of degradation. As always, expectations are high and any decrease (or lack of improvement) in the quality of the search index will be seen as a loss of focus.</p>
<p>Following the Netscape sloppy release, Google also has to worry about better testing before putting products out. Its recent stumbles with the release of Google NewsReader and Google Analytics showed the world products that were not fully ready for market release. The market acceptance for the word beta goes only so far and Google may suffer some reputational damage if it continues along a curve or release first and fix it later (this, however, is not necessarily a standalone cause for failure, as we’ve learned from the release of many Microsoft products that needed their own round of stabilization)</p>
<p>Last but not least is the burning glare of the media world and of Wall Street. As can be seen now that lofty (and, one could add, unrealistic) expectations could not be met, punishment (in the form of a declining stock price) is coming. Similarly, the press is getting more critical. This is part of a normal cycle: a company is hyped up and then taken down. These are just fads (ask your friends at Yahoo!, who have managed to go through the whole cycle and are starting to go back through a build-up phase now).</p>
<p>And, as a postcript, take the advice of pundits like myself with a grain of salt. There are lessons to be learned but I can’t guarantee that these are the right ones to learn. However, what is certain is that Google needs to remain a viable player in search if for no other reason than to keep companies like Microsoft honest. As we’ve seen in the browser wars, once a company wins, it tends to slow down on the innovation front and search is still so young a field that it needs major progress on the innovation front.</p>
<p><p><i><a href="http://tnl.net/who" rel="author" title="Who is Tristan Louis?">Tristan Louis</a> is the founder and CEO of <a href="http://www.keepskor.com" title="Keepskor">Keepskor</a> and  writes the influential <a href="http://www.tnl.net/" title="tnl.net">tnl.net</a> weblog, where this was initially posted under the title <a href="http://www.tnl.net/blog/2006/02/01/googling-netscape/">Googling Netscape</a>. You can follow him on twitter <a href="https://twitter.com/TNLNYC">here</a> or receive his weekly newsletter by subscribing <a href="http://eepurl.com/gb6zD">here</a>.</i></p>
</p>
]]></content:encoded>
			<wfw:commentRss>http://www.tnl.net/blog/2006/02/01/googling-netscape/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Getting to Know You</title>
		<link>http://www.tnl.net/blog/2005/12/16/getting-to-know-you/</link>
		<comments>http://www.tnl.net/blog/2005/12/16/getting-to-know-you/#comments</comments>
		<pubDate>Fri, 16 Dec 2005 08:02:51 +0000</pubDate>
		<dc:creator>Tristan Louis</dc:creator>
				<category><![CDATA[Business]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[Advertising]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[Search]]></category>

		<guid isPermaLink="false">http://tnl.net/blog/2005/12/16/getting-to-know-you/</guid>
		<description><![CDATA[Google’s introduction of new extensions for Firefox is all about knowing more about some users. This week, Google introduced two new Firefox extensions: Google Safe Browsing and Blogger Web Comments which are providing richer integration with the desktop and a number of new features based on your surfing patterns. But the question, when looking at [...]<p><p><i><a href="http://tnl.net/who" rel="author" title="Who is Tristan Louis?">Tristan Louis</a> is the founder and CEO of <a href="http://www.keepskor.com" title="Keepskor">Keepskor</a> and  writes the influential <a href="http://www.tnl.net/" title="tnl.net">tnl.net</a> weblog, where this was initially posted under the title <a href="http://www.tnl.net/blog/2005/12/16/getting-to-know-you/">Getting to Know You</a>. You can follow him on twitter <a href="https://twitter.com/TNLNYC">here</a> or receive his weekly newsletter by subscribing <a href="http://eepurl.com/gb6zD">here</a>.</i></p>
</p>
]]></description>
			<content:encoded><![CDATA[<p>Google’s introduction of new extensions for Firefox is all about knowing more about some users.</p>
<p>This week, Google introduced two new Firefox extensions: <a href="http://www.google.com/tools/firefox/safebrowsing/index.html" title="Google Safe Browsing">Google Safe Browsing</a> and <a href="http://www.google.com/tools/firefox/webcomments/index.html" title="Blogger Web Comments">Blogger Web Comments</a> which are providing richer integration with the desktop and a number of new features based on your surfing patterns.</p>
<p>But the question, when looking at those is why is Google interested in areas that don’t seem that close to search: the truth is that they are closely tied to Google’s business model, even though it’s not totally clear on a first look.</p>
<h3>The Google business model: advertising</h3>
<p>When you look at Google’s revenue, it becomes immediately clear that search is not really what the company is about: Google is in the business of advertising and search is the way in which it targets its advertising properly. Viewed in that prism, Google is an advertising company and advertising companies generally need a couple of things: eyeballs and data about those eyeballs.</p>
<p>The first part of this is easy to understand: eyeballs, to an advertising company, represent the inventory it has available for sale. However, eyeballs in and off themselves are pretty useless. The common misconceptions made by many companies in the late 90s was that eyeballs alone were important. The truth is that, without any other type of information, eyeballs are close to useless.</p>
<p>However, the more information you have about a set of eyeballs, the more useful it becomes. This was the realization that Google made when it moved the advertising model on its head by targeting ads based on search terms. Google then increased the amount of eyeballs it could get by offering AdSense, a program that increased inventory and provided information back to Google about what people were looking at.</p>
<p>With each new member of the AdSense program, Google gets more information about Internet users. The more information it has about Internet users, the better it can target its advertising.</p>
<h3>Enter the add-ons</h3>
<p>In May, <a href="http://www.tnl.net/blog/2005/05/06/google-accelerates-search/" title="TNL.net: Google Accelerates Search">I posited that the Google Accelerator was about distributing the indexing work</a>. What I failed to realize at the time was that Google was also getting a lot of user information in the process: what do people look at, how long, etc… This information is extremely useful. However, the accelerator had some issues and failed to achieve high velocity.</p>
<p>More deals have followed, with large partnerships aimed at pushing the Google toolbar on as many desktops as possible. One could wonder why the toolbar is so important to Google. After all, they keep trying to get it bundled left and right (with Java, for example) and are pushing it very heavily in their search engine results page. The toolbar is all about getting more information about what people visit.</p>
<p>The new extensions are about the same thing: getting to know you better. <a href="http://www.google.com/tools/firefox/agreement.html" title="Google Firefox Extensions Agreement">The Google Firefox Extensions Agreement</a> spells it out very clearly:</p>
<blockquote><p>By using the Extensions, you acknowledge and agree that Google may access, preserve, and disclose information regarding your use of the services if required to do so by law or under other conditions set forth in the Google Privacy Policy</p></blockquote>
<p>Digging into the <a href="http://www.google.com/privacy.html" title="Google Privacy Policy">privacy policy</a> spells things out clearly (the emphasis is mine):</p>
<blockquote>
<ul>
<li>Google collects personal information when you register for a Google service or otherwise voluntarily provide such information. <em>We may combine personal information collected from you with information from other Google services or third parties</em> to provide a better user experience, including customizing content for you.</li>
<li>Google uses cookies and other technologies to enhance your online experience and to learn about how you use Google services in order to improve the quality of our services.</li>
<li><em>Google’s servers automatically record information when you visit our website or use some of our products, including the URL, IP address, browser type and language, and the date and time of your request</em>.</li>
</ul>
</blockquote>
<p>From here, we learn that Google aggregates data (no big surprise here) and can share it with third parties. Among some of the data is the URL you visited, your IP address (which can then provide some information about your physical locations), and the language you use. Those are all good attributes to narrow down information about a user. For example, if someone looks at a lot of technical web sites, Google will know that this person might respond better to a technical ad. Over time, that information can be aggregated to get a better understanding of different groups and sell very targeted advertising. Let’s look at how Google uses this information (once again, emphasis is mine and this is from their privacy policy):</p>
<blockquote>
<ul>
<li>We may use personal information to provide the services you’ve requested, including services that display customized content and <em>advertising</em>.</li>
<li>We may also use personal information for <em>auditing, research and analysis to operate and improve Google technologies and services</em>.</li>
<li><em>We may share aggregated non-personal information with third parties outside of Google.</em></li>
<li>When we use third parties to assist us in processing your personal information, we require that they comply with our Privacy Policy and any other appropriate confidentiality and security measures.</li>
<li>We may also share information with third parties in limited circumstances, including when complying with legal process, preventing fraud or imminent harm, and ensuring the security of our network and services.</li>
<li>Google processes personal information on our servers in the United States of America and in other countries. In some cases, we process personal information on a server outside your own country.</li>
</ul>
</blockquote>
<p>If you remember my first point (Google is an advertising company), it starts to click. The technology and services they provide are not necessarily to the end user; they can also be to advertisers. This is why there is little worry about Google identifying you personally but being able to provide aggregated non-personal information to a third part is what advertising is all about.</p>
<p>In the television world, much of that work is being done by Nielsen (the infamous Nielsen ratings) to define what the audience of a show is and target the advertising properly. This is where the ideas like “give me around 100,000 eyeballs for men 18–24 in the New York area” yields an ad on a sports show about a New York sports team.</p>
<p>However, Google can do that better in that they can offer advertisers something along the lines of “17,000 eyeballs of 19 year old men based in Manhattan, NY with an interest in the Knicks and the Xbox 360, who also read sport news three times last week from ESPN, like the Daily Show and bought hardware and books from Amazon.com in the last 30 days.” It may sound extreme but let me explain how it works:</p>
<ul>
<li>The bought hardware and books can be gathered from the fact that they looked at URLs on Amazon.com and ended up on a purchase path as a result of that session (this would all be URL info)</li>
<li>The same can be true of the ESPN.com and theDailyShow sites (gathered from the URL field)</li>
<li>The interests (Xbox, the Knicks) can be inferred from where they spend time in their online session or what they searched for on Google</li>
<li>The location (Manhattan, NY) can be inferred from the IP address they used during their surfing session (alternately, if they use Google WiFi, it can be gathered from the info that client has reported)</li>
<li>The 19 year old men can be inferred from their email (Gmail) or usage patterns (this is where the research and analysis part come in) as relating to other 19 year old men</li>
</ul>
<p>In fact Google is so sure of their data that they will guarantee advertisers that, if their ad did not get a response, they don’t pay for it. The TV station doesn’t do that. Once you’ve run the ad and they verified the audience, if you’ve met the number, you’re good.</p>
<p>Let’s assume that you have a million dollars in advertising to try to sell your new widget (which targets that public): where would you put it?</p>
<p>The rich data set that Google is building has tremendous monetary value and that is why they keep pushing new clients that provide them with more info.</p>
<h3>Attention as value</h3>
<p>Because that data is very valuable, the current stream of organizations like <a href="http://attentiontrust.org/" title="Attention Trust">AttentionTrust</a> is one thing Google will have to deal with down the line. Such efforts are actually putting the power in the hands of the users and could potentially represent a threat to Google, if people refuse to start providing data to it. It will be interesting to see how Google deals with this new world.</p>
<p><p><i><a href="http://tnl.net/who" rel="author" title="Who is Tristan Louis?">Tristan Louis</a> is the founder and CEO of <a href="http://www.keepskor.com" title="Keepskor">Keepskor</a> and  writes the influential <a href="http://www.tnl.net/" title="tnl.net">tnl.net</a> weblog, where this was initially posted under the title <a href="http://www.tnl.net/blog/2005/12/16/getting-to-know-you/">Getting to Know You</a>. You can follow him on twitter <a href="https://twitter.com/TNLNYC">here</a> or receive his weekly newsletter by subscribing <a href="http://eepurl.com/gb6zD">here</a>.</i></p>
</p>
]]></content:encoded>
			<wfw:commentRss>http://www.tnl.net/blog/2005/12/16/getting-to-know-you/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Copy and Print</title>
		<link>http://www.tnl.net/blog/2005/11/21/copy-and-print/</link>
		<comments>http://www.tnl.net/blog/2005/11/21/copy-and-print/#comments</comments>
		<pubDate>Mon, 21 Nov 2005 08:16:05 +0000</pubDate>
		<dc:creator>Tristan Louis</dc:creator>
				<category><![CDATA[Business]]></category>
		<category><![CDATA[Media]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Microsoft]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[United States]]></category>
		<category><![CDATA[Yahoo]]></category>

		<guid isPermaLink="false">http://tnl.net/blog/2005/11/21/copy-and-print/</guid>
		<description><![CDATA[As a member of both the New York Library and Creative Commons, I received a lot of advance notice about this week’s discussion entitled “The Battle Over Books: Authors and Publishers Take on the Google Print Library Project”. And, thanks to Larry Lessig, I got a chance to be in the audience during this match-up [...]<p><p><i><a href="http://tnl.net/who" rel="author" title="Who is Tristan Louis?">Tristan Louis</a> is the founder and CEO of <a href="http://www.keepskor.com" title="Keepskor">Keepskor</a> and  writes the influential <a href="http://www.tnl.net/" title="tnl.net">tnl.net</a> weblog, where this was initially posted under the title <a href="http://www.tnl.net/blog/2005/11/21/copy-and-print/">Copy and Print</a>. You can follow him on twitter <a href="https://twitter.com/TNLNYC">here</a> or receive his weekly newsletter by subscribing <a href="http://eepurl.com/gb6zD">here</a>.</i></p>
</p>
]]></description>
			<content:encoded><![CDATA[<p>As a member of both the New York Library and Creative Commons, I received a lot of advance notice about this week’s <a href="http://www.nypl.org/events/live-nypl" title="New York Public Library; The Battle Over Books">discussion entitled “The Battle Over Books: Authors and Publishers Take on the Google Print Library Project”</a>. And, thanks to Larry Lessig, I got a chance to be in the audience during this match-up which forced me to reshape my thinking about Google, about Web 2.0, and about copyright regimes.</p>
<h3>Framing the debate</h3>
<p>The discussion centered largely around <a href="http://books.google.com/googlebooks/library.html" title="Google Print Library Project">the Google Print Library Project</a> and Google’s decision to scan books without first asking for authorization from the copyright holders. They do content, however, that they will remove books from their index if the copyright holder asks them to do so. In the last few months, the Author’s Guild and the American Association of Publishers have sued Google, alleging violations of copyright law.</p>
<p>Meanwhile, a separate effort set up by some of Google’s competitors (notably Yahoo! and Microsoft) and called <a href="http://www.opencontentalliance.org/" title="Open Content Alliance">the Open Content Alliance</a> has taken an opt-in approach to scanning copyright holdings, including only content that is no longer under copyright protection or content that has been expressly authorized by the copyright holder. This effort has not been sued by the two groups.</p>
<h3>Private vs. Public</h3>
<p>What is interesting here is that much of the debate really centers around an issue of public vs. private. Google is really creating a private holding out of content initially created by other people. While I initially was on the site of Google when I first heard about this debate, I am starting to wonder whether their position is correct. While it is a good thing that Google gives access to a way to search content which was not previously searchable, why is it OK for Google to not share that content with others? Why is it that they are not joining the Open Content Alliance and sharing access to content they have created? Why is it that they are creating a walled garden around content they did not create and only allow interaction with that content through Google? Those are questions that Google has not answered and need to be answered if we are to trust the company’s unofficial “Don’t be evil” motto.</p>
<p>However, this is an issue that goes far beyond books when you start thinking about it. Google has largely been building a reputation based on its ability to search various types of data, assuming that the copyright holders were allowing them to do so. I first looked into that issue about 5 years ago <a href="http://www.tnl.net/blog/2000/10/22/double-trouble-for-dejacom/" title="TNL.net: Double Trouble for Deja.com">Deja News put out the “For Sale” sign</a>, which was eventually picked up by Google. What is interesting is that Google needs data. Without it, Google is useless: <em>the value of a search engine is related to how many assets it holds and how well it can organize them</em>. This is why <a href="http://www.tnl.net/blog/2005/09/27/google-has-24-billion-items-index-considers-msn-search-nearest-competitor/" title="TNL.net: Google has 24 billion items index, considers MSN search nearest competitor">size does matter</a> even though some now try to claim it no longer does.</p>
<p>I would go as far as extrapolate that this is the biggest dilemma for most web 2.0 companies: as more and more of them rely on system where the data is almost as important as how one interacts with it, they are found starving for data. However, they have to balance that with the ideal of being more transparent and share that data with other entities. The dilemma then becomes how to keep a private set of data in the public eye while keeping the public from stealing and/or misusing your private data.</p>
<p>I asked the panelists whether the issue was that Google was turning the author’s data into private Google property and whether Google joining the Open Content Alliance would solve the problem. David Drummond, who was there representing Google did not answer the question. Allan Adler, from the Association of American Publishers, stated that they would drop their objections (and thus potentially their lawsuit) if Google were to follow the established principles of the Open Content Alliance. In order to decipher that statement, I went back to the OCA’s website and looked for what those principles were. They are are follows:</p>
<blockquote>
<ul>
<li>The OCA will encourage the greatest possible degree of access to and reuse of collections in the archive, while respecting the rights of content owners and contributors.</li>
<li>Contributors will determine the terms and conditions under which their collections are distributed and how attribution should be made.</li>
<li>The OCA need not be obligated to accept all content that is offered to it and may give preference to that which can be made widely accessible.</li>
<li>The OCA will offer collection and item-level metadata of its hosted collections in a variety of formats.</li>
<li>The OCA welcomes efforts to create and offer tools (including finding aids, catalogs, and indexes) that will enhance the usability of the materials in the archive.</li>
<li>Copies of the OCA collections will reside in multiple archives internationally to ensure their long-term preservation and accessibility to all.</li>
</ul>
</blockquote>
<p>The last few words (“and accessibility to all”) are particularly interesting. These, I believe, may be a large part of the reason Google is not going to join the OCA.</p>
<p>In a way, Google is appropriating other people’s work (the actual content of the books) and creating a private property around it. Had Google created the content or provided the tools to do so, they might have a claim to being part of the creation. However, it seems that, in scanning the content, they are appropriating content which is not rightfully theirs without first asking for authority to do so. That can’t be right.</p>
<h3>What price for those rights?</h3>
<p>It is interesting that Google has wrapped its argument around the <a href="http://www.copyright.gov/title17/92chap1.html#107" title="Fair Use" class="broken_link">Fair Use</a> doctrine as the copyright office seems to clearly state that one of the factors to consider is</p>
<blockquote><p>the amount and substantiality of the portion used in relation to the copyrighted work as a whole</p></blockquote>
<p>The reason that is interesting is that it points to an issue in terms of whether they are infringing or not. Considering the fact that they do have to copy the works in full in order to be successful in their undertaking, it seems that they would indeed be in infringement under a strict reading of that section of Copyright law.</p>
<p>One of the items that were overlooked by most of the media coverage is the question of price for the rights. Larry Lessig, during an exchange with Nick Taylor, of the Authors Guild, stated that he feared that the Author’s Guild and the Association of American Publishers would eventually settle their lawsuit with Google. This fear is well grounded when one realizes that the majority of lawsuits are settled out of court but it gains extra weight if there is a potential that Google will lose. To understand Lessig’s fears, however, one has to go one step further and start looking into the effect of such a settlement. First of all, Google is rich (as of this writing, Google had a market capitalization sitting north of $100 billion); There is nothing wrong with that, except for the fact that they can pay a lot more than other companies could. If they were to settle with the authors and publishers for a lot of money (which is what the receiving parties will be pushing for), they will create a precedent whereby rights that previously were available for free will now have a fairly hefty price tag.</p>
<p>This is not only bad for people trying to develop new businesses to compete with Google but has a potential for being bad for democracy in general as it might create two different groups in a society: those who can pay for access to certain content and those who can’t. In the long run, that sounds like a pretty evil thing to me and this is, once again, where the need for a system that is accessible to all and collections that reside in multiple archives are an important pre-requisite. If the authors and publishers are serious about being remunerated for their work, they are going to have to play this one for the long run. What it means is that settling is not an option! They must see this case all the way through to the Supreme Court of the United States. The reason this is necessary is that, if they settle, they change the negotiation from one where they are of equal weight to an asymmetric one where Google has all the power (because it keeps the access locked down). In the future, Google could decide what and when those authors and publishers have a say in that relationship. This is very dangerous. In a way, the relationship is one that fits a prisoner’s dilemma scheme nicely, showing that the only solution is to keep fighting:</p>
<table border="1" summary="prisoner's dilemna">
<tr>
<td>Â </td>
<th>Authors settle</th>
<th>Authors don’t settle</th>
</tr>
<tr>
<th>Publishers settle</th>
<td>Google wins complete control</td>
<td>Google asks publishers to lean on authors.</td>
</tr>
<tr>
<th>Publishers don’t settle</th>
<td>Google asks authors to avoid non-settling publishers. Offers way around them.</td>
<td>Decision is eventually made in the supreme court</td>
</tr>
</table>
<p>It is interesting to see that there is really no room but to fight. In a weird way, Google has become its own anti-thesis, being evil as a direct result of its own actions. Because, in order to protect its own economic interest, it must keep a walled garden, Google is stuck in a position where it will have to negotiate rights or lose the right to go after print. From the Google standpoint, the decision is to get one party to settle and leverage that into a position of strength to force the other party to settle. Once a settlement has been accomplished with both parties, however, Google will have established a price tags on rights. Because of that price tags, many parties (whether individuals or companies) will no longer be able to play in that space. Many could debate whether this is intentionally evil or not but few can deny that it creates an evil state of affairs.</p>
<h3>And what about CC?</h3>
<p>A lot of this discussion, of course, cannot happen without taking into <a href="http://creativecommons.org/" title="Creative Commons">Creative Commons</a> into account. I was surprised that Lessig was not making more of a case for the CC license to the publishers and authors. However, it was interesting to see him grilling Allan Adler on what constituted fair rights. Adler took a very evasive approach to dodge the question, leaving it absolutely unanswered. It is, however, an important question that needs to be dealt with if any resolution is to come.</p>
<p>One of the possible compromise would be for Google to agree they will no longer force an opt out model in exchange for a blanket endorsement of CC by the publishers and authors. Because CC licenses has a number of variables, it might allow some speeding up of the process in terms of willing to grant rights. This would also greatly benefit the Open Content Alliance project and thus ensure that content is widely shared and distributed while allowing content authors and publishers some level of control over what rights they would give away. The funny thing is that this may, in the end, be the only way out of the mess Google has created and that no one else seems to have suggested it.</p>
<p><p><i><a href="http://tnl.net/who" rel="author" title="Who is Tristan Louis?">Tristan Louis</a> is the founder and CEO of <a href="http://www.keepskor.com" title="Keepskor">Keepskor</a> and  writes the influential <a href="http://www.tnl.net/" title="tnl.net">tnl.net</a> weblog, where this was initially posted under the title <a href="http://www.tnl.net/blog/2005/11/21/copy-and-print/">Copy and Print</a>. You can follow him on twitter <a href="https://twitter.com/TNLNYC">here</a> or receive his weekly newsletter by subscribing <a href="http://eepurl.com/gb6zD">here</a>.</i></p>
</p>
]]></content:encoded>
			<wfw:commentRss>http://www.tnl.net/blog/2005/11/21/copy-and-print/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Reading the Google Tea Leaves</title>
		<link>http://www.tnl.net/blog/2005/11/06/reading-the-google-tea-leaves/</link>
		<comments>http://www.tnl.net/blog/2005/11/06/reading-the-google-tea-leaves/#comments</comments>
		<pubDate>Mon, 07 Nov 2005 02:45:31 +0000</pubDate>
		<dc:creator>Tristan Louis</dc:creator>
				<category><![CDATA[Business]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[AOL]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Microsoft]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[Yahoo]]></category>

		<guid isPermaLink="false">http://tnl.net/blog/2005/11/06/reading-the-google-tea-leaves/</guid>
		<description><![CDATA[Every time Google comes out with a new product, many people talk about how great it is and highlight the product as a category killer. However, it increasingly appears to me that Google is filling up holes in their offering, in an attempt to match its competitors. Based on that assumption, I started wondering if [...]<p><p><i><a href="http://tnl.net/who" rel="author" title="Who is Tristan Louis?">Tristan Louis</a> is the founder and CEO of <a href="http://www.keepskor.com" title="Keepskor">Keepskor</a> and  writes the influential <a href="http://www.tnl.net/" title="tnl.net">tnl.net</a> weblog, where this was initially posted under the title <a href="http://www.tnl.net/blog/2005/11/06/reading-the-google-tea-leaves/">Reading the Google Tea Leaves</a>. You can follow him on twitter <a href="https://twitter.com/TNLNYC">here</a> or receive his weekly newsletter by subscribing <a href="http://eepurl.com/gb6zD">here</a>.</i></p>
</p>
]]></description>
			<content:encoded><![CDATA[<p>Every time Google comes out with a new product, many people talk about how great it is and highlight the product as a category killer. However, it increasingly appears to me that Google is filling up holes in their offering, in an attempt to match its competitors. Based on that assumption, I started wondering if Google had any product that was truly unique. To do so, I started a chart that mapped Google offerings against its competitors. For the purpose of this analysis, I decided that Google’s main competitors were Microsoft, Yahoo!, and AOL.</p>
<h3>The Search Space</h3>
<p>Google is undoubtedly the leader in search. It is what they specialized in and continues to be their most cherished asset. But does Google offer search products that fill a niche which is not covered by its competitors? Let’s take a look…</p>
<table border="1" summary="search space">
<tr>
<th>Indexes</th>
<th>Google</th>
<th>Microsoft</th>
<th>Yahoo!</th>
<th>AOL</th>
</tr>
<tr>
<th>Audio</th>
<td>No</td>
<td><a href="http://music.msn.com/">Yes</a></td>
<td><a href="http://new.music.yahoo.com">Yes</a></td>
<td><a href="http://search.aol.com/aol/browserup">Yes</a></td>
</tr>
<tr>
<th>Blogs</th>
<td><a href="http://blogsearch.google.com/">Yes</a></td>
<td>Unknown</td>
<td><a href="http://news.search.yahoo.com/">Yes</a> (mixed with news)</td>
<td><a href="http://www.searchenginejournal.com/aol-launching-blog-search-this-week/2328/">In development</a></td>
</tr>
<tr>
<th>Books</th>
<td><a href="http://books.google.com/">Yes</a></td>
<td><a href="http://www.microsoft.com/presspass/press/2005/oct05/10-25MSNBookSearchPR.mspx">In development</a></td>
<td><a href="http://searchenginewatch.com/3553086">In development</a></td>
<td>No</td>
</tr>
<tr>
<th>Catalog</th>
<td><a href="http://www.google.com/">Yes</a></td>
<td>No</td>
<td>No</td>
<td>No</td>
</tr>
<tr>
<th>Directory</th>
<td>Yes</td>
<td>No</td>
<td><a href="http://dir.yahoo.com/">Yes</a></td>
<td><a href="http://search.aol.com/aol/browserup">Yes</a></td>
</tr>
<tr>
<th>Encyclopedia</th>
<td>No</td>
<td>Yes (Encarta)</td>
<td><a href="http://education.yahoo.com/reference/">Yes</a></td>
<td><a href="http://www.referencecenter.com/ref/browserup">Yes</a></td>
</tr>
<tr>
<th>Images</th>
<td><a href="http://images.google.com/">Yes</a></td>
<td><a href="http://www.bing.com/images/">Yes</a></td>
<td><a href="http://images.search.yahoo.com/">Yes</a></td>
<td>Provided by Google</td>
</tr>
<tr>
<th>Local</th>
<td><a href="http://local.google.com/">Yes</a></td>
<td><a href="http://www.bing.com/local/">Yes</a></td>
<td><a href="http://local.yahoo.com/">Yes</a></td>
<td><a href="http://yellowpages.aol.com/">Yes</a></td>
</tr>
<tr>
<th>News</th>
<td><a href="http://news.google.com/">Yes</a></td>
<td>Yes</td>
<td><a href="http://news.yahoo.com/">Yes</a></td>
<td><a href="http://search.aol.com/aol/browserup">Yes</a></td>
</tr>
<tr>
<th>Podcasts</th>
<td><a href="http://www.threadwatch.org/node/3193">Rumored</a></td>
<td>No</td>
<td>Yes</td>
<td><a href="http://blog.searchenginewatch.com/050914-054203">Limited</a></td>
</tr>
<tr>
<th>Shopping</th>
<td><a href="http://www.google.com/products">Yes</a></td>
<td>Yes</td>
<td><a href="http://search.yahoo.com/products">Yes</a></td>
<td>Yes</td>
</tr>
<tr>
<th>Usenet</th>
<td><a href="http://groups.google.com/">Yes</a></td>
<td>No</td>
<td>No</td>
<td>No</td>
</tr>
<tr>
<th>Video</th>
<td><a href="http://video.google.com/">Yes</a></td>
<td><a href="http://www.bing.com/videos/browse">Yes</a></td>
<td><a href="http://video.search.yahoo.com/">Yes</a></td>
<td><a href="http://search.aol.com/aol/browserup">Yes</a></td>
</tr>
<tr>
<th>Web</th>
<td><a href="http://www.google.com/">Yes</a></td>
<td><a href="http://www.bing.com/">Yes</a></td>
<td><a href="http://search.yahoo.com/">Yes</a></td>
<td>Provided by Google</td>
</tr>
</table>
<p>The interesting thing, when looking at this data, is that, apart from Catalog and Usenet search, Google does not offer services offered by others or currently under development. Interestingly, Google does not have any offerings in the Audio (nor a Podcast offering) and Encyclopedia space (although Wikipedia results sometimes pop-up in search results.) This seems to highlight two potential areas where Google will introduce new products: an audio search engine, which will include podcasts, and some type of partnership with Wikipedia to fill the reference space.</p>
<p>What is interesting here is that Google has generally been the first to market with many of the search collections listed. From this, one can deduce that Google works more as a competitive threat to its competitors, forcing them to invest more in their search product and, in the process, improving the quality and breadth of search data for every user on the Internet. This is a good thing but not revolutionary unto its own.</p>
<h3>Search Services</h3>
<p>The next area I decided to look into, in order to divine the whats and wheres of Google was the type of search-specific services it offered, compared to the same competitors.</p>
<table border="1" summary="search services">
<tr>
<th>Search Services</th>
<th>Google</th>
<th>Microsoft</th>
<th>Yahoo!</th>
<th>AOL</th>
</tr>
<tr>
<th>Answers</th>
<td><a href="http://answers.google.com/answers/">Yes</a></td>
<td>No</td>
<td>No</td>
<td>No</td>
</tr>
<tr>
<th>Clustered results</th>
<td>No</td>
<td>In development</td>
<td>No</td>
<td>Yes (default)</td>
</tr>
<tr>
<th>Desktop Search</th>
<td><a href="http://desktop.google.com/">Yes</a></td>
<td>Yes</td>
<td><a href="http://pro.x1.com/?utm_source=Yahoo&#038;utm_medium=Affiliate&#038;utm_campaign=Yahoo&#038;source=Yahoo">Yes</a></td>
<td><a href="http://downloads.channel.aol.com/browserdts">Yes</a></td>
</tr>
<tr>
<th>Mobile Search</th>
<td><a href="http://www.google.com/mobile/">Yes</a></td>
<td><a href="http://home.mobile.msn.com/en-us/default.aspx">Yes</a></td>
<td><a href="http://mobile.yahoo.com/">Yes</a></td>
<td><a href="http://mobile.aol.com/">Yes</a></td>
</tr>
<tr>
<th>Personalized Search</th>
<td><a href="https://www.google.com/accounts/ServiceLogin?hl=en&#038;continue=http://www.google.com/psearch&#038;nui=1&#038;service=hist">Yes</a></td>
<td>Rumored</td>
<td>Yes</td>
<td>No</td>
</tr>
<tr>
<th>Search History</th>
<td><a href="https://www.google.com/accounts/ServiceLogin?hl=en&#038;continue=http://www.google.com/history/&#038;nui=1&#038;service=hist">Yes</a></td>
<td>No</td>
<td><a href="https://login.yahoo.com/config/login?.src=bmk2&#038;.intl=us&#038;.done=http%3A%2F%2Fbookmarks.yahoo.com%2F">Yes</a></td>
<td>Yes (default)</td>
</tr>
</table>
<p>This is actually interesting in that the offerings are pretty close. Of note here is a departure on the part of Microsoft, which is experimenting with clustered search. None of its competitors have show a product in that space and this may be an interesting indication of how they plan to play in that space.</p>
<p>Also of note is the fact that only Google offers a paid answering service (Google Answers). A question for players in that space could be whether something like the recent <a href="https://www.mturk.com/mturk/welcome">Mechanical Turk</a> offering from Amazon could help a company fill that niche. This seems to be an untapped market that is only being mined by Google.</p>
<h3>Non-search services</h3>
<p>OK, so we can clearly see that Google has done a good job in the search space and its competitors are working hard to play catch up in that area. While they’re doing so, Google has been busy ramping up its offerings and closing some holes in terms of being an online media player. Let’s take a look at how it is fairing in the non-search space.</p>
<table border="1" summary="non-search service">
<tr>
<th>Other Services</th>
<th>Google</th>
<th>Microsoft</th>
<th>Yahoo!</th>
<th>AOL</th>
</tr>
<tr>
<th>Auctions</th>
<td>No</td>
<td>Search only</td>
<td><a href="http://shopping.yahoo.com">Yes</a></td>
<td>No</td>
</tr>
<tr>
<th>Blogs</th>
<td><a href="https://www.blogger.com/start">Yes</a></td>
<td><a href="http://home.spaces.live.com/">Yes</a></td>
<td><a href="https://login.yahoo.com?.done=http%3A%2F%2Fprofiles.yahoo.com%2F&#038;.intl=us&#038;.src=prf&#038;.pd=c%3DpjYaRE2p2e7qnVyDc3WyJsc-">Yes</a></td>
<td><a href="http://peopleconnection.aol.com/blogs">Yes</a></td>
</tr>
<tr>
<th>Calendar</th>
<td>No</td>
<td><a href="http://login.live.com/login.srf?wa=wsignin1.0&#038;rpsnv=11&#038;ct=1264188087&#038;rver=6.0.5285.0&#038;wp=MBI&#038;wreply=http:%2F%2Fcalendar.live.com%2F%2Fcalendar%2Fcalendar.aspx&#038;lc=1033&#038;id=64362&#038;mkt=en-us">Yes</a></td>
<td><a href="https://login.yahoo.com/?.done=http%3A%2F%2Fcalendar.yahoo.com%2F">Yes</a></td>
<td>Yes</td>
</tr>
<tr>
<th>Discussion Groups</th>
<td><a href="http://groups.google.com/">Yes</a></td>
<td>Yes</td>
<td><a href="http://groups.yahoo.com">Yes</a></td>
<td>Yes</td>
</tr>
<tr>
<th>Email</th>
<td>Yes</td>
<td><a href="http://login.live.com/login.srf?wa=wsignin1.0&#038;rpsnv=11&#038;ct=1264188094&#038;rver=6.0.5285.0&#038;wp=MBI&#038;wreply=http:%2F%2Fmail.live.com%2Fdefault.aspx&#038;lc=1033&#038;id=64855&#038;mkt=en-US">Yes</a></td>
<td><a href="https://login.yahoo.com/config/login_verify2?&#038;.src=ym">Yes</a></td>
<td><a href="http://webmail.aol.com/30462-111/aol-1/en-us/common/SystemRequirements.aspx">Yes</a></td>
</tr>
<tr>
<th>IM</th>
<td>Yes</td>
<td><a href="http://windowslive.com/desktop/messenger">Yes</a></td>
<td><a href="http://messenger.yahoo.com/">Yes</a></td>
<td><a href="http://www.aim.com/">Yes</a></td>
</tr>
<tr>
<th>Internet Access</th>
<td><a href="http://www.wired.com/gadgets/wireless/news/2005/09/68920">Very Limited</a></td>
<td>Yes</td>
<td>Yes</td>
<td><a href="http://access.web.aol.com/">Yes</a></td>
</tr>
<tr>
<th>Maps</th>
<td>Yes</td>
<td><a href="http://www.bing.com/maps/help/en-us/browsernotsupported.htm?http%3a%2f%2fwww.bing.com%3a80%2fmaps%2f">Yes</a> (with <a href="http://www.bing.com/maps/default.aspx?wip=2&amp;v=2&amp;style=r&amp;rtp=~&amp;msnurl=home.aspx?%26redirect%3dfalse&amp;msnculture=en-US">2</a> more)</td>
<td>Yes</td>
<td><a href="http://www.mapquest.com/">Yes</a></td>
</tr>
<tr>
<th>Personal Page (My.*)</th>
<td><a href="http://www.google.com/ig">Yes</a></td>
<td><a href="http://www.bing.com/?fdr=lc">Yes</a></td>
<td>Yes</td>
<td>Yes (via My Netscape)</td>
</tr>
</table>
<p>Of note in that area is the fact that Google has managed to revamp the email space with its Gmail offering, forcing Yahoo! and Microsoft to work on a revamp of products user-interface that had not really evolved much since their introductions. A couple of interesting holes in the Google offerings in terms of auctions and calendaring will probably be filled in the near future with online offerings closing the gap in those areas. Heck, <a href="http://jeremy.zawodny.com/blog/archives/004282.html" title="The world could really use Google Calendar">even people working at some of their competitors are clamoring for such offerings</a>.</p>
<p>More interesting, however, is the fact that Google is the only player in that space without a substantial access offering. Basically, they’ve been using the public internet as their accessibility world. This can provide some details as to the recent rumors of their developing a large scale WiFi network and <a href="http://news.cnet.com/Google-wants-dark-fiber/2100-1034_3-5537392.html">some of their interest in purchasing dark fiber</a> or other rumors about their interest in AOL.</p>
<p>Once again, it seems that Google has served well as spurring its competitors into action but the magic Google sauce does not seem to reside in the product offerings.</p>
<h3>Developer Services</h3>
<p>While all those offerings seem of interest to the general public, Google has been doing a good job in catering to early adopters, who generally impact general opinion. When doing a comparison on that space, it was fascinating to see that Google took the lead in most categories and that AOL did not even play in any of them.</p>
<table border="1" summary="developer services">
<tr>
<th>Developer Services</th>
<th>Google</th>
<th>Microsoft</th>
<th>Yahoo!</th>
<th>AOL</th>
</tr>
<tr>
<th>Advertising Program</th>
<td><a href="https://www.google.com/adsense/login/en_US/?gsessionid=XijvbwQFeQPCc6JvmeQjRA">Yes</a></td>
<td><a href="http://advertising.microsoft.com/search-advertising">Yes</a></td>
<td><a href="http://advertisingcentral.yahoo.com/publisher/index">Yes</a></td>
<td>No</td>
</tr>
<tr>
<th>Development APIs</th>
<td><a href="http://code.google.com/more/">Yes</a></td>
<td>Yes</td>
<td><a href="http://developer.yahoo.com/">Yes</a></td>
<td>No</td>
</tr>
<tr>
<th>New Services Preview</th>
<td><a href="http://www.googlelabs.com/">Yes</a></td>
<td>Yes</td>
<td><a href="http://developer.yahoo.com">Yes</a></td>
<td>No</td>
</tr>
<tr>
<th>Web Hosting</th>
<td>No</td>
<td><a href="http://www.microsoft.com/business/en-us/default.aspx">Yes</a></td>
<td>Yes</td>
<td>No</td>
</tr>
</table>
<p>That last item is one to ponder. Google is not in the hosting business yet. But it seems that there is potential for them and, once again, could play along the lines of Google trying to harvest dark fiber. They’ve revolutionized the online email space by offering a larger amount of disk space than any competitors. It seems they could be doing the same in the hosting space by offering a combination of easy to set-up and update tools (based on the blogger set of templates) with some more powerful features like Database management (the rumored <a href="http://base.google.com/base/?gsessionid=kex30hjEaXlklgApvfFSBQ">GoogleBase</a>, which now has its own URL, even though the code still seems to be sitting <a href="http://base.google.com/base/?gsessionid=puo6vQq7bRRNz5FCY04tOA">behind a login area</a>.</p>
<h3>Conclusions</h3>
<p>Google does innovate in some spaces but has largely innovated in order to gain entry in markets that already existed. As a rule of thumb, they’ve been very smart at breathing new innovations in those markets. However, their competitors are generally quick to notice and are catching up.</p>
<p>In terms of future offerings, I would not be surprised to see the following products coming from Google over the next few months:</p>
<ul>
<li>An audio search engine, which will include a podcasting component (and possibly a podcast authoring component via blogger)</li>
<li>A strategic partnership with Wikipedia or some other encyclopedia</li>
<li>Some type of clustered search offering</li>
<li>A calendar product, which will probably inject new life in that space</li>
<li>An auction offering, tied with an internal payment system</li>
<li>A web hosting service that will scale from small entities to large ones and will include Gmail as part of the email offering</li>
<li>Some type of access service, probably using their WiFi solution</li>
</ul>
<p>Whether that all happens of course is pure speculation on my part and whether it is enough to sustain their market capitalization (north of $100 billion as I write this) is something I better leave to people who know how to invest.</p>
<p><p><i><a href="http://tnl.net/who" rel="author" title="Who is Tristan Louis?">Tristan Louis</a> is the founder and CEO of <a href="http://www.keepskor.com" title="Keepskor">Keepskor</a> and  writes the influential <a href="http://www.tnl.net/" title="tnl.net">tnl.net</a> weblog, where this was initially posted under the title <a href="http://www.tnl.net/blog/2005/11/06/reading-the-google-tea-leaves/">Reading the Google Tea Leaves</a>. You can follow him on twitter <a href="https://twitter.com/TNLNYC">here</a> or receive his weekly newsletter by subscribing <a href="http://eepurl.com/gb6zD">here</a>.</i></p>
</p>
]]></content:encoded>
			<wfw:commentRss>http://www.tnl.net/blog/2005/11/06/reading-the-google-tea-leaves/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Metrics — Weighting the Metrics</title>
		<link>http://www.tnl.net/blog/2005/10/20/metrics-weighting-the-metrics/</link>
		<comments>http://www.tnl.net/blog/2005/10/20/metrics-weighting-the-metrics/#comments</comments>
		<pubDate>Thu, 20 Oct 2005 15:50:45 +0000</pubDate>
		<dc:creator>Tristan Louis</dc:creator>
				<category><![CDATA[Business]]></category>
		<category><![CDATA[API]]></category>
		<category><![CDATA[Internet Explorer]]></category>
		<category><![CDATA[Microsoft]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[eBay]]></category>

		<guid isPermaLink="false">http://tnl.net/blog/2005/10/20/metrics-weighting-the-metrics/</guid>
		<description><![CDATA[Metrics weeks continues with a review of how to weight metrics. So far, I’ve looked into who, in a company could benefit from metrics. I then delved into two different types of metrics: hard metrics, which can easily be measured, and soft metrics, which cannot. Today, I’m going to try to figure out how this [...]<p><p><i><a href="http://tnl.net/who" rel="author" title="Who is Tristan Louis?">Tristan Louis</a> is the founder and CEO of <a href="http://www.keepskor.com" title="Keepskor">Keepskor</a> and  writes the influential <a href="http://www.tnl.net/" title="tnl.net">tnl.net</a> weblog, where this was initially posted under the title <a href="http://www.tnl.net/blog/2005/10/20/metrics-weighting-the-metrics/">Metrics — Weighting the Metrics</a>. You can follow him on twitter <a href="https://twitter.com/TNLNYC">here</a> or receive his weekly newsletter by subscribing <a href="http://eepurl.com/gb6zD">here</a>.</i></p>
</p>
]]></description>
			<content:encoded><![CDATA[<p>Metrics weeks continues with a review of how to weight metrics. So far, I’ve looked into <a href="http://www.tnl.net/blog/2005/10/16/metrics-introduction/" title="TNL.net: Metrics Introduction">who, in a company could benefit from metrics</a>. I then delved into two different types of metrics: <a href="http://www.tnl.net/blog/2005/10/18/metrics-hard-metrics/" title="TNL.net: Hard Metrics">hard metrics</a>, which can easily be measured, and <a href="http://www.tnl.net/blog/2005/10/19/metrics-soft-metrics/" title="TNL.net: Soft Metrics">soft metrics</a>, which cannot. Today, I’m going to try to figure out how this all weights out.</p>
<h3>Grouping the metrics</h3>
<p>In order to figure out weighting, I first started to think about how to group different metrics. For this purpose, I looked at things like the base value (which would give us a baseline as to how much a business is worth based solely on revenue and revenue growth), inventory (looking at things like traffic, reach, and output, because they all give us some data points as to the growth of monetizable assets in the future), consumer involvement (looking at info like links, subscribtions, and comments to define the value of customers), and growth potential (including some more fuzzy measure of potential growth and the advantages of the integration value).</p>
<p>My reasoning for grouping things in this way was that it might make it easier to figure out weighting across those large catch-all categories (and, if there is any discussion at all, I am sure that people will debate the percentage assumption against those categories). I, in no mean, try to represent those as the be-all-end-all approach to valuating a business. They are, at this time, the metrics that give me the best comparative view of a business, when I try to assess its value. However, not being much of a metrics guy to start with (my main reason for doing this series is to provoke debate among people smarter than me so there can be some consensus on metrics in this new web 1+n.x world), I hope that others will step in and show me the error of my ways along with providing some interesting information that will get all of us closer to something useful.</p>
<h3>Base Value</h3>
<p>The base value, as I see it, is defined by revenue and revenue growth based on historical data. The reason I would consider this to be the base value is that it is a reflection of the business as it exists today and can provide a baseline as to where the business would be headed if growth suddenly slowed or investment in the business stopped. It does not provide any information as to how to accelerate the growth of the business and does not provide more than a view into the present cash value of a business.</p>
<p>However, for young companies, such value does not provide much information. Start-ups, by nature, have a lower revenue and profit line than established companies because they need to recover some of their initial cost and may still be in high growth and research and development phases. As a result, to solely base one’s view of a business on its current ability to generate cash is short-sighted when it comes to start-ups.</p>
<p>Another question when developing the base value is how to factor in risks to the revenue base. For example, if the business relies primarily on advertising from an external network as its basis for revenue (many people have talked about businesses looking to AdSense as the primary source of revenue), one has to wonder what would happen if the dynamics of that relationship were to change.</p>
<p>As a whole, however, because of its overall importance in assessing the present financial value of a business, I would assume that the base value should represent about 20 to 30 percent of the overall value of a business. Initially, the value would be in the 20 percent range because potentials are higher than the current revenue line but, as the business matures, and potentials decrease, it would edge up towards 30 percent.</p>
<h3>Inventory</h3>
<p>Inventory would be the next potential grouping of different metrics. In it, I would include traffic (and traffic growth), as well as visitors, site counts, reach, and output. Let’s go into more details on each of those.</p>
<p>Traffic is important because the number of page views is something that is monetizable. However, in a web 2.0 space, pageviews are not the only traffic metric one should track. For example, RSS subscriber counts is another useful value (and controversy has already swirled around ads in RSS feeds). However, I would argue that there is one value in the inventory count that is of utmost importance: access to an API. The reason I would venture this is the most important inventory metric is that APIs, once implemented, are harder to unhook from. As such, they represent a harder type of value since they solidy a site’s reach within a particular market segment.</p>
<p>I believe that reach is actually going to be seen as one of the more important values in terms of inventory. The reason is that reach gives us an idea of the potential growth opportunity in a market. If a company has a high reach in an individual market, its potentials are more limited. Witness, for example, a company like Netscape, which once had a reach of 80% (ie. 80% of all internet users were using it. ) Tactically, this kind of position is one where they should have been on the defensive, the reason being that there was more potential of a drop in their reach than an expansion of it. Microsoft is now finding itself in the same position on a number of fronts: Windows, Office, Internet Explorer are all playing in a world where they will not reach a higher percentage of the market. As a result, they are forced to play defensively. One could argue that web 2.0 companies, with their reach APIs and more powerful front ends (thanks to technologies like AJAX) are representing the threat Microsoft saw coming from the Internet in the mid-90s. And one could argue that, this time, the position they’re in (ie. largest player) is endangering their future if they don’t make a radical change (because they can’t grow from the position they’re in).</p>
<p>Going beyond the reach, which provides some information on past growth and potentials moving forward, one has to look at output from the company. For example, in the case of a company like Ebay (arguably a web 2.x company already), the inventory is number of auctions submitted. Similarly, in the case of Craig’s List, it would be number of new ads posted, or in the case of a blog, the number of posts created. One has to be careful about ouput, however, and should measure the cost of output in order to figure out whether the output is good or not. In the case of web 1+n.x companies, output is a very good thing as it is generally created by outside parties for free. That free product is one that those companies then monetize. However, one has to be careful and evaluate if output is outpacing the company’s ability to monetize it because, if an imbalance were to start existing, the value of the output could potentially decrease.</p>
<p>All and all, because inventory has a measurable value and, in general, is the very thing that a company will monetize, I would guess its weight, when figuring out the value of a company would probably sit in the 10–15 percent range.</p>
<h3>Consumer involvement</h3>
<p>Consumer involvement, which was known in the past as stickyness, is another major group of metrics. This section would include links, subscriptions (both to RSS or API feeds and, if offered to any paid type of service), and any type of interaction a user may have with a system. For example, if you trying to get some interaction information on consumers of a blog, one could look at numbers of comments posted. Alternately, if you’re looking at a search engine, one could look at number of searches performed. Or, if you’re looking at a company which offers an API, you could look at the number of times that API has been integrated in other products and the number of times it is accessed.</p>
<p>I would venture to say that this metric is one of the most important ones when assessing the value of a business. The reason I would value it higher than the ones I mentioned earlier is that this is where one can see whether a business has a potential or not. The interaction with customers (either directly or via APIs) provides so much useful information that a company not looking at this metric is probably off track in terms of evaluating itself (and, generally considering the hype around new businesses, such company could fool itself into extinction as it fails to see major issues arising out of the increase or decrease in consumer involvement.)</p>
<p>Because it represent the value of the existing cutomer base and provides some input as to the trends surrounding that customer base, I would throw a weight of 30–35% of an overall valuation going to factors relating to consumer involvement.</p>
<h3>Growth potentials</h3>
<p>However, metrics in and off themselves, are pretty useless as a point in time number. As a result, one has to assume the growth potential when evaluating a business. The growth potential can be associated in a number of ways but, when it comes to web 1+n.x properties, it comes down more on the side of potential based on a number of subcomponents. In the interest of provoking more controversy, I would venture that there is a formula to calculate potential and that it is as follows:</p>
<blockquote><p>Potential = traffic growth rate * reputation vector * brand equity vector * (integration vector (squared)) — ( Risks vector / percentage of risk that can be mitigated)</p></blockquote>
<p>In the growth rate area, I would put an aggregate growth rate that averages out growth rates over a period of time (6 months to a year if you are computing a monthly growth rate.) The reputation vector and brand equity vector would be values based on reputational and brand equity trends, which <a href="http://www.tnl.net/blog/2005/10/19/metrics-soft-metrics/" title="TNL.net: Soft Metrics">I talked about in a previous entry</a>. You will notice that I consider the integration vector to be of such high importance, when defining potential that I’ve decided to square its value. I will talk about integration vectors in a future entry but, put short, the integration vector is the magic glue that makes acquiring or merging a company very valuable because integrating it with another company will derive greater value for the combined entity. It is that issue generally known as synergy but trying to put a value on it would have the potential of making for better, more successful acquisitions and mergers. Last, but not least, is the rist side of the equation. Because risks have a huge impact on potentials, it is important to measure them in order to get an idea as to their potential impact. However, because some of the risks can be mitigated, it is important to capture this figure in order to assess the importance of different risks.</p>
<p>Ultimately, growth potentials represent the largest part of any equation when trying to value a company. Few companies are bought without an expectation of potential and this is why, in my weighting, I would assume potential to represent a substantial (30–40%) part of the equation when trying to measure a company’s value.</p>
<h3>Conclusion: Wait, that’s more than 100 percent</h3>
<p>If you do the math, it appears the different weight are ending up representing more than 100%. The reason is that those are ranges. However, the truth is that, in any business dealing, there is also an amount of faith and luck that comes in. For example, I sat in a meeting once where an individual was given an option to buy in whole a company which is now very successful on the Internet for around a million dollars. Looking back, it might have been worth that much at the time but I doubt that it would be worth what it is worth now (several billion dollars) had that deal being consumated. Over time, the management of that company was smart enough to mine opportunities and put people in place that helped them realize huge growth. Had that company been in the hand of more conservative (and by conservative, I mean adverse to risk) investors, it would probably not have flourished in the same way.</p>
<p>Having gone through a few days thinking about metrics, it is clear that there are a number of opportunities for people smarter than me to figure out some solid metrics in assessing the value of new companies. Metrics, however, should not be the sole guide when assessing a company.</p>
<p>Many people have asked me why I bothered to look at such boring subject (and why I’ve been blogging so incessantly about numbers lately). My main reasoning is that one of the failures in Web 1.0 (the bubble we lived through in the 90s), lack of accountability and/or expectation management lead to very inflated numbers that eventually left a lot of investors with very poor investment. Having lived (and survived) that bubble, I want people to start thinking more critically about Web 2.0 companies and, hopefully, we can all learn about the mistakes of the past and avoid over-hyping new companies into extinction… because for every bubble, there is eventually a big pop, and no one really enjoys that part.</p>
<p><p><i><a href="http://tnl.net/who" rel="author" title="Who is Tristan Louis?">Tristan Louis</a> is the founder and CEO of <a href="http://www.keepskor.com" title="Keepskor">Keepskor</a> and  writes the influential <a href="http://www.tnl.net/" title="tnl.net">tnl.net</a> weblog, where this was initially posted under the title <a href="http://www.tnl.net/blog/2005/10/20/metrics-weighting-the-metrics/">Metrics — Weighting the Metrics</a>. You can follow him on twitter <a href="https://twitter.com/TNLNYC">here</a> or receive his weekly newsletter by subscribing <a href="http://eepurl.com/gb6zD">here</a>.</i></p>
</p>
]]></content:encoded>
			<wfw:commentRss>http://www.tnl.net/blog/2005/10/20/metrics-weighting-the-metrics/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Google has 24 billion items index, considers MSN search nearest competitor</title>
		<link>http://www.tnl.net/blog/2005/09/27/google-has-24-billion-items-index-considers-msn-search-nearest-competitor/</link>
		<comments>http://www.tnl.net/blog/2005/09/27/google-has-24-billion-items-index-considers-msn-search-nearest-competitor/#comments</comments>
		<pubDate>Tue, 27 Sep 2005 06:36:41 +0000</pubDate>
		<dc:creator>Tristan Louis</dc:creator>
				<category><![CDATA[Business]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Microsoft]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[Yahoo]]></category>

		<guid isPermaLink="false">http://tnl.net/blog/2005/09/27/google-has-24-billion-items-index-considers-msn-search-nearest-competitor/</guid>
		<description><![CDATA[From John Battelle’s site comes the news that Google has decided to drop the number of documents it listed on its front page. The company now claims its index is three times larger than its nearest competitor. Let’s look at the number. Google vs. Yahoo A few weeks ago, Yahoo! claimed that its index was [...]<p><p><i><a href="http://tnl.net/who" rel="author" title="Who is Tristan Louis?">Tristan Louis</a> is the founder and CEO of <a href="http://www.keepskor.com" title="Keepskor">Keepskor</a> and  writes the influential <a href="http://www.tnl.net/" title="tnl.net">tnl.net</a> weblog, where this was initially posted under the title <a href="http://www.tnl.net/blog/2005/09/27/google-has-24-billion-items-index-considers-msn-search-nearest-competitor/">Google has 24 billion items index, considers MSN search nearest competitor</a>. You can follow him on twitter <a href="https://twitter.com/TNLNYC">here</a> or receive his weekly newsletter by subscribing <a href="http://eepurl.com/gb6zD">here</a>.</i></p>
</p>
]]></description>
			<content:encoded><![CDATA[<p>From <a href="http://battellemedia.com/archives/2005/09/google_announces_new_index_size_shifts_focus_from_counting" title="Google Announces New Index Size, Shifts Focus from Counting">John Battelle’s site</a> comes the news that Google has decided to drop the number of documents it listed on its front page. The company now claims its index is three times larger than its nearest competitor. Let’s look at the number.</p>
<h3>Google vs. Yahoo</h3>
<p>A few weeks ago, Yahoo! claimed that its index was over 20 billion items large, broken as follows:</p>
<blockquote><p>just over 19.2 billion web documents, 1.6 billion images, and over 50 million audio and video files</p></blockquote>
<p>If we assume that Google believes its nearest competitor is Yahoo!, this would put the Google index at roughly 60 billion items, a fairly large number, which is probably on the high side. So we need to do more analysis in order to get closer to the truth.</p>
<h3>Google vs. Google</h3>
<p>As part of Google’s seventh birthday celebration, <a href="http://googleblog.blogspot.com/2005/09/we-wanted-something-special-for-our.html" title="We wanted something special for our birthday">Google staffers posted an entry on the official Google blog, claiming that their index is now 1,000 times the size of their original index.</a> If that’s the case, figuring out what the original index size was should give us a good number. Fortunately, I have a copy of <a href="http://battellemedia.com/" title="John Battelle's Searchblog">John Battelle</a>’s excellent book about the company (it’s entitled The Search, which is a must-read for anyone interested in the search space. No other book has gotten as deeply into the history of internet search and few have analyzed more keenly potential futures for Google). In the book, Battelle relays an email from Larry Page to Terry Winograd dated July 15, 1996. In order to give some context, one has to realize that Google started in March of 1996 so, in July of that year, Google was all of four months old. The email is regarding some of the growth issue that the search engine is having and reads (emphasis is mine):</p>
<blockquote><p>I am almost out of disk space.</p>
<h3>I have downloaded about… 24 million unique URLs</h3>
<p>and about 100 million links… I think I will need 8 gigs more to store everything… Current retail prices are about $1000/4 gigs… I have only about 15% of the pages but it seems promising</p></blockquote>
<p>If we take that number as a starting point, that would mean that the original index was around 24 million pages. From there, it is easy to multiply by the 1,000 factor they talk about in their blog and get a number of items in the Google index.</p>
<p>That number would be</p>
<h3>24 billion items in the Google Index</h3>
<p>, a little more than what Yahoo! has in their index.</p>
<h3>Google vs. MSN</h3>
<p>In November 2004, <a href="http://blog.searchenginewatch.com/041111-084221" title="Search Engine Size Wars V Erupts">MSN was estimated to have about 5 billion pages</a>. <a href="http://kenmo.spaces.live.com/Blog/" title="20 Billion -- is that the magic number?">Ken Moss, the General Manager of MSN Search claimed that they added a lot to their index</a>. While he’s not forthcoming with any detailed information in his post, we can still assume that the MSN search index is now larger than 5 billion.</p>
<p>This is interesting in itself in that it may actually help us triangulate to the right size for the Google index. If we try different growth curves against the MSN search, we could look at the following:</p>
<ul>
<li>Growth Curve of 50%: MSN Index is now 7.5 billion items</li>
<li>Growth curve of 75%: MSN Index is now 8.75 billion items</li>
<li>Growth curve of 100%: MSN index is now 10 billion items</li>
</ul>
<p>If we take Google’s assessment that it is three times larger than its nearest competitor and assume that Google is considering MSN search to be its nearest competitor, those growth curves translate as follows:</p>
<ul>
<li>Growth Curve of 50% at MSN: Google index is 22.5 billion items</li>
<li>Growth Curve of 75% at MSN: Google index is 26.25 billion items</li>
<li>Growth Curve of 100% at MSN: Google index is 30 billion items</li>
</ul>
<p>When one looks at those results, a pattern emerges: Let’s first remember the rough claim of 24 billion based on the Google vs. Google analysis above. On the 50% MSN growth curve, Google is at 22.5 billion items indexed. On the 75% MSN growth curve, Google is at 26.5 billion items indexed. It could then be that Google considers MSN Search, and not Yahoo! to be its nearest competitor, as the 24 billion mark seems to fall right in between.</p>
<h3>Conclusion</h3>
<p>While the index size is largely a game of public relations, it appears that the Google index is sitting somewhere between 22.5 and 26.5 billion items indexed and, more probably than not, at the 24 billion items indexed mark. This gives it a slight edge over the Yahoo! index and shows that the company considers Microsoft its nearest competitor. Of course, this is my own speculation so your mileage may vary.</p>
<p><p><i><a href="http://tnl.net/who" rel="author" title="Who is Tristan Louis?">Tristan Louis</a> is the founder and CEO of <a href="http://www.keepskor.com" title="Keepskor">Keepskor</a> and  writes the influential <a href="http://www.tnl.net/" title="tnl.net">tnl.net</a> weblog, where this was initially posted under the title <a href="http://www.tnl.net/blog/2005/09/27/google-has-24-billion-items-index-considers-msn-search-nearest-competitor/">Google has 24 billion items index, considers MSN search nearest competitor</a>. You can follow him on twitter <a href="https://twitter.com/TNLNYC">here</a> or receive his weekly newsletter by subscribing <a href="http://eepurl.com/gb6zD">here</a>.</i></p>
</p>
]]></content:encoded>
			<wfw:commentRss>http://www.tnl.net/blog/2005/09/27/google-has-24-billion-items-index-considers-msn-search-nearest-competitor/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Money in the archives</title>
		<link>http://www.tnl.net/blog/2005/09/23/money-in-the-archives/</link>
		<comments>http://www.tnl.net/blog/2005/09/23/money-in-the-archives/#comments</comments>
		<pubDate>Fri, 23 Sep 2005 06:28:01 +0000</pubDate>
		<dc:creator>Tristan Louis</dc:creator>
				<category><![CDATA[Business]]></category>
		<category><![CDATA[Media]]></category>
		<category><![CDATA[Search]]></category>

		<guid isPermaLink="false">http://tnl.net/blog/2005/09/23/money-in-the-archives/</guid>
		<description><![CDATA[Following a recent article in Wired News about the viability of blogging as a revenue generating model, I started thinking about the value of archival material to a blogger. As readers of this site might have noticed when using the web interface, I am using the Google Ad service called AdSense. As I am not [...]<p><p><i><a href="http://tnl.net/who" rel="author" title="Who is Tristan Louis?">Tristan Louis</a> is the founder and CEO of <a href="http://www.keepskor.com" title="Keepskor">Keepskor</a> and  writes the influential <a href="http://www.tnl.net/" title="tnl.net">tnl.net</a> weblog, where this was initially posted under the title <a href="http://www.tnl.net/blog/2005/09/23/money-in-the-archives/">Money in the archives</a>. You can follow him on twitter <a href="https://twitter.com/TNLNYC">here</a> or receive his weekly newsletter by subscribing <a href="http://eepurl.com/gb6zD">here</a>.</i></p>
</p>
]]></description>
			<content:encoded><![CDATA[<p>Following <a href="http://www.wired.com/culture/lifestyle/news/2005/09/68934" title="Can Bloggers Strike It Rich?  ">a recent article in Wired News</a> about the viability of blogging as a revenue generating model, I started thinking about the value of archival material to a blogger.</p>
<p>As readers of this site might have noticed when using the web interface, I am using the Google Ad service called AdSense. As I am not at freedom to reveal the terms of my contract with them or discuss specific, I’ll talk in general about online advertising programs.</p>
<p>The first thing to take into account is that the model on using advertising in archives is one largely predicated on a <a href="http://www.wired.com/wired/archive/12.10/tail.html" title="The Long Tail">long tail</a> concept, whereas one can make more money from small increment over a long run than trying to score the big hit. In my case, this means trying to get a few good stories out on a regular basis, none of which is going to make lots of money on a single day but a few cents or a few dollars a day can add up to quite a nice payoff on a yearly basis. I believe that people who blog and develop a nice audience can see some of those results.</p>
<p>Let’s take a hypothetical story of 1000 words. A long-standing view in the journalism word is that a dollar per word is the standard (I’ve seen much lower rates as a result of the downturn in technology publications but more on this later). So a writer would write the standard story and get $1000 back for his or her effort. At that point, the publication would print it and/or publish it on their website and start generating revenue against it. The model here, is that the publication is taking a risk with the writer, fronting the money and will recover the money over time. The other part of the equation is that publication has a built in audience and therefore markets the writer to that audience. Because they market their own publication, there is supposed to be a halo effect that shrouds the writer into the great light of being associated with publication X.</p>
<p>The truth, however, is a little different. Unless you’re already an established brand or have a story that is of such import that it will rock the nation or the world, no one will care who the writer is. For a quick test, try to think of who were the writers who wrote the front page story of any given mainstream publication this week. Let’s assume you passed that test, what else have you read by them?</p>
<p>By contrast, blogs offer the writer a single platform. When I visit a particular blog, I have a relationship with the writer. Over time, as I visit it more, I get to know the writer’s brand. This is important because several bloggers have already made the transition from the blogging world into traditional media on the strength of their audience.</p>
<p>This, of course, is a role reversal as blog writers now establish their own audience, decreasing the need for the much vaunted halo effect large publications can give them.</p>
<p>Going further, one has to think about the long term dollar (or Yuan, or Euro) value of a story. As I mentioned above, the traditional freelance writer hands over the story and gets paid. That’s where the money stops.</p>
<p>However, on a blog, the story stays online. If the blog is reasonably optimized to figure prominently in search engines, this is where it starts having a second life. Long after a blog post has been made, it still gets traffic. I see this here at TNL.net on a few popular posts covering areas that no one else seems to have bothered with. And this is where the incremental revenue start to come in.</p>
<p>Once the entry has advertising on it, any revenue generated from that advertising goes to the blog writer. Initially, it’s not comparable to the thousand dollars the writer got from a mainstream publication but, if the entry has legs (ie, it keeps serving an audience), it continues to generate money, pretty much until one of a few possible things happen:</p>
<ul>
<li>The writer decides to remove ads from his/her site</li>
<li>The writer decides to remove the entry from the site</li>
<li>The story has run its course and is no longer useful or superceded by a better one</li>
</ul>
<p>If one writes with such a long run view, a story can generate several times what the initial payback was from a publication.</p>
<p>Another important thing, when one writes a weblog, is to ensure you promote your work properly. I read a number of weblogs (over 300 at this time) and have gotten familiar with a few of the writers. If I write an entry that may be of interest to them, I have no qualms about dropping them a quick email with the link to the story and the content of the full entry in it, along with a short note as to why I thought it might interest them. This kind of self promotion generally helps in driving initial traffic to the entry.</p>
<p>But this does not have to be something that is limited to bloggers. Writers can also take advantage of it. As per word rates have been decreasing, it has been easier to negotiate on certain terms in a contract. One of the terms a writer should always try to negotiate is the length of exclusivity on rights. I generally negotiate for rights on a piece to revert to me after a certain period of time (for example, exclusive rights being given until 90 days after initial publication). Writers will find that most editors are more willing to negotiate such terms (as they have some leeway there) than they are to increase the per word rate. Once a story is published, I then make an entry in my calendar (or in the case of the custom publishing solution I have running this site, in the content management system itself) to publish the story on the site 90 days after the first day the story was published (I check that either by having a copy of the print publication in hand or seeing it live on their site). That content then stays online for as long as I want. In some cases, it gets interesting as stories I wrote years ago are getting a little extra traffic and making some extra money. It’s the fire and forget approach.</p>
<p>The result is that there is a lot of money to be made in such archival content, small dimes adding up to, hopefully, full dollars.</p>
<p><p><i><a href="http://tnl.net/who" rel="author" title="Who is Tristan Louis?">Tristan Louis</a> is the founder and CEO of <a href="http://www.keepskor.com" title="Keepskor">Keepskor</a> and  writes the influential <a href="http://www.tnl.net/" title="tnl.net">tnl.net</a> weblog, where this was initially posted under the title <a href="http://www.tnl.net/blog/2005/09/23/money-in-the-archives/">Money in the archives</a>. You can follow him on twitter <a href="https://twitter.com/TNLNYC">here</a> or receive his weekly newsletter by subscribing <a href="http://eepurl.com/gb6zD">here</a>.</i></p>
</p>
]]></content:encoded>
			<wfw:commentRss>http://www.tnl.net/blog/2005/09/23/money-in-the-archives/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Google Introduces Blog Search</title>
		<link>http://www.tnl.net/blog/2005/09/13/google-introduces-blog-search/</link>
		<comments>http://www.tnl.net/blog/2005/09/13/google-introduces-blog-search/#comments</comments>
		<pubDate>Tue, 13 Sep 2005 18:10:15 +0000</pubDate>
		<dc:creator>Tristan Louis</dc:creator>
				<category><![CDATA[Technology]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Microsoft]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[Yahoo]]></category>

		<guid isPermaLink="false">http://tnl.net/blog/2005/09/13/google-introduces-blog-search/</guid>
		<description><![CDATA[Overnight, Google has made the blog search space a little more competitive with their introduction of a new blog search engine. New ones are also coming soon from Yahoo!, Microsoft, and AskJeeves (that last one being the main reason behind their acquisition of bloglines.) At first look, it seems to work fine. Blogs are easily [...]<p><p><i><a href="http://tnl.net/who" rel="author" title="Who is Tristan Louis?">Tristan Louis</a> is the founder and CEO of <a href="http://www.keepskor.com" title="Keepskor">Keepskor</a> and  writes the influential <a href="http://www.tnl.net/" title="tnl.net">tnl.net</a> weblog, where this was initially posted under the title <a href="http://www.tnl.net/blog/2005/09/13/google-introduces-blog-search/">Google Introduces Blog Search</a>. You can follow him on twitter <a href="https://twitter.com/TNLNYC">here</a> or receive his weekly newsletter by subscribing <a href="http://eepurl.com/gb6zD">here</a>.</i></p>
</p>
]]></description>
			<content:encoded><![CDATA[<p>Overnight, Google has made the blog search space a little more competitive with their introduction of a new <a href="http://blogsearch.google.com/" title="Google Blog Search">blog search engine</a>. New ones are also coming soon from Yahoo!, Microsoft, and AskJeeves (that last one being the main reason behind their acquisition of bloglines.)</p>
<p>At first look, it seems to work fine. Blogs are easily findable, although it’s not clear what the ranking algorithm is.</p>
<p>One of the thing I’d like to see, however, is for big players like Google to start offering a ping service. At the current time, every major blog search engine seems to be using <a href="http://www.weblogs.com" title="Weblogs.com">Weblogs.com</a> as their main alert mechanism. <a href="http://www.scripting.com" title="Scripting.com">Dave Winer</a> has been extremely generous in providing this service to the community but, now that for profit entities are starting to use it, it’s time for them to step up and provide some help here. Add to the community by offering equivalent services and making them as available as weblogs.com. Another way a company like Google (or Yahoo!, or Microsoft (hey, <a href="http://radio-weblogs.com/0001011/" title="Scobleizer">Scoble</a>, can you get involved here <img src='http://www.tnl.net/editor/wp/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  ), or…) could help would be to offer free hosting on their server to Dave for the weblogs.com service. They have loads of bandwidth and servers and could provide it as a service to the community.</p>
<p>All and all, a good first start and let’s hope that we’ll see some of the improvements I suggested soon.</p>
<p><p><i><a href="http://tnl.net/who" rel="author" title="Who is Tristan Louis?">Tristan Louis</a> is the founder and CEO of <a href="http://www.keepskor.com" title="Keepskor">Keepskor</a> and  writes the influential <a href="http://www.tnl.net/" title="tnl.net">tnl.net</a> weblog, where this was initially posted under the title <a href="http://www.tnl.net/blog/2005/09/13/google-introduces-blog-search/">Google Introduces Blog Search</a>. You can follow him on twitter <a href="https://twitter.com/TNLNYC">here</a> or receive his weekly newsletter by subscribing <a href="http://eepurl.com/gb6zD">here</a>.</i></p>
</p>
]]></content:encoded>
			<wfw:commentRss>http://www.tnl.net/blog/2005/09/13/google-introduces-blog-search/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Links and Search Engines: The MSN edition</title>
		<link>http://www.tnl.net/blog/2005/07/30/links-and-search-engines-the-msn-edition/</link>
		<comments>http://www.tnl.net/blog/2005/07/30/links-and-search-engines-the-msn-edition/#comments</comments>
		<pubDate>Sat, 30 Jul 2005 22:24:24 +0000</pubDate>
		<dc:creator>Tristan Louis</dc:creator>
				<category><![CDATA[Business]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[HTML]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[United States]]></category>
		<category><![CDATA[Yahoo]]></category>

		<guid isPermaLink="false">http://tnl.net/blog/2005/07/30/links-and-search-engines-the-msn-edition/</guid>
		<description><![CDATA[I’ve been promising for a while to complete this series with results relating to MSN (and, for the record, this has nothing to do with Scoble begging for it). I finally got around to cleaning up the HTML output of Excel and can now present the third (and probably final) installment in my analysis of [...]<p><p><i><a href="http://tnl.net/who" rel="author" title="Who is Tristan Louis?">Tristan Louis</a> is the founder and CEO of <a href="http://www.keepskor.com" title="Keepskor">Keepskor</a> and  writes the influential <a href="http://www.tnl.net/" title="tnl.net">tnl.net</a> weblog, where this was initially posted under the title <a href="http://www.tnl.net/blog/2005/07/30/links-and-search-engines-the-msn-edition/">Links and Search Engines: The MSN edition</a>. You can follow him on twitter <a href="https://twitter.com/TNLNYC">here</a> or receive his weekly newsletter by subscribing <a href="http://eepurl.com/gb6zD">here</a>.</i></p>
</p>
]]></description>
			<content:encoded><![CDATA[<p>I’ve been promising for a while to complete this series with results relating to MSN (and, for the record, this has nothing to do with <a href="http://radio-weblogs.com/0001011/" title="Robert Scoble">Scoble</a> begging for it). I finally got around to cleaning up the HTML output of Excel and can now present the third (and probably final) installment in my analysis of search engine link features.</p>
<p>To recap, I initially took the list of Top 100 blogs listed by Technorati on May 19th, 2005 and started doing side by side comparisons. I initially looked at <a href="http://www.tnl.net/blog/2005/06/01/secrets-of-the-a-list-bloggers-technorati-links/" title="TNL.net: Secrets of the A-List Bloggers: Technorati Links">distribution of links among the top 100</a>, then followed up with <a href="http://www.tnl.net/blog/2005/06/13/secrets-of-the-a-list-bloggers-technorati-vs-google/" title="TNL.net: Secrets of the A-list bloggers: Technorati vs. Google">an analysis of Technorati against Google</a>, this brought me to <a href="http://www.tnl.net/blog/2005/06/20/technorati-yahoo-and-google-too/" title="TNL.net: Technorati Yahoo and Google Too">a subsequent chapter on Technorati against Google and Yahoo! (then comparing Google and Yahoo! to each other)</a>. All this created some fair amount of buzz in the search world, with people saying it was interesting to other saying I was way off the mark. Either way, it’s time to take a look at MSN, in order to complete this round-up.</p>
<p>So, to create some benchmarks, let’s start taking a look at distribution of Technorati links against MSN’s:</p>
<table border="1" summary="Technorati vs MSN">
<tr>
<th>Technorati Top 100</th>
<th>MSN Links</th>
<th>Technorati Links</th>
<th>Technorati/MSN Links</th>
</tr>
<tr>
<td>Boing Boing</td>
<td>407172</td>
<td>22532</td>
<td>5.53378%</td>
</tr>
<tr>
<td>InstaPundit</td>
<td>241472</td>
<td>15190</td>
<td>6.29058%</td>
</tr>
<tr>
<td>Daily Kos</td>
<td>184666</td>
<td>15833</td>
<td>8.57386%</td>
</tr>
<tr>
<td>Gizmodo</td>
<td>252869</td>
<td>12278</td>
<td>4.85548%</td>
</tr>
<tr>
<td>Fark</td>
<td>352289</td>
<td>10216</td>
<td>2.89989%</td>
</tr>
<tr>
<td>EnGadget</td>
<td>198584</td>
<td>15051</td>
<td>7.57916%</td>
</tr>
<tr>
<td>Davenetics</td>
<td>3334</td>
<td>7571</td>
<td>227.08458%</td>
</tr>
<tr>
<td>Eschaton</td>
<td>138241</td>
<td>8713</td>
<td>6.30276%</td>
</tr>
<tr>
<td>Dooce</td>
<td>118385</td>
<td>6797</td>
<td>5.74144%</td>
</tr>
<tr>
<td>Andrew Sullivan</td>
<td>96315</td>
<td>7680</td>
<td>7.97384%</td>
</tr>
<tr>
<td>The Best Page In The Universe</td>
<td>92232</td>
<td>6333</td>
<td>6.86638%</td>
</tr>
<tr>
<td>Talking Points Memo: by Joshua Micah Marshall</td>
<td>193438</td>
<td>7592</td>
<td>3.92477%</td>
</tr>
<tr>
<td>lgf: anti-idiotarian</td>
<td>6067</td>
<td>8275</td>
<td>136.39360%</td>
</tr>
<tr>
<td>kottke.org</td>
<td>159861</td>
<td>7278</td>
<td>4.55271%</td>
</tr>
<tr>
<td>WIL WHEATON DOT NET</td>
<td>148587</td>
<td>6314</td>
<td>4.24936%</td>
</tr>
<tr>
<td>Metafilter</td>
<td>136052</td>
<td>7591</td>
<td>5.57948%</td>
</tr>
<tr>
<td>Doc Searls</td>
<td>95781</td>
<td>5690</td>
<td>5.94064%</td>
</tr>
<tr>
<td>(In)formacao e (In)utilidade</td>
<td>3272</td>
<td>6040</td>
<td>184.59658%</td>
</tr>
<tr>
<td>Wonkette</td>
<td>96768</td>
<td>5877</td>
<td>6.07329%</td>
</tr>
<tr>
<td>Scripting News</td>
<td>183067</td>
<td>5728</td>
<td>3.12891%</td>
</tr>
<tr>
<td>Power Line</td>
<td>92069</td>
<td>7477</td>
<td>8.12108%</td>
</tr>
<tr>
<td>Balmasque</td>
<td>409</td>
<td>4544</td>
<td>1111.00244%</td>
</tr>
<tr>
<td>Corante</td>
<td>23107</td>
<td>7686</td>
<td>33.26265%</td>
</tr>
<tr>
<td>A list Apart</td>
<td>220584</td>
<td>5536</td>
<td>2.50970%</td>
</tr>
<tr>
<td>Something Awful</td>
<td>97908</td>
<td>4512</td>
<td>4.60841%</td>
</tr>
<tr>
<td>Megatokyo</td>
<td>112902</td>
<td>4154</td>
<td>3.67930%</td>
</tr>
<tr>
<td>Michelle Malkin</td>
<td>72190</td>
<td>6091</td>
<td>8.43746%</td>
</tr>
<tr>
<td>Arts and Letters Daily</td>
<td>94718</td>
<td>3983</td>
<td>4.20511%</td>
</tr>
<tr>
<td>Gawker</td>
<td>72773</td>
<td>4453</td>
<td>6.11903%</td>
</tr>
<tr>
<td>Afterall it was the best I ever had</td>
<td>922</td>
<td>3591</td>
<td>389.47939%</td>
</tr>
<tr>
<td>The Volokh Conspiracy</td>
<td>88818</td>
<td>5873</td>
<td>6.61240%</td>
</tr>
<tr>
<td>Scobelizer</td>
<td>68282</td>
<td>5524</td>
<td>8.08998%</td>
</tr>
<tr>
<td>Jeffrey Zeldman</td>
<td>149539</td>
<td>4134</td>
<td>2.76450%</td>
</tr>
<tr>
<td>This Modern World</td>
<td>79038</td>
<td>3913</td>
<td>4.95078%</td>
</tr>
<tr>
<td>The Web Standards Project</td>
<td>211917</td>
<td>3810</td>
<td>1.79787%</td>
</tr>
<tr>
<td>Joel on Software</td>
<td>133853</td>
<td>4514</td>
<td>3.37236%</td>
</tr>
<tr>
<td>Media Matters for America</td>
<td>64867</td>
<td>6809</td>
<td>10.49686%</td>
</tr>
<tr>
<td>Television without pity</td>
<td>46391</td>
<td>3859</td>
<td>8.31842%</td>
</tr>
<tr>
<td>Kuro5hin</td>
<td>130549</td>
<td>4208</td>
<td>3.22331%</td>
</tr>
<tr>
<td>Lileks</td>
<td>50706</td>
<td>3824</td>
<td>7.54151%</td>
</tr>
<tr>
<td>Hugh Hewitt</td>
<td>64118</td>
<td>4573</td>
<td>7.13216%</td>
</tr>
<tr>
<td>Joel Veitch</td>
<td>23302</td>
<td>3774</td>
<td>16.19603%</td>
</tr>
<tr>
<td>Truthout</td>
<td>42693</td>
<td>6528</td>
<td>15.29056%</td>
</tr>
<tr>
<td>Baghdad Burning</td>
<td>51647</td>
<td>3519</td>
<td>6.81356%</td>
</tr>
<tr>
<td>Buzz machine</td>
<td>72649</td>
<td>4145</td>
<td>5.70552%</td>
</tr>
<tr>
<td>fleugel</td>
<td>201995</td>
<td>3670</td>
<td>1.81688%</td>
</tr>
<tr>
<td>Informed Comment</td>
<td>62822</td>
<td>3905</td>
<td>6.21598%</td>
</tr>
<tr>
<td>Doppler: redefining podcasting</td>
<td>12512</td>
<td>3040</td>
<td>24.29668%</td>
</tr>
<tr>
<td>geek and proud</td>
<td>714</td>
<td>3166</td>
<td>443.41737%</td>
</tr>
<tr>
<td>loadmemory (Asian site)</td>
<td>198</td>
<td>3324</td>
<td>1678.78788%</td>
</tr>
<tr>
<td>Photojunkie</td>
<td>3721</td>
<td>2860</td>
<td>76.86106%</td>
</tr>
<tr>
<td>Ross Rader</td>
<td>4830</td>
<td>2976</td>
<td>61.61491%</td>
</tr>
<tr>
<td>The Truth Laid Bear</td>
<td>51806</td>
<td>4127</td>
<td>7.96626%</td>
</tr>
<tr>
<td>Joi Ito</td>
<td>62642</td>
<td>5165</td>
<td>8.24527%</td>
</tr>
<tr>
<td>ScrappleFace</td>
<td>49953</td>
<td>3480</td>
<td>6.96655%</td>
</tr>
<tr>
<td>LexText</td>
<td>1741</td>
<td>2671</td>
<td>153.41758%</td>
</tr>
<tr>
<td>Google Blog</td>
<td>42967</td>
<td>3688</td>
<td>8.58333%</td>
</tr>
<tr>
<td>Xbox</td>
<td>86021</td>
<td>4221</td>
<td>4.90694%</td>
</tr>
<tr>
<td>My life in a Bush of Ghosts</td>
<td>12</td>
<td>2519</td>
<td>20991.66667%</td>
</tr>
<tr>
<td>Astronomy picture of the day</td>
<td>33625</td>
<td>3498</td>
<td>10.40297%</td>
</tr>
<tr>
<td>Crooked Timber</td>
<td>60675</td>
<td>3617</td>
<td>5.96127%</td>
</tr>
<tr>
<td>Vodka Pundit</td>
<td>58205</td>
<td>3085</td>
<td>5.30023%</td>
</tr>
<tr>
<td>Captain’s quarter</td>
<td>45609</td>
<td>3671</td>
<td>8.04885%</td>
</tr>
<tr>
<td>A small victory</td>
<td>54767</td>
<td>3223</td>
<td>5.88493%</td>
</tr>
<tr>
<td>Gato Fedorento</td>
<td>2294</td>
<td>2574</td>
<td>112.20575%</td>
</tr>
<tr>
<td>Mezzoblue</td>
<td>99511</td>
<td>2952</td>
<td>2.96651%</td>
</tr>
<tr>
<td>PostSecret</td>
<td>30794</td>
<td>2707</td>
<td>8.79067%</td>
</tr>
<tr>
<td>Samizdata.net</td>
<td>1712</td>
<td>2872</td>
<td>167.75701%</td>
</tr>
<tr>
<td>Lawrence Lessig</td>
<td>81047</td>
<td>2949</td>
<td>3.63863%</td>
</tr>
<tr>
<td>Counterpunch</td>
<td>52642</td>
<td>3278</td>
<td>6.22697%</td>
</tr>
<tr>
<td>Democractic Underground</td>
<td>35595</td>
<td>3913</td>
<td>10.99312%</td>
</tr>
<tr>
<td>Right Wing News</td>
<td>61379</td>
<td>2967</td>
<td>4.83390%</td>
</tr>
<tr>
<td>StopDesign</td>
<td>86165</td>
<td>3037</td>
<td>3.52463%</td>
</tr>
<tr>
<td>iBiblio</td>
<td>32301</td>
<td>3105</td>
<td>9.61271%</td>
</tr>
<tr>
<td>Samizdata.net (mistake?)</td>
<td>61443</td>
<td>2743</td>
<td>4.46430%</td>
</tr>
<tr>
<td>Abrupto</td>
<td>2698</td>
<td>2935</td>
<td>108.78428%</td>
</tr>
<tr>
<td>gene7299 (Asian MSNSpaces site)</td>
<td>28</td>
<td>3215</td>
<td>11482.14286%</td>
</tr>
<tr>
<td>Where is Raed?</td>
<td>24848</td>
<td>2409</td>
<td>9.69495%</td>
</tr>
<tr>
<td>B3TA: We love the web</td>
<td>38386</td>
<td>2614</td>
<td>6.80977%</td>
</tr>
<tr>
<td>Talkleft</td>
<td>60169</td>
<td>2901</td>
<td>4.82142%</td>
</tr>
<tr>
<td>Wizbang</td>
<td>60259</td>
<td>3358</td>
<td>5.57261%</td>
</tr>
<tr>
<td>m1net (MSN spaces site)</td>
<td>22</td>
<td>3548</td>
<td>16127.27273%</td>
</tr>
<tr>
<td>Hoder</td>
<td>1620</td>
<td>5422</td>
<td>334.69136%</td>
</tr>
<tr>
<td>CTRL+Alt+Del</td>
<td>32277</td>
<td>2315</td>
<td>7.17229%</td>
</tr>
<tr>
<td>Brad DeLong</td>
<td>48403</td>
<td>2715</td>
<td>5.60916%</td>
</tr>
<tr>
<td>Blogs for Bush</td>
<td>50820</td>
<td>3560</td>
<td>7.00512%</td>
</tr>
<tr>
<td>Neil Gaiman</td>
<td>71916</td>
<td>2194</td>
<td>3.05078%</td>
</tr>
<tr>
<td>Gothamist</td>
<td>47848</td>
<td>2729</td>
<td>5.70348%</td>
</tr>
<tr>
<td>Thought Mechanics</td>
<td>60736</td>
<td>2197</td>
<td>3.61729%</td>
</tr>
<tr>
<td>IMAO</td>
<td>45822</td>
<td>2905</td>
<td>6.33975%</td>
</tr>
<tr>
<td>Dan Gillmor (old weblog)</td>
<td>36369</td>
<td>2600</td>
<td>7.14895%</td>
</tr>
<tr>
<td>HINAGATA</td>
<td>176519</td>
<td>2186</td>
<td>1.23839%</td>
</tr>
<tr>
<td>Dean’s World</td>
<td>53150</td>
<td>2985</td>
<td>5.61618%</td>
</tr>
<tr>
<td>Defamer</td>
<td>49132</td>
<td>2372</td>
<td>4.82781%</td>
</tr>
<tr>
<td>USS Clueless</td>
<td>64725</td>
<td>2570</td>
<td>3.97065%</td>
</tr>
<tr>
<td>Dive into Mark</td>
<td>54167</td>
<td>2540</td>
<td>4.68920%</td>
</tr>
<tr>
<td>Pandagon</td>
<td>51286</td>
<td>2822</td>
<td>5.50248%</td>
</tr>
<tr>
<td>Blogging.la</td>
<td>8495</td>
<td>3061</td>
<td>36.03296%</td>
</tr>
<tr>
<td>Why are you worshipping the ground I blog on?</td>
<td>3481</td>
<td>2238</td>
<td>64.29187%</td>
</tr>
<tr>
<td>Daring Fireball</td>
<td>52381</td>
<td>2573</td>
<td>4.91209%</td>
</tr>
</table>
<p>Of course, no big surprise here. This seems to be pretty consistent with what I had found in dealing with Google and Yahoo!, showing that Technorati does a good but not complete job at indexing link-backs. What’s interesting, however, is that Technorati seems to have a different pattern when dealing with MSN than it does with Yahoo or Google. Let me show you what I’m talking about. Following is the pattern of Technorati differential with MSN:<br />
<img src="http://www.tnl.net/assets/images/blog/secrets/TM2.gif" alt="Technorati vs. MSN" /><br />
… and now is the differential between Technorati and Yahoo..<br />
<img src="http://www.tnl.net/assets/images/blog/secrets/TY2.gif" alt="Technorati vs. Yahoo" /><br />
.. and finally the same graph between Technorati and Google<br />
<img src="http://www.tnl.net/assets/images/blog/secrets/TGaverages.gif" alt="Technorati vs. Google" /></p>
<p>I’ve been trying to understand why this is and still have no clear answer, to be fully honest. Could be something, could be nothing. I’m not sure at this point and this is, in large part, one of the thing that was frustrating in working on this entry. I’m not sure there is something there, to be very honest.</p>
<h3>Comparing the Search Engines</h3>
<p>However, the picture gets more interesting when you get the three search engines side by side. Here’s a quick spreadsheet of the results:</p>
<table border="1" summary="side by side">
<tr>
<th>Technorati Top 100</th>
<th>Google Links</th>
<th>Yahoo Links</th>
<th>MSN Links</th>
<th>MSN Links/Google Links</th>
<th>MSN Links/Yahoo Links</th>
</tr>
<tr>
<td>Boing Boing</td>
<td>45200</td>
<td>1880000</td>
<td>407172</td>
<td>900.8230%</td>
<td>21.6581%</td>
</tr>
<tr>
<td>InstaPundit</td>
<td>75000</td>
<td>2160000</td>
<td>241472</td>
<td>321.9627%</td>
<td>11.1793%</td>
</tr>
<tr>
<td>Daily Kos</td>
<td>59800</td>
<td>1690000</td>
<td>184666</td>
<td>308.8060%</td>
<td>10.9270%</td>
</tr>
<tr>
<td>Gizmodo</td>
<td>39300</td>
<td>1970000</td>
<td>252869</td>
<td>643.4326%</td>
<td>12.8360%</td>
</tr>
<tr>
<td>Fark</td>
<td>43600</td>
<td>1420000</td>
<td>352289</td>
<td>808.0023%</td>
<td>24.8091%</td>
</tr>
<tr>
<td>EnGadget</td>
<td>46800</td>
<td>2820000</td>
<td>198584</td>
<td>424.3248%</td>
<td>7.0420%</td>
</tr>
<tr>
<td>Davenetics</td>
<td>1780</td>
<td>66400</td>
<td>3334</td>
<td>187.3034%</td>
<td>5.0211%</td>
</tr>
<tr>
<td>Eschaton</td>
<td>62400</td>
<td>1400000</td>
<td>138241</td>
<td>221.5401%</td>
<td>9.8744%</td>
</tr>
<tr>
<td>Dooce</td>
<td>23600</td>
<td>653000</td>
<td>118385</td>
<td>501.6314%</td>
<td>18.1294%</td>
</tr>
<tr>
<td>Andrew Sullivan</td>
<td>41100</td>
<td>1260000</td>
<td>96315</td>
<td>234.3431%</td>
<td>7.6440%</td>
</tr>
<tr>
<td>The Best Page In The Universe</td>
<td>656</td>
<td>62000</td>
<td>92232</td>
<td>14059.7561%</td>
<td>148.7613%</td>
</tr>
<tr>
<td>Talking Points Memo: by Joshua Micah Marshall</td>
<td>74600</td>
<td>563000</td>
<td>193438</td>
<td>259.3003%</td>
<td>34.3584%</td>
</tr>
<tr>
<td>lgf: anti-idiotarian</td>
<td>14700</td>
<td>49300</td>
<td>6067</td>
<td>41.2721%</td>
<td>12.3063%</td>
</tr>
<tr>
<td>kottke.org</td>
<td>32000</td>
<td>1200000</td>
<td>159861</td>
<td>499.5656%</td>
<td>13.3218%</td>
</tr>
<tr>
<td>WIL WHEATON DOT NET</td>
<td>16900</td>
<td>564000</td>
<td>148587</td>
<td>879.2130%</td>
<td>26.3452%</td>
</tr>
<tr>
<td>Metafilter</td>
<td>34500</td>
<td>1160000</td>
<td>136052</td>
<td>394.3536%</td>
<td>11.7286%</td>
</tr>
<tr>
<td>Doc Searls</td>
<td>33600</td>
<td>1150000</td>
<td>95781</td>
<td>285.0625%</td>
<td>8.3288%</td>
</tr>
<tr>
<td>(In)formaco e (In)utilidade</td>
<td>1780</td>
<td>110000</td>
<td>3272</td>
<td>183.8202%</td>
<td>2.9745%</td>
</tr>
<tr>
<td>Wonkette</td>
<td>28800</td>
<td>1370000</td>
<td>96768</td>
<td>336.0000%</td>
<td>7.0634%</td>
</tr>
<tr>
<td>Scripting News</td>
<td>39400</td>
<td>1470000</td>
<td>183067</td>
<td>464.6371%</td>
<td>12.4535%</td>
</tr>
<tr>
<td>Power Line</td>
<td>7510</td>
<td>344000</td>
<td>92069</td>
<td>1225.9521%</td>
<td>26.7642%</td>
</tr>
<tr>
<td>Balmasque</td>
<td>24</td>
<td>40500</td>
<td>409</td>
<td>1704.1667%</td>
<td>1.0099%</td>
</tr>
<tr>
<td>Corante</td>
<td>6770</td>
<td>265000</td>
<td>23107</td>
<td>341.3146%</td>
<td>8.7196%</td>
</tr>
<tr>
<td>A list Apart</td>
<td>21100</td>
<td>620000</td>
<td>220584</td>
<td>1045.4218%</td>
<td>35.5781%</td>
</tr>
<tr>
<td>Something Awful</td>
<td>9020</td>
<td>372000</td>
<td>97908</td>
<td>1085.4545%</td>
<td>26.3194%</td>
</tr>
<tr>
<td>Megatokyo</td>
<td>7310</td>
<td>361000</td>
<td>112902</td>
<td>1544.4870%</td>
<td>31.2748%</td>
</tr>
<tr>
<td>Michelle Malkin</td>
<td>17300</td>
<td>537000</td>
<td>72190</td>
<td>417.2832%</td>
<td>13.4432%</td>
</tr>
<tr>
<td>Arts and Letters Daily</td>
<td>23900</td>
<td>866000</td>
<td>94718</td>
<td>396.3096%</td>
<td>10.9374%</td>
</tr>
<tr>
<td>Gawker</td>
<td>23500</td>
<td>1060000</td>
<td>72773</td>
<td>309.6723%</td>
<td>6.8654%</td>
</tr>
<tr>
<td>Afterall it was the best I ever had</td>
<td>95</td>
<td>34900</td>
<td>922</td>
<td>970.5263%</td>
<td>2.6418%</td>
</tr>
<tr>
<td>The Volokh Conspiracy</td>
<td>42000</td>
<td>1190000</td>
<td>88818</td>
<td>211.4714%</td>
<td>7.4637%</td>
</tr>
<tr>
<td>Scobelizer</td>
<td>21800</td>
<td>937000</td>
<td>68282</td>
<td>313.2202%</td>
<td>7.2873%</td>
</tr>
<tr>
<td>Jeffrey Zeldman</td>
<td>22500</td>
<td>528000</td>
<td>149539</td>
<td>664.6178%</td>
<td>28.3218%</td>
</tr>
<tr>
<td>This Modern World</td>
<td>32100</td>
<td>813000</td>
<td>79038</td>
<td>246.2243%</td>
<td>9.7218%</td>
</tr>
<tr>
<td>The Web Standards Project</td>
<td>1850</td>
<td>59800</td>
<td>211917</td>
<td>11454.9730%</td>
<td>354.3763%</td>
</tr>
<tr>
<td>Joel on Software</td>
<td>22400</td>
<td>966000</td>
<td>133853</td>
<td>597.5580%</td>
<td>13.8564%</td>
</tr>
<tr>
<td>Media Matters for America</td>
<td>24800</td>
<td>536000</td>
<td>64867</td>
<td>261.5605%</td>
<td>12.1021%</td>
</tr>
<tr>
<td>Television without pity</td>
<td>13300</td>
<td>356000</td>
<td>46391</td>
<td>348.8045%</td>
<td>13.0312%</td>
</tr>
<tr>
<td>Kuro5hin</td>
<td>17300</td>
<td>866000</td>
<td>130549</td>
<td>754.6185%</td>
<td>15.0749%</td>
</tr>
<tr>
<td>Lileks</td>
<td>Â </td>
<td>39700</td>
<td>50706</td>
<td>Â </td>
<td>127.7229%</td>
</tr>
<tr>
<td>Hugh Hewitt</td>
<td>26700</td>
<td>929000</td>
<td>64118</td>
<td>240.1423%</td>
<td>6.9018%</td>
</tr>
<tr>
<td>Joel Veitch</td>
<td>2830</td>
<td>135000</td>
<td>23302</td>
<td>823.3922%</td>
<td>17.2607%</td>
</tr>
<tr>
<td>Truthout</td>
<td>8780</td>
<td>371000</td>
<td>42693</td>
<td>486.2528%</td>
<td>11.5075%</td>
</tr>
<tr>
<td>Baghdad Burning</td>
<td>22700</td>
<td>552000</td>
<td>51647</td>
<td>227.5198%</td>
<td>9.3563%</td>
</tr>
<tr>
<td>Buzz machine</td>
<td>30600</td>
<td>1010000</td>
<td>72649</td>
<td>237.4150%</td>
<td>7.1930%</td>
</tr>
<tr>
<td>fleugel</td>
<td>1890</td>
<td>201000</td>
<td>201995</td>
<td>10687.5661%</td>
<td>100.4950%</td>
</tr>
<tr>
<td>Informed Comment</td>
<td>27900</td>
<td>787000</td>
<td>62822</td>
<td>225.1685%</td>
<td>7.9825%</td>
</tr>
<tr>
<td>Doppler: redefining podcasting</td>
<td>4420</td>
<td>607000</td>
<td>12512</td>
<td>283.0769%</td>
<td>2.0613%</td>
</tr>
<tr>
<td>geek and proud</td>
<td>355</td>
<td>9110</td>
<td>714</td>
<td>201.1268%</td>
<td>7.8375%</td>
</tr>
<tr>
<td>loadmemory (Asian site)</td>
<td>83</td>
<td>1550</td>
<td>198</td>
<td>238.5542%</td>
<td>12.7742%</td>
</tr>
<tr>
<td>Photojunkie</td>
<td>1540</td>
<td>51200</td>
<td>3721</td>
<td>241.6234%</td>
<td>7.2676%</td>
</tr>
<tr>
<td>Ross Rader</td>
<td>1070</td>
<td>48200</td>
<td>4830</td>
<td>451.4019%</td>
<td>10.0207%</td>
</tr>
<tr>
<td>The Truth Laid Bear</td>
<td>23900</td>
<td>717000</td>
<td>51806</td>
<td>216.7615%</td>
<td>7.2254%</td>
</tr>
<tr>
<td>Joi Ito</td>
<td>23400</td>
<td>1050000</td>
<td>62642</td>
<td>267.7009%</td>
<td>5.9659%</td>
</tr>
<tr>
<td>ScrappleFace</td>
<td>31100</td>
<td>807000</td>
<td>49953</td>
<td>160.6206%</td>
<td>6.1900%</td>
</tr>
<tr>
<td>LexText</td>
<td>1970</td>
<td>31200</td>
<td>1741</td>
<td>88.3756%</td>
<td>5.5801%</td>
</tr>
<tr>
<td>Google Blog</td>
<td>46</td>
<td>297000</td>
<td>42967</td>
<td>93406.5217%</td>
<td>14.4670%</td>
</tr>
<tr>
<td>Xbox</td>
<td>6600</td>
<td>237000</td>
<td>86021</td>
<td>1303.3485%</td>
<td>36.2958%</td>
</tr>
<tr>
<td>My life in a Bush of Ghosts</td>
<td>6</td>
<td>903</td>
<td>12</td>
<td>200.0000%</td>
<td>1.3289%</td>
</tr>
<tr>
<td>Astronomy picture of the day</td>
<td>5020</td>
<td>113000</td>
<td>33625</td>
<td>669.8207%</td>
<td>29.7566%</td>
</tr>
<tr>
<td>Crooked Timber</td>
<td>3560</td>
<td>67500</td>
<td>60675</td>
<td>1704.3539%</td>
<td>89.8889%</td>
</tr>
<tr>
<td>Vodka Pundit</td>
<td>4520</td>
<td>169000</td>
<td>58205</td>
<td>1287.7212%</td>
<td>34.4408%</td>
</tr>
<tr>
<td>Captain’s quarter</td>
<td>27100</td>
<td>730000</td>
<td>45609</td>
<td>168.2989%</td>
<td>6.2478%</td>
</tr>
<tr>
<td>A small victory</td>
<td>16700</td>
<td>460000</td>
<td>54767</td>
<td>327.9461%</td>
<td>11.9059%</td>
</tr>
<tr>
<td>Gato Fedorento</td>
<td>1630</td>
<td>126000</td>
<td>2294</td>
<td>140.7362%</td>
<td>1.8206%</td>
</tr>
<tr>
<td>Mezzoblue</td>
<td>12000</td>
<td>278000</td>
<td>99511</td>
<td>829.2583%</td>
<td>35.7953%</td>
</tr>
<tr>
<td>PostSecret</td>
<td>5790</td>
<td>202000</td>
<td>30794</td>
<td>531.8480%</td>
<td>15.2446%</td>
</tr>
<tr>
<td>Samizdata.net</td>
<td>1050</td>
<td>18000</td>
<td>1712</td>
<td>163.0476%</td>
<td>9.5111%</td>
</tr>
<tr>
<td>Lawrence Lessig</td>
<td>30600</td>
<td>959000</td>
<td>81047</td>
<td>264.8595%</td>
<td>8.4512%</td>
</tr>
<tr>
<td>Counterpunch</td>
<td>11700</td>
<td>295000</td>
<td>52642</td>
<td>449.9316%</td>
<td>17.8447%</td>
</tr>
<tr>
<td>Democractic Underground</td>
<td>14900</td>
<td>417000</td>
<td>35595</td>
<td>238.8926%</td>
<td>8.5360%</td>
</tr>
<tr>
<td>Right Wing News</td>
<td>27900</td>
<td>794000</td>
<td>61379</td>
<td>219.9964%</td>
<td>7.7304%</td>
</tr>
<tr>
<td>StopDesign</td>
<td>10200</td>
<td>255000</td>
<td>86165</td>
<td>844.7549%</td>
<td>33.7902%</td>
</tr>
<tr>
<td>iBiblio</td>
<td>9730</td>
<td>197000</td>
<td>32301</td>
<td>331.9733%</td>
<td>16.3964%</td>
</tr>
<tr>
<td>Samizdata.net (mistake?)</td>
<td>25500</td>
<td>697000</td>
<td>61443</td>
<td>240.9529%</td>
<td>8.8154%</td>
</tr>
<tr>
<td>Abrupto</td>
<td>550</td>
<td>44700</td>
<td>2698</td>
<td>490.5455%</td>
<td>6.0358%</td>
</tr>
<tr>
<td>gene7299 (Asian MSNSpaces site)</td>
<td>58</td>
<td>764</td>
<td>28</td>
<td>48.2759%</td>
<td>3.6649%</td>
</tr>
<tr>
<td>Where is Raed?</td>
<td>10100</td>
<td>232000</td>
<td>24848</td>
<td>246.0198%</td>
<td>10.7103%</td>
</tr>
<tr>
<td>B3TA: We love the web</td>
<td>12000</td>
<td>839000</td>
<td>38386</td>
<td>319.8833%</td>
<td>4.5752%</td>
</tr>
<tr>
<td>Talkleft</td>
<td>7170</td>
<td>221000</td>
<td>60169</td>
<td>839.1771%</td>
<td>27.2258%</td>
</tr>
<tr>
<td>Wizbang</td>
<td>21000</td>
<td>634000</td>
<td>60259</td>
<td>286.9476%</td>
<td>9.5046%</td>
</tr>
<tr>
<td>m1net (MSN spaces site)</td>
<td>104</td>
<td>579</td>
<td>22</td>
<td>21.1538%</td>
<td>3.7997%</td>
</tr>
<tr>
<td>Hoder</td>
<td>1480</td>
<td>20900</td>
<td>1620</td>
<td>109.4595%</td>
<td>7.7512%</td>
</tr>
<tr>
<td>CTRL+Alt+Del</td>
<td>2310</td>
<td>171000</td>
<td>32277</td>
<td>1397.2727%</td>
<td>18.8754%</td>
</tr>
<tr>
<td>Brad DeLong</td>
<td>30100</td>
<td>882000</td>
<td>48403</td>
<td>160.8073%</td>
<td>5.4879%</td>
</tr>
<tr>
<td>Blogs for Bush</td>
<td>16200</td>
<td>824000</td>
<td>50820</td>
<td>313.7037%</td>
<td>6.1675%</td>
</tr>
<tr>
<td>Neil Gaiman</td>
<td>13700</td>
<td>319000</td>
<td>71916</td>
<td>524.9343%</td>
<td>22.5442%</td>
</tr>
<tr>
<td>Gothamist</td>
<td>15200</td>
<td>491000</td>
<td>47848</td>
<td>314.7895%</td>
<td>9.7450%</td>
</tr>
<tr>
<td>Thought Mechanics</td>
<td>4400</td>
<td>190000</td>
<td>60736</td>
<td>1380.3636%</td>
<td>31.9663%</td>
</tr>
<tr>
<td>IMAO</td>
<td>23800</td>
<td>407000</td>
<td>45822</td>
<td>192.5294%</td>
<td>11.2585%</td>
</tr>
<tr>
<td>Dan Gillmor (old weblog)</td>
<td>10800</td>
<td>298000</td>
<td>36369</td>
<td>336.7500%</td>
<td>12.2044%</td>
</tr>
<tr>
<td>HINAGATA</td>
<td>10100</td>
<td>21100</td>
<td>176519</td>
<td>1747.7129%</td>
<td>836.5829%</td>
</tr>
<tr>
<td>Dean’s World</td>
<td>30600</td>
<td>784000</td>
<td>53150</td>
<td>173.6928%</td>
<td>6.7793%</td>
</tr>
<tr>
<td>Defamer</td>
<td>9310</td>
<td>725000</td>
<td>49132</td>
<td>527.7336%</td>
<td>6.7768%</td>
</tr>
<tr>
<td>USS Clueless</td>
<td>8470</td>
<td>264000</td>
<td>64725</td>
<td>764.1677%</td>
<td>24.5170%</td>
</tr>
<tr>
<td>Dive into Mark</td>
<td>14600</td>
<td>235000</td>
<td>54167</td>
<td>371.0068%</td>
<td>23.0498%</td>
</tr>
<tr>
<td>Pandagon</td>
<td>27300</td>
<td>743000</td>
<td>51286</td>
<td>187.8608%</td>
<td>6.9026%</td>
</tr>
<tr>
<td>Blogging.la</td>
<td>3200</td>
<td>67700</td>
<td>8495</td>
<td>265.4688%</td>
<td>12.5480%</td>
</tr>
<tr>
<td>Why are you worshipping the ground I blog on?</td>
<td>1430</td>
<td>85000</td>
<td>3481</td>
<td>243.4266%</td>
<td>4.0953%</td>
</tr>
<tr>
<td>Daring Fireball</td>
<td>12000</td>
<td>221000</td>
<td>52381</td>
<td>436.5083%</td>
<td>23.7018%</td>
</tr>
</table>
<p>The most interesting thing here is that MSN seems to prove the assertion I had made regarding Google not providing as many links as Yahoo does. The same seems to be true between MSN and Google. There were, however, a few surprises here, as far as I’m concerned:</p>
<ul>
<li>Sites located in the United States seem to fair better, on MSN, than other sites. Google and Yahoo seem to have a stronger indexing presence outside the US than MSN does.</li>
<li>MSN spaces sites are not getting particularly great representation in MSN search, compared to its competitors. I was surprised by this since they are part of the same service</li>
</ul>
<h3>Conclusions and more!</h3>
<p>So there you have, no great insight here apart from the fact that this linking stuff is interesting and that even small scale analysis can bring up some interesting trends. As I mentioned before, I am not an expert on this and thought to put together the numbers and start an analysis. However, I know that this series has attracted experts so here’s a deal: I’m making the spreadsheet of data I compiled available under a Creative Commons License (By Attribution, Share Alike) here on TNL.net. If you manage to do anything interesting with it, drop me a note and please make sure that you share it with the wider public. Enjoy!</p>
<p><p><i><a href="http://tnl.net/who" rel="author" title="Who is Tristan Louis?">Tristan Louis</a> is the founder and CEO of <a href="http://www.keepskor.com" title="Keepskor">Keepskor</a> and  writes the influential <a href="http://www.tnl.net/" title="tnl.net">tnl.net</a> weblog, where this was initially posted under the title <a href="http://www.tnl.net/blog/2005/07/30/links-and-search-engines-the-msn-edition/">Links and Search Engines: The MSN edition</a>. You can follow him on twitter <a href="https://twitter.com/TNLNYC">here</a> or receive his weekly newsletter by subscribing <a href="http://eepurl.com/gb6zD">here</a>.</i></p>
</p>
]]></content:encoded>
			<wfw:commentRss>http://www.tnl.net/blog/2005/07/30/links-and-search-engines-the-msn-edition/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>RSS and Media: Can’t we all just get along?</title>
		<link>http://www.tnl.net/blog/2005/06/29/rss-and-media-cant-we-all-just-get-along/</link>
		<comments>http://www.tnl.net/blog/2005/06/29/rss-and-media-cant-we-all-just-get-along/#comments</comments>
		<pubDate>Wed, 29 Jun 2005 07:39:16 +0000</pubDate>
		<dc:creator>Tristan Louis</dc:creator>
				<category><![CDATA[Technology]]></category>
		<category><![CDATA[Apple]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Microsoft]]></category>
		<category><![CDATA[RSS]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[Syndication]]></category>
		<category><![CDATA[Video]]></category>
		<category><![CDATA[Yahoo]]></category>

		<guid isPermaLink="false">http://tnl.net/blog/2005/06/29/rss-and-media-cant-we-all-just-get-along/</guid>
		<description><![CDATA[I keep trying to work on an entry to close the loop on the search engine and links research but RSS news is getting in the way. Last week, it was Microsoft’s welcome endorsement and a new set of extensions and this week, it’s Apple and its announcement of a new specification to add more [...]<p><p><i><a href="http://tnl.net/who" rel="author" title="Who is Tristan Louis?">Tristan Louis</a> is the founder and CEO of <a href="http://www.keepskor.com" title="Keepskor">Keepskor</a> and  writes the influential <a href="http://www.tnl.net/" title="tnl.net">tnl.net</a> weblog, where this was initially posted under the title <a href="http://www.tnl.net/blog/2005/06/29/rss-and-media-cant-we-all-just-get-along/">RSS and Media: Can’t we all just get along?</a>. You can follow him on twitter <a href="https://twitter.com/TNLNYC">here</a> or receive his weekly newsletter by subscribing <a href="http://eepurl.com/gb6zD">here</a>.</i></p>
</p>
]]></description>
			<content:encoded><![CDATA[<p>I keep trying to work on an entry to close the loop on the search engine and links research but RSS news is getting in the way. Last week, it was <a title="TNL.net: Microsoft Loves RSS" href="http://www.tnl.net/blog/2005/06/23/microsoft-loves-rss/">Microsoft’s welcome endorsement and a new set of extensions</a> and this week, it’s Apple and its announcement of a new specification to add more data to RSS feeds used for podcasting. All this is nice but it seems that we’re seeing the beginning of a fairly new battle around RSS.</p>
<h3>Some History</h3>
<p>Before I go into details about Apple’s new offering, I want to give a little background that will clear up some of my confusions. I’ve been involved in the RSS community since 1999, way back when it was just the domain of geeks.</p>
<p><a title="Yahoo! groups: Some suggestions for RSS .92 - Fri Oct 13, 2000  7:19 pm" href="http://tech.groups.yahoo.com/group/syndication/auth?check=G&#038;done=http%3A%2F%2Fgroups%2Eyahoo%2Ecom%2Fgroup%2Fsyndication%2Fmessage%2F698">Back in 2000, I made a few suggestions as to how RSS could be improved</a>. At the same, the main version of RSS was version 0.91 and there was some interest in making a new version that would be called RSS 0.92 (yes, it was the alpha days of RSS). So five years ago, I was pushing for crazy concepts like adding a <code>date</code> to an item or finding ways to attach sound files and video files into RSS feeds. Because of that, some people have asked me to opine on things like podcasting and my general contention is that podcasting is a good thing and that the way support for richer files is implemented in RSS is much sounder than what I had offered in the past.</p>
<p>Subsequent battles created a fork in the RSS movement, with one of the main issues being the use of namespaces in RSS. From there came the great split, with RSS 1.0 breaking rank with previous versions of the format, and RSS 2.0 breaking rank with RSS 1.0. Two formats, which moved in parallel. Dave Winer did a great job promoting the 2.0 format and eventually, a majority started supporting it. Since then, a third syndication format (known as ATOM) has popped up and its making its way toward a 1.0 release. With all this, we’re seeing a lot of smart people basically trying to solve some of the same problems, without really working together.</p>
<h3>A proposal</h3>
<p>Looking at this, I pity the fact that it took us so long to get as far as we’ve gotten. However, with large players now dancing in the syndication space, I am starting to worry that things are going to get worse before they get better. As a result, I’d like to offer a modest proposal: let’s merge all this work and come up with established data sets that are compatible. The use of namespaces for each vendor use is a great idea but shouldn’t one first think about what they are trying to accomplish and look at prior art before trying to reinvent the wheel? Let’s look at the example of today’s announcement from Apple.</p>
<h3>RSS Does Media</h3>
<p>So podcasting is becoming much bigger. And videocasting is coming soon. How about looking at media use in RSS. Wait, what do you know: <a title="Media RSS Specification Version 1.0.0" href="http://video.search.yahoo.com/mrss">Yahoo! has done some of the work already with the media RSS specification</a> (I know this is the second time in as many week that I’ve pulled out the Yahoo! name but it’s because they’ve been doing good work). The specification provides a number of interesting things so I would suggest that Yahoo! and Apple developers sit down together and come up with an agreed upon set of definitions. Here are a few things that I would put on the table for discussion by both entities:</p>
<ol>
<li>A common namespace: it would be nice if they both agreed to a common namespace. I’d reccommend something that does not include a version number (a mistake made in the Apple spec) but it might be nice to have it set as a DTD, which could ease validation.</li>
<li>Add <code>media:group</code> to the final specification, it looks like a very valuable one, especially for content that is encoded in more than one way (this will probably be something Apple does not want)</li>
<li>Retain <code>media:category</code> and have it replace <code>itunes:category</code>. Here, the Yahoo version seems to provide for more flexibility</li>
<li>Replace <code>itunes:explicit</code> and <code>media:adult</code> with <code>media:explicit</code>. What is defined as an adult varies from country to country whereas explicit is well, more explicit.</li>
<li><code>media:text</code> should replace the <code>itunes:subtitle</code> and <code>itunes:summary</code> but it should also get something added to differentiate the two (maybe a <code>content</code> attribute?)</li>
<li><code>itunes:author</code> could be taken care of with <code>media:credit</code>. Maybe this one could be required. The role of <code>owner</code> should be added to it and an extra attribute could be added for <code>email</code> which would cover the whole <code>itunes:owner</code> section</li>
<li><code>itunes:images</code> and <code>media:thumbnail</code> could be merged</li>
<li><code>itunes:block</code> is a good idea and could be created as a new <code>media:block</code> element which would also have a <code>distributor</code> attribute. This distributor attribute would allow to block different distributors moving forward so a creator could decide to distribute certain content only to certain channels.</li>
</ol>
<p>If both Yahoo! and Apple were to agree to do this, they would end up with a much stronger joint specification and I believe it would also represent a show of good faith from both companies and an understanding that cooperation is good for everyone. I may dream but I hope that we will see this kind of partnership happen, which is why I’d like to ask everyone to make sure to tell their friends about this entry. Together, maybe we can get Apple and Yahoo! to work together on cleaning this stuff up (and anyone else who wants to play in that space, including Microsoft and Google). Otherwise, we will see increasing fragmentation of the markets, which will result in less content for each of the specification proponents.</p>
<p><p><i><a href="http://tnl.net/who" rel="author" title="Who is Tristan Louis?">Tristan Louis</a> is the founder and CEO of <a href="http://www.keepskor.com" title="Keepskor">Keepskor</a> and  writes the influential <a href="http://www.tnl.net/" title="tnl.net">tnl.net</a> weblog, where this was initially posted under the title <a href="http://www.tnl.net/blog/2005/06/29/rss-and-media-cant-we-all-just-get-along/">RSS and Media: Can’t we all just get along?</a>. You can follow him on twitter <a href="https://twitter.com/TNLNYC">here</a> or receive his weekly newsletter by subscribing <a href="http://eepurl.com/gb6zD">here</a>.</i></p>
</p>
]]></content:encoded>
			<wfw:commentRss>http://www.tnl.net/blog/2005/06/29/rss-and-media-cant-we-all-just-get-along/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Technorati Yahoo and Google Too</title>
		<link>http://www.tnl.net/blog/2005/06/20/technorati-yahoo-and-google-too/</link>
		<comments>http://www.tnl.net/blog/2005/06/20/technorati-yahoo-and-google-too/#comments</comments>
		<pubDate>Mon, 20 Jun 2005 06:16:47 +0000</pubDate>
		<dc:creator>Tristan Louis</dc:creator>
				<category><![CDATA[Business]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[Yahoo]]></category>

		<guid isPermaLink="false">http://tnl.net/blog/2005/06/20/technorati-yahoo-and-google-too/</guid>
		<description><![CDATA[In the last entry on the subject, we took a look at how Technorati and Google compared. From there, we discovered that Technorati was getting roughly a fourth of the links Google could locate. Which brought up some interesting questions: could we rely on the Google numbers? Were they so much larger than any other [...]<p><p><i><a href="http://tnl.net/who" rel="author" title="Who is Tristan Louis?">Tristan Louis</a> is the founder and CEO of <a href="http://www.keepskor.com" title="Keepskor">Keepskor</a> and  writes the influential <a href="http://www.tnl.net/" title="tnl.net">tnl.net</a> weblog, where this was initially posted under the title <a href="http://www.tnl.net/blog/2005/06/20/technorati-yahoo-and-google-too/">Technorati Yahoo and Google Too</a>. You can follow him on twitter <a href="https://twitter.com/TNLNYC">here</a> or receive his weekly newsletter by subscribing <a href="http://eepurl.com/gb6zD">here</a>.</i></p>
</p>
]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.tnl.net/blog/2005/06/13/secrets-of-the-a-list-bloggers-technorati-vs-google/" title="TNL.net: Secrets of the A-list bloggers: Technorati vs. Google">In the last entry on the subject</a>, we took a look at how Technorati and Google compared. From there, we discovered that Technorati was getting roughly a fourth of the links Google could locate. Which brought up some interesting questions: could we rely on the Google numbers? Were they so much larger than any other search engine that we were building an unfair comparison? And, as some alert readers pointed in email, was Google under-reporting the number of links to a site? In order to answer some of those questions, I decided to build some more comparisons. So I decided to take a look at some of Google’s competitors. Today, I’ll go into how Yahoo! fared (Hint: I was surprised by the results).</p>
<h3>Gathering the data</h3>
<p>As I had done for the previous effort, I gathered data against the same list of site around the same date. This provided me with some consistency in the data set that allowed for better comparison. Compare one or two site and you may get some false positives. Compare 100 sites and things start getting a little more interesting. The Yahoo! data ended up looking at this (for people who are new to the series, I am doing the same graphs for a number of search engines):</p>
<table border="1" summary="technorati vs. yahoo">
<tr>
<th>Technorati Top 100</th>
<th>Yahoo Links</th>
<th>Technorati Links</th>
<th>Technorati/Yahoo Links</th>
</tr>
<tr>
<td>Boing Boing</td>
<td>1880000</td>
<td>22532</td>
<td>1.19851%</td>
<td>Â </td>
</tr>
<tr>
<td>InstaPundit</td>
<td>2160000</td>
<td>15190</td>
<td>0.70324%</td>
<td>Â </td>
</tr>
<tr>
<td>Daily Kos</td>
<td>1690000</td>
<td>15833</td>
<td>0.93686%</td>
<td>Â </td>
</tr>
<tr>
<td>Gizmodo</td>
<td>1970000</td>
<td>12278</td>
<td>0.62325%</td>
<td>Â </td>
</tr>
<tr>
<td>Fark</td>
<td>1420000</td>
<td>10216</td>
<td>0.71944%</td>
<td>Â </td>
</tr>
<tr>
<td>EnGadget</td>
<td>2820000</td>
<td>15051</td>
<td>0.53372%</td>
<td>Â </td>
</tr>
<tr>
<td>Davenetics</td>
<td>66400</td>
<td>7571</td>
<td>11.40211%</td>
<td>Â </td>
</tr>
<tr>
<td>Eschaton</td>
<td>1400000</td>
<td>8713</td>
<td>0.62236%</td>
<td>Â </td>
</tr>
<tr>
<td>Dooce</td>
<td>653000</td>
<td>6797</td>
<td>1.04089%</td>
<td>Â </td>
</tr>
<tr>
<td>Andrew Sullivan</td>
<td>1260000</td>
<td>7680</td>
<td>0.60952%</td>
<td>Â </td>
</tr>
<tr>
<td>The Best Page In The Universe</td>
<td>62000</td>
<td>6333</td>
<td>10.21452%</td>
<td>Â </td>
</tr>
<tr>
<td>Talking Points Memo: by Joshua Micah Marshall</td>
<td>563000</td>
<td>7592</td>
<td>1.34849%</td>
<td>Â </td>
</tr>
<tr>
<td>lgf: anti-idiotarian</td>
<td>49300</td>
<td>8275</td>
<td>16.78499%</td>
<td>Â </td>
</tr>
<tr>
<td>kottke.org</td>
<td>1200000</td>
<td>7278</td>
<td>0.60650%</td>
<td>Â </td>
</tr>
<tr>
<td>WIL WHEATON DOT NET</td>
<td>564000</td>
<td>6314</td>
<td>1.11950%</td>
<td>Â </td>
</tr>
<tr>
<td>Metafilter</td>
<td>1160000</td>
<td>7591</td>
<td>0.65440%</td>
<td>Â </td>
</tr>
<tr>
<td>Doc Searls</td>
<td>1150000</td>
<td>5690</td>
<td>0.49478%</td>
<td>Â </td>
</tr>
<tr>
<td>(In)formacao e (In)utilidade</td>
<td>110000</td>
<td>6040</td>
<td>5.49091%</td>
<td>Â </td>
</tr>
<tr>
<td>Wonkette</td>
<td>1370000</td>
<td>5877</td>
<td>0.42898%</td>
<td>Â </td>
</tr>
<tr>
<td>Scripting News</td>
<td>1470000</td>
<td>5728</td>
<td>0.38966%</td>
<td>Â </td>
</tr>
<tr>
<td>Power Line</td>
<td>344000</td>
<td>7477</td>
<td>2.17355%</td>
<td>Â </td>
</tr>
<tr>
<td>Balmasque</td>
<td>40500</td>
<td>4544</td>
<td>11.21975%</td>
<td>Â </td>
</tr>
<tr>
<td>Corante</td>
<td>265000</td>
<td>7686</td>
<td>2.90038%</td>
<td>Â </td>
</tr>
<tr>
<td>A list Apart</td>
<td>620000</td>
<td>5536</td>
<td>0.89290%</td>
<td>Â </td>
</tr>
<tr>
<td>Something Awful</td>
<td>372000</td>
<td>4512</td>
<td>1.21290%</td>
<td>Â </td>
</tr>
<tr>
<td>Megatokyo</td>
<td>361000</td>
<td>4154</td>
<td>1.15069%</td>
<td>Â </td>
</tr>
<tr>
<td>Michelle Malkin</td>
<td>537000</td>
<td>6091</td>
<td>1.13426%</td>
<td>Â </td>
</tr>
<tr>
<td>Arts and Letters Daily</td>
<td>866000</td>
<td>3983</td>
<td>0.45993%</td>
<td>Â </td>
</tr>
<tr>
<td>Gawker</td>
<td>1060000</td>
<td>4453</td>
<td>0.42009%</td>
<td>Â </td>
</tr>
<tr>
<td>Afterall it was the best I ever had</td>
<td>34900</td>
<td>3591</td>
<td>10.28940%</td>
<td>Â </td>
</tr>
<tr>
<td>The Volokh Conspiracy</td>
<td>1190000</td>
<td>5873</td>
<td>0.49353%</td>
<td>Â </td>
</tr>
<tr>
<td>Scobelizer</td>
<td>937000</td>
<td>5524</td>
<td>0.58954%</td>
<td>Â </td>
</tr>
<tr>
<td>Jeffrey Zeldman</td>
<td>528000</td>
<td>4134</td>
<td>0.78295%</td>
<td>Â </td>
</tr>
<tr>
<td>This Modern World</td>
<td>813000</td>
<td>3913</td>
<td>0.48130%</td>
<td>Â </td>
</tr>
<tr>
<td>The Web Standards Project</td>
<td>59800</td>
<td>3810</td>
<td>6.37124%</td>
<td>Â </td>
</tr>
<tr>
<td>Joel on Software</td>
<td>966000</td>
<td>4514</td>
<td>0.46729%</td>
<td>Â </td>
</tr>
<tr>
<td>Media Matters for America</td>
<td>536000</td>
<td>6809</td>
<td>1.27034%</td>
<td>Â </td>
</tr>
<tr>
<td>Television without pity</td>
<td>356000</td>
<td>3859</td>
<td>1.08399%</td>
<td>Â </td>
</tr>
<tr>
<td>Kuro5hin</td>
<td>866000</td>
<td>4208</td>
<td>0.48591%</td>
<td>Â </td>
</tr>
<tr>
<td>Lileks</td>
<td>39700</td>
<td>3824</td>
<td>9.63224%</td>
<td>Â </td>
</tr>
<tr>
<td>Hugh Hewitt</td>
<td>929000</td>
<td>4573</td>
<td>0.49225%</td>
<td>Â </td>
</tr>
<tr>
<td>Joel Veitch</td>
<td>135000</td>
<td>3774</td>
<td>2.79556%</td>
<td>Â </td>
</tr>
<tr>
<td>Truthout</td>
<td>371000</td>
<td>6528</td>
<td>1.75957%</td>
<td>Â </td>
</tr>
<tr>
<td>Baghdad Burning</td>
<td>552000</td>
<td>3519</td>
<td>0.63750%</td>
<td>Â </td>
</tr>
<tr>
<td>Buzz machine</td>
<td>1010000</td>
<td>4145</td>
<td>0.41040%</td>
<td>Â </td>
</tr>
<tr>
<td>fleugel</td>
<td>201000</td>
<td>3670</td>
<td>1.82587%</td>
<td>Â </td>
</tr>
<tr>
<td>Informed Comment</td>
<td>787000</td>
<td>3905</td>
<td>0.49619%</td>
<td>Â </td>
</tr>
<tr>
<td>Doppler: redefining podcasting</td>
<td>607000</td>
<td>3040</td>
<td>0.50082%</td>
<td>Â </td>
</tr>
<tr>
<td>geek and proud</td>
<td>9110</td>
<td>3166</td>
<td>34.75302%</td>
<td>Â </td>
</tr>
<tr>
<td>loadmemory (Asian site)</td>
<td>1550</td>
<td>3324</td>
<td>214.45161%</td>
<td>Â </td>
</tr>
<tr>
<td>Photojunkie</td>
<td>51200</td>
<td>2860</td>
<td>5.58594%</td>
<td>Â </td>
</tr>
<tr>
<td>Ross Rader</td>
<td>48200</td>
<td>2976</td>
<td>6.17427%</td>
<td>Â </td>
</tr>
<tr>
<td>The Truth Laid Bear</td>
<td>717000</td>
<td>4127</td>
<td>0.57559%</td>
<td>Â </td>
</tr>
<tr>
<td>Joi Ito</td>
<td>1050000</td>
<td>5165</td>
<td>0.49190%</td>
<td>Â </td>
</tr>
<tr>
<td>ScrappleFace</td>
<td>807000</td>
<td>3480</td>
<td>0.43123%</td>
<td>Â </td>
</tr>
<tr>
<td>LexText</td>
<td>31200</td>
<td>2671</td>
<td>8.56090%</td>
<td>Â </td>
</tr>
<tr>
<td>Google Blog</td>
<td>297000</td>
<td>3688</td>
<td>1.24175%</td>
<td>Â </td>
</tr>
<tr>
<td>Xbox</td>
<td>237000</td>
<td>4221</td>
<td>1.78101%</td>
<td>Â </td>
</tr>
<tr>
<td>My life in a Bush of Ghosts</td>
<td>903</td>
<td>2519</td>
<td>278.95903%</td>
<td>Â </td>
</tr>
<tr>
<td>Astronomy picture of the day</td>
<td>113000</td>
<td>3498</td>
<td>3.09558%</td>
<td>Â </td>
</tr>
<tr>
<td>Crooked Timber</td>
<td>67500</td>
<td>3617</td>
<td>5.35852%</td>
<td>Â </td>
</tr>
<tr>
<td>Vodka Pundit</td>
<td>169000</td>
<td>3085</td>
<td>1.82544%</td>
<td>Â </td>
</tr>
<tr>
<td>Captain’s quarter</td>
<td>730000</td>
<td>3671</td>
<td>0.50288%</td>
<td>Â </td>
</tr>
<tr>
<td>A small victory</td>
<td>460000</td>
<td>3223</td>
<td>0.70065%</td>
<td>Â </td>
</tr>
<tr>
<td>Gato Fedorento</td>
<td>126000</td>
<td>2574</td>
<td>2.04286%</td>
<td>Â </td>
</tr>
<tr>
<td>Mezzoblue</td>
<td>278000</td>
<td>2952</td>
<td>1.06187%</td>
<td>Â </td>
</tr>
<tr>
<td>PostSecret</td>
<td>202000</td>
<td>2707</td>
<td>1.34010%</td>
<td>Â </td>
</tr>
<tr>
<td>Samizdata.net</td>
<td>18000</td>
<td>2872</td>
<td>15.95556%</td>
<td>Â </td>
</tr>
<tr>
<td>Lawrence Lessig</td>
<td>959000</td>
<td>2949</td>
<td>0.30751%</td>
<td>Â </td>
</tr>
<tr>
<td>Counterpunch</td>
<td>295000</td>
<td>3278</td>
<td>1.11119%</td>
<td>Â </td>
</tr>
<tr>
<td>Democractic Underground</td>
<td>417000</td>
<td>3913</td>
<td>0.93837%</td>
<td>Â </td>
</tr>
<tr>
<td>Right Wing News</td>
<td>794000</td>
<td>2967</td>
<td>0.37368%</td>
<td>Â </td>
</tr>
<tr>
<td>StopDesign</td>
<td>255000</td>
<td>3037</td>
<td>1.19098%</td>
<td>Â </td>
</tr>
<tr>
<td>iBiblio</td>
<td>197000</td>
<td>3105</td>
<td>1.57614%</td>
<td>Â </td>
</tr>
<tr>
<td>Samizdata.net (mistake?)</td>
<td>697000</td>
<td>2743</td>
<td>0.39354%</td>
<td>Â </td>
</tr>
<tr>
<td>Abrupto</td>
<td>44700</td>
<td>2935</td>
<td>6.56600%</td>
<td>Â </td>
</tr>
<tr>
<td>gene7299 (Asian MSNSpaces site)</td>
<td>764</td>
<td>3215</td>
<td>420.81152%</td>
<td>Â </td>
</tr>
<tr>
<td>Where is Raed?</td>
<td>232000</td>
<td>2409</td>
<td>1.03836%</td>
<td>Â </td>
</tr>
<tr>
<td>B3TA: We love the web</td>
<td>839000</td>
<td>2614</td>
<td>0.31156%</td>
<td>Â </td>
</tr>
<tr>
<td>Talkleft</td>
<td>221000</td>
<td>2901</td>
<td>1.31267%</td>
<td>Â </td>
</tr>
<tr>
<td>Wizbang</td>
<td>634000</td>
<td>3358</td>
<td>0.52965%</td>
<td>Â </td>
</tr>
<tr>
<td>m1net (MSN spaces site)</td>
<td>579</td>
<td>3548</td>
<td>612.78066%</td>
<td>Â </td>
</tr>
<tr>
<td>Hoder</td>
<td>20900</td>
<td>5422</td>
<td>25.94258%</td>
<td>Â </td>
</tr>
<tr>
<td>CTRL+Alt+Del</td>
<td>171000</td>
<td>2315</td>
<td>1.35380%</td>
<td>Â </td>
</tr>
<tr>
<td>Brad DeLong</td>
<td>882000</td>
<td>2715</td>
<td>0.30782%</td>
<td>Â </td>
</tr>
<tr>
<td>Blogs for Bush</td>
<td>824000</td>
<td>3560</td>
<td>0.43204%</td>
<td>Â </td>
</tr>
<tr>
<td>Neil Gaiman</td>
<td>319000</td>
<td>2194</td>
<td>0.68777%</td>
<td>Â </td>
</tr>
<tr>
<td>Gothamist</td>
<td>491000</td>
<td>2729</td>
<td>0.55580%</td>
<td>Â </td>
</tr>
<tr>
<td>Thought Mechanics</td>
<td>190000</td>
<td>2197</td>
<td>1.15632%</td>
<td>Â </td>
</tr>
<tr>
<td>IMAO</td>
<td>407000</td>
<td>2905</td>
<td>0.71376%</td>
<td>Â </td>
</tr>
<tr>
<td>Dan Gillmor (old weblog)</td>
<td>298000</td>
<td>2600</td>
<td>0.87248%</td>
<td>Â </td>
</tr>
<tr>
<td>HINAGATA</td>
<td>21100</td>
<td>2186</td>
<td>10.36019%</td>
<td>Â </td>
</tr>
<tr>
<td>Dean’s World</td>
<td>784000</td>
<td>2985</td>
<td>0.38074%</td>
<td>Â </td>
</tr>
<tr>
<td>Defamer</td>
<td>725000</td>
<td>2372</td>
<td>0.32717%</td>
<td>Â </td>
</tr>
<tr>
<td>USS Clueless</td>
<td>264000</td>
<td>2570</td>
<td>0.97348%</td>
<td>Â </td>
</tr>
<tr>
<td>Dive into Mark</td>
<td>235000</td>
<td>2540</td>
<td>1.08085%</td>
<td>Â </td>
</tr>
<tr>
<td>Pandagon</td>
<td>743000</td>
<td>2822</td>
<td>0.37981%</td>
<td>Â </td>
</tr>
<tr>
<td>Blogging.la</td>
<td>67700</td>
<td>3061</td>
<td>4.52142%</td>
<td>Â </td>
</tr>
<tr>
<td>Why are you worshipping the ground I blog on?</td>
<td>85000</td>
<td>2238</td>
<td>2.63294%</td>
<td>Â </td>
</tr>
<tr>
<td>Daring Fireball</td>
<td>221000</td>
<td>2573</td>
<td>1.16425%</td>
<td>Â </td>
</tr>
</table>
<p>The first thing of interest when putting together that set of numbers was how much larger the number of links found in the Yahoo! index was, compared to the number of links found in either Technorati or Google. The second item I found interesting was a relative consistency in terms of Asian sites not figuring well in the Yahoo! index compared to the Technorati one. It seems that Technorati is getting a better handle on the Asian blogosphere than Yahoo! is, a surprising result considering how much time and effort the latter has put into its Asian operations.</p>
<p>In order to get some real visual comparison, I decided to draw a similar diagram of the link percentages distributed across all 100 sites. It looked like this:</p>
<p><img src="http://www.tnl.net/assets/images/blog/secrets/TY1.gif" alt="link distribution" /></p>
<p>The interesting story, looking at this is that it appeared that there was much greater variance from site to site in the Google index that there was in the Yahoo! one. In the Yahoo system, the vast majority of site fall in the below one percent range but what became even more interesting was that the rate of variance was really not that high: when comparing the median and the average, it turned out to be less than .1% of difference:</p>
<table border="1" summary="averages">
<tr>
<th>Technorati Top 100</th>
<th>Yahoo Links</th>
<th>Technorati Links</th>
<th>Technorati/Yahoo Links</th>
</tr>
<tr>
<td>Total</td>
<td>56150006</td>
<td>479580</td>
<td>0.85410%</td>
</tr>
<tr>
<td>Median</td>
<td>389500</td>
<td>3679.5</td>
<td>0.94467%</td>
</tr>
</table>
<p>While the number were vastly different in terms of size (it appeared Yahoo! had a lot more links), I figured the patterns would be roughly the same in terms of coverage: I expected the top sites to get better coverage in a large search engine like Yahoo! than smaller sites. Imagine my surprise then when I started to do some group analysis:</p>
<table border="1" summary="by groups">
<tr>
<th>Technorati Top 100</th>
<th>Yahoo Links</th>
<th>Technorati Links</th>
<th>Technorati/Yahoo Links</th>
</tr>
<tr>
<td>AVERAGE TOP 10</td>
<td>1531940</td>
<td>12186.1</td>
<td>0.79547%</td>
<td>Â </td>
</tr>
<tr>
<td>AVERAGE TOP 25</td>
<td>986368</td>
<td>8733.36</td>
<td>0.88541%</td>
<td>Â </td>
</tr>
<tr>
<td>AVERAGE TOP 50</td>
<td>768245.2</td>
<td>6534.36</td>
<td>0.85056%</td>
<td>Â </td>
</tr>
<tr>
<td>AVERAGE BOTTOM 50</td>
<td>354754.92</td>
<td>3057.24</td>
<td>0.86179%</td>
<td>Â </td>
</tr>
<tr>
<td>AVERAGE BOTTOM 25</td>
<td>362220.8846</td>
<td>2834.884615</td>
<td>0.78264%</td>
<td>Â </td>
</tr>
<tr>
<td>AVERAGE BOTTOM 10</td>
<td>350072.7273</td>
<td>2622.909091</td>
<td>0.74925%</td>
<td>Â </td>
</tr>
</table>
<p>Those numbers seemed to be all over the map, a fact that became much clearer once I graphed it:</p>
<p><img src="http://www.tnl.net/assets/images/blog/secrets/TY2.gif" alt="don't you like a nice graph" /></p>
<p>None of the nice downgrade curve I had with the Google set. Here was a much more disparate set, providing little in terms of supporting a theory of bias from a search engine. In fact, it worked more to potentially prove such theory wrong.</p>
<p>Was my data set wrong? I rechecked it and it was not. So what was happening here? As dreams of long tail and power law distributions fell out, I started to wonder how Yahoo! and Google compared. So, of course, I decided to run the numbers again…</p>
<h3>Yahoo! vs. Google</h3>
<p>This time I decided to compare Google and Yahoo! First, I figured I would get some reference data on the subject. I was surprised to not find any actual side by side comparison on a large set of sites. Anecdotal evidence existed but nothing compared to the data set I had amassed so I figure I would trust my own data set (note: If you have a better one, please leave a comment as to where it is located). The set ended up looking like this:</p>
<table border="1" summary="position">
<tr>
<th>Name</th>
<th>Position 5/19/05</th>
<th>Google</th>
<th>Yahoo</th>
<th>Google/Yahoo Links</th>
</tr>
<tr>
<td>Boing Boing</td>
<td>1</td>
<td>45200</td>
<td>1880000</td>
<td>2.40%</td>
</tr>
<tr>
<td>InstaPundit</td>
<td>2</td>
<td>75000</td>
<td>2160000</td>
<td>3.47%</td>
</tr>
<tr>
<td>Daily Kos</td>
<td>3</td>
<td>59800</td>
<td>1690000</td>
<td>3.54%</td>
</tr>
<tr>
<td>Gizmodo</td>
<td>4</td>
<td>39300</td>
<td>1970000</td>
<td>1.99%</td>
</tr>
<tr>
<td>Fark</td>
<td>5</td>
<td>43600</td>
<td>1420000</td>
<td>3.07%</td>
</tr>
<tr>
<td>EnGadget</td>
<td>6</td>
<td>46800</td>
<td>2820000</td>
<td>1.66%</td>
</tr>
<tr>
<td>Davenetics</td>
<td>7</td>
<td>1780</td>
<td>66400</td>
<td>2.68%</td>
</tr>
<tr>
<td>Eschaton</td>
<td>8</td>
<td>62400</td>
<td>1400000</td>
<td>4.46%</td>
</tr>
<tr>
<td>Dooce</td>
<td>9</td>
<td>23600</td>
<td>653000</td>
<td>3.61%</td>
</tr>
<tr>
<td>Andrew Sullivan</td>
<td>10</td>
<td>41100</td>
<td>1260000</td>
<td>3.26%</td>
</tr>
<tr>
<td>The Best Page In The Universe</td>
<td>11</td>
<td>656</td>
<td>62000</td>
<td>1.06%</td>
</tr>
<tr>
<td>Talking Points Memo: by Joshua Micah Marshall</td>
<td>12</td>
<td>74600</td>
<td>563000</td>
<td>13.25%</td>
</tr>
<tr>
<td>lgf: anti-idiotarian</td>
<td>13</td>
<td>14700</td>
<td>49300</td>
<td>29.82%</td>
</tr>
<tr>
<td>kottke.org</td>
<td>14</td>
<td>32000</td>
<td>1200000</td>
<td>2.67%</td>
</tr>
<tr>
<td>WIL WHEATON DOT NET</td>
<td>15</td>
<td>16900</td>
<td>564000</td>
<td>3.00%</td>
</tr>
<tr>
<td>Metafilter</td>
<td>16</td>
<td>34500</td>
<td>1160000</td>
<td>2.97%</td>
</tr>
<tr>
<td>Doc Searls</td>
<td>17</td>
<td>33600</td>
<td>1150000</td>
<td>2.92%</td>
</tr>
<tr>
<td>(In)formacao e (In)utilidade</td>
<td>18</td>
<td>1780</td>
<td>110000</td>
<td>1.62%</td>
</tr>
<tr>
<td>Wonkette</td>
<td>19</td>
<td>28800</td>
<td>1370000</td>
<td>2.10%</td>
</tr>
<tr>
<td>Scripting News</td>
<td>20</td>
<td>39400</td>
<td>1470000</td>
<td>2.68%</td>
</tr>
<tr>
<td>Power Line</td>
<td>21</td>
<td>7510</td>
<td>344000</td>
<td>2.18%</td>
</tr>
<tr>
<td>Balmasque</td>
<td>22</td>
<td>24</td>
<td>40500</td>
<td>0.06%</td>
</tr>
<tr>
<td>Corante</td>
<td>23</td>
<td>6770</td>
<td>265000</td>
<td>2.55%</td>
</tr>
<tr>
<td>A list Apart</td>
<td>24</td>
<td>21100</td>
<td>620000</td>
<td>3.40%</td>
</tr>
<tr>
<td>Something Awful</td>
<td>25</td>
<td>9020</td>
<td>372000</td>
<td>2.42%</td>
</tr>
<tr>
<td>Megatokyo</td>
<td>26</td>
<td>7310</td>
<td>361000</td>
<td>2.02%</td>
</tr>
<tr>
<td>Michelle Malkin</td>
<td>27</td>
<td>17300</td>
<td>537000</td>
<td>3.22%</td>
</tr>
<tr>
<td>Arts and Letters Daily</td>
<td>28</td>
<td>23900</td>
<td>866000</td>
<td>2.76%</td>
</tr>
<tr>
<td>Gawker</td>
<td>29</td>
<td>23500</td>
<td>1060000</td>
<td>2.22%</td>
</tr>
<tr>
<td>Afterall it was the best I ever had</td>
<td>30</td>
<td>95</td>
<td>34900</td>
<td>0.27%</td>
</tr>
<tr>
<td>The Volokh Conspiracy</td>
<td>31</td>
<td>42000</td>
<td>1190000</td>
<td>3.53%</td>
</tr>
<tr>
<td>Scobelizer</td>
<td>32</td>
<td>21800</td>
<td>937000</td>
<td>2.33%</td>
</tr>
<tr>
<td>Jeffrey Zeldman</td>
<td>33</td>
<td>22500</td>
<td>528000</td>
<td>4.26%</td>
</tr>
<tr>
<td>This Modern World</td>
<td>34</td>
<td>32100</td>
<td>813000</td>
<td>3.95%</td>
</tr>
<tr>
<td>The Web Standards Project</td>
<td>35</td>
<td>1850</td>
<td>59800</td>
<td>3.09%</td>
</tr>
<tr>
<td>Joel on Software</td>
<td>36</td>
<td>22400</td>
<td>966000</td>
<td>2.32%</td>
</tr>
<tr>
<td>Media Matters for America</td>
<td>37</td>
<td>24800</td>
<td>536000</td>
<td>4.63%</td>
</tr>
<tr>
<td>Television without pity</td>
<td>38</td>
<td>13300</td>
<td>356000</td>
<td>3.74%</td>
</tr>
<tr>
<td>Kuro5hin</td>
<td>39</td>
<td>17300</td>
<td>866000</td>
<td>2.00%</td>
</tr>
<tr>
<td>Lileks</td>
<td>40</td>
<td>Â </td>
<td>39700</td>
<td>0.00%</td>
</tr>
<tr>
<td>Hugh Hewitt</td>
<td>41</td>
<td>26700</td>
<td>929000</td>
<td>2.87%</td>
</tr>
<tr>
<td>Joel Veitch</td>
<td>42</td>
<td>2830</td>
<td>135000</td>
<td>2.10%</td>
</tr>
<tr>
<td>Truthout</td>
<td>43</td>
<td>8780</td>
<td>371000</td>
<td>2.37%</td>
</tr>
<tr>
<td>Baghdad Burning</td>
<td>44</td>
<td>22700</td>
<td>552000</td>
<td>4.11%</td>
</tr>
<tr>
<td>Buzz machine</td>
<td>45</td>
<td>30600</td>
<td>1010000</td>
<td>3.03%</td>
</tr>
<tr>
<td>fleugel</td>
<td>46</td>
<td>1890</td>
<td>201000</td>
<td>0.94%</td>
</tr>
<tr>
<td>Informed Comment</td>
<td>47</td>
<td>27900</td>
<td>787000</td>
<td>3.55%</td>
</tr>
<tr>
<td>Doppler: redefining podcasting</td>
<td>48</td>
<td>4420</td>
<td>607000</td>
<td>0.73%</td>
</tr>
<tr>
<td>geek and proud</td>
<td>49</td>
<td>355</td>
<td>9110</td>
<td>3.90%</td>
</tr>
<tr>
<td>loadmemory (Asian site)</td>
<td>50</td>
<td>83</td>
<td>1550</td>
<td>5.35%</td>
</tr>
<tr>
<td>Photojunkie</td>
<td>51</td>
<td>1540</td>
<td>51200</td>
<td>3.01%</td>
</tr>
<tr>
<td>Ross Rader</td>
<td>52</td>
<td>1070</td>
<td>48200</td>
<td>2.22%</td>
</tr>
<tr>
<td>The Truth Laid Bear</td>
<td>53</td>
<td>23900</td>
<td>717000</td>
<td>3.33%</td>
</tr>
<tr>
<td>Joi Ito</td>
<td>54</td>
<td>23400</td>
<td>1050000</td>
<td>2.23%</td>
</tr>
<tr>
<td>ScrappleFace</td>
<td>55</td>
<td>31100</td>
<td>807000</td>
<td>3.85%</td>
</tr>
<tr>
<td>LexText</td>
<td>56</td>
<td>1970</td>
<td>31200</td>
<td>6.31%</td>
</tr>
<tr>
<td>Google Blog</td>
<td>57</td>
<td>46</td>
<td>297000</td>
<td>0.02%</td>
</tr>
<tr>
<td>Xbox</td>
<td>58</td>
<td>6600</td>
<td>237000</td>
<td>2.78%</td>
</tr>
<tr>
<td>My life in a Bush of Ghosts</td>
<td>59</td>
<td>6</td>
<td>903</td>
<td>0.66%</td>
</tr>
<tr>
<td>Astronomy picture of the day</td>
<td>60</td>
<td>5020</td>
<td>113000</td>
<td>4.44%</td>
</tr>
<tr>
<td>Crooked Timber</td>
<td>61</td>
<td>3560</td>
<td>67500</td>
<td>5.27%</td>
</tr>
<tr>
<td>Vodka Pundit</td>
<td>62</td>
<td>4520</td>
<td>169000</td>
<td>2.67%</td>
</tr>
<tr>
<td>Captain’s quarter</td>
<td>63</td>
<td>27100</td>
<td>730000</td>
<td>3.71%</td>
</tr>
<tr>
<td>A small victory</td>
<td>64</td>
<td>16700</td>
<td>460000</td>
<td>3.63%</td>
</tr>
<tr>
<td>Gato Fedorento</td>
<td>65</td>
<td>1630</td>
<td>126000</td>
<td>1.29%</td>
</tr>
<tr>
<td>Mezzoblue</td>
<td>66</td>
<td>12000</td>
<td>278000</td>
<td>4.32%</td>
</tr>
<tr>
<td>PostSecret</td>
<td>67</td>
<td>5790</td>
<td>202000</td>
<td>2.87%</td>
</tr>
<tr>
<td>Samizdata.net</td>
<td>68</td>
<td>1050</td>
<td>18000</td>
<td>5.83%</td>
</tr>
<tr>
<td>Lawrence Lessig</td>
<td>69</td>
<td>30600</td>
<td>959000</td>
<td>3.19%</td>
</tr>
<tr>
<td>Counterpunch</td>
<td>70</td>
<td>11700</td>
<td>295000</td>
<td>3.97%</td>
</tr>
<tr>
<td>Democractic Underground</td>
<td>71</td>
<td>14900</td>
<td>417000</td>
<td>3.57%</td>
</tr>
<tr>
<td>Right Wing News</td>
<td>72</td>
<td>27900</td>
<td>794000</td>
<td>3.51%</td>
</tr>
<tr>
<td>StopDesign</td>
<td>73</td>
<td>10200</td>
<td>255000</td>
<td>4.00%</td>
</tr>
<tr>
<td>iBiblio</td>
<td>74</td>
<td>9730</td>
<td>197000</td>
<td>4.94%</td>
</tr>
<tr>
<td>Samizdata.net (mistake?)</td>
<td>75</td>
<td>25500</td>
<td>697000</td>
<td>3.66%</td>
</tr>
<tr>
<td>Abrupto</td>
<td>76</td>
<td>550</td>
<td>44700</td>
<td>1.23%</td>
</tr>
<tr>
<td>gene7299 (Asian MSNSpaces site)</td>
<td>77</td>
<td>58</td>
<td>764</td>
<td>7.59%</td>
</tr>
<tr>
<td>Where is Raed?</td>
<td>78</td>
<td>10100</td>
<td>232000</td>
<td>4.35%</td>
</tr>
<tr>
<td>B3TA: We love the web</td>
<td>79</td>
<td>12000</td>
<td>839000</td>
<td>1.43%</td>
</tr>
<tr>
<td>Talkleft</td>
<td>80</td>
<td>7170</td>
<td>221000</td>
<td>3.24%</td>
</tr>
<tr>
<td>Wizbang</td>
<td>81</td>
<td>21000</td>
<td>634000</td>
<td>3.31%</td>
</tr>
<tr>
<td>m1net (MSN spaces site)</td>
<td>82</td>
<td>104</td>
<td>579</td>
<td>17.96%</td>
</tr>
<tr>
<td>Hoder</td>
<td>83</td>
<td>1480</td>
<td>20900</td>
<td>7.08%</td>
</tr>
<tr>
<td>CTRL+Alt+Del</td>
<td>84</td>
<td>2310</td>
<td>171000</td>
<td>1.35%</td>
</tr>
<tr>
<td>Brad DeLong</td>
<td>85</td>
<td>30100</td>
<td>882000</td>
<td>3.41%</td>
</tr>
<tr>
<td>Blogs for Bush</td>
<td>86</td>
<td>16200</td>
<td>824000</td>
<td>1.97%</td>
</tr>
<tr>
<td>Neil Gaiman</td>
<td>87</td>
<td>13700</td>
<td>319000</td>
<td>4.29%</td>
</tr>
<tr>
<td>Gothamist</td>
<td>88</td>
<td>15200</td>
<td>491000</td>
<td>3.10%</td>
</tr>
<tr>
<td>Thought Mechanics</td>
<td>89</td>
<td>4400</td>
<td>190000</td>
<td>2.32%</td>
</tr>
<tr>
<td>IMAO</td>
<td>90</td>
<td>23800</td>
<td>407000</td>
<td>5.85%</td>
</tr>
<tr>
<td>Dan Gillmor (old weblog)</td>
<td>91</td>
<td>10800</td>
<td>298000</td>
<td>3.62%</td>
</tr>
<tr>
<td>HINAGATA</td>
<td>92</td>
<td>10100</td>
<td>21100</td>
<td>47.87%</td>
</tr>
<tr>
<td>Dean’s World</td>
<td>93</td>
<td>30600</td>
<td>784000</td>
<td>3.90%</td>
</tr>
<tr>
<td>Defamer</td>
<td>94</td>
<td>9310</td>
<td>725000</td>
<td>1.28%</td>
</tr>
<tr>
<td>USS Clueless</td>
<td>95</td>
<td>8470</td>
<td>264000</td>
<td>3.21%</td>
</tr>
<tr>
<td>Dive into Mark</td>
<td>96</td>
<td>14600</td>
<td>235000</td>
<td>6.21%</td>
</tr>
<tr>
<td>Pandagon</td>
<td>97</td>
<td>27300</td>
<td>743000</td>
<td>3.67%</td>
</tr>
<tr>
<td>Blogging.la</td>
<td>98</td>
<td>3200</td>
<td>67700</td>
<td>4.73%</td>
</tr>
<tr>
<td>Why are you worshipping the ground I blog on?</td>
<td>99</td>
<td>1430</td>
<td>85000</td>
<td>1.68%</td>
</tr>
<tr>
<td>Daring Fireball</td>
<td>100</td>
<td>12000</td>
<td>221000</td>
<td>5.43%</td>
</tr>
</table>
<p>Nothing particularly impressive there. It seemed that Google, on average, ended up with only about 3% of the links Yahoo! had in its index. However, the story got more interesting when looking at divergence between the average and the median, as it seemed there was a statistical divergence (almost half a percent) between the two:</p>
<table border="1" summary="divergence">
<tr>
<th>Technorati Top 100</th>
<th>Google</th>
<th>Yahoo</th>
<th>Google/Yahoo Links</th>
</tr>
<tr>
<td>Total</td>
<td>1739867</td>
<td>56150006</td>
<td>3.10%</td>
</tr>
<tr>
<td>Median</td>
<td>13700</td>
<td>389500</td>
<td>3.52%</td>
</tr>
</table>
<p>But wait, for the weirdness is only getting started. Next up was looking at the distributions (as I’ve done for Technorati vs. each of the engines):</p>
<table border="1" summary="distributions">
<tr>
<th>Technorati Top 100</th>
<th>Google</th>
<th>Yahoo</th>
<th>Google/Yahoo Links</th>
</tr>
<tr>
<td>AVERAGE TOP 10</td>
<td>43858</td>
<td>1531940</td>
<td>2.86%</td>
</tr>
<tr>
<td>AVERAGE TOP 25</td>
<td>30397.6</td>
<td>986368</td>
<td>3.08%</td>
</tr>
<tr>
<td>AVERAGE TOP 50</td>
<td>23599.04082</td>
<td>768245.2</td>
<td>3.07%</td>
</tr>
<tr>
<td>AVERAGE BOTTOM 50</td>
<td>11443.07843</td>
<td>354754.92</td>
<td>3.23%</td>
</tr>
<tr>
<td>AVERAGE BOTTOM 25</td>
<td>11980.07692</td>
<td>362220.8846</td>
<td>3.31%</td>
</tr>
<tr>
<td>AVERAGE BOTTOM 10</td>
<td>13782.72727</td>
<td>350072.7273</td>
<td>3.94%</td>
</tr>
</table>
<p>I looked at the number and they did not seem right so I ran them again and ended up with the same results. Ran them a third time and still couldn’t make sense of it. So I graphed it:</p>
<p><img src="http://www.tnl.net/assets/images/blog/secrets/GY2.gif" alt="Google vs. Yahoo round 2" /></p>
<p>… and to my surprise, it appeared that the further down the line one went, the greater the differential. In fact, sites that are in the bottom of the top 100 are one full percent more likely to get indexed in Yahoo! than in Google.</p>
<h3>Conclusions</h3>
<p>From here, we can draw a few conclusions:</p>
<ul>
<li>Yahoo! generally does a better job at indexing the blogosphere than Google does. We know they have been working hard to improve their index and here’s proof that they are getting results</li>
<li>Even if Google is the one with the motto about not doing evil, Yahoo! seems to be the one interested in giving equal opportunity to the little guy: smaller blogs seem to have a better chance of being recognized by Yahoo! than they do of being recognized by Google</li>
<li>While the front page of Google advertises they are currently indexing over 8 billion pages, it is very difficult to find ways to support that claim via the link feature they are offering: this can be seen as confirmation that Google does not tell you about all the links it has in its index.</li>
<li>Sure volume counts but in the case of search indexes, they may count against sites: if one is less likely to appear in Google than it is to appear in Yahoo! and the Google index is much larger than the Yahoo! one, then, if Yahoo! and Google had the same amount of traffic, a single blog could find itself receiving more traffic from Yahoo! than it does from Google. This would be due to the fact that each individual page in Yahoo! has more weight than it does in Google.</li>
<li>The top 100 blogs have other 56 million links in the Yahoo!. That’s a lot of links and clearly shows that links are the currency of the blogging world. It would be interested to get data that would help analyze how much interlinking exists across those sites.</li>
</ul>
<p>Up next, we’ll take a look at how MSN plays in all this game. So stay tuned!</p>
<p><p><i><a href="http://tnl.net/who" rel="author" title="Who is Tristan Louis?">Tristan Louis</a> is the founder and CEO of <a href="http://www.keepskor.com" title="Keepskor">Keepskor</a> and  writes the influential <a href="http://www.tnl.net/" title="tnl.net">tnl.net</a> weblog, where this was initially posted under the title <a href="http://www.tnl.net/blog/2005/06/20/technorati-yahoo-and-google-too/">Technorati Yahoo and Google Too</a>. You can follow him on twitter <a href="https://twitter.com/TNLNYC">here</a> or receive his weekly newsletter by subscribing <a href="http://eepurl.com/gb6zD">here</a>.</i></p>
</p>
]]></content:encoded>
			<wfw:commentRss>http://www.tnl.net/blog/2005/06/20/technorati-yahoo-and-google-too/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Secrets of the A-list bloggers: Technorati vs. Google</title>
		<link>http://www.tnl.net/blog/2005/06/13/secrets-of-the-a-list-bloggers-technorati-vs-google/</link>
		<comments>http://www.tnl.net/blog/2005/06/13/secrets-of-the-a-list-bloggers-technorati-vs-google/#comments</comments>
		<pubDate>Mon, 13 Jun 2005 07:52:46 +0000</pubDate>
		<dc:creator>Tristan Louis</dc:creator>
				<category><![CDATA[Business]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[Yahoo]]></category>

		<guid isPermaLink="false">http://tnl.net/blog/2005/06/13/secrets-of-the-a-list-bloggers-technorati-vs-google/</guid>
		<description><![CDATA[Looking at data about the A-list, first in and of its own, and later as part of a wider scope made me wonder about the initial data set I was using. What does it mean to be on the Technorati 100? Is Technorati presenting an accurate representation of the world? And how does it compare [...]<p><p><i><a href="http://tnl.net/who" rel="author" title="Who is Tristan Louis?">Tristan Louis</a> is the founder and CEO of <a href="http://www.keepskor.com" title="Keepskor">Keepskor</a> and  writes the influential <a href="http://www.tnl.net/" title="tnl.net">tnl.net</a> weblog, where this was initially posted under the title <a href="http://www.tnl.net/blog/2005/06/13/secrets-of-the-a-list-bloggers-technorati-vs-google/">Secrets of the A-list bloggers: Technorati vs. Google</a>. You can follow him on twitter <a href="https://twitter.com/TNLNYC">here</a> or receive his weekly newsletter by subscribing <a href="http://eepurl.com/gb6zD">here</a>.</i></p>
</p>
]]></description>
			<content:encoded><![CDATA[<p>Looking at data about the A-list, first <a href="http://www.tnl.net/blog/2005/05/24/secrets-of-the-a-list-bloggers-lots-of-short-entries/" title="TNL.net: Secrets of the A-List Bloggers: Lots of short entries">in and of its own</a>, and later as <a href="http://www.tnl.net/blog/2005/06/01/secrets-of-the-a-list-bloggers-technorati-links/" title="TNL.net: Secrets of the A-List Bloggers: Technorati Links">part of a wider scope</a> made me wonder about the initial data set I was using. What does it mean to be on the <a href="http://technorati.com/" title="Technorati Top 100">Technorati 100</a>? Is Technorati presenting an accurate representation of the world? And how does it compare against the wider world? So I decided to start gathering data from other search engines. In this entry, I will go into a more detailed analysis of that information and will attempt to answer some of the questions I raised above.</p>
<h3>Source Check</h3>
<p>In my initial review, I noticed that Technorati was ranking sites bases on sources. However, incoming and outgoing information is not really available from the major search engines when it comes to sourcing data. So, for this particular investigation, I decided to dismiss source data and focus on link data. I decided to go and get link data from the three largest search engines: Google, Yahoo! and MSN (that last one was included at the last minute just because I knew that <a href="http://radio-weblogs.com/0001011/" title="The Scobleizer">Robert Scoble</a> would complain about the study being biased if I didn’t include MSN).</p>
<p>Picking three search engines was also interesting because it providing some sort of reference check. If one of the engines did not line up with the other two, we could point out to a potential flaw in that engine instead of trying to understand why the data was wrong.</p>
<p>Having picked that data set, I decided to start gathering the data. Let me say that it’s a lot of information and, should I try to do this again in the future, writing software to gather the information will probably be less time consuming that trying to get it by hand.</p>
<p>But enough about the process, let’s get into the numbers.</p>
<h3>Technorati vs. Google</h3>
<p>So the first dataset I created was a comparative index of Technorati and Google. The set was created by grabbing the number of links to a site in Google and getting the equivalent value for Technorati. The resut looked like this:</p>
<table border="1" summary="technorati vs. google">
<tr>
<th>Technorati Top 100</th>
<th>Google Links</th>
<th>Technorati Links</th>
<th>Technorati/Google</th>
</tr>
<tr>
<td>Boing Boing</td>
<td>45200</td>
<td>22532</td>
<td>49.8496%</td>
</tr>
<tr>
<td>InstaPundit</td>
<td>75000</td>
<td>15190</td>
<td>20.2533%</td>
</tr>
<tr>
<td>Daily Kos</td>
<td>59800</td>
<td>15833</td>
<td>26.4766%</td>
</tr>
<tr>
<td>Gizmodo</td>
<td>39300</td>
<td>12278</td>
<td>31.2417%</td>
</tr>
<tr>
<td>Fark</td>
<td>43600</td>
<td>10216</td>
<td>23.4312%</td>
</tr>
<tr>
<td>EnGadget</td>
<td>46800</td>
<td>15051</td>
<td>32.1603%</td>
</tr>
<tr>
<td>Davenetics</td>
<td>1780</td>
<td>7571</td>
<td>425.3371%</td>
</tr>
<tr>
<td>Eschaton</td>
<td>62400</td>
<td>8713</td>
<td>13.9631%</td>
</tr>
<tr>
<td>Dooce</td>
<td>23600</td>
<td>6797</td>
<td>28.8008%</td>
</tr>
<tr>
<td>Andrew Sullivan</td>
<td>41100</td>
<td>7680</td>
<td>18.6861%</td>
</tr>
<tr>
<td>The Best Page In The Universe</td>
<td>656</td>
<td>6333</td>
<td>965.3963%</td>
</tr>
<tr>
<td>Talking Points Memo: by Joshua Micah Marshall</td>
<td>74600</td>
<td>7592</td>
<td>10.1769%</td>
</tr>
<tr>
<td>lgf: anti-idiotarian</td>
<td>14700</td>
<td>8275</td>
<td>56.2925%</td>
</tr>
<tr>
<td>kottke.org</td>
<td>32000</td>
<td>7278</td>
<td>22.7438%</td>
</tr>
<tr>
<td>WIL WHEATON DOT NET</td>
<td>16900</td>
<td>6314</td>
<td>37.3609%</td>
</tr>
<tr>
<td>Metafilter</td>
<td>34500</td>
<td>7591</td>
<td>22.0029%</td>
</tr>
<tr>
<td>Doc Searls</td>
<td>33600</td>
<td>5690</td>
<td>16.9345%</td>
</tr>
<tr>
<td>(In)formacao e (In)utilidade</td>
<td>1780</td>
<td>6040</td>
<td>339.3258%</td>
</tr>
<tr>
<td>Wonkette</td>
<td>28800</td>
<td>5877</td>
<td>20.4063%</td>
</tr>
<tr>
<td>Scripting News</td>
<td>39400</td>
<td>5728</td>
<td>14.5381%</td>
</tr>
<tr>
<td>Power Line</td>
<td>7510</td>
<td>7477</td>
<td>99.5606%</td>
</tr>
<tr>
<td>Balmasque</td>
<td>24</td>
<td>4544</td>
<td>18933.3333%</td>
</tr>
<tr>
<td>Corante</td>
<td>6770</td>
<td>7686</td>
<td>113.5303%</td>
</tr>
<tr>
<td>A list Apart</td>
<td>21100</td>
<td>5536</td>
<td>26.2370%</td>
</tr>
<tr>
<td>Something Awful</td>
<td>9020</td>
<td>4512</td>
<td>50.0222%</td>
</tr>
<tr>
<td>Megatokyo</td>
<td>7310</td>
<td>4154</td>
<td>56.8263%</td>
</tr>
<tr>
<td>Michelle Malkin</td>
<td>17300</td>
<td>6091</td>
<td>35.2081%</td>
</tr>
<tr>
<td>Arts and Letters Daily</td>
<td>23900</td>
<td>3983</td>
<td>16.6653%</td>
</tr>
<tr>
<td>Gawker</td>
<td>23500</td>
<td>4453</td>
<td>18.9489%</td>
</tr>
<tr>
<td>Afterall it was the best I ever had</td>
<td>95</td>
<td>3591</td>
<td>3780.0000%</td>
</tr>
<tr>
<td>The Volokh Conspiracy</td>
<td>42000</td>
<td>5873</td>
<td>13.9833%</td>
</tr>
<tr>
<td>Scobelizer</td>
<td>21800</td>
<td>5524</td>
<td>25.3394%</td>
</tr>
<tr>
<td>Jeffrey Zeldman</td>
<td>22500</td>
<td>4134</td>
<td>18.3733%</td>
</tr>
<tr>
<td>This Modern World</td>
<td>32100</td>
<td>3913</td>
<td>12.1900%</td>
</tr>
<tr>
<td>The Web Standards Project</td>
<td>1850</td>
<td>3810</td>
<td>205.9459%</td>
</tr>
<tr>
<td>Joel on Software</td>
<td>22400</td>
<td>4514</td>
<td>20.1518%</td>
</tr>
<tr>
<td>Media Matters for America</td>
<td>24800</td>
<td>6809</td>
<td>27.4556%</td>
</tr>
<tr>
<td>Television without pity</td>
<td>13300</td>
<td>3859</td>
<td>29.0150%</td>
</tr>
<tr>
<td>Kuro5hin</td>
<td>17300</td>
<td>4208</td>
<td>24.3237%</td>
</tr>
<tr>
<td>Lileks</td>
<td>0</td>
<td>3824</td>
<td>N/A</td>
</tr>
<tr>
<td>Hugh Hewitt</td>
<td>26700</td>
<td>4573</td>
<td>17.1273%</td>
</tr>
<tr>
<td>Joel Veitch</td>
<td>2830</td>
<td>3774</td>
<td>133.3569%</td>
</tr>
<tr>
<td>Truthout</td>
<td>8780</td>
<td>6528</td>
<td>74.3508%</td>
</tr>
<tr>
<td>Baghdad Burning</td>
<td>22700</td>
<td>3519</td>
<td>15.5022%</td>
</tr>
<tr>
<td>Buzz machine</td>
<td>30600</td>
<td>4145</td>
<td>13.5458%</td>
</tr>
<tr>
<td>fleugel</td>
<td>1890</td>
<td>3670</td>
<td>194.1799%</td>
</tr>
<tr>
<td>Informed Comment</td>
<td>27900</td>
<td>3905</td>
<td>13.9964%</td>
</tr>
<tr>
<td>Doppler: redefining podcasting</td>
<td>4420</td>
<td>3040</td>
<td>68.7783%</td>
</tr>
<tr>
<td>geek and proud</td>
<td>355</td>
<td>3166</td>
<td>891.8310%</td>
</tr>
<tr>
<td>loadmemory (Asian site)</td>
<td>83</td>
<td>3324</td>
<td>4004.8193%</td>
</tr>
<tr>
<td>Photojunkie</td>
<td>1540</td>
<td>2860</td>
<td>185.7143%</td>
</tr>
<tr>
<td>Ross Rader</td>
<td>1070</td>
<td>2976</td>
<td>278.1308%</td>
</tr>
<tr>
<td>The Truth Laid Bear</td>
<td>23900</td>
<td>4127</td>
<td>17.2678%</td>
</tr>
<tr>
<td>Joi Ito</td>
<td>23400</td>
<td>5165</td>
<td>22.0726%</td>
</tr>
<tr>
<td>ScrappleFace</td>
<td>31100</td>
<td>3480</td>
<td>11.1897%</td>
</tr>
<tr>
<td>LexText</td>
<td>1970</td>
<td>2671</td>
<td>135.5838%</td>
</tr>
<tr>
<td>Google Blog</td>
<td>46</td>
<td>3688</td>
<td>8017.3913%</td>
</tr>
<tr>
<td>Xbox</td>
<td>6600</td>
<td>4221</td>
<td>63.9545%</td>
</tr>
<tr>
<td>My life in a Bush of Ghosts</td>
<td>6</td>
<td>2519</td>
<td>41983.3333%</td>
</tr>
<tr>
<td>Astronomy picture of the day</td>
<td>5020</td>
<td>3498</td>
<td>69.6813%</td>
</tr>
<tr>
<td>Crooked Timber</td>
<td>3560</td>
<td>3617</td>
<td>101.6011%</td>
</tr>
<tr>
<td>Vodka Pundit</td>
<td>4520</td>
<td>3085</td>
<td>68.2522%</td>
</tr>
<tr>
<td>Captain’s quarter</td>
<td>27100</td>
<td>3671</td>
<td>13.5461%</td>
</tr>
<tr>
<td>A small victory</td>
<td>16700</td>
<td>3223</td>
<td>19.2994%</td>
</tr>
<tr>
<td>Gato Fedorento</td>
<td>1630</td>
<td>2574</td>
<td>157.9141%</td>
</tr>
<tr>
<td>Mezzoblue</td>
<td>12000</td>
<td>2952</td>
<td>24.6000%</td>
</tr>
<tr>
<td>PostSecret</td>
<td>5790</td>
<td>2707</td>
<td>46.7530%</td>
</tr>
<tr>
<td>Samizdata.net</td>
<td>1050</td>
<td>2872</td>
<td>273.5238%</td>
</tr>
<tr>
<td>Lawrence Lessig</td>
<td>30600</td>
<td>2949</td>
<td>9.6373%</td>
</tr>
<tr>
<td>Counterpunch</td>
<td>11700</td>
<td>3278</td>
<td>28.0171%</td>
</tr>
<tr>
<td>Democractic Underground</td>
<td>14900</td>
<td>3913</td>
<td>26.2617%</td>
</tr>
<tr>
<td>Right Wing News</td>
<td>27900</td>
<td>2967</td>
<td>10.6344%</td>
</tr>
<tr>
<td>StopDesign</td>
<td>10200</td>
<td>3037</td>
<td>29.7745%</td>
</tr>
<tr>
<td>iBiblio</td>
<td>9730</td>
<td>3105</td>
<td>31.9116%</td>
</tr>
<tr>
<td>Samizdata.net (mistake?)</td>
<td>25500</td>
<td>2743</td>
<td>10.7569%</td>
</tr>
<tr>
<td>Abrupto</td>
<td>550</td>
<td>2935</td>
<td>533.6364%</td>
</tr>
<tr>
<td>gene7299 (Asian MSNSpaces site)</td>
<td>58</td>
<td>3215</td>
<td>5543.1034%</td>
</tr>
<tr>
<td>Where is Raed?</td>
<td>10100</td>
<td>2409</td>
<td>23.8515%</td>
</tr>
<tr>
<td>B3TA: We love the web</td>
<td>12000</td>
<td>2614</td>
<td>21.7833%</td>
</tr>
<tr>
<td>Talkleft</td>
<td>7170</td>
<td>2901</td>
<td>40.4603%</td>
</tr>
<tr>
<td>Wizbang</td>
<td>21000</td>
<td>3358</td>
<td>15.9905%</td>
</tr>
<tr>
<td>m1net (MSN spaces site)</td>
<td>104</td>
<td>3548</td>
<td>3411.5385%</td>
</tr>
<tr>
<td>Hoder</td>
<td>1480</td>
<td>5422</td>
<td>366.3514%</td>
</tr>
<tr>
<td>CTRL+Alt+Del</td>
<td>2310</td>
<td>2315</td>
<td>100.2165%</td>
</tr>
<tr>
<td>Brad DeLong</td>
<td>30100</td>
<td>2715</td>
<td>9.0199%</td>
</tr>
<tr>
<td>Blogs for Bush</td>
<td>16200</td>
<td>3560</td>
<td>21.9753%</td>
</tr>
<tr>
<td>Neil Gaiman</td>
<td>13700</td>
<td>2194</td>
<td>16.0146%</td>
</tr>
<tr>
<td>Gothamist</td>
<td>15200</td>
<td>2729</td>
<td>17.9539%</td>
</tr>
<tr>
<td>Thought Mechanics</td>
<td>4400</td>
<td>2197</td>
<td>49.9318%</td>
</tr>
<tr>
<td>IMAO</td>
<td>23800</td>
<td>2905</td>
<td>12.2059%</td>
</tr>
<tr>
<td>Dan Gillmor (old weblog)</td>
<td>10800</td>
<td>2600</td>
<td>24.0741%</td>
</tr>
<tr>
<td>HINAGATA</td>
<td>10100</td>
<td>2186</td>
<td>21.6436%</td>
</tr>
<tr>
<td>Dean’s World</td>
<td>30600</td>
<td>2985</td>
<td>9.7549%</td>
</tr>
<tr>
<td>Defamer</td>
<td>9310</td>
<td>2372</td>
<td>25.4780%</td>
</tr>
<tr>
<td>USS Clueless</td>
<td>8470</td>
<td>2570</td>
<td>30.3424%</td>
</tr>
<tr>
<td>Dive into Mark</td>
<td>14600</td>
<td>2540</td>
<td>17.3973%</td>
</tr>
<tr>
<td>Pandagon</td>
<td>27300</td>
<td>2822</td>
<td>10.3370%</td>
</tr>
<tr>
<td>Blogging.la</td>
<td>3200</td>
<td>3061</td>
<td>95.6563%</td>
</tr>
<tr>
<td>Why are you worshipping the ground I blog on?</td>
<td>1430</td>
<td>2238</td>
<td>156.5035%</td>
</tr>
<tr>
<td>Daring Fireball</td>
<td>12000</td>
<td>2573</td>
<td>21.4417%</td>
</tr>
</table>
<p>The third column in this is just a quick set of calculation providing us with some data as to what percentage of Google links was available in Technorati. From there, we’re already noticing some interesting trends. While most of the data ends up showing Google has having a larger set of links in its index than Technorati, there are 16 cases where the Technorati index of links is larger than the Google one. In any study, over 15% of a dataset is statistically significant. How Technorati ends up getting more data than Google is something that someone might want to investigate. Beyond that, it appears that Technorati gets about 30% of the links that Google get to a particular site, as illustrated in the chart below:</p>
<p><img src="http://www.tnl.net/assets/images/blog/secrets/TGP.gif" alt="technorati vs. google" /></p>
<p>The next set of interesting findings is that while the linkage from Technorati is generally lower than it is in Google, it is consistently that way. A quick analysis of the data set shows that the average percentage of Technorati links compared to Google links is not that far from the average median of Technorati links compared to Google links. Confused by that last sentence? Don’t worry (I was too after I wrote it) and let me show you, by pulling out another data chart:</p>
<table border="1" summary="data">
<tr>
<th>Technorati Top 100</th>
<th>Google Links</th>
<th>Technorati Links</th>
<th>Technorati/Google</th>
</tr>
<tr>
<td>TOTAL</td>
<td>1739867</td>
<td>479580</td>
<td>27.5642%</td>
</tr>
<tr>
<td>MEDIAN</td>
<td>13500</td>
<td>3679.5</td>
<td>27.2556%</td>
</tr>
</table>
<p>Doesn’t it all become clearer? On average, for the top 100 bloggers, Technorati holds 27.56% of the links that Google holds. Part of the reason behind this may be that Technorati only represents the blogs subset of the whole web while Google represents linkage for the web as a whole. From here, we could gather that for every link a blog provides, other sources on the web provide 3 links. Since blogs still represent a small portion of the web, however, the importance of links in the blog world may be outpacing the importance of links in the non-blog world. Part of the reason behind this could be that links are one of the big currency in the web space and many blogs are offering little content but are heavy on the linking. If an average blog entry is under 300 words, it often contains at least one link. This could mean that Technorati and other blog search engines are right to consider links as a strong measurement, but may show that blogs, as a medium, are not providing that much content beyond linking.</p>
<p>However, it gets even more interesting if you dig in. Looking at the data, these values are actually misleading. What is happening is not truly an egalitarian match. Doing a quick review of the distribution, we start seeing some interesting trends.</p>
<table border="1" summary="trends">
<tr>
<th>Technorati Top 100</th>
<th>Google Links</th>
<th>Technorati Links</th>
<th>Technorati/Google</th>
</tr>
<tr>
<td>AVERAGE TOP 10</td>
<td>43858</td>
<td>12186.1</td>
<td>27.7854%</td>
</tr>
<tr>
<td>AVERAGE TOP 25</td>
<td>30397.6</td>
<td>8733.36</td>
<td>28.7304%</td>
</tr>
<tr>
<td>AVERAGE TOP 50</td>
<td>23127.06</td>
<td>6534.36</td>
<td>28.2542%</td>
</tr>
<tr>
<td>AVERAGE BOTTOM 50</td>
<td>11443.07843</td>
<td>3057.24</td>
<td>26.7169%</td>
</tr>
<tr>
<td>AVERAGE BOTTOM 25</td>
<td>11980.07692</td>
<td>2834.884615</td>
<td>23.6633%</td>
</tr>
<tr>
<td>AVERAGE BOTTOM 10</td>
<td>13782.72727</td>
<td>2622.909091</td>
<td>19.0304%</td>
</tr>
</table>
<p>Let’s graph the Technorati links as percentage of Google to see a little more of what I’m inferring:</p>
<p><img src="http://www.tnl.net/assets/images/blog/secrets/TGaverages.gif" alt="averages" /></p>
<p>Looking at this, it seems that our friends at Technorati have a bias. On average, blogs in the top 10 are 8% more likely to get indexed by both Google and Technorati than they are to be indexed by Google only. Considering that Google already admits to some level of bias in their system (part of the foundation for PageRank is that sites with higher PageRanks get indexed more often), it is a bit worrisome, especially if the trend holds across the whole of Technorati’s universe. If Google favors indexing more popular sites more often, a clear opprtunity for world-live-web search engines like Technorati would be in the long tail of less-often-indexed sites but Technorati seems to ignore that opportunity and concentrate on the top sites. What that will translate into is a direct reproduction of the power laws when it comes to indexing of blogs.</p>
<p>But is that true of Google vs Technorati only? Or do the same rules apply for other search engines? We’ll look at that in the next entry.</p>
<p><p><i><a href="http://tnl.net/who" rel="author" title="Who is Tristan Louis?">Tristan Louis</a> is the founder and CEO of <a href="http://www.keepskor.com" title="Keepskor">Keepskor</a> and  writes the influential <a href="http://www.tnl.net/" title="tnl.net">tnl.net</a> weblog, where this was initially posted under the title <a href="http://www.tnl.net/blog/2005/06/13/secrets-of-the-a-list-bloggers-technorati-vs-google/">Secrets of the A-list bloggers: Technorati vs. Google</a>. You can follow him on twitter <a href="https://twitter.com/TNLNYC">here</a> or receive his weekly newsletter by subscribing <a href="http://eepurl.com/gb6zD">here</a>.</i></p>
</p>
]]></content:encoded>
			<wfw:commentRss>http://www.tnl.net/blog/2005/06/13/secrets-of-the-a-list-bloggers-technorati-vs-google/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Secrets of the A-List Bloggers: Lots of short entries</title>
		<link>http://www.tnl.net/blog/2005/05/24/secrets-of-the-a-list-bloggers-lots-of-short-entries/</link>
		<comments>http://www.tnl.net/blog/2005/05/24/secrets-of-the-a-list-bloggers-lots-of-short-entries/#comments</comments>
		<pubDate>Tue, 24 May 2005 08:00:00 +0000</pubDate>
		<dc:creator>Tristan Louis</dc:creator>
				<category><![CDATA[Business]]></category>
		<category><![CDATA[Microsoft]]></category>
		<category><![CDATA[Search]]></category>

		<guid isPermaLink="false">http://tnl.net/blog/2005/05/24/secrets-of-the-a-list-bloggers-lots-of-short-entries/</guid>
		<description><![CDATA[A couple of weeks ago, when working on the entry about salaries for bloggers, I did a quick analysis of the entries in a day slice. Many people pointed out that this was a small slice and was not representative of what other blogs where doing. From there, I ended up with two questions basically [...]<p><p><i><a href="http://tnl.net/who" rel="author" title="Who is Tristan Louis?">Tristan Louis</a> is the founder and CEO of <a href="http://www.keepskor.com" title="Keepskor">Keepskor</a> and  writes the influential <a href="http://www.tnl.net/" title="tnl.net">tnl.net</a> weblog, where this was initially posted under the title <a href="http://www.tnl.net/blog/2005/05/24/secrets-of-the-a-list-bloggers-lots-of-short-entries/">Secrets of the A-List Bloggers: Lots of short entries</a>. You can follow him on twitter <a href="https://twitter.com/TNLNYC">here</a> or receive his weekly newsletter by subscribing <a href="http://eepurl.com/gb6zD">here</a>.</i></p>
</p>
]]></description>
			<content:encoded><![CDATA[<p>A couple of weeks ago, when working on <a href="http://www.tnl.net/blog/2005/05/09/gawker-bucks-vs-journalists-bucks/" title="TNL.net: Blogger bucks vs. journalist bucks">the entry about salaries for bloggers</a>, I did a quick analysis of the entries in a day slice. Many people pointed out that this was a small slice and was not representative of what other blogs where doing. From there, I ended up with two questions basically bugging me: first, how many entries does the average blog produce on a daily basis? Second, what is the size of those entries? To answer the question, I decided to start analyzing the A-list of the blog world.</p>
<h3>Method to the madness</h3>
<p>The first thing in preparing for this analysis was to figure out who I should pick as a subject for the study. That was the easy part, as Technorati, the blogging search engine, is generous enough to provide a <a href="http://technorati.com/" title="Technorati Top 100">Top 100 list</a>, highlighting the superstars of the blogging world.</p>
<p>Sitting at the top of the list were the following blogs:</p>
<ul>
<li><a href="http://www.boingboing.net/" title="Boing Boing">Boing Boing</a>, which is not written by a number of people, some of whom are <a href="http://www.boingboing.net/2005/05/24/technorati_tracking_.html" title="I am a proud advisor to Technorati">are also advising the company</a></li>
<li><a href="http://pajamasmedia.com/instapundit/" title="InstaPundit">InstaPundit</a>, a libertarian weblog run by Glenn Reynolds</li>
<li><a href="http://www.dailykos.com/" title="Daily Kos">Daily Kos</a>, a left wing weblog run by Markos Moulitsas Zuniga</li>
<li><a href="http://gizmodo.com/" title="Gizmodo">Gizmodo</a>, a weblog about gadgets</li>
<li><a href="http://www.fark.com" title="Fark">Fark</a>, a list of links</li>
<li><a href="http://www.engadget.com">Engadget</a>, a competitor to Gizmodo for the gadget audience</li>
</ul>
<p>I decided to eliminate Fark from my analysis as it was the outlier, generally not producing more than four or five words per link and not separating entries but rather keeping everything in one page, making it look very different from the regular blog format. Having done so, I decided to pick a 24 hour cycle and analyze data from that cycle for the remaining five blogs. The day I picked, May 19th, was a good day for gadget news and political blogs: in the gadget space, the E3 show was closing down. Meanwhile, the potential of a dramatic showdown in the American congress made for a lot of material for political bloggers.</p>
<p>I waited until the day was completed to start my research. Picking every entry one by one, I cut the entry out, pasted it on Microsoft Word, did a word count on the entry, and recorded the number in an Excel spreadsheet.</p>
<p>So let’s take a look at the numbers:</p>
<table border="1" summary="the numbers">
<tr>
<td>Â </td>
<th>Boing Boing</th>
<th>InstaPundit</th>
<th>Daily Kos</th>
<th>Gizmodo</th>
<th>EnGadget</th>
</tr>
<tr>
<th>1</th>
<td>120</td>
<td>175</td>
<td>202</td>
<td>77</td>
<td>13</td>
</tr>
<tr>
<th>2</th>
<td>136</td>
<td>75</td>
<td>105</td>
<td>115</td>
<td>118</td>
</tr>
<tr>
<th>3</th>
<td>247</td>
<td>3</td>
<td>2</td>
<td>67</td>
<td>147</td>
</tr>
<tr>
<th>4</th>
<td>145</td>
<td>10</td>
<td>4</td>
<td>107</td>
<td>78</td>
</tr>
<tr>
<th>5</th>
<td>94</td>
<td>48</td>
<td>171</td>
<td>133</td>
<td>151</td>
</tr>
<tr>
<th>6</th>
<td>62</td>
<td>33</td>
<td>297</td>
<td>131</td>
<td>111</td>
</tr>
<tr>
<th>7</th>
<td>67</td>
<td>6</td>
<td>7</td>
<td>102</td>
<td>171</td>
</tr>
<tr>
<th>8</th>
<td>196</td>
<td>11</td>
<td>159</td>
<td>134</td>
<td>101</td>
</tr>
<tr>
<th>9</th>
<td>67</td>
<td>27</td>
<td>785</td>
<td>135</td>
<td>111</td>
</tr>
<tr>
<th>10</th>
<td>101</td>
<td>25</td>
<td>527</td>
<td>225</td>
<td>85</td>
</tr>
<tr>
<th>11</th>
<td>294</td>
<td>13</td>
<td>231</td>
<td>99</td>
<td>85</td>
</tr>
<tr>
<th>12</th>
<td>165</td>
<td>9</td>
<td>316</td>
<td>98</td>
<td>152</td>
</tr>
<tr>
<th>13</th>
<td>50</td>
<td>691</td>
<td>401</td>
<td>104</td>
<td>101</td>
</tr>
<tr>
<th>14</th>
<td>64</td>
<td>60</td>
<td>130</td>
<td>92</td>
<td>103</td>
</tr>
<tr>
<th>15</th>
<td>32</td>
<td>16</td>
<td>892</td>
<td>90</td>
<td>88</td>
</tr>
<tr>
<th>16</th>
<td>111</td>
<td>24</td>
<td>352</td>
<td>59</td>
<td>114</td>
</tr>
<tr>
<th>17</th>
<td>202</td>
<td>50</td>
<td>201</td>
<td>82</td>
<td>174</td>
</tr>
<tr>
<th>18</th>
<td>283</td>
<td>71</td>
<td>470</td>
<td>129</td>
<td>210</td>
</tr>
<tr>
<th>19</th>
<td>50</td>
<td>11</td>
<td>391</td>
<td>121</td>
<td>122</td>
</tr>
<tr>
<th>20</th>
<td>49</td>
<td>15</td>
<td>2</td>
<td>49</td>
<td>204</td>
</tr>
<tr>
<th>21</th>
<td>32</td>
<td>864</td>
<td>642</td>
<td>81</td>
<td>94</td>
</tr>
<tr>
<th>22</th>
<td>32</td>
<td>249</td>
<td>647</td>
<td>97</td>
<td>68</td>
</tr>
<tr>
<th>23</th>
<td>40</td>
<td>22</td>
<td>47</td>
<td>69</td>
<td>113</td>
</tr>
<tr>
<th>24</th>
<td>23</td>
<td>10</td>
<td>245</td>
<td>Â </td>
<td>86</td>
</tr>
<tr>
<th>25</th>
<td>42</td>
<td>9</td>
<td>238</td>
<td>Â </td>
<td>127</td>
</tr>
<tr>
<th>26</th>
<td>68</td>
<td>254</td>
<td>Â </td>
<td>Â </td>
<td>119</td>
</tr>
<tr>
<th>27</th>
<td>42</td>
<td>334</td>
<td>Â </td>
<td>Â </td>
<td>99</td>
</tr>
<tr>
<th>28</th>
<td>56</td>
<td>169</td>
<td>Â </td>
<td>Â </td>
<td>106</td>
</tr>
<tr>
<th>29</th>
<td>72</td>
<td>33</td>
<td>Â </td>
<td>Â </td>
<td>65</td>
</tr>
<tr>
<th>30</th>
<td>151</td>
<td>Â </td>
<td>Â </td>
<td>Â </td>
<td>104</td>
</tr>
<tr>
<th>31</th>
<td>Â </td>
<td>Â </td>
<td>Â </td>
<td>Â </td>
<td>830</td>
</tr>
<tr>
<th>32</th>
<td>Â </td>
<td>Â </td>
<td>Â </td>
<td>Â </td>
<td>114</td>
</tr>
<tr>
<th>33</th>
<td>Â </td>
<td>Â </td>
<td>Â </td>
<td>Â </td>
<td>69</td>
</tr>
<tr>
<th>34</th>
<td>Â </td>
<td>Â </td>
<td>Â </td>
<td>Â </td>
<td>353</td>
</tr>
<tr>
<th>35</th>
<td>Â </td>
<td>Â </td>
<td>Â </td>
<td>Â </td>
<td>147</td>
</tr>
<tr>
<th>36</th>
<td>Â </td>
<td>Â </td>
<td>Â </td>
<td>Â </td>
<td>58</td>
</tr>
<tr>
<th>37</th>
<td>Â </td>
<td>Â </td>
<td>Â </td>
<td>Â </td>
<td>127</td>
</tr>
<tr>
<th>38</th>
<td>Â </td>
<td>Â </td>
<td>Â </td>
<td>Â </td>
<td>158</td>
</tr>
<tr>
<th>39</th>
<td>Â </td>
<td>Â </td>
<td>Â </td>
<td>Â </td>
<td>95</td>
</tr>
<tr>
<th>40</th>
<td>Â </td>
<td>Â </td>
<td>Â </td>
<td>Â </td>
<td>209</td>
</tr>
</table>
<p>A cursory look at this shows a lot of interesting data. For starters, all the A list bloggers in that group posted at a rate of an entry per hour or more. However, looking at this, it was unclear how long each entry was.</p>
<p>I decided to massage the data a bit. Individual entry data did not provide much in the way of a clear view but aggregated information did give me a clearer picture. Let’s take a look:</p>
<table border="1" summary="aggregate numbers">
<tr>
<td>Â </td>
<th>Boing Boing</th>
<th>InstaPundit</th>
<th>Daily Kos</th>
<th>Gizmodo</th>
<th>EnGadget</th>
<th>Average</th>
<th>Total</th>
</tr>
<tr>
<th>Daily Total</th>
<td>3093</td>
<td>3317</td>
<td>7464</td>
<td>2396</td>
<td>5580</td>
<td>4370</td>
<td>21850</td>
</tr>
<tr>
<th># of entries</th>
<td>30</td>
<td>29</td>
<td>25</td>
<td>23</td>
<td>40</td>
<td>29.4</td>
<td>147</td>
</tr>
<tr>
<th>Average words/entry</th>
<td>103.1</td>
<td>114.37931</td>
<td>298.56</td>
<td>104.173913</td>
<td>139.5</td>
<td>148.639456</td>
<td>Â </td>
</tr>
</table>
<p>The data became clearer. On that particular day, the top five bloggers created an average of 30 entries, with each entry being under 150 words. This reminds me of something <a href="http://blogs.law.harvard.edu/philg/" title="Phillip Greenspun's weblog">Phillip Greenspun, another A-list blogger</a>, had said about why he liked blogs:</p>
<blockquote><p>It allows me to experiments with the three paragraph form</p></blockquote>
<p>Considering the size of the average entry from this, it seems very clear that an entry should be brief.</p>
<p>However, going beyond that is the number of entries that come in on a day. Looking at this, the average Top 5 A-list blogger wrote an average of almost 30 entries. Think about it for a second or two. 30 entries! It’s a huge number for a single day.</p>
<h3>From the reader standpoint</h3>
<p>So let’s say you popped up your news aggregator of choice and have subscribed to each of those blogs. How much would you read? How much information would you get? Our little analysis shows you would have read a bit under 22,000 words. That would amount, in terms of printed pages, to 44 single spaced pages.</p>
<p>Your alternative? Well, on that day, you could have <a href="http://ask.metafilter.com/18970/How-many-words-on-the-front-page" title="How Many Words on the front page?">picked up the New York Times and read every stories on the front page</a>. That would have netted you 12,964 words, or about 22 single spaced printed pages. You could have listened to the evening news, are about 3000 more words. Ultimately, you would have consumed more words reading blogs than going with mainstream media: 5 TV shows would have netted you about 15,000 words. 5 newspaper stories (assuming a different report on each story) would have netted you about 8,000 words. So blogs are much more prolific in terms of words.</p>
<h3>Blogger burn out</h3>
<p>Notice that I’ve carefully avoided the subject of quality in this particular analysis. This seems to be an increasing issue in the blogging world. Some bloggers, like <a href="http://joi.ito.com/weblog/2005/05/21/becoming-boring.html" title="Becoming Boring">Joi Ito</a> and <a href="http://avc.blogs.com/a_vc/2005/05/bagging_the_pos.html" title="Bagging the Post">Fred Wilson</a> are starting to worry about the quality of entries. Is this the onset of a rush to more substantial but less frequent posts? Only time will tell.</p>
<p><p><i><a href="http://tnl.net/who" rel="author" title="Who is Tristan Louis?">Tristan Louis</a> is the founder and CEO of <a href="http://www.keepskor.com" title="Keepskor">Keepskor</a> and  writes the influential <a href="http://www.tnl.net/" title="tnl.net">tnl.net</a> weblog, where this was initially posted under the title <a href="http://www.tnl.net/blog/2005/05/24/secrets-of-the-a-list-bloggers-lots-of-short-entries/">Secrets of the A-List Bloggers: Lots of short entries</a>. You can follow him on twitter <a href="https://twitter.com/TNLNYC">here</a> or receive his weekly newsletter by subscribing <a href="http://eepurl.com/gb6zD">here</a>.</i></p>
</p>
]]></content:encoded>
			<wfw:commentRss>http://www.tnl.net/blog/2005/05/24/secrets-of-the-a-list-bloggers-lots-of-short-entries/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Google Accelerates Search</title>
		<link>http://www.tnl.net/blog/2005/05/06/google-accelerates-search/</link>
		<comments>http://www.tnl.net/blog/2005/05/06/google-accelerates-search/#comments</comments>
		<pubDate>Fri, 06 May 2005 07:03:01 +0000</pubDate>
		<dc:creator>Tristan Louis</dc:creator>
				<category><![CDATA[Technology]]></category>
		<category><![CDATA[Browser]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Microsoft]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[Yahoo]]></category>

		<guid isPermaLink="false">http://tnl.net/blog/2005/05/06/google-accelerates-search/</guid>
		<description><![CDATA[Google introduced a new tool called Web Accelerator. While much will be made of the fears about the privacy implications of that move, I personally believe that this move is one that is deeply rooted in the search mission of the company and will be seen as a gambit of the same size as the [...]<p><p><i><a href="http://tnl.net/who" rel="author" title="Who is Tristan Louis?">Tristan Louis</a> is the founder and CEO of <a href="http://www.keepskor.com" title="Keepskor">Keepskor</a> and  writes the influential <a href="http://www.tnl.net/" title="tnl.net">tnl.net</a> weblog, where this was initially posted under the title <a href="http://www.tnl.net/blog/2005/05/06/google-accelerates-search/">Google Accelerates Search</a>. You can follow him on twitter <a href="https://twitter.com/TNLNYC">here</a> or receive his weekly newsletter by subscribing <a href="http://eepurl.com/gb6zD">here</a>.</i></p>
</p>
]]></description>
			<content:encoded><![CDATA[<p>Google introduced a new tool called <a href="http://webaccelerator.google.com" title="Google Web Accelerator">Web Accelerator</a>. While much will be made of the fears about the privacy implications of that move, I personally believe that this move is one that is deeply rooted in the search mission of the company and will be seen as a gambit of the same size as the one taken by its founders when they first looked at Yahoo! in the mid-90s and figured they could deliver a better search product.</p>
<h3>How Search Engines Work</h3>
<p>Before I get into details as to why I think this web accelerator is a major search move by Google, I first need to educate some of my readers as to the basics of search and some of the issues relating to creating a good search product. If you already know about search indexing, you can skip to the next section.</p>
<p>Search engines are basically acting as not only giant card catalogs, similar to the ones you can find in a library but also as giant libraries in and off themselves. When you type a word in a search box, what happens next includes a number of different steps that allow to look through a giant index, which is basically an image of all the pages the search engine knows about.</p>
<p>The way those indexes are created is through programs that are known as spiders (sometimes also referred to as web-robots or crawlers). Those programs are independent pieces of software that go and basically surf the web at very high speed, making copies of everything they encounter and comparing what they find to what other spiders are found. That giant set of pages copied by spiders is called an index (it is also sometimes referred to as a collection). They run around the clock and their sole job is to get more pages and ensure that the pages they’ve gotten in the past still exist and that they have not changed (if they have changed, the spider will “re-index” the page, ie. delete the previous one from the index and put the new version in its place).</p>
<h3>Size Matters</h3>
<p>The idea is a surprisingly simple one and was first introduced in the early days of the web. At the time, creating an index of all the pages on the web was relatively easy, largely due to the fact that there were not that many pages and that not that many people were creating them (I actually enjoy surprising newbies by telling them that I once saw the whole web, every single pages on it. What I omit until later in the story is that I did this in 1993, at a time when you could count the number of web servers without hitting 100 and when you could actually see the whole web in only a few hours.)</p>
<p>The amazing thing is that, although the number of web sites (and hence the number of web pages) has exploded, the basic technology to build a search index has not evolved that much. The concepts are basically the same today as they were in 1994–1995 but the web is now much, much larger.</p>
<p>How large, you wonder? Well, a good indicator would be to take a look at the bottom of the <a href="http://www.google.com" title="Google">Google home page</a> for a number. As of this writing, that number stands at 8,058,044,651. That’s over 8 billion pages, a very large number and one that folks at Google are appropriately proud of.</p>
<p>There’s only one little issue with that number. It’s on the low side. In fact, it’s estimated that it represents less than one percent of the actual number of pages on the web. In 2001, that number was estimated at over 500 billion pages in what is called the <a href="http://en.wikipedia.org/wiki/Deep_web" title="Deep Web">Deep Web</a>, a part of the web that has not been indexed by search engines yet. With the growth of weblogs, which are generated tons of content on a daily basis, and the connection of more systems like books, satellite maps, etc… to the web, you can only imagine that the number has grown.</p>
<p>Let’s pause for a moment and assume that only as many pages were created between 2001 and now as were created in the previous four years, at the high of the dotcom boom. This means that there would be over a trillion web pages on the Internet. Now that gets to be a much more interesting number.</p>
<h3>You Call THIS Fresh?!?</h3>
<p>So we know that Google has a problem in finding a lot of the pages that already exist on the Internet. But that’s nothing compared to the other problem Google has.</p>
<p>Imagine an index with 1 million pages. If you assume that a spider can index that one million pages in a day, the content on those pages is refreshed daily, meaning that the index has a new version of the pages only once a day. Now try to do the same with 8 billion pages and it becomes a pretty complicated problem. Google has solved some of that problem by basically deciding that some sites have a higher worth than others. As a results, sites which are known to refresh their content on a regular basis get more attention from Google than sites that do not.</p>
<p>With the explosion of weblogs, however, a new breed of sites has created a problem for Google. For starters, there are a lot of them, and most of them refresh their content regularly, in some cases more than once a day. This makes the job of producing relevant indexes almost impossible for Google, turning their search engine into something more akin to a library, the kind of place that you use when you are looking for a reference, than an up to date source.</p>
<p>Not only that but, if Google is to also index the deep web, keeping track of all the changes across all the web becomes impossible… Impossible, that is, if you are using crawlers.</p>
<p>So we now know that the crawlers are no longer the right option when it comes to keeping fresh information within a proper search engine index. Looking at this, Google needs to do something radical. On the one hand, they can try to build a system that will get the most up to date information through notification from the sites that are updating content. This is where services like <a href="http://technorati.com/" title="Technorati">Technorati</a> and Feedster come in, getting updates from RSS feeds and thus building indexes with more recent information than Google’s.</p>
<p>On the other hand, they could look at increasing the number of crawlers they are using. We know that <a href="http://www.tnl.net/blog/2004/04/30/how-many-google-machines/" title="TNL.net: How Many Google Machines">Google has a lot of machines</a> but trying to scale to the point where they can monitor a trillion pages via crawl would require a lot more power than that.</p>
<p>Enters Web Accelerator!</p>
<h3>Spreading the Load</h3>
<p>In the late 90s, distributing computing took hold as a concept. Projects like SETI@home and <a href="http://folding.stanford.edu/" title="Folding@Home">Folding@Home</a> have shown the way in terms of harnessing the power of millions of computers to solve processor-intensive kinds of problems. Google started looking at this with the roll out of their toolbar with a feature called <a href="http://www.google.com/toolbar/ie/index.html" title="Google Compute">Google Compute</a>.</p>
<p>Now let’s move forward. What if you could get information as to what pages are new and what pages are changes by just observing where people are surfing? This is the space that the accelerator occupies. Sitting neatly between your web browser and the Google architecture is a mini proxy that keeps checking if it can find a way to give you pages at a faster rate from the Google index than it does from the actual existing site. Along the way, Google finds out what pages are missing from its index (and gets a chance to add them) and what pages in its index are not up to date.</p>
<p>Imagine a million people downloading the Google Web Accelerator and all of a sudden, you have an infrastructure that finds out about a lot of pages very quickly.</p>
<p>Microsoft and Yahoo! are already in competition with Google in the search space. In order to maintain its leadership, Google needs to not only provide an index that is larger than its competitors but also more up to date. With this accelerator, they can do that and only one of its competitor can ever hope to match the feature: Microsoft.</p>
<p>The <a href="http://webaccelerator.google.com/webmasterhelp.html" title="What Webmasters Need To Know About Google Web Accelerator">webmaster FAQ</a> points the accelerator does not cover pages which are secure (nicely bypassing security issues) nor large media files. I suspect that we will see that change in the future, with the addition of images coming first.</p>
<p><p><i><a href="http://tnl.net/who" rel="author" title="Who is Tristan Louis?">Tristan Louis</a> is the founder and CEO of <a href="http://www.keepskor.com" title="Keepskor">Keepskor</a> and  writes the influential <a href="http://www.tnl.net/" title="tnl.net">tnl.net</a> weblog, where this was initially posted under the title <a href="http://www.tnl.net/blog/2005/05/06/google-accelerates-search/">Google Accelerates Search</a>. You can follow him on twitter <a href="https://twitter.com/TNLNYC">here</a> or receive his weekly newsletter by subscribing <a href="http://eepurl.com/gb6zD">here</a>.</i></p>
</p>
]]></content:encoded>
			<wfw:commentRss>http://www.tnl.net/blog/2005/05/06/google-accelerates-search/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>

<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Page Caching using disk: enhanced
Database Caching 32/59 queries in 0.451 seconds using disk: basic

Served from: www.tnl.net @ 2012-02-09 23:43:43 -->
