Links and Search Engines: The MSN edition

I’ve been promising for a while to complete this series with results relating to MSN (and, for the record, this has nothing to do with Scoble begging for it). I finally got around to cleaning up the HTML output of Excel and can now present the third (and probably final) installment in my analysis of search engine link features.

To recap, I initially took the list of Top 100 blogs listed by Technorati on May 19th, 2005 and started doing side by side comparisons. I initially looked at distribution of links among the top 100, then followed up with an analysis of Technorati against Google, this brought me to a subsequent chapter on Technorati against Google and Yahoo! (then comparing Google and Yahoo! to each other). All this created some fair amount of buzz in the search world, with people saying it was interesting to other saying I was way off the mark. Either way, it’s time to take a look at MSN, in order to complete this round-up.

So, to create some benchmarks, let’s start taking a look at distribution of Technorati links against MSN’s:

Technorati Top 100MSN LinksTechnorati LinksTechnorati/MSN Links
Boing Boing407172225325.53378%
InstaPundit241472151906.29058%
Daily Kos184666158338.57386%
Gizmodo252869122784.85548%
Fark352289102162.89989%
EnGadget198584150517.57916%
Davenetics33347571227.08458%
Eschaton13824187136.30276%
Dooce11838567975.74144%
Andrew Sullivan9631576807.97384%
The Best Page In The Universe9223263336.86638%
Talking Points Memo: by Joshua Micah Marshall19343875923.92477%
lgf: anti-idiotarian60678275136.39360%
kottke.org15986172784.55271%
WIL WHEATON DOT NET14858763144.24936%
Metafilter13605275915.57948%
Doc Searls9578156905.94064%
(In)formacao e (In)utilidade32726040184.59658%
Wonkette9676858776.07329%
Scripting News18306757283.12891%
Power Line9206974778.12108%
Balmasque40945441111.00244%
Corante23107768633.26265%
A list Apart22058455362.50970%
Something Awful9790845124.60841%
Megatokyo11290241543.67930%
Michelle Malkin7219060918.43746%
Arts and Letters Daily9471839834.20511%
Gawker7277344536.11903%
Afterall it was the best I ever had9223591389.47939%
The Volokh Conspiracy8881858736.61240%
Scobelizer6828255248.08998%
Jeffrey Zeldman14953941342.76450%
This Modern World7903839134.95078%
The Web Standards Project21191738101.79787%
Joel on Software13385345143.37236%
Media Matters for America64867680910.49686%
Television without pity4639138598.31842%
Kuro5hin13054942083.22331%
Lileks5070638247.54151%
Hugh Hewitt6411845737.13216%
Joel Veitch23302377416.19603%
Truthout42693652815.29056%
Baghdad Burning5164735196.81356%
Buzz machine7264941455.70552%
fleugel20199536701.81688%
Informed Comment6282239056.21598%
Doppler: redefining podcasting12512304024.29668%
geek and proud7143166443.41737%
loadmemory (Asian site)19833241678.78788%
Photojunkie3721286076.86106%
Ross Rader4830297661.61491%
The Truth Laid Bear5180641277.96626%
Joi Ito6264251658.24527%
ScrappleFace4995334806.96655%
LexText17412671153.41758%
Google Blog4296736888.58333%
Xbox8602142214.90694%
My life in a Bush of Ghosts12251920991.66667%
Astronomy picture of the day33625349810.40297%
Crooked Timber6067536175.96127%
Vodka Pundit5820530855.30023%
Captain’s quarter4560936718.04885%
A small victory5476732235.88493%
Gato Fedorento22942574112.20575%
Mezzoblue9951129522.96651%
PostSecret3079427078.79067%
Samizdata.net17122872167.75701%
Lawrence Lessig8104729493.63863%
Counterpunch5264232786.22697%
Democractic Underground35595391310.99312%
Right Wing News6137929674.83390%
StopDesign8616530373.52463%
iBiblio3230131059.61271%
Samizdata.net (mistake?)6144327434.46430%
Abrupto26982935108.78428%
gene7299 (Asian MSNSpaces site)28321511482.14286%
Where is Raed?2484824099.69495%
B3TA: We love the web3838626146.80977%
Talkleft6016929014.82142%
Wizbang6025933585.57261%
m1net (MSN spaces site)22354816127.27273%
Hoder16205422334.69136%
CTRL+Alt+Del3227723157.17229%
Brad DeLong4840327155.60916%
Blogs for Bush5082035607.00512%
Neil Gaiman7191621943.05078%
Gothamist4784827295.70348%
Thought Mechanics6073621973.61729%
IMAO4582229056.33975%
Dan Gillmor (old weblog)3636926007.14895%
HINAGATA17651921861.23839%
Dean’s World5315029855.61618%
Defamer4913223724.82781%
USS Clueless6472525703.97065%
Dive into Mark5416725404.68920%
Pandagon5128628225.50248%
Blogging.la8495306136.03296%
Why are you worshipping the ground I blog on?3481223864.29187%
Daring Fireball5238125734.91209%

Of course, no big surprise here. This seems to be pretty consistent with what I had found in dealing with Google and Yahoo!, showing that Technorati does a good but not complete job at indexing link-backs. What’s interesting, however, is that Technorati seems to have a different pattern when dealing with MSN than it does with Yahoo or Google. Let me show you what I’m talking about. Following is the pattern of Technorati differential with MSN:

Technorati vs. MSN

Technorati vs. MSN

… and now is the differential between Technorati and Yahoo..

Technorati vs. Yahoo

Technorati vs. Yahoo

.. and finally the same graph between Technorati and Google

Techorati vs. Google: Averages

Techorati vs. Google: Averages

I’ve been trying to understand why this is and still have no clear answer, to be fully honest. Could be something, could be nothing. I’m not sure at this point and this is, in large part, one of the thing that was frustrating in working on this entry. I’m not sure there is something there, to be very honest.

Comparing the Search Engines

However, the picture gets more interesting when you get the three search engines side by side. Here’s a quick spreadsheet of the results:

Technorati Top 100Google LinksYahoo LinksMSN LinksMSN Links/Google LinksMSN Links/Yahoo Links
Boing Boing452001880000407172900.8230%21.6581%
InstaPundit750002160000241472321.9627%11.1793%
Daily Kos598001690000184666308.8060%10.9270%
Gizmodo393001970000252869643.4326%12.8360%
Fark436001420000352289808.0023%24.8091%
EnGadget468002820000198584424.3248%7.0420%
Davenetics1780664003334187.3034%5.0211%
Eschaton624001400000138241221.5401%9.8744%
Dooce23600653000118385501.6314%18.1294%
Andrew Sullivan41100126000096315234.3431%7.6440%
The Best Page In The Universe656620009223214059.7561%148.7613%
Talking Points Memo: by Joshua Micah Marshall74600563000193438259.3003%34.3584%
lgf: anti-idiotarian1470049300606741.2721%12.3063%
kottke.org320001200000159861499.5656%13.3218%
WIL WHEATON DOT NET16900564000148587879.2130%26.3452%
Metafilter345001160000136052394.3536%11.7286%
Doc Searls33600115000095781285.0625%8.3288%
(In)formaco e (In)utilidade17801100003272183.8202%2.9745%
Wonkette28800137000096768336.0000%7.0634%
Scripting News394001470000183067464.6371%12.4535%
Power Line7510344000920691225.9521%26.7642%
Balmasque24405004091704.1667%1.0099%
Corante677026500023107341.3146%8.7196%
A list Apart211006200002205841045.4218%35.5781%
Something Awful9020372000979081085.4545%26.3194%
Megatokyo73103610001129021544.4870%31.2748%
Michelle Malkin1730053700072190417.2832%13.4432%
Arts and Letters Daily2390086600094718396.3096%10.9374%
Gawker23500106000072773309.6723%6.8654%
Afterall it was the best I ever had9534900922970.5263%2.6418%
The Volokh Conspiracy42000119000088818211.4714%7.4637%
Scobelizer2180093700068282313.2202%7.2873%
Jeffrey Zeldman22500528000149539664.6178%28.3218%
This Modern World3210081300079038246.2243%9.7218%
The Web Standards Project18505980021191711454.9730%354.3763%
Joel on Software22400966000133853597.5580%13.8564%
Media Matters for America2480053600064867261.5605%12.1021%
Television without pity1330035600046391348.8045%13.0312%
Kuro5hin17300866000130549754.6185%15.0749%
LileksÂ3970050706Â127.7229%
Hugh Hewitt2670092900064118240.1423%6.9018%
Joel Veitch283013500023302823.3922%17.2607%
Truthout878037100042693486.2528%11.5075%
Baghdad Burning2270055200051647227.5198%9.3563%
Buzz machine30600101000072649237.4150%7.1930%
fleugel189020100020199510687.5661%100.4950%
Informed Comment2790078700062822225.1685%7.9825%
Doppler: redefining podcasting442060700012512283.0769%2.0613%
geek and proud3559110714201.1268%7.8375%
loadmemory (Asian site)831550198238.5542%12.7742%
Photojunkie1540512003721241.6234%7.2676%
Ross Rader1070482004830451.4019%10.0207%
The Truth Laid Bear2390071700051806216.7615%7.2254%
Joi Ito23400105000062642267.7009%5.9659%
ScrappleFace3110080700049953160.6206%6.1900%
LexText197031200174188.3756%5.5801%
Google Blog462970004296793406.5217%14.4670%
Xbox6600237000860211303.3485%36.2958%
My life in a Bush of Ghosts690312200.0000%1.3289%
Astronomy picture of the day502011300033625669.8207%29.7566%
Crooked Timber356067500606751704.3539%89.8889%
Vodka Pundit4520169000582051287.7212%34.4408%
Captain’s quarter2710073000045609168.2989%6.2478%
A small victory1670046000054767327.9461%11.9059%
Gato Fedorento16301260002294140.7362%1.8206%
Mezzoblue1200027800099511829.2583%35.7953%
PostSecret579020200030794531.8480%15.2446%
Samizdata.net1050180001712163.0476%9.5111%
Lawrence Lessig3060095900081047264.8595%8.4512%
Counterpunch1170029500052642449.9316%17.8447%
Democractic Underground1490041700035595238.8926%8.5360%
Right Wing News2790079400061379219.9964%7.7304%
StopDesign1020025500086165844.7549%33.7902%
iBiblio973019700032301331.9733%16.3964%
Samizdata.net (mistake?)2550069700061443240.9529%8.8154%
Abrupto550447002698490.5455%6.0358%
gene7299 (Asian MSNSpaces site)587642848.2759%3.6649%
Where is Raed?1010023200024848246.0198%10.7103%
B3TA: We love the web1200083900038386319.8833%4.5752%
Talkleft717022100060169839.1771%27.2258%
Wizbang2100063400060259286.9476%9.5046%
m1net (MSN spaces site)1045792221.1538%3.7997%
Hoder1480209001620109.4595%7.7512%
CTRL+Alt+Del2310171000322771397.2727%18.8754%
Brad DeLong3010088200048403160.8073%5.4879%
Blogs for Bush1620082400050820313.7037%6.1675%
Neil Gaiman1370031900071916524.9343%22.5442%
Gothamist1520049100047848314.7895%9.7450%
Thought Mechanics4400190000607361380.3636%31.9663%
IMAO2380040700045822192.5294%11.2585%
Dan Gillmor (old weblog)1080029800036369336.7500%12.2044%
HINAGATA10100211001765191747.7129%836.5829%
Dean’s World3060078400053150173.6928%6.7793%
Defamer931072500049132527.7336%6.7768%
USS Clueless847026400064725764.1677%24.5170%
Dive into Mark1460023500054167371.0068%23.0498%
Pandagon2730074300051286187.8608%6.9026%
Blogging.la3200677008495265.4688%12.5480%
Why are you worshipping the ground I blog on?1430850003481243.4266%4.0953%
Daring Fireball1200022100052381436.5083%23.7018%

The most interesting thing here is that MSN seems to prove the assertion I had made regarding Google not providing as many links as Yahoo does. The same seems to be true between MSN and Google. There were, however, a few surprises here, as far as I’m concerned:

  • Sites located in the United States seem to fair better, on MSN, than other sites. Google and Yahoo seem to have a stronger indexing presence outside the US than MSN does.
  • MSN spaces sites are not getting particularly great representation in MSN search, compared to its competitors. I was surprised by this since they are part of the same service

Conclusions and more!

So there you have, no great insight here apart from the fact that this linking stuff is interesting and that even small scale analysis can bring up some interesting trends. As I mentioned before, I am not an expert on this and thought to put together the numbers and start an analysis. Enjoy!

Previous Post
Standard Power Chargers Would Be Nice
Next Post
NPR defining new Podcast strategy
Menu