TNL.net is designed for modern browsers but the content is still readable in older ones. If you want to ensure the best experience, please install a browser that was developed after 2009.

tnl.net

Links and Search Engines: The MSN edition

I’ve been promis­ing for a while to com­plete this series with results relat­ing to MSN (and, for the record, this has noth­ing to do with Scoble beg­ging for it). I finally got around to clean­ing up the HTML out­put of Excel and can now present the third (and prob­a­bly final) install­ment in my analy­sis of search engine link features.

To recap, I ini­tially took the list of Top 100 blogs listed by Tech­no­rati on May 19th, 2005 and started doing side by side com­par­isons. I ini­tially looked at dis­tri­b­u­tion of links among the top 100, then fol­lowed up with an analy­sis of Tech­no­rati against Google, this brought me to a sub­se­quent chap­ter on Tech­no­rati against Google and Yahoo! (then com­par­ing Google and Yahoo! to each other). All this cre­ated some fair amount of buzz in the search world, with peo­ple say­ing it was inter­est­ing to other say­ing I was way off the mark. Either way, it’s time to take a look at MSN, in order to com­plete this round-up.

So, to cre­ate some bench­marks, let’s start tak­ing a look at dis­tri­b­u­tion of Tech­no­rati links against MSN’s:

Tech­no­rati Top 100 MSN Links Tech­no­rati Links Technorati/MSN Links
Boing Boing 407172 22532 5.53378%
InstaPun­dit 241472 15190 6.29058%
Daily Kos 184666 15833 8.57386%
Giz­modo 252869 12278 4.85548%
Fark 352289 10216 2.89989%
EnGad­get 198584 15051 7.57916%
Dav­e­net­ics 3334 7571 227.08458%
Escha­ton 138241 8713 6.30276%
Dooce 118385 6797 5.74144%
Andrew Sul­li­van 96315 7680 7.97384%
The Best Page In The Universe 92232 6333 6.86638%
Talk­ing Points Memo: by Joshua Micah Marshall 193438 7592 3.92477%
lgf: anti-idiotarian 6067 8275 136.39360%
kottke.org 159861 7278 4.55271%
WIL WHEATON DOT NET 148587 6314 4.24936%
Metafil­ter 136052 7591 5.57948%
Doc Searls 95781 5690 5.94064%
(In)formacao e (In)utilidade 3272 6040 184.59658%
Won­kette 96768 5877 6.07329%
Script­ing News 183067 5728 3.12891%
Power Line 92069 7477 8.12108%
Bal­masque 409 4544 1111.00244%
Corante 23107 7686 33.26265%
A list Apart 220584 5536 2.50970%
Some­thing Awful 97908 4512 4.60841%
Mega­tokyo 112902 4154 3.67930%
Michelle Malkin 72190 6091 8.43746%
Arts and Let­ters Daily 94718 3983 4.20511%
Gawker 72773 4453 6.11903%
After­all it was the best I ever had 922 3591 389.47939%
The Volokh Conspiracy 88818 5873 6.61240%
Sco­belizer 68282 5524 8.08998%
Jef­frey Zeldman 149539 4134 2.76450%
This Mod­ern World 79038 3913 4.95078%
The Web Stan­dards Project 211917 3810 1.79787%
Joel on Software 133853 4514 3.37236%
Media Mat­ters for America 64867 6809 10.49686%
Tele­vi­sion with­out pity 46391 3859 8.31842%
Kuro5hin 130549 4208 3.22331%
Lileks 50706 3824 7.54151%
Hugh Hewitt 64118 4573 7.13216%
Joel Veitch 23302 3774 16.19603%
Truthout 42693 6528 15.29056%
Bagh­dad Burning 51647 3519 6.81356%
Buzz machine 72649 4145 5.70552%
fleugel 201995 3670 1.81688%
Informed Com­ment 62822 3905 6.21598%
Doppler: redefin­ing podcasting 12512 3040 24.29668%
geek and proud 714 3166 443.41737%
load­mem­ory (Asian site) 198 3324 1678.78788%
Pho­to­junkie 3721 2860 76.86106%
Ross Rader 4830 2976 61.61491%
The Truth Laid Bear 51806 4127 7.96626%
Joi Ito 62642 5165 8.24527%
Scrap­ple­Face 49953 3480 6.96655%
Lex­Text 1741 2671 153.41758%
Google Blog 42967 3688 8.58333%
Xbox 86021 4221 4.90694%
My life in a Bush of Ghosts 12 2519 20991.66667%
Astron­omy pic­ture of the day 33625 3498 10.40297%
Crooked Tim­ber 60675 3617 5.96127%
Vodka Pun­dit 58205 3085 5.30023%
Captain’s quar­ter 45609 3671 8.04885%
A small victory 54767 3223 5.88493%
Gato Fedorento 2294 2574 112.20575%
Mez­zoblue 99511 2952 2.96651%
Post­Se­cret 30794 2707 8.79067%
Samizdata.net 1712 2872 167.75701%
Lawrence Lessig 81047 2949 3.63863%
Coun­ter­punch 52642 3278 6.22697%
Democ­rac­tic Underground 35595 3913 10.99312%
Right Wing News 61379 2967 4.83390%
StopDe­sign 86165 3037 3.52463%
iBib­lio 32301 3105 9.61271%
Samizdata.net (mis­take?) 61443 2743 4.46430%
Abrupto 2698 2935 108.78428%
gene7299 (Asian MSNSpaces site) 28 3215 11482.14286%
Where is Raed? 24848 2409 9.69495%
B3TA: We love the web 38386 2614 6.80977%
Talk­left 60169 2901 4.82142%
Wiz­bang 60259 3358 5.57261%
m1net (MSN spaces site) 22 3548 16127.27273%
Hoder 1620 5422 334.69136%
CTRL+Alt+Del 32277 2315 7.17229%
Brad DeLong 48403 2715 5.60916%
Blogs for Bush 50820 3560 7.00512%
Neil Gaiman 71916 2194 3.05078%
Gothamist 47848 2729 5.70348%
Thought Mechan­ics 60736 2197 3.61729%
IMAO 45822 2905 6.33975%
Dan Gill­mor (old weblog) 36369 2600 7.14895%
HINAGATA 176519 2186 1.23839%
Dean’s World 53150 2985 5.61618%
Defamer 49132 2372 4.82781%
USS Clue­less 64725 2570 3.97065%
Dive into Mark 54167 2540 4.68920%
Pandagon 51286 2822 5.50248%
Blogging.la 8495 3061 36.03296%
Why are you wor­ship­ping the ground I blog on? 3481 2238 64.29187%
Dar­ing Fireball 52381 2573 4.91209%

Of course, no big sur­prise here. This seems to be pretty con­sis­tent with what I had found in deal­ing with Google and Yahoo!, show­ing that Tech­no­rati does a good but not com­plete job at index­ing link-backs. What’s inter­est­ing, how­ever, is that Tech­no­rati seems to have a dif­fer­ent pat­tern when deal­ing with MSN than it does with Yahoo or Google. Let me show you what I’m talk­ing about. Fol­low­ing is the pat­tern of Tech­no­rati dif­fer­en­tial with MSN:
Technorati vs. MSN
… and now is the dif­fer­en­tial between Tech­no­rati and Yahoo..
Technorati vs. Yahoo
.. and finally the same graph between Tech­no­rati and Google
Technorati vs. Google

I’ve been try­ing to under­stand why this is and still have no clear answer, to be fully hon­est. Could be some­thing, could be noth­ing. I’m not sure at this point and this is, in large part, one of the thing that was frus­trat­ing in work­ing on this entry. I’m not sure there is some­thing there, to be very honest.

Com­par­ing the Search Engines

How­ever, the pic­ture gets more inter­est­ing when you get the three search engines side by side. Here’s a quick spread­sheet of the results:

Tech­no­rati Top 100 Google Links Yahoo Links MSN Links MSN Links/Google Links MSN Links/Yahoo Links
Boing Boing 45200 1880000 407172 900.8230% 21.6581%
InstaPun­dit 75000 2160000 241472 321.9627% 11.1793%
Daily Kos 59800 1690000 184666 308.8060% 10.9270%
Giz­modo 39300 1970000 252869 643.4326% 12.8360%
Fark 43600 1420000 352289 808.0023% 24.8091%
EnGad­get 46800 2820000 198584 424.3248% 7.0420%
Dav­e­net­ics 1780 66400 3334 187.3034% 5.0211%
Escha­ton 62400 1400000 138241 221.5401% 9.8744%
Dooce 23600 653000 118385 501.6314% 18.1294%
Andrew Sul­li­van 41100 1260000 96315 234.3431% 7.6440%
The Best Page In The Universe 656 62000 92232 14059.7561% 148.7613%
Talk­ing Points Memo: by Joshua Micah Marshall 74600 563000 193438 259.3003% 34.3584%
lgf: anti-idiotarian 14700 49300 6067 41.2721% 12.3063%
kottke.org 32000 1200000 159861 499.5656% 13.3218%
WIL WHEATON DOT NET 16900 564000 148587 879.2130% 26.3452%
Metafil­ter 34500 1160000 136052 394.3536% 11.7286%
Doc Searls 33600 1150000 95781 285.0625% 8.3288%
(In)formaco e (In)utilidade 1780 110000 3272 183.8202% 2.9745%
Won­kette 28800 1370000 96768 336.0000% 7.0634%
Script­ing News 39400 1470000 183067 464.6371% 12.4535%
Power Line 7510 344000 92069 1225.9521% 26.7642%
Bal­masque 24 40500 409 1704.1667% 1.0099%
Corante 6770 265000 23107 341.3146% 8.7196%
A list Apart 21100 620000 220584 1045.4218% 35.5781%
Some­thing Awful 9020 372000 97908 1085.4545% 26.3194%
Mega­tokyo 7310 361000 112902 1544.4870% 31.2748%
Michelle Malkin 17300 537000 72190 417.2832% 13.4432%
Arts and Let­ters Daily 23900 866000 94718 396.3096% 10.9374%
Gawker 23500 1060000 72773 309.6723% 6.8654%
After­all it was the best I ever had 95 34900 922 970.5263% 2.6418%
The Volokh Conspiracy 42000 1190000 88818 211.4714% 7.4637%
Sco­belizer 21800 937000 68282 313.2202% 7.2873%
Jef­frey Zeldman 22500 528000 149539 664.6178% 28.3218%
This Mod­ern World 32100 813000 79038 246.2243% 9.7218%
The Web Stan­dards Project 1850 59800 211917 11454.9730% 354.3763%
Joel on Software 22400 966000 133853 597.5580% 13.8564%
Media Mat­ters for America 24800 536000 64867 261.5605% 12.1021%
Tele­vi­sion with­out pity 13300 356000 46391 348.8045% 13.0312%
Kuro5hin 17300 866000 130549 754.6185% 15.0749%
Lileks   39700 50706   127.7229%
Hugh Hewitt 26700 929000 64118 240.1423% 6.9018%
Joel Veitch 2830 135000 23302 823.3922% 17.2607%
Truthout 8780 371000 42693 486.2528% 11.5075%
Bagh­dad Burning 22700 552000 51647 227.5198% 9.3563%
Buzz machine 30600 1010000 72649 237.4150% 7.1930%
fleugel 1890 201000 201995 10687.5661% 100.4950%
Informed Com­ment 27900 787000 62822 225.1685% 7.9825%
Doppler: redefin­ing podcasting 4420 607000 12512 283.0769% 2.0613%
geek and proud 355 9110 714 201.1268% 7.8375%
load­mem­ory (Asian site) 83 1550 198 238.5542% 12.7742%
Pho­to­junkie 1540 51200 3721 241.6234% 7.2676%
Ross Rader 1070 48200 4830 451.4019% 10.0207%
The Truth Laid Bear 23900 717000 51806 216.7615% 7.2254%
Joi Ito 23400 1050000 62642 267.7009% 5.9659%
Scrap­ple­Face 31100 807000 49953 160.6206% 6.1900%
Lex­Text 1970 31200 1741 88.3756% 5.5801%
Google Blog 46 297000 42967 93406.5217% 14.4670%
Xbox 6600 237000 86021 1303.3485% 36.2958%
My life in a Bush of Ghosts 6 903 12 200.0000% 1.3289%
Astron­omy pic­ture of the day 5020 113000 33625 669.8207% 29.7566%
Crooked Tim­ber 3560 67500 60675 1704.3539% 89.8889%
Vodka Pun­dit 4520 169000 58205 1287.7212% 34.4408%
Captain’s quar­ter 27100 730000 45609 168.2989% 6.2478%
A small victory 16700 460000 54767 327.9461% 11.9059%
Gato Fedorento 1630 126000 2294 140.7362% 1.8206%
Mez­zoblue 12000 278000 99511 829.2583% 35.7953%
Post­Se­cret 5790 202000 30794 531.8480% 15.2446%
Samizdata.net 1050 18000 1712 163.0476% 9.5111%
Lawrence Lessig 30600 959000 81047 264.8595% 8.4512%
Coun­ter­punch 11700 295000 52642 449.9316% 17.8447%
Democ­rac­tic Underground 14900 417000 35595 238.8926% 8.5360%
Right Wing News 27900 794000 61379 219.9964% 7.7304%
StopDe­sign 10200 255000 86165 844.7549% 33.7902%
iBib­lio 9730 197000 32301 331.9733% 16.3964%
Samizdata.net (mis­take?) 25500 697000 61443 240.9529% 8.8154%
Abrupto 550 44700 2698 490.5455% 6.0358%
gene7299 (Asian MSNSpaces site) 58 764 28 48.2759% 3.6649%
Where is Raed? 10100 232000 24848 246.0198% 10.7103%
B3TA: We love the web 12000 839000 38386 319.8833% 4.5752%
Talk­left 7170 221000 60169 839.1771% 27.2258%
Wiz­bang 21000 634000 60259 286.9476% 9.5046%
m1net (MSN spaces site) 104 579 22 21.1538% 3.7997%
Hoder 1480 20900 1620 109.4595% 7.7512%
CTRL+Alt+Del 2310 171000 32277 1397.2727% 18.8754%
Brad DeLong 30100 882000 48403 160.8073% 5.4879%
Blogs for Bush 16200 824000 50820 313.7037% 6.1675%
Neil Gaiman 13700 319000 71916 524.9343% 22.5442%
Gothamist 15200 491000 47848 314.7895% 9.7450%
Thought Mechan­ics 4400 190000 60736 1380.3636% 31.9663%
IMAO 23800 407000 45822 192.5294% 11.2585%
Dan Gill­mor (old weblog) 10800 298000 36369 336.7500% 12.2044%
HINAGATA 10100 21100 176519 1747.7129% 836.5829%
Dean’s World 30600 784000 53150 173.6928% 6.7793%
Defamer 9310 725000 49132 527.7336% 6.7768%
USS Clue­less 8470 264000 64725 764.1677% 24.5170%
Dive into Mark 14600 235000 54167 371.0068% 23.0498%
Pandagon 27300 743000 51286 187.8608% 6.9026%
Blogging.la 3200 67700 8495 265.4688% 12.5480%
Why are you wor­ship­ping the ground I blog on? 1430 85000 3481 243.4266% 4.0953%
Dar­ing Fireball 12000 221000 52381 436.5083% 23.7018%

The most inter­est­ing thing here is that MSN seems to prove the asser­tion I had made regard­ing Google not pro­vid­ing as many links as Yahoo does. The same seems to be true between MSN and Google. There were, how­ever, a few sur­prises here, as far as I’m concerned:

Con­clu­sions and more!

So there you have, no great insight here apart from the fact that this link­ing stuff is inter­est­ing and that even small scale analy­sis can bring up some inter­est­ing trends. As I men­tioned before, I am not an expert on this and thought to put together the num­bers and start an analy­sis. How­ever, I know that this series has attracted experts so here’s a deal: I’m mak­ing the spread­sheet of data I com­piled avail­able under a Cre­ative Com­mons License (By Attri­bu­tion, Share Alike) here on TNL.net. If you man­age to do any­thing inter­est­ing with it, drop me a note and please make sure that you share it with the wider pub­lic. Enjoy!

Originally published on July 30, 2005 in Business, Technology . You may find related thoughts pieces under the following terms: , , , ,