<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss'><id>tag:blogger.com,1999:blog-12987422</id><updated>2009-12-18T08:24:03.490-08:00</updated><title type='text'>Piccolblog</title><subtitle type='html'>Antonio's highlights and notes</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://blog.piccolboni.info/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12987422/posts/default/-/technology'/><link rel='alternate' type='text/html' href='http://blog.piccolboni.info/search/label/technology'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Antonio Piccolboni</name><uri>http://www.blogger.com/profile/18181181557046696245</uri><email>noreply@blogger.com</email></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>3</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>25</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-12987422.post-2204670956275188003</id><published>2008-08-17T12:15:00.000-07:00</published><updated>2008-08-18T14:33:29.302-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='technology'/><category scheme='http://www.blogger.com/atom/ns#' term='computing'/><title type='text'>Google returns more AdSense rich results than Live: conflict of interest?</title><content type='html'>Several search pundits have commented on conflicts of interest Google is potentially exposed to.  Google itself &lt;a href="http://www.techcrunch.com/2008/04/02/google-to-sell-performics/"&gt;admitted&lt;/a&gt; to not wanting to hold onto Performics, the search engine marketing division of their recent purchase Doubleclick, and in fact they quickly sold it. But that's only one of many conflicts of interest Google has to deal with. John Battelle &lt;a href="http://battellemedia.com/archives/004574.php"&gt;observes&lt;/a&gt; in his searchblog that, in a comparison between Google Ad Planner and Comscore data, sites that use AdSense, Google's advertising platform, get better traffic numbers. If you use the free Ad Planner to plan an ad campaign your money is more likely to flow to Google's coffers than if you were using Comscore. Google provided a statement denying any deliberate bias, but it's hard to find alternative explanations. A similar conflict of interest arises for Quantcast and their Media Planner and rating services w.r.t. being fair to publishers that don't use their tag (they are not &lt;i&gt;quantified&lt;/i&gt; in their jargon). They provide ratings and demographic analyses for both, but directly get data only from quantified publishers. Since I am a former employee I will abstain from a more detailed analysis in this case.&lt;br /&gt;&lt;br /&gt;Another conflict of interest is between Google the search company and Google the media company: will Google search give prominence to Google properties Picasa, Knol, Blogger and YouTube over the rest of the net? Timo Paloheimo is afraid that could be the case and so he created a &lt;a href="http://www.startupbin.com/2008/08/11/google-minus-google/"&gt;custom search engine&lt;/a&gt; that excludes such properties, a draconian solution in my opinion but a clear expression of the malaise generated by Google's dominant position in search and the multiple conflicts of interest with other Google products. &lt;br /&gt;&lt;br /&gt;But this is only the tip of the iceberg IMHO: the main conflict of interest is between search and advertisment. When companies like Answers.com report that their traffic is down 28% because of a change in their Google ranking, Om Malik rightfully &lt;a href="http://gigaom.com/2007/08/02/answerscom-raises-questions-about-googles-power/"&gt;comments&lt;/a&gt; that this very fact &lt;i&gt;"raises questions about Google’s power"&lt;/i&gt;.&lt;br /&gt;Google has an interest in returning the most relevant and useful results in each search; and also has an interest in directing traffic to sites that use adSense because they provide a big part of Google's revenue. How do they balance the two? Google's answer is that the two businesses are run as independently as possible (I forgot the source here, but I am almost sure it was a Google spokesperson). But is that a sufficient answer? Or should we trust Microsoft not to tweak their live search? I think I have &lt;a href="http://spreadsheets.google.com/pub?key=pVGoEOC1mxMvU79Ok8b0-AA"&gt;&lt;b&gt;strong evidence&lt;/b&gt;&lt;/a&gt; that Google search results are more likely to carry adSense ads than Microsoft's live.com.&lt;br /&gt;&lt;span class="fullpost"&gt;&lt;br /&gt;The methodology is as follows:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Sample random search keywords&lt;/li&gt;&lt;li&gt;Use each of them as a query on google.com and live.com&lt;/li&gt;&lt;li&gt;Check if any of the results in the first page uses adSense&lt;/li&gt;&lt;/ol&gt;I considered the ratio between the number of search matches using adSense and the total number of accessible search results (normally 10, but sometimes some could not be accessed within a reasonable time) for each keyword. Out of 632 searches, Google-provided matches were more adSense rich than live.com ones 316 times, the opposite was true 126 times and the ties were 190. Pretty remarkable, isn't it? If you are not convinced by your intuition or are statistically inclined, applying a Wilcoxon signed rank test rejects the hypothesis of the two search engines results being equally adSense rich with a p-value smaller than 2.2e-16. &lt;br /&gt;&lt;br /&gt;Correlation is not causation. It could be that either company is tweaking their engine to favor or penalize adSense; or it could be some other mechanism I haven't figured out yet. But consider this hypothetical scenario: Google doesn't feel it needs to compete so hard in search anymore, wants to boost revenue, decides to give a small but competitively significant rank boost to adSense-running sites, those in turn outcompete their non-adSense-running competitors. We really don't want this to happen. We just recovered from another stranglehold on the computing world.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12987422-2204670956275188003?l=blog.piccolboni.info' alt='' /&gt;&lt;/div&gt;</content><link rel='related' href='http://spreadsheets.google.com/pub?key=pVGoEOC1mxMvU79Ok8b0-AA' title='Google returns more AdSense rich results than Live: conflict of interest?'/><link rel='replies' type='application/atom+xml' href='http://blog.piccolboni.info/feeds/2204670956275188003/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://blog.piccolboni.info/2008/08/google-returns-more-adsense-rich.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12987422/posts/default/2204670956275188003'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12987422/posts/default/2204670956275188003'/><link rel='alternate' type='text/html' href='http://blog.piccolboni.info/2008/08/google-returns-more-adsense-rich.html' title='Google returns more AdSense rich results than Live: conflict of interest?'/><author><name>Antonio Piccolboni</name><uri>http://www.blogger.com/profile/18181181557046696245</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='17670611985532990073'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12987422.post-3276089771649253998</id><published>2008-07-30T15:34:00.000-07:00</published><updated>2008-08-04T08:27:43.892-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='technology'/><category scheme='http://www.blogger.com/atom/ns#' term='computing'/><title type='text'>reCAPTCHA, a bogus motivation</title><content type='html'>The authors claim they use spare cycles for a good cause, but it isn't true.&lt;br /&gt;&lt;span class="fullpost"&gt;&lt;br /&gt;CAPTCHAs are little reading tests that allow to distinguish humans from machines when accessing a computer interface. A word is displayed in a slightly noisy form and the user is required to type it in, easy for humans but not so for machines. The authors of reCAPTCHA wanted to associate a useful task to this test. They say &lt;/span&gt;&lt;br /&gt;&lt;blockquote&gt;... in aggregate these little puzzles consume more than 150,000 hours of work each day. What if we could make positive use of this human effort?&lt;/blockquote&gt;So instead of picking random words for the test, they pick word-sized fragment from large public digitization efforts (currently the one led by the Internet Archive) and submit them to the users. That is, reCAPTCHA words are words that are not in digital format, appear in a paper document of interest and machines could not understand. So by solving a reCAPTCHA you are helping a useful project. So where is the catch? Well it took me a moment of thought but it's kind of obvious: if these are fragments of real digitization projects, by definition we don't know the solution. If we don't know it, we can't use them as CAPTCHAs! So I went back to the reCAPTHCA web site, and what they do is to propose two CAPTCHAs, one with a known but useless solution and the other with an unknown but useful one, in a random order. If you want to use a website, you have to complete both. Therefore there is no use of spare cycles whatsoever: the effort for the user has been doubled. If reCAPTCHA were the standard, there would be another 150,000 hours a day spent solving the second word. Fine by me in exchange for a free service, but not as efficient as advertised, not even close.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12987422-3276089771649253998?l=blog.piccolboni.info' alt='' /&gt;&lt;/div&gt;</content><link rel='related' href='http://recaptcha.net/learnmore.html' title='reCAPTCHA, a bogus motivation'/><link rel='replies' type='application/atom+xml' href='http://blog.piccolboni.info/feeds/3276089771649253998/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://blog.piccolboni.info/2008/07/recaptcha-bogus-motivation.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12987422/posts/default/3276089771649253998'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12987422/posts/default/3276089771649253998'/><link rel='alternate' type='text/html' href='http://blog.piccolboni.info/2008/07/recaptcha-bogus-motivation.html' title='reCAPTCHA, a bogus motivation'/><author><name>Antonio Piccolboni</name><uri>http://www.blogger.com/profile/18181181557046696245</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='17670611985532990073'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12987422.post-5641416747338185821</id><published>2008-06-08T23:47:00.000-07:00</published><updated>2008-06-27T00:30:17.456-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='technology'/><category scheme='http://www.blogger.com/atom/ns#' term='science'/><title type='text'>Map of Science</title><content type='html'>Fascinating map of the relation between scientific disciplines. Not all of the methodology is clear, but it is fun to look at. Notice the lack of ties between CSE and Biology.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://mapofscience.com/"&gt;Map of Science&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="fullpost"&gt;&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12987422-5641416747338185821?l=blog.piccolboni.info' alt='' /&gt;&lt;/div&gt;</content><link rel='related' href='http://mapofscience.com/' title='Map of Science'/><link rel='replies' type='application/atom+xml' href='http://blog.piccolboni.info/feeds/5641416747338185821/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://blog.piccolboni.info/2008/06/map-of-science.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12987422/posts/default/5641416747338185821'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12987422/posts/default/5641416747338185821'/><link rel='alternate' type='text/html' href='http://blog.piccolboni.info/2008/06/map-of-science.html' title='Map of Science'/><author><name>Antonio Piccolboni</name><uri>http://www.blogger.com/profile/18181181557046696245</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='17670611985532990073'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry></feed>