Google vs. Bing: Spying, Cheating, Stealing, An Overview
If you have followed the news lately you may have noticed articles about Bing stealing Google's search results on all major tech blogs. Lifehacker, Download Squad, Neowin and dozens of other blogs repeated what the original source over at Searchengineland claimed.
According to Danny Sullivan's article Google setup a honeypot to lure Bing into the trap. Google manipulated their search engine to rank honeypot pages for 100 words that neither Bing nor Google have found matches before. In the second step 20 Google engineers began to run test queries from their home computers running Internet Explorer with Suggested sites and the Bing toolbar enabled. The engineers were also asked to click on the first search result on Google that would come up.
Some of the results started to appear in Bing 14 days after the experiment had started. Interestingly enough, only 9 out of the 100 searches produced the same result on Bing and Google.
As a result, Google assumed that Bing was copying Google Search results.
Harry Shum, Corporate VP at Bing, today replied at the Future of Search event:
We use over 1,000 different signals and features in our ranking algorithm. A small piece of that is clickstream data we get from some of our customers, who opt-in to sharing anonymous data as they navigate the web in order to help us improve the experience for all users.
To be clear, we learn from all of our customers. What we saw in today’s story was a spy-novelesque stunt to generate extreme outliers in tail query ranking. It was a creative tactic by a competitor, and we’ll take it as a back-handed compliment. But it doesn’t accurately portray how we use opt-in customer data as one of many inputs to help improve our user experience.
It is a fact that Bing uses data from its toolbar to improve their search results.
The question is: Did Bing copy Google's search results, or did they merely use the anonymous usage data from those 20 Google engineers (which included a search term and an url they clicked on) to improve Bing's search results for that query?
There are to many open questions that the claims are not justified, for instance: Why have only 9% of the search results been identical and not a higher number or even all of them.
The honeypot alone is no proof that Bing is indeed copying search results from Google. The explanation that Microsoft is making use of user queries and actions seems more reasonable.
What's your take on this? Let me know in the comments.
Advertisement
Google lying
It is not just their search bar. Recently i was looking at msn and the “recent searches” bar had two random articles then the name of the online game i play there. Im not paranoid about what bing does but i dont appreciate them leeching my searches. creepy when im just looking up news that i find out bing was programmed to watch me. I can see the joke now
“in soviet russia engine searches you”
dont mess with google they own every thing
Personally I prefer Google for search, maps, mail etc., because I feel at least for me they have the best user-interface, simple and effective. Bing claiming to be a “decision engine” is hilarious.
However, I agree with Martin, in that Bing is not “copying” Google. If Yahoo or even some random search engine had set up this “honey pot”, and had users consistently enter the rubbish search term, and click on a certain URL, bing based on what you entered in the search query box, followed by what you clicked on, would automatically give you the same result because this is what is of apparent interest to YOU.
Search term->URL->Click on URL implies that the URL is somehow related to the search term at least for YOU. It is not the same as Google(Search Term). If it was, then all users, not just 9/100 representing the users who clicked through to the URL from Google, would get the same result.
Way to go bing!
In any case, I would really not mind even if bing actually did Google(Search Term), if all else failed; this is not an exam folks, and everything on the internet is public domain and fair game in my book. Bing is trying to provide you the most relevant search results period, like a concierge. It’s not about who’s algorithm is more elegant or better, it’s about who helps you find what you’re looking for, even if they have to send out bloodhounds to do it.
Cry me a river, Google…I know if Google had thought of this first, they would have used it too. And these are the same guys who acquired DoubleClick which monitors all your internet activity…
Search Term -> Url -> click on URL
Wait… who’s that hiding under the first arrow?
More like:
Search Term -> Google returns it worked hard to get -> Lazy user clicks on one of the top ten Google results
If Bing is “writing that down”, then it’s basically writing down Google’s hard work, not the user’s.
If Bing thinks it should do Google searches, fine, but it should say so. Not copy Google’s results and present it as its own achievement.
well, for me it is simple. i avoid anything microsoft.. bling doesn’t exist nor do microsofts tactics.. every computer i own or use at work either has the bling default changed to google or i make the homepage google. my linux doesnt mind one bit..
I think this is clever by Microsoft/Bing. Use every resource you can to provide a better search …
I’ll explain it instead of being aggressive:
Google invented “fake” search terms which served “fake” results, which don’t contain any of the search terms, nor are they linked to the search terms in any other way.
Google’s result page is the only connection between the fake search terms and the specific results.
If Bing served those results, it could only have done so because it got them from Google’s results. Microsoft themselves admitted that the Bing toolbar collects user queries and clicked results.
Thus, they effectively (via 3rd party) mined Google’s results. This is not something they’re supposed to do.
And the user query entered in the search form does not count? Bing could as well looked at the user query, and the url that the user clicked on to come to the same conclusion regardless of the page. It could very well happened on Ghacks or any other site providing that Bing not only monitors user queries on Google but all queries. And now?
Martin, I wasn’t suggesting my example was exactly the same as Bing. In my example I used exclusively competitor’s results, to make a point – that no matter how you dilute it, “crowd-sourcing” your opponents, is not a valid tactic.
Bing has stated to Danny Sullivan (here http://searchengineland.com/bing-why-googles-wrong-in-its-accusations-63279) that they get signals from all major search sites. In other words, they take into account competitors results. What ever weight they give them is not important. The point is it does shine through to some queries, and that is wrong.
As Danny Sullivan summed up in a comment:
“if Bing learns that a page is “relevant†to “users†simply because Google was incredibly smart in (1) finding the page on the web and (2) ranking it above all others, especially if the user entered an ambiguous term, did Bing really learn from the user behavior, or did the user behavior simple serve to funnel back to Bing what Google thinks is right for a query?”
Which is exactly why it is unethical to do what Bing does.
Bing could have validated what it recieve from users
Bing doesn’t just tie together websites by time proximity. Bing specifically identifies google (and other search engine) queries, and identifies entered search terms (which are highlighted in the toolbar, from what I read).
If they later tie those search terms, with the sites a user visited immediately after, then they’re still copying Google by proxy, even it’s via URI and not via html code. Had google not been there, they wouldn’t have that result would they? That’s what counts.
Assume I develop a toolbar that does exactly that. It and watches your queries at google, yahoo and alltheweb, and then watches where you go.
I then launch a “search engine” based exclusively on these results, and market myself as “a new amazing search engine”. What would you call my product? I would call it a useful hack – an addon, perhaps, or a “utility”.
I would certainly not categorize it as its own search engine, and most definitely wouldn’t categorize it as a “competitor”.
Now, I’m sure that unlike my example, Microsoft has their own algorithms. I know cause I used their technology. The trouble is, their URI tracking is used to free-ride Google’s algorithms.
I think you confuse a few issues here. I would not call 9 out of 100 results exclusively for instance. I do not think that it is worthwhile to continue this discussion however since we both are not able to proof anything.
Martin – you’re wrong. Google proved conclusively that they ARE.
The experiment was crafted in such a way, that the only way those results could arrive to Bing, is if they got them from Google’s results for that term search (via the Bing toolbar).
If you don’t understand that, re-read the original article.
Idk about you but a 9/100 is a fail in most tests.
If Bing was just copying Google they would of got a lot more.
Bing is obviously copying. I believe them when they say its only one of “1000” signals, but it is still very wrong.
Google is their direct competitor, in the field of internet search. So if out of “1000 signals” 999 signals for “random word” are blank, it is NOT OK to use Google as the 1000th signal.
There is no way that Bing’s algorithm should somehow rely on Google’s results. Absolutely no way.
The hard fact is that eventually, the Bing algorithm is partially dependent on the google algorithm. To put it in mathemathical terms:
Bing(term) = f(x_1,x_2,….,x_999, Google(term));
And that is wrong. Unless Bing wants to declare itself as a “meta search engine”.
Bing does not copy, at the very least the proof is not conclusive.
The thing is the term is nowhere on the page. The only place it appears ANYWHERE is on Google’s fabricated pages, NOT on the pages those link to.
And of course in the user query, right?
They both give poor results quality wise, so it’s like a mediocre student copying at a test from another mediocre student… They BOTH should spend resources to improve their engine quality, not fight each other on the PR battlefield.
Alright, so which search engine gives you quality results?
To me, Google is still the best way to find what I’m looking for, although their result quality is seeing a downtime lately. For Bing, I never used it much, coz when I search in my own language (which is Bangla by the way) it doesn’t return many useful results as Google does. So Bing is not really good for localized search. And about the copying thing, even before Google noticed it, I personally noticed that too that Bing’s first page results are very similar to Google’s first page results for the same query.
Wow, MS’ apologists? The fact that you can type in something like afhdshjacds and Bing returns the exact same results as Google says they are copying. It’s that simple.
No. This is a just one of the many signals that Bing, and for that matter, Google, too consider when ranking websites.
Say, a website ranks 9th or 10th for the keyword “mangoes” and a lot of users click on that link, it indicates that the link is more relevant to “mangoes” than some of the other links that appear ahead of the page. So, the website is promoted to couple of ranks. This is exactly what happened here.
The difference is that Google monitors clickstream data for users of only Google, while Bing was smart enough to take advantage of the Bing toolbar to monitor clickstream for competitor’s search engine as well. So, it’s no surprise that Bing began ranking that particular page for that particular keyword.
Given that the keyword Google used is utter nonsense and has no relevant page, the only signal that Bing could rely upon is clickstream and they did.
No it does not.
Google – good PR, “we are the victim here”.
Bing – bad PR, “we are not smart enough to match Google quality” (note that this is not necessarily what it is, but definitely what it looks like).
Users – enjoying the pillow fight, instead of giving a thought about that this is probably one of the most innocent things that happens with data they willingly submit to companies.