<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>gHacks technology news &#187; aol logs</title>
	<atom:link href="http://www.ghacks.net/tag/aol-logs/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.ghacks.net</link>
	<description>A technology blog covering software, mobile phones, gadgets, security, the Internet and other relevant areas.</description>
	<lastBuildDate>Tue, 24 Nov 2009 11:26:04 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.6</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Updates on the AOL Scandal</title>
		<link>http://www.ghacks.net/2006/08/09/updates-on-the-aol-scandal/</link>
		<comments>http://www.ghacks.net/2006/08/09/updates-on-the-aol-scandal/#comments</comments>
		<pubDate>Wed, 09 Aug 2006 08:27:24 +0000</pubDate>
		<dc:creator>Martin</dc:creator>
				<category><![CDATA[Search Engines]]></category>
		<category><![CDATA[aol]]></category>
		<category><![CDATA[aol logs]]></category>
		<category><![CDATA[aol scandal]]></category>

		<guid isPermaLink="false">http://www.ghacks.net/2006/08/09/updates-on-the-aol-scandal/</guid>
		<description><![CDATA[Aol released private search queries of 500.000 AOL users to the public. Analysts and Journalists alike are having a busy time analysing the data for various reasons. I was able to identify three different motivations: 1. How big is the privacy breach, 2. Is it possible to identify someone from the queries and 3. analysing the queries from a marketing point of view.]]></description>
			<content:encoded><![CDATA[<p><a title="500000 aol users, 20 million queries" target="_blank" href="http://www.ghacks.net/2006/08/07/anonymised-logs-of-500000-aol-users-on-the-net/">Aol released</a> private search queries of 500.000 AOL users to the public. Analysts and Journalists alike are having a busy time analysing the data for various reasons. I was able to identify three different motivations: 1. How big is the privacy breach, 2. Is it possible to identify someone from the queries and 3. analysing the queries from a marketing point of view.</p>
<p><a title="eliott back aol gate" target="_blank" href="http://elliottback.com/wp/archives/2006/08/07/aol-gate-search-query-data-scandal/">Eliott Back</a> searched the data for credit card numbers, social security numbers and email addresses and found disturbing numbers. He found an undisclosed amount of credit card numbers, about 200 social security numbers and nearly 60 email addresses. The danger that someone will misuse the given data is present and likely.</p>
<p><span id="more-699"></span>The first AOL user has been <a title="first aol user has been identified" target="_blank" href="http://www.nytimes.com/2006/08/09/technology/09aol.html">identified</a> by the New York Times. AOL user 4417749, known as Thelma Arnold, a 62-year-old widow who lives in Lilburn, Ga., was identified by analyzing the searches she conducted. I suppose she will not be the last one that will be identified and there surely will be some sort of backlash for AOL.</p>
<p>Webmasters began to import the data into databases and offer ways to search the data here or <a title="search the aol user data" target="_blank" href="http://simplifiedsec.com/KeywordDigger.html">here</a> on the web for free.</p>
<p>The first response from AOL was that the data was released by mistake and that the intention was to reach the academic world. Unfortunately for them they came up with the idea to make this publicly available instead of giving it out only to people who request the data for research. They really should have know that this could have happened.</p>

	Tags: <a href="http://www.ghacks.net/tag/aol/" title="aol" rel="tag">aol</a>, <a href="http://www.ghacks.net/tag/aol-logs/" title="aol logs" rel="tag">aol logs</a>, <a href="http://www.ghacks.net/tag/aol-scandal/" title="aol scandal" rel="tag">aol scandal</a><br />

	<h4>Related posts</h4>
	<ul class="st-related-posts">
	<li><a href="http://www.ghacks.net/2006/08/07/anonymised-logs-of-500000-aol-users-on-the-net/" title="Anonymized Logs of 500000 AOL users on the net (August 7, 2006)">Anonymized Logs of 500000 AOL users on the net</a> (10)</li>
	<li><a href="http://www.ghacks.net/2008/07/29/whats-up-with-the-download-squad/" title="What&#8217;s Up With The Download Squad (July 29, 2008)">What&#8217;s Up With The Download Squad</a> (13)</li>
	<li><a href="http://www.ghacks.net/2009/03/11/truemark-email-identification/" title="Truemark Email Identification (March 11, 2009)">Truemark Email Identification</a> (5)</li>
	<li><a href="http://www.ghacks.net/2008/05/22/send-aim-messages-without-a-software/" title="Send AIM Messages without a software (May 22, 2008)">Send AIM Messages without a software</a> (3)</li>
	<li><a href="http://www.ghacks.net/2008/02/19/remove-viewmgrexe/" title="Remove viewmgr.exe (February 19, 2008)">Remove viewmgr.exe</a> (0)</li>
</ul>

]]></content:encoded>
			<wfw:commentRss>http://www.ghacks.net/2006/08/09/updates-on-the-aol-scandal/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Anonymized Logs of 500000 AOL users on the net</title>
		<link>http://www.ghacks.net/2006/08/07/anonymised-logs-of-500000-aol-users-on-the-net/</link>
		<comments>http://www.ghacks.net/2006/08/07/anonymised-logs-of-500000-aol-users-on-the-net/#comments</comments>
		<pubDate>Mon, 07 Aug 2006 07:39:01 +0000</pubDate>
		<dc:creator>Martin</dc:creator>
				<category><![CDATA[Search Engines]]></category>
		<category><![CDATA[aol]]></category>
		<category><![CDATA[aol anonymized logs]]></category>
		<category><![CDATA[aol logs]]></category>

		<guid isPermaLink="false">http://www.ghacks.net/2006/08/07/anonymised-logs-of-500000-aol-users-on-the-net/</guid>
		<description><![CDATA[AOL surely did not think about the immense backlash they would receive from the internet community when they released anonymised logs of 500,000 AOL users at the AOL research website. The file consisted of about 20.000.000 million web queries from about 500.000 AOL users in the course of three months (march to may 2006). The AOL username was replaced by a unique ID, everything else was kept unchanged in the logs.]]></description>
			<content:encoded><![CDATA[<p>AOL surely did not think about the immense backlash they would receive from the internet community when they released anonymized logs of 500,000 AOL users at the AOL research website. The file consisted of about 20 million web queries from about 500.000 AOL users in the course of three months (march to may 2006). The AOL username was replaced by a unique ID, everything else was kept unchanged in the logs.</p>
<p>AOL quickly took down the website but there is still google cache copy available. The big compressed logfile (over 400 megabytes) can be obtained from <a title="aol log files download" target="_blank" href="http://www.gregsadetsky.com/aol-data/">Greg Sadetskys website</a> as a web or torrent download. Pulling the information from the AOL website surely looks like they are admitting a wrong doing and are now in full damage control mode. The uncompress logfile consists of text files with a combined size of over two gigabytes.</p>
<p><span id="more-693"></span>Some questions naturally arise: Why did AOL release the data ? Why is there such a big outcry in the web community ? I think that AOL released the data for research and <a target="_blank" title="marketing and the aol 500 k logs" href="http://plentyoffish.wordpress.com/2006/08/06/aol-releases-googles-most-prized-keyword-list-google-is-gonna-get-mega-spammed/">marketing purposes</a>. This is a goldmine for every researcher on search engines and user interaction and marketers will surely analyse the keywords and search phrases extensively. A question remains: Why did they make the file available for download to the public ? Would not it be better to offer the file dvd for researchers only ?</p>
<p><a title="google 6 dvd user searches" target="_blank" href="http://googleresearch.blogspot.com/2006/08/all-our-n-gram-are-belong-to-you.html">Google</a> will be offering six DVDs with search data soon as well, so it is not only evil AOL who is &#8220;sharing&#8221; user searches with others. There are two clear differences between the AOL and the Google approach: AOL released customer data, that is data of people who are paying AOL for internet access, while Google has not that customer relationship. The second difference is that AOL released the data to the public while Google will be offering the data on DVD supposedly for researchers only.</p>
<p>I took a look at the first text file with a size of more than 200 megabytes. Each line begins with the unique Id that replaced the username, the search queries, the time of the search and the possible destination (url) the user went to.  Everything is unfiltered, you could surely create some pretty accurate profiles from the user queries. User 205405 is searching for rape, child abuse and the like while user 2603120 is looking for spanish language courses in chicago.</p>
<p>It will be interesting to see if some of the 500k users will sue AOL over this privacy infringement. Others will most likely demand that AOL reports some of its users to the authorities because of their searches.</p>
<p>Maybe it is even possible to uncover a real name by analyising the searches. I did not take a closer look at them but I saw searches for real names and <a target="_blank" title="techcrunch" href="http://www.techcrunch.com/2006/08/06/aol-proudly-releases-massive-amounts-of-user-search-data/">others</a> are reporting that &#8220;<span style="font-style: italic">searches for names of specific people, addresses, telephone numbers, illegal drugs, and more can be found in the logs as well.</span>&#8220;</p>

	Tags: <a href="http://www.ghacks.net/tag/aol/" title="aol" rel="tag">aol</a>, <a href="http://www.ghacks.net/tag/aol-anonymized-logs/" title="aol anonymized logs" rel="tag">aol anonymized logs</a>, <a href="http://www.ghacks.net/tag/aol-logs/" title="aol logs" rel="tag">aol logs</a><br />

	<h4>Related posts</h4>
	<ul class="st-related-posts">
	<li><a href="http://www.ghacks.net/2006/08/09/updates-on-the-aol-scandal/" title="Updates on the AOL Scandal (August 9, 2006)">Updates on the AOL Scandal</a> (1)</li>
	<li><a href="http://www.ghacks.net/2008/07/29/whats-up-with-the-download-squad/" title="What&#8217;s Up With The Download Squad (July 29, 2008)">What&#8217;s Up With The Download Squad</a> (13)</li>
	<li><a href="http://www.ghacks.net/2009/03/11/truemark-email-identification/" title="Truemark Email Identification (March 11, 2009)">Truemark Email Identification</a> (5)</li>
	<li><a href="http://www.ghacks.net/2008/05/22/send-aim-messages-without-a-software/" title="Send AIM Messages without a software (May 22, 2008)">Send AIM Messages without a software</a> (3)</li>
	<li><a href="http://www.ghacks.net/2008/02/19/remove-viewmgrexe/" title="Remove viewmgr.exe (February 19, 2008)">Remove viewmgr.exe</a> (0)</li>
</ul>

]]></content:encoded>
			<wfw:commentRss>http://www.ghacks.net/2006/08/07/anonymised-logs-of-500000-aol-users-on-the-net/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
	</channel>
</rss>
