« Girls Gone Wild: An Entrepreneurial Morality Fable | Main | Playing with the AOL Search Data »

Latest Stories

August 6, 2006

AOL's Data Death Rattle

Markus nails one of the reasons why AOL's release of reams of search data is a big deal. Among other things -- like gigantic privacy violations, and so on -- it is going to help reprice the keyword market, to Google's detriment.

[Update] I see that AOL has taken the link down. While I have the data, I'm a) uneasy at cooperating in this privacy violation, and b) I don't have the bandwidth at my host to allow downloading it. I see there is a (short) mirror list here.

[Update^2] Lots of privacy-related hand-wringing going on over this issue. While I'm generally sympathetic, I also think it's important to point out that doing self-incriminating searches over the public Internet is a little like sending self-incriminating emails over the public Internet: An expectation of privacy is not the same thing as a guarantee.

Sphere It   |  Digg this! Digg it   |  Bookmark this! Bookmark it   |  Stumble It! Stumble it

Comments

AOL Exec's must be waking up ... the blog post and the zip file are no longer there... now let's just wait for the heads to roll...

The privacy blowup is the big thing here -- I suspect some of the user IDs will be linked to actual screen names and people soon, and perhaps prove embarassing to the searchers.

I suspect Markus has overblown the SEO/spam implications. Big-league spammers should already have great data on what people are searching for -- it can be collected so many ways: toolbars, spyware, ISP caches, metasearch engines. Google and others also already give lots of info on relative frequency of search terms when you buy (or even contemplate buying) keyword-triggered ads. And when you come close to something that's searched for more often, they'll suggest the variants for you. So this is unlikely to teach spammers mich new about what people are searching for.

how many people search on their own credit cards as a way to look for listings on fraud/scam websites? you will see at least a few valid card numbers in this data. same for SSNs.

I'm surprised at how knee-jerk the reaction is to this by people like Mike Arrington.

This type of data has heretofore been available to ISPs, large web sites and people that pay for it.

The fact that there's a ton of porn searches shouldn't surprise anyone. And data like credit card numbers and SSNs are also much easier to find via Google than people think.

In the end, AOL is giving something useful away for free. Wouldn't that usually make the web 2.0 hypesters happy?

John - I kinda agree. While Aol didn't handle this release well, people need to relax a little on the alarmism. Perfect privacy on a public network is an oxymoron.

What happens when a few dozen (or hundred, or thousand) of the 'anonymized' user-ids get definitively mapped to real screen-names/people, because the collection of queries made essentially determines just one person? And the queries include things that get those people in hot water with their peers, employer, spouse, or government?

I don't know if Google's 'search history' is now enabled for everyone, or how many of you stay logged-in to Google while you search. But if you do, and don't think it's a big deal to have your search history revealed, visit this page...

http://www.google.com/searchhistory/?hl=en

Step through your search history -- Google may very well have thousands of your searches recorded -- and see if you'd like it posted for the world to see, attached to your name.

Wondering if the government has anything to do with this. Remember earlier they asked Google and Yahoo to provide the details as well.