"Should Web Giants Let Startups Use the Information They Have About You?
By Josh McHugh
12.20.07 | 6:00 PM
Just after 10 am on June 7, 2007, Ryan Sit glanced at
his Gmail inbox and saw the message he had been waiting nine months to
receive. Sit, a 29-year-old software developer from San Diego, is the
founder of Listpic, a site that used bots — automatic software-based
agents — to pull images from craigslist for-sale listings and
reorganize them into an easier-to-navigate, more attractive format.
Instead of tediously clicking individual links to view photos, Listpic
users could see them all collected onto a single page. The service was
an instant success, and by early June it was pulling in more than
43,000 visitors a day and thousands of dollars a month in Google
AdSense revenue...."
Great article, read the whole thing..
This is a topic I'll be doing quite a bit of blogging on this year. Particularly I'm thinking a lot about what this means as we begin to mine this data to create better and better user experiences. So if it's on the web - is it free? How can we make our audiences into our best distributors? How can we not repeat the mistakes of the music business in the new economy? What rights do copyright holders, authors, musicians, writers have over their content.. Do they get a say in how their work is used? How can we guarantee/protect their right to participate in monies generated by new forms/uses.
In my music business days we used to ask artists to sign contracts that talked about licensing their work, "in any and all media known and unknown, or hereafter invented throughout the universe, in perpetuity." I used to think that just meant we intended to corner the market on CD's on Venus. Now when I think about mash-ups, sampling, DJ's, iplayers, etc. it feels to me like we missed the point entirely.
Then there is user data, metadata.. the detrius of a digital life. if you took my cookies files, amazon wish list, click-paths and 1/2 a dozen or so playlists from my last FM scrobble you would have a very odd picture of me. Mix that up with other data and data mining becomes very very strange. I saw Jeff Bezos give a talk about the 'my Tivo thinks I'm gay" phenomena at web 2.0.. his feeling was that this was a self correcting proble. Bad data = bad sales = correct alogrythm,= better data. I think he's largely right on that.
But in his mind, you come into his store, your behavior belongs to him - end discussion. To me it's more complicated. I think it's fine if you collect that data, especially in aggregate from your property. I think it's fine if you use it to improve your product offerings/suggestions to me. Tone down the tampax commercials give me more gadget stuff - oh yea and music more music. Less Cricket.
On the other hand, when this data starts to build up, collect, get rusty and frankly out of date (my Synth pop Aha historical phase is largely over for those of you who share my lastfm friends list..) How can we keep this current, and how do I prevent this information from being used in a way which hurts me, or even breaks my personal privacy (I mean come on, we all like at least ONE Neil Diamond song.
Basically, we need a framework, set of rules around this - and we all need to agree them. I'm a huge fan of the work Creative Commons have done in this area but we need a lot more. At the BBC we talk a lot about this these days, particularly as we exist to serve the public and their interest. we aren't selling you anything, except maybe that the BBC is valuable and you should continue to support it. More on this in future blogs.
Originally posted on rxdxt.vox.com