Monetizing the data flood and the future of culture at SXSW 2010
Data on the web proliferates faster than anywhere else, ever; at the same time it has ever been so easy to access.
Three challenges have arised. How can we help people put to use this army of tweets, Youtube videos, blog posts? How do we make money in the process? How will cultural authorities (academics, film critics, publishers) grapple with fundamental changes in their work as mathematical tools become indispensable to recognize and present noteworthy content?
Vertical markets
In Monday's keynote, Twitter co-founder Evan Williams' announced a monetization strategy for Twitter which consists in making tweets available for third parties to organize and analyse. I know you will say Duh!, but here is what I believe is a point worth considering. We use to think of large scale data analysis as data mining: typically confined within the walls of a research or consulting firm, and with the purpose of tuning a separate product or a marketing campaign. But, I believe, this is not what Evans had in mind. Instead, the data will be reformatted and organized and directly fed back to customers in vertical markets (say doctors, or graphic designers, or real estate agents). Large scale data analysis is used to create value for the consumer directly and build a product around this value; just as Google or Bing today. Also think of the way Bing has claimed to focus on its health, travel and shopping verticals.
The future of curators
The second issue is what will happen of traditional authorities, be it academics, Criterion Collection editors, book publishers. These are traditional guides through the cultural landscape, helping to focus the public's attention on what films, music, research papers have the most chance at not being a complete waste of time. This year, many SXSW panels (Nobody Wants to Watch your Film, New Publishing ad Web Content, Universities in the Free Era) have placed a great deal of emphasis on the transition from authorities to humble curators: for example, professors will give up exclusive claims to authority and instead assemble the information lying around and present it to the attention of their students, without worrying about the institutional provenance of the data. Although dissolving traditional auras is a huge concession, there is another, more radical, change at hand. Is it really possible for a single person to operate a meaningful selection among the heaps of data available? Or will not the sample he or she will manage to access be so small that the selection will almost be random? Curation cannot be thought anymore as the work of a single individual, or even a team, leafing through whatever articles, pictures or soundfiles can be found. There is just too much and it is updated too often. Curation will mean more and more the design and implementation of data analysis and presentation tools. Contrary to third-party Twitter plans, such systems will not all need to turn huge profits, nor will they able to. They might focus on long tail niches (60s French film lovers, neurobiologists, electric guitar geeks). They will need to rely on individuals who are not always highly trained statisticians, and they will need to place a big emphasis on fostering and empowering a community around the data.
Fun ahead
Monetization techniques of popular web applications such as Twitter converge with answers to our need to keep on building reference and meaning in the data flood. We can expect disruption in this area to continue for a long time, as profits are pursued and answers to the navigation problem are formulated. And solutions will have to be found to allow relatively unexperienced people to manipulate large data sets. This will probably mean creating another abstraction layer on top of products such as Apache Mahout (itself an abstraction layer on top of Apache Hadoop), to make them usable for people wih low programming backgrounds.