Tuesday, July 15, 2008

FOO Camp 2008

I spent the weekend at FOO Camp 2008, an annual event organized by publisher O'Reilly Media (hence the name, Friends Of O'Reilly). The event brought 275 movers and shakers of the tech industry and related industries, and was an incredible experience. It was as if someone injected into my brain the latest and greatest ideas and thoughts with one joyful syringe, accompanied with a few good glasses of wine. Michael Arrington of TechCrunch captures the spirit of FOO Camp in his blog post (and you can even see me standing and looking busy behind Jimmy Wales, the founder of Wikipedia, in one of his photos).

The conference begins with no set agenda. They put up an empty board with the different time slots and locations of sessions, and as the participants arrive, they fill up the board with sessions. There are about 10 sessions going on in parallel at any given time, most of them looking quite fascinating.

To give you a rough idea, within the span of a few hours, I attended sessions on:

-- aggregating meta-data on the web organized by Esther Dyson (i.e., all the data we create as we use services on the web),
-- the future (or lack thereof) of journalism (organized by several NY Times and SeattlePI reporters),
-- "open education" (tools, policies and politics of),
-- crowd-sourcing vs. curation (i.e., how to balance all the inputs one gets from the bloggers of the world with careful aggregation and analysis of information),
-- how computers can help humanities (e.g., analyzing the Bible, helping archaeologists), organized by Martin Wattenberg, the creator of Many Eyes,
-- educational tools for virtual worlds, and
-- a very well attended session on small things one can do to become happier in life.

There was also a session on "big data", organized by Roger Magoulas, the director of research at O'Reilly. The point I took away from that session is that owners of big data sets are now more confused than ever. They face a much wider array of architectural choices for data management systems than they ever did. These include map-reduce based systems, column stores, real-time warehouses, streaming systems, and various systems built on top of MySQL. Each of these architectures has its advantages and limitations, but it's becoming increasingly harder for application builders to understand the tradeoffs (and it's not like marketing departments are getting rewarded for making the choices clearer). It's no longer the world where you buy your favorite relational database system and you're done (and stuck). I think this situation presents some interesting research challenges for the database community (it's also interesting how some of these architectures get little attention in the community).

The idea of designing the conference program on the spot is very appealing, and I'd like to propose we do a little bit of it in traditional scientific conferences. (There is a concept of birds-of-feather session, but that's usually a grab bag of ideas). We should allot time slots in our conferences where sessions can be organized as the participants come to the conference and stimulate discussions there. That's a much better way of getting up to speed on hot topics and people's current thinking, which is what conferences should be for!

No comments: