Skip to content

Categories:

Hacker Chat: Max Ogden Talks About CouchDB, Open Data and Couchappsora (Part 1)

Max Ogden Max Ogden is a developer living in San Francisco. He’s a Code for America fellow and one of the founding developers of Couchappspora, an open source social network built with Apache CouchDB.

I recently talked to him about his Code for America fellowship, how he got started programming, CouchDB and much more. Tune in next week for part two of the interview.

Sponsor

Klint Finley: Can you start by telling us a little big about who you are and what you do?

Max Ogden: Sure. I’m Max Ogden, originally from Oregon, most recently Portland. Now I’m talking to you over IRC from Harvard’s guest wi-fi and I technically live in San Francisco, although I’m in Boston for the month of February on assignment for Code for America where I’m a fellow this year.

I’m working out of Boston City Hall on a software project intended to give more educational resources to low income populations in the city of Boston. But outside of Code for America I’ve previously worked at some startups in Portland. But my real passion lies in the communication and networking space. More recently I’ve been getting involved in different privacy-centric open source projects.

How did you become a Code for America fellow?

I was recruited by Tim O’Reilly. He worked his charismatic magic on me. The mayor of Portland, Sam Adams, was presenting me with an award for doing open source and open data work in Portland and Tim was at the awards ceremony. He said “We’d like to see this happen in more cities than just Portland.” That was a pretty exciting proposition.

How did you get started with Web development, and programming in general?

In high school I started taking classes, I actually got hooked by playing Starcraft back in 1999. The custom level editor has elements of event-driven asynchronous programming and I started geeking out on custom Starcraft maps. And then realized that programming is just Starcraft without the space aliens.

My programming career was largely self-directed, but then I got a great job at a marketing research company under some really talented engineers. It was a really good mentoring environment. Agile processes, pair programming, test driven development. A lot of experimentation with newer technologies and a lot of freedom to innovate. But the nature of the work was building human interfaces to technology.

The company’s purpose was to create ways for consumers to express qualitative information about consumer goods, a la focus groups, but using consumer technology as the capture mechanism. So I have a background in helping people be more human using technology. Which is an interesting field/challenge.

Did you go to college for computer science or did you go straight into the job market after high school?

I skipped college altogether actually and went straight into startup land.

You do a lot of work with CouchDB, notably GeoCouch and Couchappsora. What attracted you to Couch?

I got interested in CouchDB when Portland started releasing a lot of government generated data sets. Because Portland is so laid back, I would get off of work at 5pm and then have the rest of the evening to hop around town and code at cafes and bars with peers. Cost of rent is low so you can afford to not take your work home with you.

So the Mayor’s office started releasing raw data and it was in the native GIS format that the government created it in, which requires certain proprietary software in order to work with. A lot of major cities have $50,000 + annual licenses for these software suites. The barrier of entry there is high. So I tried decoding a single data set to see what the workflow was like and nine hours later I had a list of 2,900 bicycle parking racks in JSON format

So the problem arose as: how do I host the other approximately 100 geographic datasets in the same nice JSON format for other developers to work with, so that they don’t have to go through all of the trouble that I originally had to go through to work with the data. So I was shopping around for databases and GeoCouch was brand new at the time and was the most elegant and simple solution that I’ve found since.

Basically I can dump a bunch government data into it and with little to no special configuration it automatically becomes a publicly accessible API. I guess the biggest “feature” of Couch.

The relevance to my software philosophy is the idea that Couch is by default open and replicable. So any data that is in Couch becomes fully replicable by anyone.

Right now we have the notion of a blog presented via HTML and then a separate syndicatable feed, usually via RSS. Couch is basically a more robust pub/sub mechanism that would replace RSS in that model. So for open government data it is really enticing because anyone can replicate the actual API itself, not just consume the API via one-off queries.

Can you talk a little more about GeoCouch and what it is?

GeoCouch is just a light layer on top of Couch. There are a couple notable CouchDB “plugins.” One is GeoCouch, the other is CouchDB-Lucene. GeoCouch lets you quickly look up spatial objects. So if you have, for instance, a list of all of the fire hydrants in Boston – there happens to be 13,000 – and you wanted to show a firefighter the three fire hydrants on the block that he is standing on you can filter those three out of the list of 13,000 in a very efficient manner. On the order of milliseconds.

CouchDB-Lucene is similar in nature, except instead of spatial queries it lets you do full text indexing. The normal out of the box Couch has a more familiar B-Tree index, which is a pretty general purpose indexing engine for lists of data. But spatial queries and full text indexes are more specialized around big lists of geolocated objects or large bodies of freeform text. The general workflow is: You put data into a database (“folder,” “bucket”) and then you hook up a Web application to pull that data out and display it. So I usually do a lot of the logic on the client side in JavaScript.

The nice side effect of that strict separation between data and presentation is that every site built that way automatically has a replicable API. So you can use the Couch “protocol” to subscribe to anyone’s API.

When Diaspora got released they tightly coupled the data layer and the presentation layer so it was a big Rails app that had a lot of complex dependencies and fancy front end-user interfaces and an opaque backend that didn’t adhere to any notable syndicatable protocols. The OStatus stack was missing, for instance. OStatus is a set of six or seven different formats like RSS for creating decentralized subscription based social networks. But OStatus is front-end agnostic. OStatus doesn’t advocate for any specific user interface, it expects to you built your front end in new and interesting ways.

On some level I was bored with Facebook because every user is exposed to the same interface. It was too standardized. And on another level Facebook’s centralized data ownership was morally questionable. And then Diaspora came along and didn’t seem to promote any data standards but instead tried to make its front end the selling point. Which is great, but you need to pursue both the standards in the backend and the innovation in the user experience at least that’s the way I look at it.

So I started hacking with RWW resident hacker Tyler Gillies on Couchappspora, which used the Couch protocol to replicate data but used Diaspora’s front end.

To be continued…

Discuss


Posted in Uncategorized.


0 Responses

Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.



Some HTML is OK

or, reply to this post via trackback.