I was working on a blog monitoring project for Sun Microsystems, building a web page that displayed the most recent and the most-talked about blog posts from around the web about 12 different Sun technologies, for use during the company’s huge user conference.
As a part of that work, I was grabbing a feed from Google Blogsearch for long search queries like “Sun+Java-Indonesia….” etc. Google Blogsearch’s own RSS feeds were all full of cruft, though. HTML bolding the search terms in the description field, and more. Not being a developer myself, I couldn’t figure out how to strip that all out. I spent several nights pulling out my hair, worried I wouldn’t be able to create something that was production-ready for this big client.
I tried Yahoo Pipes, I tried other blog search engines, but what I ended up doing was using Dapper to scrape a new feed from the search results pages. Those feeds were nice and clean to display on the project website.
This wasn’t an easy thing to figure out. I tried many different strategies before discovering that, with help from the guys at Dapper even. As the project proceeded, my contact at Sun came to me and said (paraphrasing) – “Marshall, it looks like you’re going to be able to pull this off after all, but I wonder if you could add one more search query and module to the end product. It is very, very top secret though and you cannot tell anyone about it.”
I said of course I could do that, what was the search query?
“Neil Young,” she said.
Of course I was more than happy to do that. It turned out that the big splashy secret announcement at Sun’s conference was that Neil Young was going to make a surprise appearance on stage to unveil the first ever collection of his entire life’s work, including letters he’d written, scanned-in notes from studio recording sessions, video interviews and of course all his music. All those materials would be made available on Blu-Ray, the media storage format that runs in all media players required to use Sun’s Java software.
I built a long search query that would automatically deliver the best feed of search results about Neil Young’s news that I knew nothing about yet, and included it in my deliverables. The project was completed days before the big conference and it was exhausting.
Just before the conference began, my Sun contact called me and said, “can we fly you down to the event for an interview with Neil Young as thanks for all your hard work?”
And that’s how Dapper made it possible for me to meet Neil Young. We talked about electric cars (his new passion), about MP3 audio quality, about DRM and more. It was great.
I used Dapper for many, many different things. I still use it regularly (I used it last night, in fact) and if I could stop time and geek out for an evening with no obligations, I’d still probably spend that time playing with Dapper or the similar new tool NeedleBase.
Isn’t That Just an Ad Network?
When Dapper was acquired by Yahoo last week, all the news coverage was brief and called the service a semantic advertising platform. How tragic! Co-founder Eran Shir wrote last week about the acquisition and said that the Dapper team always envisioned themselves making the display advertising world a more meaningful place. If that’s true I’m disappointed. That sure wasn’t what the service’s earliest adopters wanted to use it for.
In February 2008 Dapper announced at its DapperCamp event that it would be launching an advertising technology. The Dapp Factory, as it was called, would not longer just be used to extract data for an undetermined purpose – it would be used to target contextual relevance for ad placement.
A mere 35,000 “Dapps” to perform extraction had been built and the company was struggling to be financially viable. It was a confusing service with a challenging interface on top of a radically new user paradigm. The only clear solution was to become an ad network. To fund the semantic indexing of text fields around the web by turning some of them into advertisements.
It’s cool. I’m ad-supported. But Dapper had promised more than that. It had promised to be an easy and powerful tool that anyone, with no technical skills, could use to render any web page dynamic, to monitor particular fields in pages for updates automatically, to pull sets of data off of pages around the web. It’s magic.
It was beautiful, but people didn’t want it, they didn’t understand it. Because people are stupid. It’s maddening. If you tell people: take this tool, use it to get real-time notifications of changes to the tiniest part of any web page, use it to pull down sets of data from the web with a snap of your fingers, use it to work fast and get first movers’ advantage. Scrape, then grab the fruits of that scraping, then enjoy a fast-growing career and meet your childhood musical heroes! But no, if there’s an unclear step between a technology of empowerment and profit, a step that requires creativity and hard work, then the market at large throws a fit and demands that profit be instead put directly into its spoiled-child’s hand. “I want an ad network!” people say, effectively, “Give me the money directly!”
Dapper as Parable
A beautiful web technology is like a little fairy, whose light shines bright for a short time and then extinguishes. Enjoy it while you can, until an uncaring market starves it to death and it turns into an ad network, for lack of viable alternatives.
Dapper still lets you scrape feeds using its legacy product. Hopefully Yahoo won’t shut that down, if it allows any of the service to survive. But imagine how much more powerful (and stable) this beautiful service might have been if the company could have found a way to monetize its core feed scraping and publishing product. If that had remained the top development priority.
The same thing happens time and time again. “Your technology is too wonderful,” I sometimes tell the most inspiring startups I interview before they launch. “No one will understand how to capture the incredible value you deliver. Your sales people will pound their heads against a wall for months. And then you will become an ad network.”
Companies laugh uneasily. Perhaps because they know how likely it is that I’m right. (Perhaps because they think I’m a creep who ought to be perfectly happy for them if they can manage to build a viable ad network.)
I told Factery Labs that when I saw its demo. That startup provides an API that you can throw any URL at and get in response a feed of “fact-type sentences” extracted from the text behind the submitted URL. It’s awesome. Twitter client Sobees, for example, uses it to offer text summary previews of any links shared by your friends on Twitter. It’s great – but what are the odds that Factery is going to turn into an ad network? I think they are pretty good.
I told the company that and they said, “what’s your shirt size?”
I told them, and a week later a package showed up at my door from Cafe Press. In it was a hooded sweatshirt with the Factery Labs robot logo screen printed on the back of it. Around the logo circled the words: “Factery Labs – Not an Ad Network Yet.”
It’s a cautionary tale – tell people that anyone can blog or Tweet, post a photo or a video, and you will change the world. Tell people that anyone can now extract text and data, process it automatically and treat web content like bowling pins, torches and knives in a capable juggler’s hands. Not enough people, at least so far, will care. You will likely become an ad network.
Maybe that will change someday. Or maybe these freaky little services will remain forever like short-lived fairies, destined to be extinguished before their time.
Either way, I had a lot of great times with Dapper. I hope that technology like it will never stop being born.
Pages: 1 2
0 Responses
Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.