Developers Arena

Social Media Web Tips, Social Media News & Technology Updates

Categories:

Data Mining and Taco Bell Programming

Taco Bell logo Programmer Ted Dziuba suggests an alternative to traditional program that he called “Taco Bell Programming.” The Taco Bell chain creates multiple menu items from about eight different ingredients. Dziuba wants to be able to be able to create many applications with combinations of about eight different shell commands.

Sponsor

Here’s an example from Dziuba:

Here’s a concrete example: suppose you have millions of web pages that you want to download and save to disk for later processing. How do you do it? The cool-kids answer is to write a distributed crawler in Clojure and run it on EC2, handing out jobs with a message queue like SQS or ZeroMQ.

The Taco Bell answer? xargs and wget. In the rare case that you saturate the network connection, add some split and rsync. A “distributed crawler” is really only like 10 lines of shell script.

Dziuba gives another example. Instead of using Hadoop to process that data once you have it, you can use:

find crawl_dir/ -type f -print0 | xargs -n1 -0 -P32 ./process

“It is a viable way to deal with massive data problems, at least for one-off jobs,” Big data expert and ReadWriteWeb contributor Pete Warden says about Dziuba’s Taco Bell programming concept. “You’re trading off the ability to manage and tightly control the process against development speed.”

Do you have any favorite hacks like this?

Discuss

Posted in General, Technology, Web.

Tagged with Tips.

No comments

By Klint Finley – January 23, 2011

0 Responses

Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.

« 5 Cloud Shake-Ups This Week Google Adds Browser-Based Code Editor to Project Hosting »

Proudly powered by WordPress and Carrington.

Carrington Theme by Crowd Favorite

Data Mining and Taco Bell Programming

0 Responses

About Developers Arena

Recent Posts

Categories

Recent Comments

Data Mining and Taco Bell Programming

0 Responses

Subscribe

About Developers Arena

Recent Posts

Categories

Tags

Recent Comments