Friday, June 27, 2008

Grid versus Cloud Computing

From the end user perspective, the short answer to the question "What is the difference between grid computing and cloud computing?" is the way you work with the system is different. Grid Computing follows the typical batch oriented workflow of the old mainframe days. A user has a program to run and a grid allows you to launch this program more or less the same way as if you would on your local machine. The key point is that you look at the grid as a means to execute a program.

The typical use of a cloud is information driven. Assuming Google as the quintessential cloud computing environment, the user is looking for information, and Google's programs have done their job in the past by taking in raw data and organizing it so that the user can find contextual information. Inside Google, scripts are organizing the schedules for launching the programs that crawl the web, compute the index, and update the production index.

I just reread Tom White's post: Running Hadoop MapReduce on Amazon EC2 and Amazon which is a great example of all the steps needed to get a service running on a cloud. Once Hadoop is running and we periodically pick up the web log from S3 we would have a cloud for that particular task. The actual usage case of analyzing a web log would be much simpler when executed on a grid because the grid would automatically start and stop the services needed on our behalf. However, keeping the services running 24/7 and interacting with them through a web interface is more of a cloud computing workflow and that is the way start-ups are using AWS.

No comments: