Saturday, August 29, 2009

Scaling Hadoop for Multicore and Highly Threaded Systems

Here is the latest systems webinar series on Scaling Hadoop for Multicore and Highly Threaded Systems from Sun Microsystems.

Details of this webinar as mentioned in the web site:

During this webinar you will learn about: Scale — How to use Hadoop to store and process petabytes of data Performance — How to maximize parallelism per node, and the results of tests varying the number of nodes, and integrating Flash memory drives Virtualization — How we created multiple virtual nodes using Solaris Containers Reliability — How Hadoop automatically maintains multiple copies of data and redeploys tasks based on failures Deployment options — How Hadoop can be run in the "cloud" on Amazon EC2/3 services and in compute farms and high-performance computing (HPC) environments Hadoop is typically scaled on a large pool of commodity system nodes. However, by using multicore, multithreaded processors, you can achieve the same scale with fewer machines. In this Webinar, we will discuss how Sun's chip multithreading (CMT) technology-based UltraSPARC T2 Plus processor can process up to 256 tasks in parallel within a single node. We will also share with you how we evaluated CPU and I/O throughput, memory size, and task counts to extract maximal parallelism per single node.

Labels: ,