Chi Hung Chan: Hadoop Summit and Data-Intensive Computing Symposium Videos and Slides

Lots of videos and slides related to scalability and performance, definitely worth your time.

Hadoop Summit and Data-Intensive Computing Symposium Videos and Slides

Hadoop Summit - March 25, 2008 - The Hadoop Summit brought together leaders from the Hadoop developer and user communities for the first time. Apache Hadoop, an open-source distributed computing project of the Apache Software Foundation, is a distributed file system and parallel execution environment that enables its users to process massive amounts of data. Slides from the presentations are available below, and video will be coming soon.

Hadoop Overview: Doug Cutting / Eric Baldeschwieler, Yahoo! - Slides - Video
Pig: Chris Olston, Yahoo! - Slides
JAQL: Kevin Beyer, IBM - Slides - Video
DryadLINQ: Michael Isard, Microsoft - Slides - Video
Monitoring Hadoop using X-Trace: Andy Konwinski, UC Berkeley - Slides - Video
Zookeeper: Ben Reed, Yahoo! - Slides - Video
Hbase: Michael Stack, Powerset - Slides - Video
Hbase at Rapleaf: Bryan Duxbury, Rapleaf - Slides - Video
Hive: Joydeep Sen Sarma / Ashish Thusoo, Facebook - Slides - Video
GrepTheWeb- Hadoop on AWS: Jinesh Varia, Amazon - Slides - Video
Building Ground Models of Southern California: Steve Schlosser / David O'Hallaron, Intel / CMU - Slides - Video
Online search for engineering design content: Mike Haley, Autodesk - Slides - Video
Yahoo – Webmap: Christian Kunz, Yahoo! - Slides - Video
Natural language Processing: Jimmy Lin, U of Maryland / Christophe Bisciglia, Google - Slides - Video
Panel on future directions: Sameer Paranjpye, Sanjay Radia, Owen O’Malley (Yahoo), Chad Walters (Powerset), Jeff Eastman (Mahout) - Video

Data-Intensive Computing Symposium - March 26, 2008 - Hosted by Yahoo! and the CCC, the Data-Intensive Computing Symposium brought together experts in system design, programming, parallel algorithms, data management, scientific applications, and information-based applications to better understand existing capabilities in the development and application of large-scale computing systems, and to explore future opportunities. Slides from the presentations are available below, and video will be coming soon.

Data-Intensive Scalable Computing: Randy Bryant, Carnegie Mellon - Slides - Video
Text Information Management: Challenges and Opportunities: ChengXiang Zhai, University of Illinois at Urbana-Champaign - Slides - Video
Clouds and ManyCore: The Revolution: Dan Reed, Microsoft Research - Slides - Video
Computational Paradigms for Genomic Medicine: Jill Mesirov, Broad Institute of MIT and Harvard - Video
Simplicity and Complexity in Data Systems at Scale: Garth Gibson, Carnegie Mellon - Slides - Video
Handling Large Datasets at Google: Current Systems and Future Directions: Jeff Dean, Google - Slides - Video
Algorithmic Perspectives on Large-Scale Social Network Data: Jon Kleinberg, Cornell - Slides - Video
Mining the Web Graph: Marc Najork, Microsoft Research - Slides - Video
"What" Goes Around: Joe Hellerstein, U.C. Berkeley - Video
Sherpa: Hosted Data Serving: Raghu Ramakrishnan, Yahoo! Research - Slides - Video
Scientific Applications of Large Databases: Alex Szalay, Johns Hopkins - Slides - Video
Data-Rich Computing: Where It's At : Phil Gibbons, Intel Research - Slides - Video
NSF Plans for Supporting Data Intensive Computing: Jeannette Wing, NSF - Slides - Video
The Google/IBM data center: Christophe Bisciglia, Google - Video
The Computing Community Consortium: Stimulating Bigger Thinking: Ed Lazowska, University of Washington and CCC - Slides

Labels: performance

Chi Hung Chan

Thursday, April 24, 2008

Hadoop Summit and Data-Intensive Computing Symposium Videos and Slides

0 Comments:

About Me

Previous Posts