Thursday, April 24, 2008

Hadoop Summit and Data-Intensive Computing Symposium Videos and Slides

Lots of videos and slides related to scalability and performance, definitely worth your time.

Hadoop Summit and Data-Intensive Computing Symposium Videos and Slides

Hadoop Summit - March 25, 2008 - The Hadoop Summit brought together leaders from the Hadoop developer and user communities for the first time. Apache Hadoop, an open-source distributed computing project of the Apache Software Foundation, is a distributed file system and parallel execution environment that enables its users to process massive amounts of data. Slides from the presentations are available below, and video will be coming soon.

  1. Hadoop Overview: Doug Cutting / Eric Baldeschwieler, Yahoo! - Slides - Video
  2. Pig: Chris Olston, Yahoo! - Slides
  3. JAQL: Kevin Beyer, IBM - Slides - Video
  4. DryadLINQ: Michael Isard, Microsoft - Slides - Video
  5. Monitoring Hadoop using X-Trace: Andy Konwinski, UC Berkeley - Slides - Video
  6. Zookeeper: Ben Reed, Yahoo! - Slides - Video
  7. Hbase: Michael Stack, Powerset - Slides - Video
  8. Hbase at Rapleaf: Bryan Duxbury, Rapleaf - Slides - Video
  9. Hive: Joydeep Sen Sarma / Ashish Thusoo, Facebook - Slides - Video
  10. GrepTheWeb- Hadoop on AWS: Jinesh Varia, Amazon - Slides - Video
  11. Building Ground Models of Southern California: Steve Schlosser / David O'Hallaron, Intel / CMU - Slides - Video
  12. Online search for engineering design content: Mike Haley, Autodesk - Slides - Video
  13. Yahoo – Webmap: Christian Kunz, Yahoo! - Slides - Video
  14. Natural language Processing: Jimmy Lin, U of Maryland / Christophe Bisciglia, Google - Slides - Video
  15. Panel on future directions: Sameer Paranjpye, Sanjay Radia, Owen O’Malley (Yahoo), Chad Walters (Powerset), Jeff Eastman (Mahout) - Video

Data-Intensive Computing Symposium - March 26, 2008 - Hosted by Yahoo! and the CCC, the Data-Intensive Computing Symposium brought together experts in system design, programming, parallel algorithms, data management, scientific applications, and information-based applications to better understand existing capabilities in the development and application of large-scale computing systems, and to explore future opportunities. Slides from the presentations are available below, and video will be coming soon.

  1. Data-Intensive Scalable Computing: Randy Bryant, Carnegie Mellon - Slides - Video
  2. Text Information Management: Challenges and Opportunities: ChengXiang Zhai, University of Illinois at Urbana-Champaign - Slides - Video
  3. Clouds and ManyCore: The Revolution: Dan Reed, Microsoft Research - Slides - Video
  4. Computational Paradigms for Genomic Medicine: Jill Mesirov, Broad Institute of MIT and Harvard - Video
  5. Simplicity and Complexity in Data Systems at Scale: Garth Gibson, Carnegie Mellon - Slides - Video
  6. Handling Large Datasets at Google: Current Systems and Future Directions: Jeff Dean, Google - Slides - Video
  7. Algorithmic Perspectives on Large-Scale Social Network Data: Jon Kleinberg, Cornell - Slides - Video
  8. Mining the Web Graph: Marc Najork, Microsoft Research - Slides - Video
  9. "What" Goes Around: Joe Hellerstein, U.C. Berkeley - Video
  10. Sherpa: Hosted Data Serving: Raghu Ramakrishnan, Yahoo! Research - Slides - Video
  11. Scientific Applications of Large Databases: Alex Szalay, Johns Hopkins - Slides - Video
  12. Data-Rich Computing: Where It's At : Phil Gibbons, Intel Research - Slides - Video
  13. NSF Plans for Supporting Data Intensive Computing: Jeannette Wing, NSF - Slides - Video
  14. The Google/IBM data center: Christophe Bisciglia, Google - Video
  15. The Computing Community Consortium: Stimulating Bigger Thinking: Ed Lazowska, University of Washington and CCC - Slides



Post a Comment

<< Home