Cascading and Common Big Data Problems

ÜberConf

Denver · July 12 - 15, 2011

You are viewing details from a past event

About this Presentation

This session will quickly introduce the Cascading open-source project and how it was used in various projects to overcome problems and bottlenecks particular to large data analytics.

This presentation will cover five different high profile Hadoop and Cascading projects and the lessons learned from them. Then identify the common architectural components across them. We will then present a summary of Hadoop and its architecture to show why Hadoop was a key technology for these projects and the design decisions architects should consider when beginning a new Hadoop project.

Author of Cascading Data Processing Open Source Project

Chris Wensel is the founder of Concurrent, Inc., and the author of the Cascading data processing open-source project, an alternative API to MapReduce for Apache Hadoop.

He also co-founded Scale Unlimited, the first Hadoop and “Big Data” related professional services and training company, where he mentored and trained companies like Sun Microsystems, Apple, and numerous startups in the Bay Area.

Chris bootstrapped his first Internet startup in the early 90's, creating an early Web server-side scripting language used in the real estate and insurance verticals. During the late 90's, Chris focused on distributed-agent based systems where he received several patents on
distributed computing. From there he became Chief Architect for the fastest growing business unit at Thomson Reuters. Just prior to Concurrent, Chris was a Consulting Architect to TeleAtlas geo-content management group in Belgium.

Chris also advises several startups in the “Big Data” and “Big Audience” technology space.