Rocky Mountain Software Symposium - November 19 - 21, 2010 - No Fluff Just Stuff

Matthew McCullough

Rocky Mountain Software Symposium

Denver · November 19 - 21, 2010

You are viewing details from a past event
Matthew McCullough

Training Innovator, GitHub

Matthew McCullough is an energetic 15 year veteran of enterprise software development, open source education, and co-founder of Ambient Ideas, LLC, a Denver consultancy. Matthew currently is VP of Training at GitHub.com, author of the Git Master Class series for O'Reilly, speaker at over 30 national and international conferences, author of three of the top 10 DZone RefCards, and President of the Denver Open Source Users Group. His current topics of research center around project automation: build tools (Gradle), distributed version control (Git, GitHub), Continuous Integration (Jenkins, Travis) and Quality Metrics (Sonar). Matthew resides in Denver, Colorado with his beautiful wife and two young daughters, who are active in nearly every outdoor activity Colorado has to offer.

Presentations

Cryptography on the JVM: Boot Camp

Does your application transmit customer information? Are there fields of sensitive customer data stored in your DB? Can your application be used on insecure networks? If so, you need a working knowledge of encryption and how to leverage Open Source APIs and libraries to make securing your data as easy as possible. Cryptography is quickly becoming a developer's new frontier of responsibility in many data-centric applications.

Encryption on the JVM: Advanced Techniques

Now that you have the basics of encryption under your belt, we'll advance to talking about where it is sensible and performant to add this level of security to your application. Symmetric key and public key encryption have various levels of processing overhead, so you can't blindly just use the “best” encryption out there. What about password hashes? Did you know they are vulnerable with our “salt”?

Hadoop: Divide and Conquer Gigantic Datasets (Intro)

Moore's law has finally hit the wall and CPU speeds have actually decreased in the last few years. The industry is reacting with hardware with an ever-growing number of cores and software that can leverage “grids” of distributed, often commodity, computing resources. But how is a traditional Java developer supposed to easily take advantage of this revolution? The answer is the Apache Hadoop family of projects. Hadoop is a suite of Open Source APIs at the forefront of this grid computing revolution and is considered the absolute gold standard for the divide-and-conquer model of distributed problem crunching. The well-travelled Apache Hadoop framework is curently being leveraged in production by prominent names such as Yahoo, IBM, Amazon, Adobe, AOL, Facebook and Hulu just to name a few.

Hadoop: Divide and Conquer Gigantic Datasets (Advanced)

With the basics of Hadoop under your belt, we'll dig into the depths of this amazing framework by writing our own reducer in Java and deploying it to the cluster. Next, we'll dig deeper into DSLs like Pig and its log-file processing cousin, Chukwa. Since grid topology is intentionally very opaque in Hadoop, we'll look at the benefits and how to achieve a properly tuned cluster with replication. Specific to HDFS, we'll tune the configurable parameters for storage redundancy and bucket sizes.