Mark Johnson

Northern Virginia Software Symposium

Reston · April 17 - 19, 2015

You are viewing details from a past event

Director Consulting @ Hortonworks

Mark Johnson is a Director of Consulting at Hortonworks where his day is spent helping people achieve value from their Big and complex Data repositories. Mark has worked on a wide range of technology during his career. Most recently he has focused on the Hadoop ecosystem. Mark is active in the software community as the President of the New England Java Users Group (NEJUG) and a regular presenter to user groups and various conferences. When not working, Mark can be found riding his mountain bike on local trails and playing with his family.

Presentations

Getting started with Hadoop

Apache Hadoop is a powerful and sometimes complex tool for dealing with Big Data as well as high data throughput applications which can enable some existing applications to finally run right as well as open doors for entirely new types of applications and analysis. So the question is how does one get started with Hadoop? This presentation explores the various introductory aspects of the Hadoop infrastructure, data sources and query strategies and planning so you can get started with Hadoop.

Applying Testing Techniques for Big Data and Hadoop

More and more companies are relying on timely and accurate analytics from their “Big Data” systems. Unfortunately, testing toolsets and concepts common in other technical disciplines is lacking. A common and incorrect perception exists in “Big Data” that proper testing is impossible due to dataset size and tool complexity. This overview session will examine a sampling of Hadoop testing tools and processes available today which you can use on your projects today.

Introduction to Apache Pig Latin Programming

Apache Pig Latin is a high powered and easy to learn data flow language available for the Hadoop ecosystem. With Pig it is possible to perform various ETL type data processing activities on traditional structured, hierarchal structured as well as unstructured data. In this presentation we will examine the some of the capabilities of Pig Latin so you can get started with your ETL activities in Hadoop.