Speaker Topics - No Fluff Just Stuff

Introduction to Apache Pig Latin Programming

Apache Pig Latin is a high powered and easy to learn data flow language available for the Hadoop ecosystem. With Pig it is possible to perform various ETL type data processing activities on traditional structured, hierarchal structured as well as unstructured data. In this presentation we will examine the some of the capabilities of Pig Latin so you can get started with your ETL activities in Hadoop.

Apache Pig is a high powered and easy to learn data flow language available for the Hadoop ecosystem. With Pig it is possible to perform various ETL type data processing activities on traditional structured, hierarchal structured as well as unstructured data. In this presentation we will examine the some of the capabilities of Pig Latin so you can get started with your ETL activities in Hadoop.

During the presentation we will illustrate many of the Pig Latin features with code examples. In addition, we will cover various Pig debugging and process log examination to help you make the most of your development time.


About Mark Johnson

Mark Johnson is a Director of Consulting at Hortonworks where his day is spent helping people achieve value from their Big and complex Data repositories. Mark has worked on a wide range of technology during his career. Most recently he has focused on the Hadoop ecosystem. Mark is active in the software community as the President of the New England Java Users Group (NEJUG) and a regular presenter to user groups and various conferences. When not working, Mark can be found riding his mountain bike on local trails and playing with his family.

More About Mark »