Hadoop Developer Inhouse Training

Over the past few years, Hadoop has established itself as the de facto standard for analysing large and very large quantities of data. Hadoop poses several challenges for developers, however: firstly, raw data is now handled completely differently from the way it was previously handled. Secondly, the development of MapReduce programs requires a rethink in comparison to functional or object-oriented programming.

Furthermore, an entire ecosystem of technologies for the most diverse application areas has grown up around the "simple" MapReduce tool Hadoop. These include distributed data storage, exploration and analysis, and automated classification and forecasting.

This training course provides a detailed introduction to Hadoop MapReduce and HDFS. It gives participants the knowledge they need to create and implement complex MapReduce algorithms themselves, as well as to implement the technologies of this ecosystem efficiently. Practical exercises are given priority: during the course, all participants develop their own data analysis applications using the techniques and tools presented.


  • Hadoop Basics (MapReduce and HDFS, APIs: Streaming and Java)
  • Architecture of Hadoop clusters (NameNode and DataNodes, JobTracker and TaskTrackers, Future Prospects: YARN (MapReduce 2)
  • Advanced MapReduce Techniques: Sorting, Joins, Debugging Techniques
  • Use Case: e.g. Analysing Map Data,

Target Audience: Java developers

Duration: 3 days