Apache Spark Training

The training sessions are usually held in German. Please contact us if you are interested in training sessions in English. 

Training on using the Apache Spark framework for real-time data analysis.

Apache Spark Training at inovex

Target audience: Analysts, software architects, software developers
Length: 2 days 
Dates: Available upon request
Times: 9 am – 5 pm 
Number of participants: min. 3, max. 12 
Price: 1,200 euros plus VAT

Whether for batch or stream processing, thanks to its performance as distributed in-memory technology, Apache Spark has firmly established itself among the big data tools ecosystem within a short space of time.

This training course provides an introduction to Spark as a tool for analysing large volumes of data and covers both batch and streaming processes. The course emphasises the formulation of analytical queries and the use of machine learning processes. The participants are given specific business requirements and introduced to the architecture, techniques and tools they will need to fulfil these appropriately.

This course is heavily practice-focused. It centres on a complex database in which the participants practice methods, tools and techniques.

Agenda: 

  • Spark basics and architecture
  • Spark APIs and the RDD data structure
  • Formulating queries with Spark SQL
  • Transformations and actions in the Spark context
  • Zeppelin as a Spark frontend
  • Machine learning using the Spark MLlib
  • Overview of the Apache Spark ecosystem
  • Designing Spark architectures for implementing specific use cases

Note: 

  • The course fee includes training materials, certificates of participation, lunches, drinks and snacks.
  • Participants must bring their own laptop to the training sessions.

Instructors (depending on dates): 

Hans-Peter Zorn is a big data scientist at inovex. He specialises in big data architectures, Hadoop security, machine learning, and data-driven products. Previously, he worked in the UKP Lab at the TU [Technical University] of Darmstadt, where he used Hadoop to analyse large text volumes.

Dr Dominik Benz works for inovex as a big data engineer. Here, his duties include test-driven big data application development and the implementation of ETL processes based on Hadoop technologies (Hive, HBase), as well as their integration into traditional business intelligence environments.

Dr Robin Senge is a senior big data scientist at inovex. As a machine learning expert, he designs and implements ad-hoc data analyses and data-driven use cases based on Apache Spark (among other platforms).