
Apache Spark for Data Scientists
Training on the Apache Spark framework for real-time data analysis.
Training on the Apache Spark framework for real-time data analysis.
The Training sessions are usually held in German. Please contact us if you are interested in Training sessions in English.
Whether for batch or stream processing, thanks to its performance as distributed in-memory technology, Spark has firmly established itself in the big data tools ecosystem within a short space of time.
This Training course is aimed primarily at Data Scientists and explains Spark’s underlying structure and architecture, as well as the use of the Spark ecosystem’s powerful frontend tools for performing analyses.
The course also emphasises Machine Learning. After a general introduction, the Spark MLlib is described in detail. This library places a number of powerful ‘out of the box’ machine-learning algorithms at the user’s disposal.
This course is heavily practice-focused. It centres on a complex database in which the participants use Python to practice methods, tools and techniques.
Day 1 — Spark
Day 2 — Machine Learning
Day 3 — Machine Learning in Practice
Note: