Apache Spark for Data Scientists

Target group: Data Scientists
* Course details may differ for on-site events.
Make a request

Training on the Apache Spark framework for real-time data analysis.

The Training sessions are usually held in German. Please contact us if you are interested in Training sessions in English.

Whether for batch or stream processing, thanks to its performance as distributed in-memory technology, Spark has firmly established itself in the big data tools ecosystem within a short space of time.

This Training course is aimed primarily at Data Scientists and explains Spark’s underlying structure and architecture, as well as the use of the Spark ecosystem’s powerful frontend tools for performing analyses.

The course also emphasises Machine Learning. After a general introduction, the Spark MLlib is described in detail. This library places a number of powerful ‘out of the box’ machine-learning algorithms at the user’s disposal.

This course is heavily practice-focused. It centres on a complex database in which the participants use Python to practice methods, tools and techniques.

Agenda:

Day 1 — Spark

  • Introduction to Apache Spark
  • Introduction to Apache Zeppelin
  • Spark API and RDDs
  • Key/Value RDD and joins
  • Spark SQL and dataframes/datasets

Day 2 — Machine Learning

    • Introduction to Machine Learning
      • Supervised / unsupervised learning
      • Features extraction
      • Validation

Day 3 — Machine Learning in Practice

  • Overview of models, algorithms and their areas of Application
  • Data preparation and processing
  • Machine learning in practice
  • Using Spark ML in a large database

Note:

  • The course fee includes Training materials, certificates of participation, lunches, drinks and snacks.
  • Participants must bring their own laptop to the Training sessions.
Make a request „Apache Spark for Data Scientists“ Training Description PDF, 26.90 kB

Ihre Trainer:

Bild von Hans-Peter Zorn

Hans-Peter Zorn

Big Data Scientist Read More

Dr. Robin Senge

Senior Big Data Scientist Read More

Dr. Dominik Benz

Head of Machine Learning Engineering Read More

Get in touch!

Collin Rogowski

Head of inovex Academy