Dataprocessing with Spark (Batch & Stream) Training

In this hands-on course, participants will learn how modern lakehouse architectures can be built in the Databricks Cloud using Spark (processing) and Delta Lake (storage).

Request now

Home / Academy / Dataprocessing with Spark (Batch & Stream) Training

At a glance

General information

2 days

in Karlsruhe or remote
German or English

Target group

Software developers with basic knowledge of Python, Jupyter notebooks, and working with data (e.g., SQL, DataFrames, etc.)

Application examples

Provision of scalable analyses and dashboards based on large amounts of data
Development of streaming-based data applications, e.g., for processing high-volume sensor or motion data

Description

This training course teaches the basics of the scalable data processing engine Apache Spark and the cloud platform Databricks. Combined, they enable the development of high-performance batch and stream-based applications for analyzing and transforming large amounts of data.

In this hands-on course, participants learn how modern lakehouse architectures can be built in the Databricks Cloud using Spark (processing) and Delta Lake (storage).

All concepts are introduced theoretically and then reinforced through exercises in a prepared Databricks environment. The focus is on both a good technical understanding and practical implementation, so that participants are immediately able to use the technologies covered in their own projects after the training.

Agenda

Introduction to the basics and architecture of Apache Spark
Data transformations with Spark SQL and Spark DataFrames
Databricks Lakehouse Architecture & Unity Catalog
Databricks Workspaces, Notebooks, Clusters, and Workflows
Delta Lake and optimized data storage
Spark Structured Streaming
Stateful Streaming with Watermarks

Typical questions we answer:

What advantages does Spark offer over other approaches?
For which use cases are streaming architectures useful?
How do data transformations work with Spark?
What is a Delta Lake and when is it best to use one?
How can stateful streaming be implemented in Spark?
How is Spark best used in the Databricks environment?
What is lakehouse architecture?

Dataprocessing with Spark (Batch & Stream) Training

signed certificate of participation
in-house training
Customization available (agenda, tech stack, language, etc.)
small training groups

Request now

Trainers

Our trainers are field-tested experts in their areas of expertise. Through their work in projects, they expand their knowledge day by day and pass on this know-how in their trainings - application-oriented and practice-oriented.

Simon Bachstein

Databricks Certified Data Engineer Professional

Since 2019, Simon Bachstein, a data engineer with a background in mathematics, has not only been developing smart and innovative data products, but also designing data landscapes with a focus on quality, efficiency, security, and user-friendliness. As a trainer, Simon enjoys imparting a deep understanding of the technology, but never loses sight of practical applications and seeks to engage in dialogue about specific problems.

More trainings with Simon Bachstein →

Why inovex Academy?

Our offer

The inovex Academy has set itself the task of passing on knowledge about methods and technologies that we already use successfully in our projects.

Curated content

Our trainers create a customized training offer based on your requirements.

Customizable tech stack

In exclusive trainings, we can consider your tech stack for the training content.

Individual assistance

If needed, we can tailor the training to a specific use case of your company and work directly based on your data.

Our training approach

From the needs analysis to the awarding of certificates, we offer customized training courses, flexibly designed and carried out according to your requirements.

If you are interested in in-house training, we will start by identifying your needs and discussing your objectives. This discussion forms the basis for an initial offer.

As soon as the framework data has been clarified, our trainers start adapting the training content. Many of our training courses have a modular structure and offer the opportunity to design the agenda flexibly. Training courses that prepare for certifications, on the other hand, are less flexible. Here, however, you can set the content focus according to your wishes.

You will receive all relevant information in advance of the training. The training will then take place in the room of your choice and at the agreed time. Our trainers will adapt to your requirements.

After completing the training, all participants receive a certificate confirming their participation. You will also have the opportunity to give us feedback on the content and the course. We are always happy to receive praise and suggestions for improvement.

Frequently Asked Questions

All you need for the training is your own laptop with a web browser. This will be used to access the web-based Python development environment provided.

The exercises take place in a Databricks/Spark environment provided specifically for this purpose. The prepared exercises allow participants to practice the concepts discussed by performing realistic development tasks in the Databricks/Spark environment.

No, Databricks access is provided by the inovex Academy.

Supplementary information

General terms and conditions

Cancellation policy

Go back

We are your partner for successful trainings

We would be happy to talk to you personally about your concerns. Get in touch now!

Collin Rogowski

Head of inovex Academy

Call Send email

Individual training offer for your company
Over 25 years of experience as inovex Academy

Dataprocessing with Spark (Batch & Stream) Training

At a glance

General information

Target group

Application examples

Description

Agenda

Typical questions we answer:

Trainers

Simon Bachstein

Why inovex Academy?

Curated content

Customizable tech stack

Individual assistance

Our training approach

Frequently Asked Questions

What do I need for this training?

What types of exercises are there?

Do I need my own Databricks account for the training?

Supplementary information

Collin Rogowski

I look forward to your inquiry.

Collin Rogowski

We are your partner for successful trainings

Collin Rogowski