Zeichnung: Frau zeigt auf Whiteboard mit Graphen. Mann steht daneben mit Schloss-Symbol in der Hand.

Data Engineering Challenges: Working with Personal Data

Target group: Data Engineers, Data Scientists, Product Owner
Make a request

The General Data Protection Regulation (GDPR) came into force in 2018 with the aim of safeguarding and standardising the data protection rights of EU citizens. It provides companies and organizations with a legal framework which defines how they can collect, process, and store personal data.
To avoid having to make time-consuming corrections later, it is important to consider the legal data protection requirements when designing your own data platform.
Although the various provisions of the GDPR could be perceived as barriers to innovative solutions, current data engineering approaches and concepts offer excellent opportunities to provide companies with added value from their data in a legally secure way.

Based on the general requirements of the GDPR as they apply to personal data, this training course teaches participants different methodological and technical concepts for handling anonymization, deletion requests, retention periods, and metadata management. The course takes a hands-on approach which enables participants to try out practical concepts that have proven themselves in productive real-world projects. In each case, the course focuses on a specific use case and its corresponding architecture.

After completing this course, participants will be able to assess their data products for potential GDPR-related issues and use targeted technologies to successfully overcome the challenges [on a cloud platform].


  • GDPR & Requirements for Data Engineers
  • Deletion Concepts & Architectures
    • Handling of personal data (PII, data privacy)
    • Encryption/anonymization (hashing, SHA-256, k-anonymity)
    • Privacy-aware table design (metadata, tagging, schemas)
    • Retention policies, archival processes
  • Hands-On Implementation
    • Big data updates & deletes using Spark & Delta Lake
    • Metadata management using data catalogues (data discovery)
  • Data Governance
    • Mandatory processes within your organization
    • IAM & rights concept

The training environment will be set up on GCP. However, the technologies used can also be deployed on the other public clouds (or there are similar services).

The training is not legal advice.

Make a request „Data Engineering Challenges: Working with Personal Data“ Training Description PDF, 101.18 kB
Profilbild Von Kolja Maier

Kolja Maier

Data & Machine Learning Engineer Read More
Portraitfoto von Marcel Spitzer

Marcel Spitzer

Big Data Scientist Read More

Back to the overview