A Case for Isolated Virtual Environments with PySpark

2020-09-16T15:00:21+00:00

This blogpost motivates the use of virtual environments with Python and then shows how they can be a handy tool when deploying PySpark jobs to managed clusters.

This blog post motivates the use of virtual environments with Python and then shows how they can be a handy tool when deploying PySpark jobs to managed clusters.

A Case for Isolated Virtual Environments with PySpark2020-09-16T15:00:21+00:00

Customer Journey verbessern mit Behavioral Economics & intelligenter Technologie: Eure Fragen beantwortet!

2020-09-08T14:12:15+00:00

Bei unserem Online Meetup zum Thema Customer Journey verbessern mit Behavioral Economics & intelligenter Technologie blieben einige Fragen unbeantwortet. Wir haben sie zusammengetragen und von den Vortragenden beantworten lassen.

Bei unserem Online Meetup zum Thema Customer Journey verbessern mit Behavioral Economics & intelligenter Technologie blieben einige Fragen unbeantwortet. Wir habe

Customer Journey verbessern mit Behavioral Economics & intelligenter Technologie: Eure Fragen beantwortet!2020-09-08T14:12:15+00:00

Federated Learning: A Guide to Collaborative Training with Decentralized Sensitive Data – Part 1

2020-08-10T11:42:29+00:00

This blog post explains how Federated Learning works and what privacy techniques are necessary to ensure that sensitive data is protected.

Nowadays, access to high-quality real-world data has a major impact on the success of data-driven projects, as the quality of a Machine Learning solution strongly dep

Federated Learning: A Guide to Collaborative Training with Decentralized Sensitive Data – Part 12020-08-10T11:42:29+00:00

Dive into Snorkel: Weak-Supervision on German Texts

2020-08-03T16:20:46+00:00

How do we proceed if we have almost no labeled data for a machine learning model? One answer may be: combining all the knowledge we have in one framework to get to the best of each world. This blogpost investigates the trending data programming framework Snorkel for the task of detecting bad language on German texts.

How do we proceed if we have almost no labeled data for a machine learning model? One answer may be: combining all the knowledge we have (labeled data, distant superv

Dive into Snorkel: Weak-Supervision on German Texts2020-08-03T16:20:46+00:00

Personalisierung mit Recommender Systems FAQ: Eure Fragen beantwortet

2020-07-27T12:27:35+00:00

Bei unserem Meetup zur Rolle von Recommender-Systemen im Omnichnannel-Marketing sind einige Fragen offen geblieben. Diese haben wir hier zusammengetragen und von unseren Expert:innen beantworten lassen.

Bei unserem Meetup zur Rolle von Recommender Systems im Omnichnannel-Marketing sind einige Fragen offen geblieben. Diese haben wir hier zusammengetragen und von unser

Personalisierung mit Recommender Systems FAQ: Eure Fragen beantwortet2020-07-27T12:27:35+00:00

Automated Feature Engineering with Open-Source Libraries

2020-07-20T16:56:10+00:00

In the hope of excellent features, without requiring domain experts spending days engineering them, lies this review of automated feature engineering with TPOT, auto-sklearn and autofeat.

In the hope of excellent features, without requiring domain experts spending days engineering them, lies this review of automated feature engineering with TPOT, auto-

Automated Feature Engineering with Open-Source Libraries2020-07-20T16:56:10+00:00

Causal Inference in Campaign Targeting

2020-05-13T16:18:46+00:00

In this article I will work through a synthetic example to show the efficacy of causal inference in marketing campaign targeting.

The following is one of two posts published alongside the JustCause framework, which we developed at inovex as a tool to foster good scientific practice in the field

Causal Inference in Campaign Targeting2020-05-13T16:18:46+00:00

Causal Inference: Introduction to Causal Effect Estimation

2020-03-23T12:55:32+00:00

Recently, there has been a surge in interest in Causal Inference. It is, however, not always clear what is meant by the term and what the respective methods can actually do. This post gives a high-level overview over the two major schools of Causal Inference and then dives deep into the basics of one of them.

Recently, there has been a surge in interest in what is called Causal Inference. It i

Causal Inference: Introduction to Causal Effect Estimation2020-03-23T12:55:32+00:00

3D Deep Learning with TensorFlow 2

2020-03-09T17:36:48+00:00

In this blog post, we will first have a look at 3D deep learning with PointNet. Its creators provide a TensorFlow 1.x implementation of PointNet on Github, but since TensorFlow 2.0 was released in the meantime, we will transform it into an idiomatic TensorFlow 2 implementation in the second part of this post.

The world that we interact with each and every day is three-dimensional, but the majority of deep learning models process visual data as 2D images. However, there are

3D Deep Learning with TensorFlow 22020-03-09T17:36:48+00:00

Frameworks for Machine Learning Model Management

2019-04-04T10:02:24+00:00

This blog post will compare three different tools developed to support reproducible machine learning model development: MLFlow developed by DataBricks (the company behind Apache Spark), DVC, a software product of the London based startup iterative.ai, and Sacred, an academic project developed by different researchers.

In my previous blog post „how to manage machine learning models“ I explained the difficulties within the process of developing a good machine learning mod

Frameworks for Machine Learning Model Management2019-04-04T10:02:24+00:00

Machine Learning Interpretability: Do You Know What Your Model Is Doing?

2019-04-02T13:36:57+00:00

Unlike usual performance metrics, fairness, safety and transparency in machine learning models are much harder if not impossible to quantify. Here are some techniques (and examples) to provide interpretability, to make decision systems understandable not only for their creators, but also for their customers and users.

Machine learning has a great potential to improve data products and business processes. It is used to propose products and news articles that we might be interested i

Machine Learning Interpretability: Do You Know What Your Model Is Doing?2019-04-02T13:36:57+00:00

Working efficiently with Jupyter Notebooks

2018-11-20T11:31:51+00:00

Being in the data science domain for quite some years, I have seen good Jupyter notebooks but also a lot of ugly ones. Follow these best practices to to work more efficiently with your notebooks and strike the perfect balance between text, code and visualisations.

If you have ever done something analytical or anything closely related to data science in Python, there is just no way you have not heard of or IPython or Jupyter not

Working efficiently with Jupyter Notebooks2018-11-20T11:31:51+00:00

From Exploration to Production—Bridging the Deployment Gap for Deep Learning (Part 2)

2019-04-02T13:47:25+00:00

In this blogposts on deep learning model exploration, translation, and deployment we expand on the previous article with two additional approaches for model deployment: TensorFlow Serving and Docker as well as a rather hobbyist approach in which we build a simple web application that serves our model.

This is the second part of a series of two blogposts on deep learning model exploration, translation, and deployment. Both involve many technologies like PyTorch, Ten

From Exploration to Production—Bridging the Deployment Gap for Deep Learning (Part 2)2019-04-02T13:47:25+00:00

How to Manage Machine Learning Models

2018-12-18T17:06:14+00:00

In the past few moths a slew of Machine Learning management platforms arose. In this article we have a look at ModelDB which supports data scientists by keeping track of models, datasources and parameters. If you use scikit-learn or SparkML it promises easy integration and offers additional visualisation tools.

Developing a good machine learning model is not straight forward, but rather an iterative process which involves many steps. Mostly Data Scientists start by building

How to Manage Machine Learning Models2018-12-18T17:06:14+00:00

From Exploration to Production — Bridging the Deployment Gap for Deep Learning

2018-10-01T14:49:13+00:00

This article introduces EMNIST, we develop and train models with PyTorch, translate them with the Open Neural Network eXchange format ONNX and serve them through GraphPipe. We will orchestrate these technologies to solve the task of image classification using the more challenging and less popular EMNIST dataset.

This is the first part of a series of two blogposts on deep learning model exploration, translation, and deployment. Both involve many technologies like PyTorch, Tens

From Exploration to Production — Bridging the Deployment Gap for Deep Learning2018-10-01T14:49:13+00:00
Mehr Beiträge laden