Florian Wilhelm

Über Florian Wilhelm

My name is Florian Wilhelm and I am a Data Scientist living in Cologne, Germany. Right now I enjoy working on innovative Data Science projects with experts every day at inovex. With more than five years of project experience in the field of Predictive & Prescriptive Analytics and Big Data, I have acquired profound knowledge in the domains of mathematical modelling, statistics, machine learning, high-performance computing and data mining. For the last years I programmed mostly with the Python Data Science stack (NumPy, SciPy, Scikit-Learn, Pandas, Matplotlib, Jupyter, etc.) to which I also contributed several extensions. Due to my participation in many industry projects, I have also gained experience in the Hadoop stack including Hive and Spark as well as R.

Multiplicative LSTM for sequence-based Recommenders

2018-08-21T21:57:45+00:00

Traditional user-item recommenders often neglect the dimension of time, finding for each user a latent representation based on the user’s historical item interactions without any notion of recency and sequence of interactions. Sequence-based recommenders such as Multiplicative LSTMs tackle this issue.

Recommender Systems support the decision making processes of customers with personalized suggestions. They are widely used and influence the daily life of almost ever

Multiplicative LSTM for sequence-based Recommenders 2018-08-21T21:57:45+00:00

Managing isolated Environments with PySpark

2018-04-10T13:30:43+00:00

In this article we present a simple solution for managing Isolated Environments with PySpark that we have been using in production for more than a year.

With the sustained success of the Spark data processing platform even data scientists with a strong focus on the Python ecosystem can no longer ignore it. Fortunately

Managing isolated Environments with PySpark 2018-04-10T13:30:43+00:00

Data Science in Production: Packaging, Versioning and Continuous Integration

2018-02-07T14:53:36+00:00

Here's what changes when your data science project grows from a proof of concept. How do you deploy your model, how can updates be rolled out, ...?

A common pattern in most data science projects I participated in is that it’s all fun and games until someone wants to put it into production. From that point in time

Data Science in Production: Packaging, Versioning and Continuous Integration 2018-02-07T14:53:36+00:00

Efficient UD(A)Fs with PySpark

2017-11-27T15:30:11+00:00

Nowadays, Spark surely is one of the most prevalent technologies in the fields of data science and big data. Luckily, even though it is developed in Scala and runs in

Efficient UD(A)Fs with PySpark 2017-11-27T15:30:11+00:00

Causal Inference and Propensity Score Methods

2017-11-27T15:30:21+00:00

In supervised learning, correlation is crucial to predict the target variable with the help of the feature variables. But what good is causation?

In the field of machine learning and particularly in supervised learning, correlation is crucial to predict the target variable with the help of the feature variables

Causal Inference and Propensity Score Methods 2017-11-27T15:30:21+00:00

Hive UDFs and UDAFs with Python

2017-11-27T15:30:25+00:00

In this post we focus on how to write sophisticated User Defined (Aggregated) Functions (UD(A)Fs) for Apache Hive in Python.

Sometimes the analytical power of built-in Hive functions is just not enough. In this case it is possible to write hand-tailored User-Defined Functions (UDFs) for tra

Hive UDFs and UDAFs with Python 2017-11-27T15:30:25+00:00