A common pattern in most data science projects I participated in is that it’s all fun and games until someone wants to put it into production. From that point in time [...]
Über Florian WilhelmMy name is Florian Wilhelm and I am a Data Scientist living in Cologne, Germany. Right now I enjoy working on innovative Data Science projects with experts every day at inovex. With more than five years of project experience in the field of Predictive & Prescriptive Analytics and Big Data, I have acquired profound knowledge in the domains of mathematical modelling, statistics, machine learning, high-performance computing and data mining. For the last years I programmed mostly with the Python Data Science stack (NumPy, SciPy, Scikit-Learn, Pandas, Matplotlib, Jupyter, etc.) to which I also contributed several extensions. Due to my participation in many industry projects, I have also gained experience in the Hadoop stack including Hive and Spark as well as R.
Nowadays, Spark surely is one of the most prevalent technologies in the fields of data science and big data. Luckily, even though it is developed in Scala and runs in [...]
Before we actually dive into this topic, imagine the following: You just moved to a new place and the time is ripe for a little house-warming dinner with your best fr [...]
In the field of machine learning and particularly in supervised learning, correlation is crucial to predict the target variable with the help of the feature variables [...]
Sometimes the analytical power of built-in Hive functions is just not enough. In this case it is possible to write hand-tailored User-Defined Functions (UDFs) for tra [...]