In die Kategorie Analytics fallen sowohl die klassischen Data-driven-Business / BI-Themen (Data Warehouse, ETL, Reporting, Dashboards) als auch die neueren Trends in diesem Umfeld: Big Data, Data Science & Deep Learning und Search-based Applications.

Wir verstehen uns als Spezialist für anspruchsvolle Aufgaben in den Bereichen Data Management und Analytics, die unter Zeitdruck gelöst werden müssen und für die oftmals in den Unternehmen keine eigenen Fachleute verfügbar sind:

  • die Modellierung hochkomplexer Cubes,
  • die Integration heterogener Datenquellen,
  • der effiziente Umgang mit sehr großen Datenvolumina (Big Data),
  • die wissenschaftliche Analyse dieser Daten-Pools (Data Science) und
  • der Einsatz von innovativen Suchtechnologien im Unternehmenskontext.

Transfer Learning for Text Classification with Siamese Networks

2019-11-21T12:28:53+00:00

In this blogpost, I want to present my master's thesis, which focused on transfer learning for text classification using Siamese Networks.

Text classification is a field in natural language processing (NLP), which assigns text to given classes. With applications in sentiment analysis, spam detection or i

Transfer Learning for Text Classification with Siamese Networks 2019-11-21T12:28:53+00:00

Uncertainty Quantification in Deep Learning

2019-10-09T14:17:31+00:00

Teach your Deep Neural Network to be aware of its epistemic and aleatory uncertainty. Get a quantified confidence measure for your Deep Learning predictions.

Artificial Intelligence—and machine learning in particular—have come a long way since their early beginnings. The widespread availability and affordability of powerfu

Uncertainty Quantification in Deep Learning 2019-10-09T14:17:31+00:00

Multimodal Sequential Recommender Systems

2019-09-02T14:54:44+00:00

Sequential recommender systems are based on sequential user representations for a given user and sequence length. Each sequence consists of several items in temporal order. Sequential recommender systems aim at exploiting the temporal information that is hidden in the sequence of item interactions of the given user.

Since the invention of the internet, the availability and amount of information has increased steadily. Today we are facing problems of information overload and an ov

Multimodal Sequential Recommender Systems 2019-09-02T14:54:44+00:00

Turnilo: A Lightweight Frontend for Realtime Analytics Powered by Apache Druid

2019-08-20T09:20:21+00:00

In this post, we introduce Turnilo, explain its configuration and usage and share our evaluation outcome. For completeness, we also provide a list of current alternatives.

We frequently help our customers implement data platforms on a grand scale: as a backend for user-facing applications, for business analytics or data science and mach

Turnilo: A Lightweight Frontend for Realtime Analytics Powered by Apache Druid 2019-08-20T09:20:21+00:00

Digitize your Receipts using Computer Vision

2019-08-08T13:00:31+00:00

In this article I describe the steps and approaches to image recognition for receipt digitalization using computer vision. This is the basic functionality behind apps such as Google Lens, Evernote, PaperScan and taggun.io.

“Would you like the receipt?”—It’s hard to say no to that. Not because you actually want it (you may even throw it in the trash before exiting the store), but because

Digitize your Receipts using Computer Vision 2019-08-08T13:00:31+00:00

Summarizing Long Texts with Seq2Seq Neural Networks

2019-07-08T09:24:24+00:00

We extend state-of-the-art  sequence-to-sequence neural networks for summarization of long text across windows. By learning transitions, we are able to process arbitrarily long texts during inference.

This blog post describes my master thesis "Abstractive Summarization for Long Texts". We’ve extended existing state-of-the-art  sequence-to-sequence (Seq2Seq) neural net

Summarizing Long Texts with Seq2Seq Neural Networks 2019-07-08T09:24:24+00:00

Machine Learning Interpretability: Explaining Blackbox Models with LIME (Part II)

2019-06-04T16:14:33+00:00

The idea behind the model-agnostic technique LIME is to approximate a complex model locally by an interpretable model and to use that simple model to explain a prediction of a particular instance of interest.

This is the second part of our series about Machine Learning interpretability. We want to describe LIME (Local Interpretable Model-Agnostic Explanations), a popular t

Machine Learning Interpretability: Explaining Blackbox Models with LIME (Part II) 2019-06-04T16:14:33+00:00

Reinforcement Learning Walkthrough: Introduction (Part 1)

2019-05-15T14:00:26+00:00

This blog explains the basic concept of Reinforcement Learning, giving you an understanding of the closed loop system, in which an agent uses actions to change the state of the environment and thus receives rewards, with the goal of maximizing the return.

I would like to start this series about reinforcement learning by giving an overview of what reinforcement learning is, what it is used for and what terminology is ne

Reinforcement Learning Walkthrough: Introduction (Part 1) 2019-05-15T14:00:26+00:00

The Mystery of Entropy: How to Measure Unpredictability in Machine Learning

2019-05-13T11:34:47+00:00

Entropy is a significant, widely used and above all successful measure for quantifying eg. inhomogeneity, uncertainty or unpredictability. It is an integral part of the latest machine learning models deployed on real-world data sets. In this article, I want to highlight the simplicity, beauty and meaning of entropy.

If you are dealing with Statistics, Data Science, Machine Learning, Artificial Intelligence or even general Computer Science, Mathematics, Engineering or Physics, you

The Mystery of Entropy: How to Measure Unpredictability in Machine Learning 2019-05-13T11:34:47+00:00

Was kostet die Cloud? (und warum das keiner so richtig genau sagen kann)

2019-04-10T16:23:03+00:00

Diese Frage wird uns oft gestellt im Zusammenhang mit der Kosten-Kalkulation für Projekte, deren Infrastruktur auf Public-Cloud-Diensten basiert. Allerdings ist es leider oft ein komplexes Unterfangen, diese Kosten zu bestimmen. Warum das so ist möchte ich in diesem Blog Artikel erläutern.

Diese Frage wird uns oft gestellt im Zusammenhang mit der Kosten-Kalkulation für Projekte, deren Infrastruktur auf Public-Cloud-Diensten basiert. Es ist verständlich,

Was kostet die Cloud? (und warum das keiner so richtig genau sagen kann) 2019-04-10T16:23:03+00:00

Frameworks for Machine Learning Model Management

2019-04-04T10:02:24+00:00

This blog post will compare three different tools developed to support reproducible machine learning model development: MLFlow developed by DataBricks (the company behind Apache Spark), DVC, a software product of the London based startup iterative.ai, and Sacred, an academic project developed by different researchers.

In my previous blog post „how to manage machine learning models“ I explained the difficulties within the process of developing a good machine learning mod

Frameworks for Machine Learning Model Management 2019-04-04T10:02:24+00:00

Deep Learning Fundamentals: Concepts & Methods of Artificial Neural Networks

2019-04-02T13:34:12+00:00

Everybody talks about AI and deep learning and everybody uses it, including you! But what exactly is deep learning and what are artificial neural networks? In this article I shine a light on some basic yet crucial concepts in an attempt to lift the veil.

Artificial intelligence or deep learning: Everybody talks about it and everybody uses it, including you! Of course you immediately have the evil terminator in mind wh

Deep Learning Fundamentals: Concepts & Methods of Artificial Neural Networks 2019-04-02T13:34:12+00:00

Price Prediction in Online Car Marketplaces using Natural Language Processing

2019-04-02T13:35:55+00:00

I use state-of-the-art NLP techniques to improve an existing pricing model in an online car market. Online car markets usually use technical car attributes for price prediction with sellers adding description texts to provide more details. In my thesis, I use these texts to improve the existing pricing model.

tl;dr: This blog post summarizes my masters‘ thesis. I use state-of-the-art NLP techniques to improve an existing pricing model in an online car market. Online

Price Prediction in Online Car Marketplaces using Natural Language Processing 2019-04-02T13:35:55+00:00

Machine Learning Interpretability: Do You Know What Your Model Is Doing?

2019-04-02T13:36:57+00:00

Unlike usual performance metrics, fairness, safety and transparency in machine learning models are much harder if not impossible to quantify. Here are some techniques (and examples) to provide interpretability, to make decision systems understandable not only for their creators, but also for their customers and users.

Machine learning has a great potential to improve data products and business processes. It is used to propose products and news articles that we might be interested i

Machine Learning Interpretability: Do You Know What Your Model Is Doing? 2019-04-02T13:36:57+00:00

SeqPolicyNet: Querying Elasticsearch by Asking Questions about Movies

2019-04-02T13:39:46+00:00

This article presents SeqPolicyNet, our Deep Learning approach to accessing information stored in an Elasticsearch instance given natural language questions.

tl;dr (spoiler alert): We’ve trained an advanced neural network to query Elasticsearch based on natural language questions. Our model, called SeqPolicyNet, incorporat

SeqPolicyNet: Querying Elasticsearch by Asking Questions about Movies 2019-04-02T13:39:46+00:00
Mehr Beiträge laden