In die Kategorie Analytics fallen sowohl die klassischen Data-driven-Business / BI-Themen (Data Warehouse, ETL, Reporting, Dashboards) als auch die neueren Trends in diesem Umfeld: Big Data, Data Science & Deep Learning und Search-based Applications.

Wir verstehen uns als Spezialist für anspruchsvolle Aufgaben in den Bereichen Data Management und Analytics, die unter Zeitdruck gelöst werden müssen und für die oftmals in den Unternehmen keine eigenen Fachleute verfügbar sind:

  • die Modellierung hochkomplexer Cubes,
  • die Integration heterogener Datenquellen,
  • der effiziente Umgang mit sehr großen Datenvolumina (Big Data),
  • die wissenschaftliche Analyse dieser Daten-Pools (Data Science) und
  • der Einsatz von innovativen Suchtechnologien im Unternehmenskontext.

Text Spotting using semi-supervised Generative Adversarial Networks

2019-04-02T17:41:50+00:00

We built a text spotting (OCR) pipeline that out-performed Google Cloud Vision using semi-supervised Generative Adversarial Networks.

Despite all advances in machine learning due to the advent of deep learning, the latter has one major shortcoming: It requires a lot of data during the learning proce

Text Spotting using semi-supervised Generative Adversarial Networks 2019-04-02T17:41:50+00:00

TensorFlow Mobile: Training and Deploying a Neural Network

2019-04-02T17:41:34+00:00

In this blog series we explain how you can train and deploy a convolutional neural network for image classification to a mobile app using TensorFlow Mobile.

Smart Assistants, fancy image filters in Snapchat and apps like Prisma all have one thing in common—they are powered by Machine Learning. The use of Machine Learning

TensorFlow Mobile: Training and Deploying a Neural Network 2019-04-02T17:41:34+00:00

Managing isolated Environments with PySpark

2018-04-10T13:30:43+00:00

In this article we present a simple solution for managing Isolated Environments with PySpark that we have been using in production for more than a year.

With the sustained success of the Spark data processing platform even data scientists with a strong focus on the Python ecosystem can no longer ignore it. Fortunately

Managing isolated Environments with PySpark 2018-04-10T13:30:43+00:00

Application of Differential Privacy and Randomized Response in Big Data

2018-03-01T09:15:11+00:00

In this blog, I’ll explain some of the basic concepts of differential privacy and talk about how I’ve used it in my Bachelor’s Thesis.

Differential Privacy is a topic of growing interest in the world of Big Data. It is currently being deployed by tech giants like Google and Apple to gain knowledge ab

Application of Differential Privacy and Randomized Response in Big Data 2018-03-01T09:15:11+00:00

Writing a Hive UDF for lookups

2018-02-07T14:42:53+00:00

Let's use a Hive UDF to perform lookups against resources residing in the Hadoop file system (HDFS) which allows non-equi joins.

In today’s blog I am going to take a look at a fairly mundane and unspectacular use of a Hive UDF (user-defined function), that of performing lookups against re

Writing a Hive UDF for lookups 2018-02-07T14:42:53+00:00

Data Science in Production: Packaging, Versioning and Continuous Integration

2018-02-07T14:53:36+00:00

Here's what changes when your data science project grows from a proof of concept. How do you deploy your model, how can updates be rolled out, ...?

A common pattern in most data science projects I participated in is that it’s all fun and games until someone wants to put it into production. From that point in time

Data Science in Production: Packaging, Versioning and Continuous Integration 2018-02-07T14:53:36+00:00

Network Anomaly Detection: Online vs. Offline Machine Learning

2019-04-02T17:27:26+00:00

In this part of our network anomaly detection blogpost series we want to compare two basically different styles of learning.

In this part of our network anomaly detection series we want to compare two basically different styles of learning. The very first post introduced the simple k-means 

Network Anomaly Detection: Online vs. Offline Machine Learning 2019-04-02T17:27:26+00:00

Sport-Tracking mit Elasticsearch [Meetup]

2018-02-07T14:54:40+00:00

In diesem Mittschnitt unseres Meetups zeigt Tracking Fan Wolfgang, wie er die Daten seiner Garmin Watch selbst mit Elasticsearch ausgewertet hat.

In diesem Mittschnitt unseres Meetups in Karlsruhe zeigt Wolfgang, ein begeisterter Triathlet und Tracking Fan, wie er die Daten seiner Garmin Watch selbst mit Elasti

Sport-Tracking mit Elasticsearch [Meetup] 2018-02-07T14:54:40+00:00

Neural Networks in the Browser

2019-04-02T17:40:53+00:00

Neural networks are the basis of some pretty impressive recent advances in machine learning. From greatly improved translation to automatic transfer of painting style

Neural Networks in the Browser 2019-04-02T17:40:53+00:00

Real-time detection of anomalies in computer networks with methods of machine learning: Stop the (data)-thief!

2019-02-15T12:54:16+00:00

This blog post describes some basic concepts and shows a prototypical architecture for network anomaly detection in real-time.

This blog post shows some results and concepts of a master’s thesis here at inovex. It describes some basic concepts and shows a prototypical architecture for detecti

Real-time detection of anomalies in computer networks with methods of machine learning: Stop the (data)-thief! 2019-02-15T12:54:16+00:00

Powering a Data Hub at Otto Group BI with Schedoscope

2017-11-27T15:30:20+00:00

In order to build data services or advanced machine learning models, organizations must integrate large amounts of information from diverse sources.

In order to build data services or advanced machine learning models, organizations must integrate large amounts of information from diverse sources. As a central plac

Powering a Data Hub at Otto Group BI with Schedoscope 2017-11-27T15:30:20+00:00
Mehr Beiträge laden