Writing a Hive UDF for lookups


Let's use a Hive UDF to perform lookups against resources residing in the Hadoop file system (HDFS) which allows non-equi joins.

In today’s blog I am going to take a look at a fairly mundane and unspectacular use of a Hive UDF (user-defined function), that of performing lookups against re

HBase and Phoenix on Azure: adventures in abstraction


Layers of abstraction have helped us accelerate our productivity – but if they fail we are confronted with all the nuts-and-bolts of the implementation.

One of my favourite essays by Joel Spolsky (he of Stack Overflow fame) is “The law of leaky abstractions”. In it he describes how the prevalence of layers of abstract

Storm in a Teacup


On a recent project, we used Apache Storm as the real-time component of a cloud-based environment for fraud detection. This article provides an overview.

I wanted to call this blog article something like „Storm in a Nutshell“ but decided against it as there is probably a book by that name out there somewher

Drastic Elastic [Part 1]: ElasticSearch as a Database


Idea of using a search engine as a database, mainly due to cost and usability considerations. In this article I want to share my experience.

In an article for Java Magazin way back in 2012 (only a small section of it seems to have survived online(!), although it is still available from the inovex website a

