Über Andrew Kenworthy

Der Autor hat bisher keine Details angegeben.
Bisher hat Andrew Kenworthy, 7 Blog Beiträge geschrieben.

Writing a Hive UDF for lookups

2018-02-07T14:42:53+00:00

Let's use a Hive UDF to perform lookups against resources residing in the Hadoop file system (HDFS) which allows non-equi joins.

In today’s blog I am going to take a look at a fairly mundane and unspectacular use of a Hive UDF (user-defined function), that of performing lookups against re

Writing a Hive UDF for lookups 2018-02-07T14:42:53+00:00

HBase and Phoenix on Azure: adventures in abstraction

2018-02-28T10:47:50+00:00

Layers of abstraction have helped us accelerate our productivity – but if they fail we are confronted with all the nuts-and-bolts of the implementation.

One of my favourite essays by Joel Spolsky (he of Stack Overflow fame) is “The law of leaky abstractions”. In it he describes how the prevalence of layers of abstract

HBase and Phoenix on Azure: adventures in abstraction 2018-02-28T10:47:50+00:00

Storm in a Teacup

2018-02-28T10:46:59+00:00

On a recent project, we used Apache Storm as the real-time component of a cloud-based environment for fraud detection. This article provides an overview.

I wanted to call this blog article something like „Storm in a Nutshell“ but decided against it as there is probably a book by that name out there somewher

Storm in a Teacup 2018-02-28T10:46:59+00:00

Drastic Elastic [Part 4]: Aggregations & Plugins

2017-11-28T12:41:23+00:00

In this last article of our four part series we describe how ElasticSearch plugins help us to address appropriate aggregation levels.

In an earlier post in this mini-series I mentioned that the aggregated data we persist in ElasticSearch has discrete retention times: 5 minute aggregation => (rete

Drastic Elastic [Part 4]: Aggregations & Plugins 2017-11-28T12:41:23+00:00

Drastic Elastic [Part 3]: Cluster Setup

2017-11-28T12:48:39+00:00

In This article we describe how we set up an Elasticsearch cluster to best guard against network partitioning.

ElasticSearch does not offer support for clusters spanning data centres. However, on our project we had access to a network latency of 400 *micro*seconds (0.4 ms) bet

Drastic Elastic [Part 3]: Cluster Setup 2017-11-28T12:48:39+00:00

Drastic Elastic [Part 2]: The aggregation framework

2018-02-28T10:51:55+00:00

Following from my earlier article on elasticsearch-as-a-database, we will now take a look at the aggregation framework.

Following from my earlier article on elasticsearch-as-a-database, we will now take a look at the aggregation framework.

Drastic Elastic [Part 2]: The aggregation framework 2018-02-28T10:51:55+00:00

Drastic Elastic [Part 1]: ElasticSearch as a Database

2018-02-28T10:50:59+00:00

Idea of using a search engine as a database, mainly due to cost and usability considerations. In this article I want to share my experience.

In an article for Java Magazin way back in 2012 (only a small section of it seems to have survived online(!), although it is still available from the inovex website a

Drastic Elastic [Part 1]: ElasticSearch as a Database 2018-02-28T10:50:59+00:00