{"id":21031,"date":"2016-08-22T07:11:39","date_gmt":"2016-08-22T06:11:39","guid":{"rendered":"https:\/\/www.inovex.de\/?p=2094"},"modified":"2026-03-23T12:25:54","modified_gmt":"2026-03-23T11:25:54","slug":"storm-in-a-teacup","status":"publish","type":"post","link":"https:\/\/www.inovex.de\/de\/blog\/storm-in-a-teacup\/","title":{"rendered":"Storm in a Teacup"},"content":{"rendered":"<p>I wanted to call this blog article something like &#8222;Storm in a Nutshell&#8220; but decided against it as<\/p>\n<ol type=\"a\">\n<li>there is probably a book by that name out there somewhere, and I wanted to avoid any unannounced visits in the dead of night from shady-looking types from the copyright police, and<\/li>\n<li>I really wanted to use a corny pun.<\/li>\n<\/ol>\n<p>So think of a teacup as conceptually similar to a nutshell, but bigger.<\/p>\n<p>On a <a href=\"https:\/\/www.inovex.de\/de\/referenzen\/case-studies\/big-data-optimierte-betrugserkennung-auf-microsoft-azure\/\" target=\"_blank\" rel=\"noopener\">recent<\/a> project, we used Apache Storm as the real-time component of a complex, cloud-based environment used for fraud detection. In this article I would like to offer an introductory overview of storm, showing how to define a simple spout and bolt, as well as highlighting some of the issues that are important when building storm topologies.<!--more--><\/p>\n<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_79_2 counter-hierarchy ez-toc-counter ez-toc-custom ez-toc-container-direction\">\n<div class=\"ez-toc-title-container\"><p class=\"ez-toc-title\" style=\"cursor:inherit\"><\/p>\n<\/div><nav><ul class='ez-toc-list ez-toc-list-level-1 ' ><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/www.inovex.de\/de\/blog\/storm-in-a-teacup\/#Batch-or-real-time\" >Batch or real-time?<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/www.inovex.de\/de\/blog\/storm-in-a-teacup\/#Bolts-and-Spouts\" >Bolts and Spouts<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/www.inovex.de\/de\/blog\/storm-in-a-teacup\/#A-simple-example\" >A simple example<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/www.inovex.de\/de\/blog\/storm-in-a-teacup\/#Parallelism\" >Parallelism<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-5\" href=\"https:\/\/www.inovex.de\/de\/blog\/storm-in-a-teacup\/#Serialization\" >Serialization<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-6\" href=\"https:\/\/www.inovex.de\/de\/blog\/storm-in-a-teacup\/#Initialization\" >Initialization<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-7\" href=\"https:\/\/www.inovex.de\/de\/blog\/storm-in-a-teacup\/#Read-on-%E2%80%A6\" >Read on &#8230;<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-8\" href=\"https:\/\/www.inovex.de\/de\/blog\/storm-in-a-teacup\/#Join-us\" >Join us!<\/a><\/li><\/ul><\/nav><\/div>\n<h2><span class=\"ez-toc-section\" id=\"Batch-or-real-time\"><\/span>Batch or real-time?<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>By way of introduction, let&#8217;s briefly describe which tools fit into which space in the real-time\/batch paradigm.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-large wp-image-2167\" src=\"https:\/\/www.inovex.de\/wp-content\/uploads\/2016\/08\/apache-storm-1024x454.png\" alt=\"Storm vs. Map Reduce vs. Flink vs. Spark\" width=\"800\" height=\"355\" \/><\/p>\n<p>Apache Storm is basically a streaming tool, but also offers mini-batch capabilities with its Trident abstraction layer. Map-reduce is firmly in the batch paradigm, and Apache Spark offers mini-batching (somewhat confusingly referred to as &#8222;Spark Streaming&#8220;) and batch processing (&#8222;Spark SQL&#8220;). Apache Flink, like Storm, covers streaming- and mini-batch use-cases, but at the time of writing is not yet bundled with any of the Hadoop distributions (Hortonworks, Cloudera and MapR).<\/p>\n<p>We chose Storm as we wanted the reliability of a distribution-backed (and tested) component that could deliver streaming capabilities.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Bolts-and-Spouts\"><\/span>Bolts and Spouts<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Storm uses three main types of object: spouts, bolts and topologies that combine these elements in a chain.<\/p>\n<p><strong>Spout:<\/strong> a spout acts as an input to a flow, or stream, of tuples through a process defined by a topology. There can be multiple spouts in a topology, but typically there will be just one, pulling data from a source such as Kafka or Eventhub (as we have in Azure). You can also define your own spouts for testing purposes that generate data and emit it to bolts.<\/p>\n<p><strong>Bolt:<\/strong> a bolt consumes tuples from an input stream. Storm comes with abstact classes that you can extend: the simplest of these is the BaseBasicBolt class, where only two methods have to be implemented: the execute() method (where any processing is done) and the declareOutputFields() method which defines what the objects emitted from the execute() method look like (specifically, what the fields are called and to which output stream they belong). Any &#8222;acking&#8220; (ack = acknowledgement notification) is implemented by this abstract class behind the scenes, as is tuple chaining (emitting a tuple identifier along with the array of emitted values, so that the topology can track which tuples have made it through all bolts successfully). The BaseRichBolt abstract class, on the other hand, requires that you implement any acking or chaining yourself.<\/p>\n<p>Topology: a topology combines spouts and bolts, defining which output streams exist and by which bolts they are consumed. A topology can be started in local mode (for testing) or in cluster mode. A tuple passing through a topology can optionally be acked so that the spout can take specific actions, such as replaying that tuple (this guarantees &#8222;at least once&#8220; processing).<\/p>\n<h2><span class=\"ez-toc-section\" id=\"A-simple-example\"><\/span>A simple example<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>A spout in its simplest form is listed below. We initialize the collector object in the open() method, which is used for emitting randomnly-generated, base64-encoded strings. We use a simple mechanism for limiting the number of tuples that can be active (i.e. not yet acked) in the topology: this is to avoid filling the internal queues to the point of overflow (which can lead to out-of-memory exceptions). We can circumnavigate this simple queue machanism by setting maxPendingMsgs to UNLIMITED_PENDING in the constructor. There is no replay (the class is intentionally simplified), as failed tuples are simply removed from the pending queue. If a queue limit has been specified (i.e. maxPendingMsgs &lt;&gt; UNLIMITED_PENDING), and this limit has been reached, then nothing new is emitted from the spout until there is space in the queue.<\/p>\n<pre class=\"lang:java decode:true \">public class SimpleSpout extends BaseRichSpout {\r\n\r\n\tprivate final List&lt;Object&gt; pending;\r\n\r\n\tprivate SpoutOutputCollector collector;\r\n\r\n\tprivate final int maxPendingMsgs;\r\n\r\n\tprivate final static Random rand = new Random();\r\n\r\n\tpublic final static int UNLIMITED_PENDING = -1;\r\n\r\n\tpublic SimpleSpout(int maxPendingMsgs) {\r\n\r\n\t\tthis.pending = new LinkedList&lt;Object&gt;();\r\n\r\n\t\tthis.maxPendingMsgs = maxPendingMsgs;\r\n\r\n\t}\r\n\r\n\t@Override\r\n\r\n\tpublic void open(@SuppressWarnings(\"rawtypes\") Map conf, TopologyContext context, SpoutOutputCollector collector) {\r\n\r\n\t\tthis.collector = collector;\r\n\r\n\t}\r\n\r\n\t@Override\r\n\r\n\tpublic void nextTuple() {\r\n\r\n\t\tif (UNLIMITED_PENDING == maxPendingMsgs || pending.size() &lt; maxPendingMsgs) {\r\n\r\n\t\t\tString s = String.valueOf(rand.nextLong());\r\n\r\n\t\t\tString b64 = Base64.encodeBase64String(s.getBytes(Charsets.UTF_8));\r\n\r\n\t\t\tif (UNLIMITED_PENDING == maxPendingMsgs) {\r\n\r\n\t\t\t\t\/* turn off acking if we are not worried about queue overflow *\/\r\n\r\n\t\t\t\tcollector.emit(new Values(b64));\r\n\r\n\t\t\t} else {\r\n\r\n\t\t\t\tUUID uuid = UUID.randomUUID();\r\n\r\n\t\t\t\tcollector.emit(new Values(b64), uuid);\r\n\r\n\t\t\t\tpending.add(uuid);\r\n\r\n\t\t\t}\r\n\r\n\t\t}\r\n\r\n\t}\r\n\r\n\t@Override\r\n\r\n\tpublic void declareOutputFields(OutputFieldsDeclarer declarer) {\r\n\r\n\t\tdeclarer.declare(new Fields(\"emitted\"));\r\n\r\n\t}\r\n\r\n\t@Override\r\n\r\n\tpublic void ack(Object uuid) {\r\n\r\n\t\tsuper.ack(uuid);\r\n\r\n\t\tpending.remove(uuid);\r\n\r\n\t}\r\n\r\n\t@Override\r\n\r\n\tpublic void fail(Object uuid) {\r\n\r\n\t\tsuper.fail(uuid);\r\n\r\n\t\tpending.remove(uuid);\r\n\r\n\t}\r\n\r\n}<\/pre>\n<p>A simple bolt that consumes data from this spout is listed below.<\/p>\n<pre class=\"lang:java decode:true\">public class ReadMapBolt extends BaseBasicBolt {\r\n\r\n    @Override\r\n\r\n    public void execute(Tuple tuple, BasicOutputCollector collector) {\r\n\r\n        long start = System.nanoTime();\r\n\r\n        String b64 = tuple.getValues().get(0).toString();\r\n\r\n        byte[] bb = Base64.decodeBase64(b64);\r\n\r\n        collector.emit(new Values(\"ok\"));\r\n\r\n    }\r\n\r\n    @Override\r\n\r\n    public void declareOutputFields(OutputFieldsDeclarer ofd) {\r\n\r\n        ofd.declare(new Fields(\"parseoutput\"));\r\n\r\n    }\r\n\r\n}<\/pre>\n<p>Note that the information listed in the declareOutputFields() methods must be consistent across the topology (i.e. either explicitly when retrieving tuple fields by name, or implicitly when doing so by position, as above) otherwise the topology will throw an exception on deployment.<\/p>\n<p>Lastly, our topology links the spout and bolt together:<\/p>\n<pre class=\"lang:java decode:true\">public class SimpleWriterAndReader {\r\n\r\n    public static void main(String[] args) throws Exception {\r\n\r\n        TopologyBuilder builder = new TopologyBuilder();\r\n\r\n        builder.setSpout(\"spout\", new SimpleSpout(64), 1);\r\n\r\n        builder.setBolt(\"bolt\", new ReadMapBolt(), 4).shuffleGrouping(\"spout\").setNumTasks(4);\r\n\r\n        Config conf = new Config();\r\n\r\n        conf.setDebug(false);\r\n\r\n        conf.setNumWorkers(1);\r\n\r\n        LocalCluster cluster = new LocalCluster();\r\n\r\n        cluster.submitTopology(\"test_topology\", conf, builder.createTopology());\r\n\r\n        Thread.sleep(1000 * 60 * 15);\r\n\r\n        cluster.shutdown();\r\n\r\n    }\r\n\r\n}<\/pre>\n<p>This topology launches in local mode and shuts down after 15 minutes.<\/p>\n<p>We now move on to considering some not-so-trivial aspects of a storm topology that may be of interest.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Parallelism\"><\/span>Parallelism<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>In the topology above, we had defined our parallelism in these two lines:<\/p>\n<pre class=\"lang:java decode:true\">builder.setSpout(\"spout\", new SimpleSpout(64), 1);\r\n\r\nbuilder.setBolt(\"bolt\", new ReadMapBolt(), 4).shuffleGrouping(\"spout\").setNumTasks(4);\r\n\r\n<\/pre>\n<p>We defined our spout as having a parallelism hint of 1, but the bolt was defined with a hint of 4 (<span class=\"lang:java decode:true crayon-inline \">.setBolt(&#8222;bolt&#8220;, new ReadMapBolt(), 4)<\/span>) and also 4 tasks: <span class=\"lang:java decode:true crayon-inline \">.setNumTasks(4)<\/span>.<\/p>\n<p>What is the difference?<\/p>\n<p>The first hint \u2013 in setSpout() and setBolt() \u2013 is actually the number of executors, where an executor is a thread of execution within the JVM. The second hint is the number of tasks, or instances, of a spout or bolt that have been created.<\/p>\n<p>So Storm parallelism is defined by stating how many actual threads should be applied to a spout\/bolt, as well as how many instances of this spout\/bolt should be initialised on topology deployment. By default, (number of tasks\/instances) = (number of executors\/threads), but if we set the number of running tasks\/instances to a value higher than what we expect to need, then we can adjust the number of threads up (or down again) without having to stop the topology.<\/p>\n<p>At cluster-level we can also set the number of workers (=JVMs): <span class=\"lang:java decode:true crayon-inline \">conf.setNumWorkers(1);<\/span>. A good rule of thumb is:<\/p>\n<p>(number of workers) = (number of worker nodes in cluster) = (number of spout partitions)<\/p>\n<p>e.g. if we have a cluster made up of 4 worker nodes and we are using a spout-source that uses partitions (such as EventHub), then we should set up our source to have 4 partitions, too. In this way we can have one instance of the spout running in the single JVM on each node, reading from one partition (either exclusively or in round-robin fashion).<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Serialization\"><\/span>Serialization<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>When linking our bolt to our spout, we defined a shuffleGrouping distribution:<\/p>\n<pre class=\"lang:java decode:true \">builder.setBolt(\"bolt\", new ReadMapBolt(), 4).shuffleGrouping(\"spout\").setNumTasks(4);<\/pre>\n<p>This means, according to the javadoc comment, that &#8222;tuples are randomly distributed across the bolt&#8217;s tasks in a way such that each bolt is guaranteed to get an equal number of tuples.&#8220; This makes perfect sense as it goes a long way to guaranteeing a balanced topology, but it incurs the overhead of object serialization, which takes place whenever tuples are pssed across a JVM boundary:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-large wp-image-2096\" src=\"https:\/\/www.inovex.de\/wp-content\/uploads\/2016\/08\/storm-2-1024x576.png\" alt=\"storm-2\" width=\"800\" height=\"450\" \/><\/p>\n<p>Therefore, each object we emit has to be serializable. This is fine for primitives and simple objects, but for more complex ones we may have to implement this ourselves. One approach is to always emit complex objects as a byte array. We make use of avro classes at certain stages of the topology, and the serialization can be achieved in just a few lines:<\/p>\n<pre class=\"lang:java decode:true\">public static byte[] getMyObjectAsByteArray(MyObject o) throws IOException {\r\n\r\n    ByteArrayOutputStream baos = new ByteArrayOutputStream();\r\n\r\n    BinaryEncoder encoder = EncoderFactory.get().binaryEncoder(baos, null);\r\n\r\n    DatumWriter writer = new SpecificDatumWriter&lt;&gt;(MyObject.SCHEMA$);\r\n\r\n    try {\r\n\r\n        writer.write(o, encoder);\r\n\r\n        encoder.flush();\r\n\r\n        baos.close();\r\n\r\n        return baos.toByteArray();\r\n\r\n    } finally {\r\n\r\n        Closeables.close(baos, true);\r\n\r\n    }\r\n\r\n}<\/pre>\n<p>However, this means that we are serializing and deserializing even when we *don&#8217;t* cross a JVM boundary (since with shuffleGrouping all bolts emit to all instances of the next bolt in the chain, including ones in the same JVM). A better approach is to make use of the Kryo classes within Storm that take care of the serialization (but which only serialize when needed). We can define our serialization code as above, but wrap this in a class that we register with storm, like this:<\/p>\n<pre class=\"lang:java decode:true\">public class MyObjectSerializer extends Serializer {\r\n\r\n    private static final Schema SCHEMA = MyObject.getClassSchema();\r\n\r\n    public void write(Kryo kryo, Output output, MyObject object) {\r\n\r\n        DatumWriter writer = new SpecificDatumWriter&lt;&gt;(SCHEMA);\r\n\r\n        ByteArrayOutputStream out = new ByteArrayOutputStream();\r\n\r\n        BinaryEncoder encoder = EncoderFactory.get().binaryEncoder(out, null);\r\n\r\n        try {\r\n\r\n            writer.write(object, encoder);\r\n\r\n            encoder.flush();\r\n\r\n        } catch (IOException e) {\r\n\r\n             \/\/ perform exception handling here...\r\n\r\n        }\r\n\r\n        IOUtils.closeQuietly(out);\r\n\r\n        byte[] outBytes = out.toByteArray();\r\n\r\n        output.writeInt(outBytes.length);\r\n\r\n        output.write(outBytes);\r\n\r\n    }\r\n\r\n    public MyObject read(Kryo kryo, Input input, Class type) {\r\n\r\n        int byteCount = input.readInt();\r\n\r\n        byte[] value = input.readBytes(byteCount);\r\n\r\n        SpecificDatumReader reader = new SpecificDatumReader&lt;&gt;(SCHEMA);\r\n\r\n        MyObject record = null;\r\n\r\n        try {\r\n\r\n            record = reader.read(null, DecoderFactory.get().binaryDecoder(value, null));\r\n\r\n        } catch (IOException e) {\r\n\r\n             \/\/ perform exception handling here...\r\n\r\n        }\r\n\r\n        return record;\r\n\r\n    }\r\n\r\n}<\/pre>\n<p>and<\/p>\n<pre class=\"lang:java decode:true \">Config conf = new Config();\r\n\r\nconf.registerSerialization(MyObject.class, MyObjectSerializer.class); \/\/ register\r\n\r\n...\r\n\r\n<\/pre>\n<p>In this way we only incur the serialization overhead when it is needed.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Initialization\"><\/span>Initialization<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>A topology is deployed by using the storm command-line tool. Certain checks &#8211; e.g. that the topology chain is consistent (i.e. that all defined inputs actually exist), that local resources and remote systems referenced in the topology set-up (i.e. spout\/bolt constructors) are available &#8211; are carried out before deployment to the cluster. The instances of spout and bolt are then created on the nodes of the cluster. In terms of the spout\/bolt code, this means that any objects instantiated in the constructor have to be serializable: any objects that are not, have to be declared transient in the class and instantiated once the prepare() method is called on the node, like this:<\/p>\n<pre class=\"lang:java decode:true\">public class FeatureComputationBolt extends BaseBasicBolt {\r\n\r\n    private transient MyObject myObject;\r\n\r\n    @Override\r\n\r\n    public void prepare(@SuppressWarnings(\"rawtypes\") Map stormConf, TopologyContext context) {\r\n\r\n        myObject = new MyObject();\r\n\r\n    }\r\n\r\n    ...\r\n\r\n}<\/pre>\n<p>We&#8217;re now set \u2013\u00a0we have looked at a simple topology as well as a couple of issues that may crop up when dealing with more complex use cases.<\/p>\n<p>Happy storming!<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Read-on-%E2%80%A6\"><\/span>Read on &#8230;<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>So you&#8217;re interested in processing heaps of data? Have a look at <a href=\"https:\/\/www.inovex.de\/en\/our-services\/big-data\/\" target=\"_blank\" rel=\"noopener\">our website<\/a> and read about the services we offer to our customers.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Join-us\"><\/span>Join us!<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Are you looking for a job in big data processing or analytics? We&#8217;re currently hiring!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I wanted to call this blog article something like &#8222;Storm in a Nutshell&#8220; but decided against it as there is probably a book by that name out there somewhere, and I wanted to avoid any unannounced visits in the dead of night from shady-looking types from the copyright police, and I really wanted to use [&hellip;]<\/p>\n","protected":false},"author":49,"featured_media":12693,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"inline_featured_image":false,"ep_exclude_from_search":false,"footnotes":""},"tags":[77],"service":[411],"coauthors":[{"id":49,"display_name":"Andrew Kenworthy","user_nicename":"akenworthy"}],"class_list":["post-21031","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","tag-big-data","service-data-engineering"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v26.5 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Storm in a Teacup - inovex GmbH<\/title>\n<meta name=\"description\" content=\"On a recent project, we used Apache Storm as the real-time component of a cloud-based environment for fraud detection. This article provides an overview.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.inovex.de\/de\/blog\/storm-in-a-teacup\/\" \/>\n<meta property=\"og:locale\" content=\"de_DE\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Storm in a Teacup - inovex GmbH\" \/>\n<meta property=\"og:description\" content=\"On a recent project, we used Apache Storm as the real-time component of a cloud-based environment for fraud detection. This article provides an overview.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.inovex.de\/de\/blog\/storm-in-a-teacup\/\" \/>\n<meta property=\"og:site_name\" content=\"inovex GmbH\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/inovexde\" \/>\n<meta property=\"article:published_time\" content=\"2016-08-22T06:11:39+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2026-03-23T11:25:54+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.inovex.de\/wp-content\/uploads\/2016\/08\/storm-teacup-white.png\" \/>\n\t<meta property=\"og:image:width\" content=\"2300\" \/>\n\t<meta property=\"og:image:height\" content=\"876\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Andrew Kenworthy\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:image\" content=\"https:\/\/www.inovex.de\/wp-content\/uploads\/2016\/08\/storm-teacup-white-1024x390.png\" \/>\n<meta name=\"twitter:creator\" content=\"@inovexgmbh\" \/>\n<meta name=\"twitter:site\" content=\"@inovexgmbh\" \/>\n<meta name=\"twitter:label1\" content=\"Verfasst von\" \/>\n\t<meta name=\"twitter:data1\" content=\"Andrew Kenworthy\" \/>\n\t<meta name=\"twitter:label2\" content=\"Gesch\u00e4tzte Lesezeit\" \/>\n\t<meta name=\"twitter:data2\" content=\"10\u00a0Minuten\" \/>\n\t<meta name=\"twitter:label3\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data3\" content=\"Andrew Kenworthy\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.inovex.de\/de\/blog\/storm-in-a-teacup\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.inovex.de\/de\/blog\/storm-in-a-teacup\/\"},\"author\":{\"name\":\"Andrew Kenworthy\",\"@id\":\"https:\/\/www.inovex.de\/de\/#\/schema\/person\/0519169c755e15b1478ccf638f16f06c\"},\"headline\":\"Storm in a Teacup\",\"datePublished\":\"2016-08-22T06:11:39+00:00\",\"dateModified\":\"2026-03-23T11:25:54+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.inovex.de\/de\/blog\/storm-in-a-teacup\/\"},\"wordCount\":1467,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/www.inovex.de\/de\/#organization\"},\"image\":{\"@id\":\"https:\/\/www.inovex.de\/de\/blog\/storm-in-a-teacup\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.inovex.de\/wp-content\/uploads\/2016\/08\/storm-teacup-white.png\",\"keywords\":[\"Big Data\"],\"articleSection\":[\"Analytics\",\"English Content\",\"General\"],\"inLanguage\":\"de\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/www.inovex.de\/de\/blog\/storm-in-a-teacup\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.inovex.de\/de\/blog\/storm-in-a-teacup\/\",\"url\":\"https:\/\/www.inovex.de\/de\/blog\/storm-in-a-teacup\/\",\"name\":\"Storm in a Teacup - inovex GmbH\",\"isPartOf\":{\"@id\":\"https:\/\/www.inovex.de\/de\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.inovex.de\/de\/blog\/storm-in-a-teacup\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.inovex.de\/de\/blog\/storm-in-a-teacup\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.inovex.de\/wp-content\/uploads\/2016\/08\/storm-teacup-white.png\",\"datePublished\":\"2016-08-22T06:11:39+00:00\",\"dateModified\":\"2026-03-23T11:25:54+00:00\",\"description\":\"On a recent project, we used Apache Storm as the real-time component of a cloud-based environment for fraud detection. This article provides an overview.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.inovex.de\/de\/blog\/storm-in-a-teacup\/#breadcrumb\"},\"inLanguage\":\"de\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.inovex.de\/de\/blog\/storm-in-a-teacup\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"de\",\"@id\":\"https:\/\/www.inovex.de\/de\/blog\/storm-in-a-teacup\/#primaryimage\",\"url\":\"https:\/\/www.inovex.de\/wp-content\/uploads\/2016\/08\/storm-teacup-white.png\",\"contentUrl\":\"https:\/\/www.inovex.de\/wp-content\/uploads\/2016\/08\/storm-teacup-white.png\",\"width\":2300,\"height\":876,\"caption\":\"storm-teacup-white\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.inovex.de\/de\/blog\/storm-in-a-teacup\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.inovex.de\/de\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Storm in a Teacup\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.inovex.de\/de\/#website\",\"url\":\"https:\/\/www.inovex.de\/de\/\",\"name\":\"inovex GmbH\",\"description\":\"\",\"publisher\":{\"@id\":\"https:\/\/www.inovex.de\/de\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.inovex.de\/de\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"de\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.inovex.de\/de\/#organization\",\"name\":\"inovex GmbH\",\"url\":\"https:\/\/www.inovex.de\/de\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"de\",\"@id\":\"https:\/\/www.inovex.de\/de\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/03\/inovex-logo-16-9-1.png\",\"contentUrl\":\"https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/03\/inovex-logo-16-9-1.png\",\"width\":1921,\"height\":1081,\"caption\":\"inovex GmbH\"},\"image\":{\"@id\":\"https:\/\/www.inovex.de\/de\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/inovexde\",\"https:\/\/x.com\/inovexgmbh\",\"https:\/\/www.instagram.com\/inovexlife\/\",\"https:\/\/www.linkedin.com\/company\/inovex\",\"https:\/\/www.youtube.com\/channel\/UC7r66GT14hROB_RQsQBAQUQ\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.inovex.de\/de\/#\/schema\/person\/0519169c755e15b1478ccf638f16f06c\",\"name\":\"Andrew Kenworthy\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"de\",\"@id\":\"https:\/\/www.inovex.de\/de\/#\/schema\/person\/image\/7397755342ed757eeb6b1d51f16a4044\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/c7a29df25f27010b3c581f97c66a52694571cfa2f9c9b79049542969194fbdd3?s=96&d=retro&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/c7a29df25f27010b3c581f97c66a52694571cfa2f9c9b79049542969194fbdd3?s=96&d=retro&r=g\",\"caption\":\"Andrew Kenworthy\"},\"url\":\"https:\/\/www.inovex.de\/de\/blog\/author\/akenworthy\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Storm in a Teacup - inovex GmbH","description":"On a recent project, we used Apache Storm as the real-time component of a cloud-based environment for fraud detection. This article provides an overview.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.inovex.de\/de\/blog\/storm-in-a-teacup\/","og_locale":"de_DE","og_type":"article","og_title":"Storm in a Teacup - inovex GmbH","og_description":"On a recent project, we used Apache Storm as the real-time component of a cloud-based environment for fraud detection. This article provides an overview.","og_url":"https:\/\/www.inovex.de\/de\/blog\/storm-in-a-teacup\/","og_site_name":"inovex GmbH","article_publisher":"https:\/\/www.facebook.com\/inovexde","article_published_time":"2016-08-22T06:11:39+00:00","article_modified_time":"2026-03-23T11:25:54+00:00","og_image":[{"width":2300,"height":876,"url":"https:\/\/www.inovex.de\/wp-content\/uploads\/2016\/08\/storm-teacup-white.png","type":"image\/png"}],"author":"Andrew Kenworthy","twitter_card":"summary_large_image","twitter_image":"https:\/\/www.inovex.de\/wp-content\/uploads\/2016\/08\/storm-teacup-white-1024x390.png","twitter_creator":"@inovexgmbh","twitter_site":"@inovexgmbh","twitter_misc":{"Verfasst von":"Andrew Kenworthy","Gesch\u00e4tzte Lesezeit":"10\u00a0Minuten","Written by":"Andrew Kenworthy"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.inovex.de\/de\/blog\/storm-in-a-teacup\/#article","isPartOf":{"@id":"https:\/\/www.inovex.de\/de\/blog\/storm-in-a-teacup\/"},"author":{"name":"Andrew Kenworthy","@id":"https:\/\/www.inovex.de\/de\/#\/schema\/person\/0519169c755e15b1478ccf638f16f06c"},"headline":"Storm in a Teacup","datePublished":"2016-08-22T06:11:39+00:00","dateModified":"2026-03-23T11:25:54+00:00","mainEntityOfPage":{"@id":"https:\/\/www.inovex.de\/de\/blog\/storm-in-a-teacup\/"},"wordCount":1467,"commentCount":0,"publisher":{"@id":"https:\/\/www.inovex.de\/de\/#organization"},"image":{"@id":"https:\/\/www.inovex.de\/de\/blog\/storm-in-a-teacup\/#primaryimage"},"thumbnailUrl":"https:\/\/www.inovex.de\/wp-content\/uploads\/2016\/08\/storm-teacup-white.png","keywords":["Big Data"],"articleSection":["Analytics","English Content","General"],"inLanguage":"de","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.inovex.de\/de\/blog\/storm-in-a-teacup\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.inovex.de\/de\/blog\/storm-in-a-teacup\/","url":"https:\/\/www.inovex.de\/de\/blog\/storm-in-a-teacup\/","name":"Storm in a Teacup - inovex GmbH","isPartOf":{"@id":"https:\/\/www.inovex.de\/de\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.inovex.de\/de\/blog\/storm-in-a-teacup\/#primaryimage"},"image":{"@id":"https:\/\/www.inovex.de\/de\/blog\/storm-in-a-teacup\/#primaryimage"},"thumbnailUrl":"https:\/\/www.inovex.de\/wp-content\/uploads\/2016\/08\/storm-teacup-white.png","datePublished":"2016-08-22T06:11:39+00:00","dateModified":"2026-03-23T11:25:54+00:00","description":"On a recent project, we used Apache Storm as the real-time component of a cloud-based environment for fraud detection. This article provides an overview.","breadcrumb":{"@id":"https:\/\/www.inovex.de\/de\/blog\/storm-in-a-teacup\/#breadcrumb"},"inLanguage":"de","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.inovex.de\/de\/blog\/storm-in-a-teacup\/"]}]},{"@type":"ImageObject","inLanguage":"de","@id":"https:\/\/www.inovex.de\/de\/blog\/storm-in-a-teacup\/#primaryimage","url":"https:\/\/www.inovex.de\/wp-content\/uploads\/2016\/08\/storm-teacup-white.png","contentUrl":"https:\/\/www.inovex.de\/wp-content\/uploads\/2016\/08\/storm-teacup-white.png","width":2300,"height":876,"caption":"storm-teacup-white"},{"@type":"BreadcrumbList","@id":"https:\/\/www.inovex.de\/de\/blog\/storm-in-a-teacup\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.inovex.de\/de\/"},{"@type":"ListItem","position":2,"name":"Storm in a Teacup"}]},{"@type":"WebSite","@id":"https:\/\/www.inovex.de\/de\/#website","url":"https:\/\/www.inovex.de\/de\/","name":"inovex GmbH","description":"","publisher":{"@id":"https:\/\/www.inovex.de\/de\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.inovex.de\/de\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"de"},{"@type":"Organization","@id":"https:\/\/www.inovex.de\/de\/#organization","name":"inovex GmbH","url":"https:\/\/www.inovex.de\/de\/","logo":{"@type":"ImageObject","inLanguage":"de","@id":"https:\/\/www.inovex.de\/de\/#\/schema\/logo\/image\/","url":"https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/03\/inovex-logo-16-9-1.png","contentUrl":"https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/03\/inovex-logo-16-9-1.png","width":1921,"height":1081,"caption":"inovex GmbH"},"image":{"@id":"https:\/\/www.inovex.de\/de\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/inovexde","https:\/\/x.com\/inovexgmbh","https:\/\/www.instagram.com\/inovexlife\/","https:\/\/www.linkedin.com\/company\/inovex","https:\/\/www.youtube.com\/channel\/UC7r66GT14hROB_RQsQBAQUQ"]},{"@type":"Person","@id":"https:\/\/www.inovex.de\/de\/#\/schema\/person\/0519169c755e15b1478ccf638f16f06c","name":"Andrew Kenworthy","image":{"@type":"ImageObject","inLanguage":"de","@id":"https:\/\/www.inovex.de\/de\/#\/schema\/person\/image\/7397755342ed757eeb6b1d51f16a4044","url":"https:\/\/secure.gravatar.com\/avatar\/c7a29df25f27010b3c581f97c66a52694571cfa2f9c9b79049542969194fbdd3?s=96&d=retro&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/c7a29df25f27010b3c581f97c66a52694571cfa2f9c9b79049542969194fbdd3?s=96&d=retro&r=g","caption":"Andrew Kenworthy"},"url":"https:\/\/www.inovex.de\/de\/blog\/author\/akenworthy\/"}]}},"_links":{"self":[{"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/posts\/21031","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/users\/49"}],"replies":[{"embeddable":true,"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/comments?post=21031"}],"version-history":[{"count":3,"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/posts\/21031\/revisions"}],"predecessor-version":[{"id":66669,"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/posts\/21031\/revisions\/66669"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/media\/12693"}],"wp:attachment":[{"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/media?parent=21031"}],"wp:term":[{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/tags?post=21031"},{"taxonomy":"service","embeddable":true,"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/service?post=21031"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/coauthors?post=21031"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}