WebAssembly in Production

Gepostet am: 30. April 2018

WebAssembly (or wasm for short) is a relatively new way to efficiently execute code in a browser. Since late 2017, it is supported in all modern major browsers (Chrome, Firefox, Safari and Edge) which makes it worth a look for real-world applications. It is the successor of Mozillas asm.js project, which compiled C or C++ code directly to JavaScript. WebAssembly goes a step further and defines a new binary format for a machine-near bytecode representation which can easily be translated to executable assembly instructions by the execution engine while maintaining a safe execution context for untrusted code. This skips the computation-intensive parsing and JIT-compiling of JavaScript which has a bunch of different advantages shown in the Evaluation chapter.

Browser vendors did an amazing job in running code in a language as dynamic as JavaScript as fast as somehow possible, but by doing so the JS-engines became very complicated with multiple stages a completely separate optimizing compilers, multiple implementations of the same thing for different scenarios and generally a big portion of black magic. This causes JS-programs to have highly varying performance in different browsers. The WebAssembly representation on the other hand is straightforward to compile to native assembly instructions and produces much more consistent performance characteristics across different browsers. Keep in mind that this probably doesn’t matter for a simple React app showing cute kitten images, but comes into play in larger and/or computation-heavy applications executed in the browser. Besides advantages in runtime behavior, WebAssembly can be a compile time target for a lot of different languages which opens up the web platform for much more developers. While support for compiler is still evolving, C, C++ and Rust can already be compiled to WebAssembly.

Wrapping up, there are three main reasons to use WebAssembly:

  • Delivering existing C/C++ applications over the web (Talking about things like games, 3D Graphics and more).
  • Developing in your language of choice (for example .NET or Java).
  • Accelerating hot code portions of ordinary JavaScript apps.

WebAssembly hasn’t become „mainstream“ yet which means tools and frameworks are in most cases not stable and still developed on; .NET and Java support is still highly experimental. For simple programs which mainly do computation and don’t rely much on external APIs however the available tooling is completely sufficient even in real-world environments.

Implementing AES with WebAssembly

This article describes how the third use case can be covered by tools available today. The client side encryption and decryption using the widespread AES algorithm serves as an example to show how such code packages can be made accessible from ordinary JavaScript applications. Since AES is a standard algorithm and not something application specific, the resulting JavaScript package will be published as a standalone npm package which can be used in arbitrary web applications. The goal here is to make WebAssembly accelerated modules available and consumable as easy and simple as conventional JavaScript modules without the developer being aware of wasm-specific quirks. Of course the advantages of WebAssembly (a small file size and high performance) shouldn’t get lost in the process. A project following a related but due to the size of the project more complex approach is libsodium.js which delivers the whole sodium crypto library as a WebAssembly module.

There are various implementations of AES in plain JavaScript which can serve as a baseline to compare the WebAssembly approach from a performance perspective.

For the library a slightly simplified version of this implementation will be used for the actual computation. It doesn’t have external dependencies which promises a small build size and doesn’t use features not available yet in WebAssembly like SIMD or multithreading.

Compiling

To turn the C code into WebAssembly, a working compiler infrastructure is necessary. This is provided by the binaryen project which in turn depends on LLVM (a C/C++ compiler). Binaryen provides tools to transform the output of LLVM into WebAssembly. It is possible to manually invoke LLVM and Binaryen with the correct parameters, but fortunately the emscripten compiler of the WebAssembly predecessor asm.js also supports WebAssembly and neatly integrates the different tools behind a simple command line utility.

If you are using a Mac and Homebrew, installing emscripten is as simple as running brew install emscripten, for other environments please refer to the emscipten website.

To trigger the compile process, run emcc aes.c -s WASM=1 -o aes.js. This tells emscripten to compile the aes.c file using the binaryen wasm backend. The -o flag specifies the output file. Besides the aes.js file, a wasm file with the same name is created containing the actual WebAssembly byte code. The JavaScript file contains a generic wrapper generated by emscripten to initialize the module and call the provided WebAssembly functions from JavaScript. Because this wrapper offers way too many things (with a more than four times bigger file size than the WebAssembly module it is wrapping) which aren’t necessary for the use case at hand, we won’t use this wrapper but write our own in the next section.

Per default, emscripten is creating an unoptimized debug build. To optimize for runtime performance, we have to pass the -O3 flag. Also, we have to specify which methods of the C program have to be included in the WebAssembly build. A look into aes.c shows that there are multiple methods necessary to control the encryption/decryption process:

aes_setkey_dec and aes_setkey_enc to initialize an encryption/decryption context and aes_crypt_cfb and aes_crypt_cbc to perform the actual encryption/decryption in output feedback mode respectively cipher block chaining mode. For simplicity of the example, we will focus on CBC mode in this implementation. Emscripten can be instructed to only export the necessary methods by providing the EXPORTED_FUNCTIONS array as an additional flag: -S "EXPORTED_FUNCTIONS=['_aes_setkey_dec', '_aes_setkey_enc', '_aes_crypt_cbc']". Note the leading underscores in the function names; emscripten automatically adds these to all C functions.

This means the final compilation command looks like this:

Initializing

To execute the compiled program, the WebAssembly byte code has to be loaded into the browser and initialized. Generally the WebAssembly.instantiateStreaming API is the best choice to do so as it starts compiling and instantiating the module while downloading which can significantly reduce startup times for large modules. However, for small modules as in this case, the benefit isn’t big enough to justify the additional overhead in setting up everything correctly—the web server has to serve the WebAssembly code with the correct Content-Type header which isn’t part of the default configuration yet in most environments and widely used bundling setups (e.g. create-react-app) don’t support WebAssembly yet. This could change in the next few months as e.g. webpack 4 already supports WebAssembly modules and actively works on improving the integration. For now though the best results for small modules like this one can be achieved by inlining the WebAssembly byte code into the regular JavaScript using the wasm-loader webpack plugin. By doing so the WebAssembly module becomes a regular JavaScript module which ensures compatibility with existing JavaScript tools like older webpack versions, alternative bundlers like RollupJS and gulp workflows.

By using webpack and the wasm-loader, the compiled module can by initialized like this:

The wasm module returns a function as default export which initializes a new instance of the WebAssembly module. The parameter of this function defines the imports which are passed to the WebAssembly module. Like an ES6 module a WebAssembly module can define things like APIs and constants it depends on. In difference to regular ES6 modules the WebAssembly code has no access to Browser APIs for network fetching, DOM manipulation etc. It runs in a completely separated sandbox with the only communication channels to the outside world being the things being passed in as the imports object and the things being exported in the instance.exports object. Even the memory the module operates on has to be passed in as an import. A WebAssembly module can be thought of as a separate VM running inside JavaScript which means the JavaScript environment has to simulate the context of this VM. If for instance the C code compiled to WebAssembly tries to access the file system, the resulting WebAssembly module will require the respective syscall implementations to be passed in via the imports. In this case the file system would have to be emulated by the JavaScript context.

In our case the WebAssembly module is fairly simple and doesn’t call any system APIs which results in a simple imports object. The AES module needs a local memory to buffer intermediate results of the encryption and a pointer to know where in this local memory to store these.

The resulting code looks like this:

new WebAssembly.Memory({ initial: 256, maximum: 256 }) allocates a new portion of memory (256 being the number of 64kb pages) which is used by the WebAssembly module. The memory object can also be used to transfer data in and out of the WebAssembly VM.

In our case for instance the data to be encrypted has to be made accessible by the WebAssembly code. But you can’t pass in a JS array into a WebAssembly function because in a C program arrays are just pointers to an address in memory. To hand the data array over to WebAssembly, we have to copy it into the memory object. memory.buffer is an ArrayBuffer reference to the WebAssembly memory which can be used to transfer data. With const byteView = new Uint8Array(memory.buffer), the ArrayBuffer is made accessible as a TypedArray of bytes. By calling byteView.set(myData, pointerToTheData), the myData array is copied to address pointerToTheData inside the WebAssembly controlled memory. pointerToTheData in this case is a simple integer value representing the address in the memory. Now this memory address can be given to an exported WebAssembly function which in turn can use it to access the data placed at this position.

To extract the decrypted data, the same trick can be applied—by using const extractedData = byteView.subarray(pointerToTheData, pointerToTheData + dataLength), the data can be transferred back into ordinary JavaScript memory.

Of course this method of copying data around is annoying to use for a JavaScript-accustomed developer. To make the module more convenient to use, some glue code is necessary to hide these implementation details behind a clean, JavaScript-like API.

JS-ify

It is not safe to manipulate the data in the WebAssembly memory as the WebAssembly code itself doesn’t expect it. If pointerToTheData points to an address where WebAssembly stores internal data, it would be overwritten which could lead to errors in the calculation. To prevent this, it would be possible to also export the functions malloc and free of the standard library. By calling these exported methods we can tell the WebAssembly which memory ranges are blocked and safely use these. However, this dynamic memory management comes at a cost because the implementation of malloc and free would also have to be included into the WebAssembly module.

In simple modules like the one at hand we can do without a fully fledged memory management to keep the bundle small and just lay out our memory statically. The used AES implementation requires the following portions of memory:

  • The encryption key (32 bytes in case of AES256)
  • The IV (16 bytes),
  • An encryption context holding various internal information (a struct containing int, a pointer and an array of 68 longs which adds up to 276 bytes)
  • A decryption context of the same size as the encryption context.

The rest of the memory can be used to hold the data which has to be encrypted or decrypted. The pointers to these memory ranges can be created like this:

The static memory allocation starts at memory address 16384 (2^14) which leaves the first memory page to WebAssembly for intermediary results saved on the stack.

Our module will export three functions: init to initialize the WebAssembly instance with key and iv and encrypt and decrypt for the actual data handling. As expected in a JavaScript context, data, key and iv arrays can be passed directly as parameters of the three functions which will handle the WebAssembly memory management:

And that’s it—by putting the parts together and compiling the JavaScript glue module with webpack, the former C implementation of AES is opaquely wrapped into a regular JavaScript package which can be used in any web application as long as the executing browser supports WebAssembly.

The code shown here with a few extra bells and whistles is available at Github and npm.

Evaluation

To evaluate whether the development overhead is worth the more complicated implementation, the average computation time for different sizes of input data has to be measured. The benchmark test can be found here with the source code on Github.

To get an idea of the performance of a comparable plain JavaScript library, the AES implementation of the Stanford JavaScript Crypto Library was used. Because of inefficient handling of arrays, the aes-js is significantly slower and not included in the benchmark.

The following charts show the throughput in ms/MB (lower is better) for decreasing sizes of payload in different browsers. In all browsers, the WebAssembly implementation beats the JavaScript implementation by a wide margin. It has to be noted that the JavaScript performance varies significantly across the different browsers while the WebAssembly performance is very similar, except for small payloads. This indicates that the bridging mechanism integrating JavaScript and WebAssembly execution contexts varies in performance between the browsers. Especially Firefox seems to have a performant communication method between both languages.

Comparison Performance WebAssembly and JavaScript in Firefox Comparison Performance WebAssembly and JavaScript in Chrome Comparison Performance WebAssembly and JavaScript in Safari

When plotting the individual runtime in ms of consecutive runs of small payloads (16 kB in this measurement), the increasingly performant JavaScript implementation shows the work of the optimizing compiler which optimizes the AES loop as it runs more often. The WebAssembly implementation on the other hand shows a steady high performance from the first run because the assembly code generated by the WebAssembly instructions is already maximally optimized.

Comparison Performance WebAssembly and JavaScript on Consecutive Runs

The aes-wasm dependency adds ~25 kB (~14 kB after gzip) to the benchmark bundle while the Stanford Crypto Library grows the bundle by ~365 kB (~109 kB after gzip). This is not due to an inefficient implementation but because the whole library with lots of different algorithms is delivered as a single JavaScript module which can’t be split up by the bundler. Unfortunately this makes a direct size comparison infeasible. To put this into perspective, the httparchive project measures a median 367 kB of JavaScript per webpage.

Summary

WebAssembly can execute small, computation intensive programs reliably fast across different browsers. The approach is definitely worth a shot if you have a relatively small portion of your code which takes the majority of the computation time (encryption, compression, parsing, neural network stuff, physics, audio/video processing, …). However, it is important to take into account the transfer of data in and out of the WebAssembly VM. In case of encrypting and decrypting byte arrays, the transformation is relatively simple but can become complicated and costly for more complex data structures.

Even though the support for WebAssembly and accompanying tooling has come a long way, there are still major hurdles to use native modules from JavaScript. But the finalization of WebAssembly support in webpack and widespread adoption of version 4 together with projects aiming to automate the binding process like wasm-bindgen for Rust steadily wears them down which promises a future of faster and richer web apps using JavaScript and WebAssembly modules interchangeably.

2018-04-30T13:33:31+00:00