ART in Practice

This is the third part of our series about significant changes inside the latest Android version 4.4. If you have checked out part 1 and part 2 from Matthias about the new ART runtime, you might want to play with it a bit. We’ll cover some of the implications in this article.

Make the switch

Switching to ART looks easy and tempting at a first glance, but there are some implications you should know about upfront. The most straight forward way is to use a device which runs the latest Android Version with ART enabled. According to Google the following devices have ART enabled in the stock ROM:

  • Nexus 7 (2013)
  • Nexus 4
  • Nexus 5

Noticeably missing are the Nexus 7 (2012) and Nexus 10. And unfortunately the ART option is not available by default on those devices. However you can try to build a custom AOSP ROM for those devices with ART enabled. Just make sure to include the following lines in your device-config:

This means that you have the classic dalvik selected as default and included the ART runtime as an option into the build.

In case you have no supported Hardware, you could try the emulator. You’ll be out of luck if you just try the default 4.4 AVD and switch to ART, it will result in an endless boot-loop. This is a known issue, but a fix is already on its way. In the meantime you can search the web for already patched emulator images (at your own risk of course).

In case you have passed the obstacles and switched to ART, you will first notice a significantly longer bootup-time. As Matthias mentioned in his first posts, this step needs to be done in order to convert all apps and the Framework (!) to OAT files which will run on ART.

Developing on ART

I had the chance to run some tests on 2 Nexus 7 (2013) devices in parallel, one running dalvik, the other – of course – on ART. Both devices are on Android 4.4.2.

So in case you don’t care much about all the details: Is there any difference for you as a developer? In a nutshell: Not really. If you haven’t dealt with details of the dalvik VM in the past you will barely notice the difference.

Of course you connect your ART enabled device via ADB. Nothing has changed here. When you deploy your shiny application, let’s say via eclipse, it will all work as expected. Of course the logfiles look a bit different, most noticeable some error messages during install and startup phase:

And the already known conversion which takes place instead of the dexopt-step when running on Dalvik:

Fortunately most of the developing tools also work as expected. I haven’t noticed any significant difference using tools like Traceview or Allocation tracker. Also the debugger works nearly 100% as with dalvik, meaning that ART has implemented the JDWP Spec. This suspicion can be confirmed by checking the runtime/jdwp/README.txt file in the AOSP ART directory which states that ART has an incomplete but working JDWP implementation.

One thing you might notice when inspecting objects is that every object has an additional field called shadow$_klass_ Now what is that? Going through the sources you’ll find that ART has a different implementation of java.lang.Object compared to dalvik, you can find them here inside the AOSP Source tree:

The ART implementation contains this additional field holding the class name of the created object.
I couldn’t figure out why this is needed in detail, feel free to dig deeper into ART and find the reasons for it.

Heap dumps

ART does support heap dumps via the hprof file format, which means you can analyze a heap snapshot with tools like jhat or MAT in Eclipse. The results are similar to dalvik dumps but there are a some additional objects specific to ART. As you can see in the screenshot it seems like ART needs wrapper or proxy objects for method calling (my guess is into the framework) and these objects are visible in your heap and do count to your memory consumption as far as I can tell. But whether this will stay with future versions of ART or is just the result of the current implementation status I don’t know.

Garbage collection

One thing that you’ll quickly notice is the new garbage collector. It initially shows up with its new log-entries, for example:

Compared to the classic Dalvik:

You’ll get similar information in a slightly different format including the cause, the collector who did the job, heap stats and the pause time.

You’ll notice significantly fewer log-lines regarding the garbage collection. This seems to have 2 root causes:

  • Not every GC is logged
  • The GC-behaviour is different

In order to analyze the new GC, I run a test App which I used to demonstrate GC-pauses. It runs a loop to animate a Rectangle using a custom View and creates objects inside the onDraw method (which you shouldn’t do, and this shows why). Since the onDraw method is called 60 times per second, the app creates a lot of objects and keeps the garbage collector busy. When the GC runs, you’ll usually notice a short frame-drop of the animation. You can visualize those hickups by using the developer tools and profile the GPU activity as shown in the screenshots below. Every time a line crosses the green Bar, you have missed a frame.

When running the same App on dalvik compared to ART, you’ll notice that dalvik prints out more GC-messages than ART. It seems that ART is less aggressive running the GC, which leads to fewer hickups. On the downside, the longer you wait, the more objects you have to scan & clean. But the pause times are more or less the same between ART and dalvik. You can see the results by comparing the 2 screenshots, the dalvik device produced 2 hickups within a short timeframe while the ART runtime only shows one. For completeness‘ sake: The other delay you see is caused by the screenshot process.

Running the GC less often is probably a good thing, as long as pause times don’t get longer. It increases the chance that your animation or transition will not be disrupted by a GC run. So ART seems to run the GC less frequently. But that is not the whole truth. It also prints out less GC messages. When you dig into the sources, there is a place where ART decides whether the collection was “slow”. Check runtime/gc/heap.cc for details. If it was a slow gc-run, it will collect statistics and log them. The threshold for being slow is defined in runtime/gc/head.h and it’s currently set to 5ms pause time or 100ms gc-run-time. Faster GC runs will not be logged.

From a first view, the GC situation seems to get better than with dalvik. In my limited tests the GC runs less often with similar or fewer pause times compared to dalvik. The decision not to log every GC run (even if there was a short pause) is a bit strange, though.

Stability

In general use ART seems quite stable. But using the development tools like allocation tracker will increase the chance to crash it. I sometimes managed to get a device reboot while playing with the tools. But that’s okay for now, as ART is still experimental.

You might also encounter issues with installed apps from the Play Store. The most popular example is WhatsApp which crashes using ART. There is a really good description of this particular case by Ian Rogers.

Summary

So what about speed? Everyone is talking about the performance improvements of ART. I skipped the part intentionally. As Matthias pointed out in earlier Posts, ART is running in a very defensive mode. It’s beta. And performance measurements are hard to get right anyhow. I have already said too much about the GC-performance which might not be accurate. I’ll leave this field to others.

From a developer’s perspective it’s good to see that Google cares about our tools. We’ll most likely not loose anything, and there are already more ATRACE-Tags inside ART compared to dalvik which we get back in systrace. A probably better GC will help decrease UI-glitches and therefore I’m looking forward to get an even better ART as default runtime. And not to miss this one: I haven’t touched renderscript and JNI with my tests so far.

comments powered by Disqus