The OAT of ART

This is the second part of a small series of articles about the newest Android Version: Kitkat, that I wrote at inovex GmbH. In this part we will take a closer look at the new fileformat: OAT of the ART runtime and have a brief look at the garbage collection Diggin deeper: OAT file analysis. So far we found out that the system runs a kind of compilation on the device itself. It converts not only the apps but also huge parts of the android framework to oat files. In this post we’ll try to figure out what this oat-thingies are and how they are executed.

As mentioned before, all installed apps run through the dex2oat compilation. Now let’s have a closer look at the resulting files.

So let’s adb pull one of the dex2oat results, for example the converted result of the SystemUI.apk:

The handy file comand returns:

Wow.. that escalated quickly! With ART we go from java to class to dex to oat, which is a shared object!

Further analysis with objdump shows the following:

Only three symbols are defined: Metadata, execution start and end.
Obviously the new runtime handels apps as shared objects (!) which were dynamically loaded into the VM-context (which is very likely to be the previously explained boot-image). A look into the sources reveals that they actually use dlopen() to load the libraries during runtime.

Now lets use the new oatdump to gain some more knowledge over the oat file format. My first attemp was to use oatdump at the boot-image file /data/dalvik-cache/ But it turned out that a whole dump of this file is about 1.6 GB in size, which is quite unhandy for the kind of analysis which I’m trying to do.

Therefore, I wrote a small App with almost no functionallity, but which is simple enough to understand how this OAT thing works. Here is the source code:

After installation, we can pull it’s compiled version on our host and execute objdump on it:

No surprise so far… it’s obviously an App with very little functionality that fits into 0x238 bytes. So let’s use oatdump on that file:

The header shows us something about the content of that file, architecture and some integrity-check values, and some adresses, which are presumably used to relocate the library correctly. But the interesting part is in the body of the dump output: method names, dex code and the disassembled ARM code of this method.
For example, the oat-dump output of our foo method is the following:

The DEX CODE part is quite obvious: int a in our java source code maps to our virtual register v2, we add the constant 4711, store the result in v0 and return it. The OAT DATA is not yet fully understood, but obviously the core_spill_mask describes the registers that were used in that method on the ARM to pass the data, the vmap_table shows which virtual registers map to which real ones.

The CODE sections shows whats the processor is actually going to execute: r2 obviously holds our int a at the begining. After the creation of a new stack frame, the constant 4711 is added to our int a , and the results are passed back. No surprise, but impressive to see!

It also reveals that there is almost no optimisation. It is more like a gcc -OO . Obviously the whole computation can be done with one single instruction:
adds.w r0, r2, #4711. Not even a new stack frame is needed to do this.

So let’s summarize what OAT files are: precompiled like APK files, that are loaded into the running process like a shared-object library. They contain each method of all classes in the APK, and of course the method names and descriptions and an offset table to locate the methods within the binary.

Keep care of your heap: GC on the ART

The ART garbage collection is quite similar to the dalviks one. Both use a mark-and-sweep approach to keep the heap clean. That surprises at first glance, but is actually quite comprehensible.

We have never lost the tracabilty of our allocation on the way from java to class to dex to machinecode. Also the way of the code execution has changed, the data structures and referenced of the objects are still the same, and therefore the GC process can be performed in the same way as on the davik.

A quick look into the sources under art/runtime/gc reveals that they use 4 different types of GC runs, all of them may run in parallel, and are listet with an increasing chance of freeing heap space:

The GC loops through it, until enough space is available to allocate the desired memory:

If this procedure fails, the system starts to try it harder by enlarging the heapspace etc. But this is overall a standard procedure and nothing new or exceptionally different to the dalviks GC, or, at least I didn’t find it.

comments powered by Disqus