Migrating an embedded Android setup: HAL (Part 3)

After we spoke about the Linux Kernel issues that arise when porting our embedded setup to an new Android version in the previous two articles, we finally arrive in the mystical world of Android. As this part is straight forward, we’ll continue right to the hardware abstraction layer (HAL).

Am I dreaming or did everything really go as intended?

I started with integrating the sensor, because I wrote that part myself and know how to do it. So for most of the part it was just like what I did for the Pandaboard.

I think the most complicated part of the sensor integration was to find the right directory to add the HAL changes …

So here I was with a fully functional ultra sonic distance sensor, used as a common proximity device. How complicated could it be to port the Android implementations for the Grove-LCD? At least it looked like there was a lot of documentation available …

As you will see, more than I could imagine.

Ultrasonic Sensor

Indeed, I was.

In contrast to the sensor integration, there was no infrastructure to use for our custom device. So we needed to implement everything on our own, starting at the driver level you already read about and and going all the way up to an interface for application developers to use our LCD as easily as they do with the sensor.

That means we needed to create a Hardware Abstraction Layer (HAL) as a layer between the driver and a custom system service. This system service can be used to control the display. But because the service is not reachable for app developers, we needed a way to distribute our custom hardware interface to them. Fortunately, Android provides a system to do just that: SDK-Add-Ons. With more or less of the old source code, I began.

At first, I added the HAL as a part between the driver and Android. Due to some bigger changes in the driver’s interface, there were a lot of changes to pay attention to. Because we wanted to publish our work, there was no need to put nearly all driver logic into the HAL (in contrast to the kernel’s license, HAL’s license allows closed source blobs).

So we could use some advantages of implementing the driver logic in kernel space, such as improved performance or creating much more comfortable and powerful interfaces. But the principles how this layer works did not change in contrast to the old one.

Sounds very simple until now, but after porting the other stuff, I tried to test it all—nothing worked. And if you know a bit about Android you may agree that debugging something not really localizable is much more tedious than debugging in kernel.

As I said, I applied some more changes, more accurately implementing a whole custom system service until I was able to test the changes in Android. That means that the problem could lie in one of many many modified files and I didn’t even know whether I was looking for one or more issues. A very long, frustrating period of debugging began until we identified the HAL, more accurately our lcd1313M1hw.imx6.so, as the erroneous part. Android was not able to load this shared library, so there was no communication to the driver.

But what had gone wrong? We scanned the source code looking for errors but we did not find a single one. We compared the code to other loadable .so files, but everything worked as intended. So we finally looked into various .so files with „objdump“ to see if something failed at build time. Everything looked correct at first, but as we compared its output to the output of other, working .so files, we noticed this line:

While it looked like that at the working .so file:

This lead us to a small, not easy to find line of code. The struct hw_module_methods_t was marked as const, which causes the difference in the .so. In the old showcase the „const“ should have worked, my colleagues assured me. But with Android 6 and the Wandboard it didn’t. I inspected some old and new Android source code from similar .so files loaded at the same time as ours, but I never saw a „const“ in this place.

On the old Marsboard Android does not compile anymore (and I do not want to touch this horrible board ever again), so I am not able to determine if the „const“ ever worked, but it is possible. If it ever worked correctly, we propose that the part was changed due to improvements in address space layout randomization, which does not cooperate well with „const“ or with „.data.rel.ro.local“, respectively.

Read on

Now that we debugged HAL, we’re ready to tackle the Android framework in our next article. In the meantime, have a look at our Smart Devices division and learn how you can join us as a software developer.

comments powered by Disqus