10/08/2015

Full TrustZone exploit for MSM8974

In this blog post, we'll cover the complete process of exploiting the TrustZone vulnerability described in the previous post. If you haven't read it already, please do!

Responsible Disclosure

First of all, I'd like to point out that I've responsibly disclosed this vulnerability to Qualcomm, and the issue has already been fixed (see "Timeline" below).

I'd also like to take this opportunity to point out that Qualcomm did an amazing job in both responding to the disclosure amazingly fast and by being very keen to fix the issue as soon as possible.

They've also gifted me a brand new (at the time) Moto X 2014, which will be the subject of many posts later on (going much more in depth into TrustZone's architecture and other security components on the device).


Patient Zero

While developing this exploit, I only had my trusty (personal) Nexus 5 device to work with. This means that all memory addresses and other specific information written below is taken from that device.

In case anyone wants to recreate the exact research described below, or for any other reason, the exact version of my device at the time was:

google/hammerhead/hammerhead:4.4.4/KTU84P/1227136:user/release-keys

With that out of the way, let's get right to it!

The vulnerability primitive

If you read the previous post, you already know that the vulnerability allows the attacker to cause the TrustZone kernel to write a zero DWORD to any address in the TrustZone kernel's virtual address space.

Zero write primitives are, drawing on personal experience, not very fun to work with. They are generally quite limited, and don't always lead to exploitable conditions. In order to create a robust exploit using such a primitive, the first course of action would be to attempt to leverage this weak primitive into a stronger one.

Crafting an arbitrary write primitive

Since the TrustZone kernel is loaded at a known physical address, this means that all of the addresses are already known in advance, and do not need to be discovered upon execution.

However, the internal data structures and state of the TrustZone kernel are largely unknown and subject to change due to the many different processes interacting with the TrustZone kernel (from external interrupts, to "Secure World" applications, etc.).

Moreover, the TrustZone code segments are mapped with read-only access permissions, and are verified during the secure boot process. This means that once TrustZone's code is loaded into memory, it theoretically cannot (and should not) be subject to any change.

TrustZone memory mappings and permissions

So that said - how can we leverage a zero write primitive to enable full code execution?

We could try and edit any modifiable data (such as the heap, the stack or perhaps globals) within the TrustZone kernel, which might allow us to create a stepping stone for a better primitive.

As we've mentioned in the previous blog post, normally, when an SCM command is called, any argument which is a pointer to memory, is validated by the TrustZone kernel. The validation is done in order to make sure the physical address is within an "allowed" range, and isn't for example, within the TrustZone kernel's used memory ranges.

These validations sound like a prime candidate for us to look into, since if we were able to disable their operation, we'd be able to leverage other SCM calls in order to create different kinds of primitives.

TrustZone memory validation

Let's start by giving the memory validation function a name - from now on, we'll call it "tzbsp_validate_memory".

Here's a decompilation of the function:

The function actually calls two internal functions to perform the validation, which we'll call "is_disallowed_range" and "is_allowed_range", respectively.

is_disallowed_range


As you can see, the function actually uses the first 12 bits of the given address in the following way:
  • The upper 7 bits are used as an index into a table, containing 128 values, each 32-bit wide.
  • The lower 5 bits are used as the bit index to be checked within the 32-bit entry which is present at the previously indexed location.


In other words, for each 1MB chunk that intersects the region of memory to be validated, there exists a bit in the aforementioned table which is used to denote whether or not this region of data is "disallowed" or not. If any chunk within the given region is disallowed, the function returns a value indicating as such. Otherwise, the function treats the given memory region as valid.

is_allowed_range


Although a little longer, this function is also quite simple. Essentially, it simply goes over a static array containing entries with the following structure:



The function iterates over each of the entries in the table which resides at the given memory address, stopping when the "end_marker" field for the current entry is 0xFFFFFFFF.

Each range specified by such an entry, is validated against to make sure that the memory range is allowed. However, as evidenced in the decompilation above, entries in which the "flags" fields' second bit is set, are skipped!


Attacking the validation functions

Now that we understand how the validation functions operate, let's see how we can use the zero write primitive in order to disable their operation.

First, as described above, the "is_disallowed_range" function uses a table of 32-bit entries, where each bit corresponds to a 1MB block of memory. Bits which are set to one represent disallowed blocks, and zero bits represent allowed blocks.

This means that we can easily neutralise this function by simply using the zero write primitive to set all the entries in the table to zero. In doing so, all blocks of memory will now be marked as allowed.

Moving on to the next function; "is_allowed_range". This one is a little tricky - as mentioned above, blocks in which the second bit in the flags field is set, are validated against the given address. However, for each block in which this bit is not set, no validation is performed, and the block is skipped over.

Since in the block table present in the device, only the first range is relevant to the memory ranges which reside within the TrustZone kernel's memory range, we only need to zero out this field. Doing so will cause it to be skipped over by the validation function, and, as a result, the validation function will accept memory addresses within the TrustZone kernel as valid.

Back to crafting a write primitive

So now that we've gotten rid of the bounds check functions, we can freely supply any
memory address as an argument for an SCM call, and it will be operated upon without any obstacle.

But are we any closer to creating a write primitive? Ideally, had there been an SCM call where we could control a chunk of data which is written to a controlled location, that would have sufficed.

Unfortunately, after going over all of the SCM calls, it appears that there are no candidates which match this description.

Nevertheless, there's no need to worry! What cannot be achieved with a single SCM call, may be possible to achieve by stringing a few calls together. Logically, we can split the creation of an arbitrary write primitive into the following steps:
  • Create an uncontrolled piece of data at a controlled location
  • Control the created piece of data so that it actually contains the wanted content
  • Copy the created data to the target location
Create

Although none of the SCM calls seem to be good candidates in order to create a controlled piece of data, there is one call which can be used to create an uncontrolled piece of data at a controlled location - "tzbsp_prng_getdata_syscall".

This function, as its name implies, can be used to generate a buffer of random bytes at a given location. It is generally used by Android is order to harness the hardware PRNG which is present in Snapdragon SoCs.

In any case, the SCM call receives two arguments; the output address, and the output length (in bytes).

On the one hand, this is great - if we (somewhat) trust the hardware RNG, we can be pretty sure that for each byte we generate using this call, the entire range of byte values is possible as an output. On the other hand, this means that we have no control whatsoever on what data is actually going to be generated.

Control

Even though any output is possible when using the PRNG, perhaps there is some way in which we could be able to verify that the generated data is actually the data that we wish to write.

In order to do so, let's think of the following game - imagine you have a slot machine with four slots, each with 256 possible values. Each time you pull the lever, all the slots rotate simultaneously, and a random output is presented. How many times would you need to pull the lever in order for the outcome to perfectly match a value that you picked beforehand? Well, there are 4294967296 (2^32) possible values, so each time you pull the lever, there's a chance of about 10^(-10) that the result would match your wanted outcome. Sounds like you're going to be here for a while...


But what if you could cheat? For example, what if you had a different lever for each slot? That way you can only change the value of a single slot with each pull. This means that now for each time the lever is pulled, there's a chance of 1/256 that the outcome will match the desired value for that slot.


Sounds like the game is much easier now, right? But how much easier? In probability theory this kind of distribution for a single "game" is called a Bernoulli Distribution, and is actually just a fancy way of saying that each experiment has a set probability of success, denoted p, and all other outcomes are marked all failures, and have a probability of 1-p of occurring.

Assuming we would like a 90% chance of success, it turns out that the in original version of the game we would require approximately 10^8 attempts (!), but if we cheat, instead, we would only require approximately 590 attempts per slot, which is several orders of magnitude less.

So have you figured out how this all relates to our write primitive yet? Here it goes:

First, we need to find an SCM call which returns a value from a writeable memory location within the TrustZone kernel's memory, to the caller.

There are many such functions. One such candidate is the "tzbsp_fver_get_version" call. This function can be used by the "Normal World" in order to retrieve internal version numbers of different TrustZone components. It does so by receiving an integer denoting the component whose version should be retrieved, and an address to which the version code should be written. Then, the function simply goes over a static array of pairs containing the component ID, and the version code. When a component with the given ID is found, the version code is written to the output address.


tzbsp_fver_get_version internal array

Now, using the "tzbsp_prng_getdata_syscall" function, we can start manipulating any version code's value, one byte at a time. In order to know the value of the byte that we've generated at each iteration, we can simply call the aforementioned SCM, while passing in the component ID matching the component whose version code we are modifying, and supplying a return address which points to a readable (that is, not in TrustZone) memory location.

We can repeat these first two steps until we are satisfied with the generated byte, before moving on to generate the next byte. This means that after a few iterations, we can be certain that the value of a specific version code matches our wanted DWORD.
 
Copy 

Finally, we would like to write the generated value to a controlled location. Luckily, this step is pretty straight-forward. All we need to do is simply call the "tzbsp_fver_get_version" SCM call, but now we can simply supply the target address as the return address argument. This will cause the function to write our generated DWORD to a controlled location, thus completing our write gadget.

Phew... What now?

From here on, things get a little easier. First, although we have a write primitive, it is still quite cumbersome to use. Perhaps it would be a little easier if we were able to create a simpler gadget using the previous one.

We can do this by creating our own SCM call, which is simply a write-what-where gadget. This may sound tricky, but it's actually pretty straight-forward.

In the previous blog post, we mentioned that all SCM calls are called indirectly via a large array containing, among other things, pointers to each of the SCM calls (along with the number of arguments they are provided, their name, etc.).

This means that we can use the write gadget we created previously in order to change the address of some SCM call which we deem to be "unimportant", to an address at which a write gadget already exists. Quickly going over the TrustZone kernel's code reveals that there are many such gadgets. Here's one example of such a gadget:


This piece of code will simply write the value in R0 to the address in R1, and return. Great.

Finally, it might also be handy to be able to read any memory location which is within the TrustZone kernel's virtual address space. This can be achieved by creating a read gadget, using the exact same method described above, in place of another "unimportant" SCM call. This gadget is actually quite a bit rarer than the write gadget. However, one such gadget was found within the TrustZone kernel:


This gadget returns the value read from the address in R0, with the offset R1. Awesome.

Writing new code

At this stage, we have full read-write access to the TrustZone kernel's memory. What we don't yet have, is the ability to execute arbitrary code within the TrustZone kernel. Of course, one might argue the we could find different gadgets within the kernel, and string those together to create any wanted effect. But this is quite tiring if done manually (we would need to find quite a few gadgets), and quite difficult to do automatically.

There are a few possible way to tackle this problem.

One possible angle of approach might be to write a piece of code in the "Normal World", and branch to it from the "Secure World". This sounds like an easy enough approach, but is actually much easier said than done.

As mentioned in the first blog post, when the processor in operating in secure mode, meaning the NS (Non-Secure) bit in the SCR (Secure Configuration Register) is turned off, it can only execute pages which are marked as "secure" in the translation table used by the MMU (that is, the NS bit is off).



This means that in order to execute our code chunk residing in the "Normal World" we would first have to modify the TrustZone kernel's translation table in order to map the address in which we've written our piece of code as secure.

While all this is possible, it is a little tiresome.

A different approach might be to write new code within the TrustZone kernel's code segments, or overwrite existing code. This also has the advantage of allowing us to modify existing behaviour in the kernel, which can also come in handy later on.

However, upon first glance this doesn't sound easier to accomplish than the previous approach. After all, the TrustZone kernel's code segments are mapped as read-only, and are certainly not writeable.

However, this is only a minor setback! This can actually be solved without modifying the translation table after all, by using a convenient feature of the ARM MMU called "domains".

In the ARM translation table, each entry has a field which lists its permissions, as well as a field denoting the "domain" to which the translation belongs. There are 16 domains, and each translation belongs to a single one of them.

Within the ARM MMU, there is a register called the DACR (Domain Access Control Register). This 32-bit register has 16 pairs of bits, one pair for each domain, which are used to specify whether faults for read access, write access, both, or neither, should be generated for translations of the given domain.


Whenever the processor attempts to access a given memory address, the MMU first checks if the access is possible using the access permissions of the given translation for that address. If the access is allowed, no fault is generated.

Otherwise, the MMU checks if the bits corresponding to the given domain in the DACR are set. If so, the fault is suppressed and the access is allowed.

This means that simply setting the DACR's value to 0xFFFFFFFF will actually cause the MMU to enable access to any mapped memory address, for both read and write access, without generating a fault (and more importantly, without having to modify the translation table).

But how can we set the DACR? Apparently, during the TrustZone kernel's initialization, it also explicitly sets the DACRs value to a predetermined value (0x55555555), like so:


However, we can simply branch to the next opcode in the initialization function, while supplying our own value in R0, thus causing the DACR to be set to our controlled value.

Now that the DACR is set, the path is all clear - we can simply write or overwrite code within the TrustZone kernel.

In order to make things a little easier (and less disruptive), it's probably better to write code at a location which is unused by the TrustZone kernel. One such candidate is a "code cave".

Code caves are simply areas (typically at the end allocated memory regions) which are unused (i.e., do not contain code), but are nonetheless mapped and valid. They are usually caused by the fact that memory mappings have a granularity, and therefore quite frequently there is internal fragmentation at the end of a mapped segment.

Within the TrustZone kernel there are several such code caves, which enable us to write small pieces of code within them and execute them, with minimal hassle.

Putting it all together

So this exploit was a little complex. Here's a run-down of  all the stages we had to complete:
  • Disable the memory validation functions using the zero write primitive
  • Craft a wanted DWORD at a controlled location using the TrustZone PRNG
  • Verify the crafted DWORD by reading the corresponding version code
  • Write the crafted version code to the location of a function pointer to an existing SCM call (by doing so creating a fast write gadget)
  • Use the fast write gadget to create a read gadget
  • Use the fast write gadget to write a function pointer to a gadget which enables us to modify the DACR
  • Modify the DACR to be fully enabled (0xFFFFFFFF)
  • Write code to a code cave within the TrustZone kernel
  • Execute! :)

The Code

I've written an exploit for this vulnerability, including all the needed symbols for the Nexus 5 (with the fingerprint stated beforehand).

First of all, in order to enable the exploit to send the needed crafted SCM calls to the TrustZone kernel, I've created a patched version of the msm-hammerhead kernel which adds such functionality and exposes it to user-space Android.

I've chosen to do this by adding some new IOCTLs to an existing driver, QSEECOM (mentioned in the first blog post), which is a Qualcomm driver used to interface with the TrustZone kernel. These IOCTLs enable the caller to send a "raw" SCM call (either regular, or atomic) to the TrustZone kernel, containing any arbitrary data.

You can find the needed kernel modifications here.

For those of you using a Nexus 5 device, I personally recommend following Marcin Jabrzyk's great tutorial - here (it's a full tutorial describing how to compile and boot a custom kernel without flashing it to the device).

After booting the device with a modified kernel, you'll need a user-space application which can use the newly added IOCTLs in order to send SCMs to the kernel.

I've written such an application which you can get it here.

Finally, the exploit itself is written in python. It uses the user-space application to send SCM calls via the custom kernel directly to the TrustZone kernel, and allows execution of any arbitrary code within the kernel.

You can find the full exploit's code here.

Using the exploit

Using the exploit is pretty straight forward. Here's what you have to do:
  • Boot the device using the modified kernel (see Marcin's tutorial)
  • Compile the FuzzZone binary and place it under /data/local/tmp/
  • Write any ARM code within the shellcode.S file
  • Execute the build_shellcode.sh script in order to create a shellcode binary
  • Execute exploit.py to run your code within the TrustZone kernel

Affected Devices

At the time of disclosure, this vulnerability affected all devices with the MSM8974 SoC. I created a script to statically check the ROMs of many such devices before reporting the vulnerability, and found that the following devices were vulnerable:

Note: This vulnerability has since been fixed by Qualcomm, and therefore should not affect updated devices currently. Also, please note that the following is not an exhaustive list, by any measure. It's simply the result of my static analysis at the time.

 -Samsung Galaxy S5
 -Samsung Galaxy S5
 -Samsung Galaxy Note III
 -Samsung Galaxy S4 
 -Samsung Galaxy Tab Pro 10.1
 -Samsung Galaxy Note Pro 12.2
 -HTC One
 -LG G3
 -LG G2
 -LG G Flex 
 -Sony Xperia Z3 Compact 
 -Sony Xperia Z2 
 -Sony Xperia Z Ultra 
 -Samsung Galaxy S5 Active
 -Samsung Galaxy S5 TD-LTE
 -Samsung Galaxy S5 Sport
 -HTC One (E8)
 -Oneplus One
 -Acer Liquid S2
 -Asus PadFone Infinity
 -Gionee ELIFE E7
 -Sony Xperia Z1 Compact
 -Sony Xperia Z1s
 -ZTE Nubia Z5s
 -Sharp Aquos Xx 302SH
 -Sharp Aquos Xx mini 303SH
 -LG G Pro 2
 -Samsung Galaxy J
 -Samsung Galaxy Note 10.1 2014 Edition (LTE variant)
 -Samsung Galaxy Note 3 (LTE variant)
 -Pantech Vega Secret UP
 -Pantech Vega Secret Note
 -Pantech Vega LTE-A
 -LG Optimus Vu 3
 -Lenovo Vibe Z LTE
 -Samsung Galaxy Tab Pro 8.4
 -Samsung Galaxy Round
 -ZTE Grand S II LTE
 -Samsung Galaxy Tab S 8.4 LTE
 -Samsung Galaxy Tab S 10.5 LTE
 -Samsung Galaxy Tab Pro 10.1 LTE
 -Oppo Find 7 Qing Zhuang Ban
 -Vivo Xshoot Elite
 -IUNI U3
 -Hisense X1
 -Hisense X9T Pantech Vega Iron 2 (A910)
 -Vivo Xplay 3S
 -ZTE Nubia Z5S LTE
 -Sony Xperia Z2 Tablet (LTE variant)
 -Oppo Find 7a International Edition
 -Sharp Aquos Xx304SH
 -Sony Xperia ZL2 SOL25
 -Sony Xperia Z2a
 -Coolpad 8971
 -Sharp Aquos Zeta SH-04F
 -Asus PadFone S
 -Lenovo K920 TD-LTE (China Mobile version)
 -Gionee ELIFE E7L
 -Oppo Find 7
 -ZTE Nubia X6 TD-LTE 128 GB
 -Vivo Xshot Ultimate
 -LG Isai FL
 -ZTE Nubia Z7
 -ZTE Nubia Z7 Max
 -Xiaomi Mi 4
 -InFocus M810

Timeline
  • 19.09.14 - Vulnerability disclosed
  • 19.09.14 - Initial response from QC
  • 22.09.14 - Issue confirmed by QC
  • 01.10.14 - QC issues notice to customers
  • 16.10.14 - QC issues notice to carriers, request for 14 days of embargo
  • 30.10.14 - Embargo expires
I'd like to also point out that after reporting this issue to Qualcomm, I was informed that it has already been internally identified by them prior to my disclosure. However, these kinds of issues require quite a long period of time in order to push a fix, and therefore at the time of my research, the fix had not yet been deployed (at least, not to the best of my knowledge).

Last Words

I'd really like to hear some feedback from you, so please leave a comment below! Feel free to ask about anything.

20 comments:

  1. Great post, sightly hard to understand as i'm spanish but it denotes a deep knowledge of exploting. Contrats and keep on it!

    P.d. When releasing stagefright exploit? :P

    ReplyDelete
    Replies
    1. Thanks! Happy you enjoyed the post.

      The stagefright exploits (I've found quite a few) are coming soon, but the next post will be about a linux kernel exploit.

      Delete
  2. So by reading this (http://blog.azimuthsecurity.com/2013/04/unlocking-motorola-bootloader.html) and this post, to my understanding, if I just want to unlock the bootloader I can do that without going in to so much hassle by simply overwriting the "golbal_flag" value by your zero write primitive right?
    By the way can you publish anything you learned about qfuse structure on msm8974, and the script you wrote to check the bootloader for the exploit?
    Great post. Learned to much since I'm still getting in to this bootloader stuff :)

    Ps. I wish you would've published this a week ago. I bought a amazon fire phone and upgraded to the new firmware last week. I'm sure the old bootloader had the flow and new one doesn't :(

    ReplyDelete
    Replies
    1. I'm planning a much more detailed post about MSM8974 internals, but it'll have to wait a while since I want to first cover a few Android and Linux kernel vulnerabilities...

      Looking at the post you linked to, it refers to the MSM8960 architecture. That version of Snapdragon had a massive documentation leak, that lead to a pretty wide understanding of the QFuses structure. Unfortunately, there's been no such leak for MSM8974, so while some comparisons might be drawn between them, there's no guarantee that other things haven't changed. Since I only have my personal device, I'm not willing to blow QFuses until I'm 100% about their purpose.

      I have done some research into unlocking the Moto X 2014 (which is an MSM8974 device), which I will definitely write about soon, but it's still incomplete.

      Anyway, glad you enjoyed the post! Best of luck with the Fire phone :) Hope the next TrustZone posts help.

      Delete
    2. Great.. Keep up the good work. By the way in the post I linked, he detects the QFuse by analysing aboot image. Not sure the same applies here though.

      Delete
  3. This comment has been removed by the author.

    ReplyDelete
  4. Hey,
    This was an interesting read for sure. You mentioned that the S4 was vulnerable but when examining the tz image for the s4, the address don't match. I was wondering if you had figured out the addresses for the s4 as well. I gained a fair bit of knowledge reading the posts before and the post following this post as well.

    Keep up the great work!

    ReplyDelete
  5. 0x12 -> 0xC in your memory object struct

    ReplyDelete
  6. I thourougly enjoyed this article. i wish the apq8084 (droid turbo)were researched and exploited in this manner. Thank you for your wor and especially for posting so others can learn from it.

    ReplyDelete
    Replies
    1. Thank you! As for APQ8084, wait three weeks, I have a surprise :)

      Delete
  7. AWESOME!!! sounds like Christmas will be good this year.

    ReplyDelete
  8. This comment has been removed by the author.

    ReplyDelete
  9. Why is responsible disclosure necessary here? From what I can tell, this exploit only enables the owner of a device to get full access to the device, which is their right anyway. Why would you tell the manufacturer?

    ReplyDelete
    Replies
    1. Because this can also be used to extract credit card details, create a completely silent RAT, extract fingerprint information and the list goes on...

      Delete
  10. This comment has been removed by the author.

    ReplyDelete
  11. laginimaineb also on MSM 8916 there is a function called

    tz_service <0x3F802, aTzbsp_oem_svc, 0xF, 0x86500ECB, 1> ; "tzbsp_oem_svc"

    which doesnt check the ranges and write 3 dwords to an arbitrary address

    int __fastcall tzbsp_oem_svc(int a1, int a2)
    {
    int v2; // r4@1

    v2 = a2;
    *(_DWORD *)a2 = 0xC;
    *(_DWORD *)(a2 + 4) = get_tzbsp_params(); =>returns 0x0F
    *(_DWORD *)(v2 + 8) = sub_865164FE(); =>returns 0x0
    return 0;
    }

    this can be used to neutralise range check. Contact me on hack3r2k@hack3r2k.com if interested

    ReplyDelete
    Replies
    1. Interesting! MSM8916 is the SoC in the very cheap Moto E 2nd gen., right?

      https://wiki.cyanogenmod.org/w/Surnia_Info

      Delete
    2. Hi Daniel,

      Not sure for the Moto but still this function could be manufacturer related. The phone im researching is an Alcatel 5042

      Delete
  12. Hi,

    Im having an issue after exploiting the trustzone on my phone. Im not able to read the modem ram. It should be mapped somewhere at 0xC0XXXXXX but when i attempt to read the region simply everything hangs. Also the display stops working when exploiting the tz. It still unclear why. Everything else seems to work fine thru.

    ReplyDelete