INSIGHTS | May 16, 2017

#WannaCry: Examining Weaponized Malware

By Brad Hegrat

Attribution: You Keep Using That Word, I Do Not Think It Means What You Think It Means…

In internal discussions in virtual halls of IOActive this morning, there were many talks about the collective industry’s rush to blame or attribution over the recent WanaCry/WannaCrypt ransomware breakouts. Twitter was lit up on #Wannacry and #WannaCrypt and even Microsoft got into the action, stating, “We need governments to consider the damage to civilians that comes from hoarding these vulnerabilities and the use of these exploits.”

Opinions for blame and attribution spanned the entire spectrum of response, from the relatively sane…

…to the sublimely sarcastic.

As a community, we can talk and debate who did what, and why, but in the end it really doesn’t matter. Literally, none (well, almost none) of us outside the government or intelligence communities have any impact on the discussion of attribution. Even for the government, attribution is hard to nearly impossible to do reliably, and worse – is now politicized and drawn out in the court of public opinion. The digital ink on malware or Internet attacks is hardly even dry, yet experts are already calling out “Colonel Mustard, Lead Pipe, Study” before we even know what the malware is fully capable of doing. We insist on having these hyperbolic discussions where we wax poetic about the virtues of the NSA vs. Microsoft vs. state actors.

It’s more important to focus on the facts and what can be observed in the behavioral characteristics of the malware and what organizations need to do to prevent infection now and in the future.

How people classify them varies, but there are essentially three different classes of weaponized malware:

Semi-automatic/automatic kits that exploit “all the things.”These are the Confickers, Code-red, SQL Slamming Melissa’s of the world
Manual/point targeted kits that exploit “one thing at a time.” These are the types of kit that Shadow Brokers dropped. Think of these as black market, crew-served weapons, such as MANPADS
Automatic point target exploit kits that exploit based on specific target telemetry AND are remotely controllable in flight. This includes Stuxnet. Think of these as the modern cyber equivalent of cruise missiles

Nation state toolkits are typically elegant. As we know, ETERNALBLUE was part of a greater framework/toolkit. Whoever made WannaCrypt/Cry deconstructed a well written (by all accounts thus far) complex mechanism for point target use, and made a blunt force weapon of part of it. Of those three types above, nation states moved on from the first one over a decade ago because they’re not controllable and they don’t meet the clandestine nature that today’s operators require. Nation states typically prefer type two; however, this requires bi-directional, fully routed IP connectivity to function correctly. When you cannot get to the network or asset in question, type three is your only option. In that instance, you build in the targeting telemetry for the mission and send it on its way. This requires a massive amount of upfront HUMINT and SIGINT for targeting telemetry. As you can imagine, the weaponized malware in type three is both massive in size and in sunk cost.

WannaCry/WanaCrypt is certainly NOT types two or three and it appears that corners were cut in creating the malware. The community was very quick in actively reversing the package and it doesn’t appear that any major anti-reversing or anti-tampering methods were used. Toss in the well-publicized and rudimentary “kill switch” component and this appears almost sloppy and lacks conviction. I can think of at least a dozen more elegant command and control functions it could have implemented to leave control in the hands of the malware author. Anyone with reverse engineering skills would eventually find this “kill switch” and disable it using a hex editor to modifying a JMP instruction. Compare this to Conficker, which had password brute-forcing capabilities as well as the ability to pivot after installation and infect other hosts without the use of exploits, but rather through simple login after passwords were identified.

This doesn’t mean that WannaCry/WanaCrypt is not dangerous, on the contrary depending upon the data impacted, its consequences could be devastating. For example, impacting the safety builder controlling Safety Instrumented Systems, locking operators out of the Human Machine Interfaces (HMI’s, or computers used in industrial control environments) could lead to dangerous process failures. Likewise, loss of regulatory data that exists in environmental control systems, quality systems, historians, or other critical ICS assets could open a facility up to regulatory action. Critical infrastructure asset owners typically have horrific patch cycles with equally appalling backup and disaster recovery strategies. And if businesses are hit with this attack and lose critical data, it may open up a door to legal action for failure to follow due care and diligence to protect these systems. It’s clear this ransomware is going to be a major pain for quite some time. Due care and preventative strategies should be taken by asset owners everywhere to keep their operations up and running in the safest and secure manner possible.

It really doesn’t do much good to philosophically discuss attribution, or play as a recent hashtag calls it, the #smbBlameGame. It’s relatively clear that this is amateur hour in the cybercrime space. With a lot of people panicking about this being the “next cyber cruise missile” or equivalent, I submit that this is more akin to digital malaria.

RESEARCH | September 22, 2015

Is Stegomalware in Google Play a Real Threat?

By Alfonso Muñoz

For several decades, the science of steganography has been used to hide malicious code (useful in intrusions) or to create covert channels (useful in information leakage). Nowadays, steganography can be applied to almost any logical/physical medium (format files, images, audio, video, text, protocols, programming languages, file systems, BIOS, etc.). If the steganographic algorithms are well designed, the hidden information is really difficult to detect. Detecting hidden information, malicious or not, is so complex that the study of steganalytic algorithms (detection) has been growing. You can see the growth in scientific publications (source: Scholar Google) and research investment by governments or institutions.

In fact, since the attacks on September 11, 2001, there has been a lot of discussion on the possibility of terrorists using this technology. See:

http://usatoday30.usatoday.com/tech/news/2001-02-05-binladen.htm

http://usatoday30.usatoday.com/tech/columnist/2001/12/19/maney.htm

http://arstechnica.com/business/2012/05/steganography-how-al-qaeda-hid-secret-documents-in-a-porn-video/

In this post, I would like to illustrate steganography’s ability to hide data in Android applications. In this experiment, I focus on Android applications published in Google Play, leaving aside alternative markets with lower security measures, where it is easier to introduce malicious code.

Is it possible to hide information on Google Play or in the Android apps released in it?

The answer is easy: YES! Simple techniques have been documented, from hiding malware by renaming the file extension (Android / tr DroidCoupon.A – 2011, Android / tr SmsZombie.A – 2012, Android / tr Gamex.A – 2013) to more sophisticated procedures (AngeCryption – BlackHat Europe October2014).

Let me show some examples in more depth:

Google Play Web (https://play.google.com)

Google Play includes a webpage for each app with information such as a title, images, and a text description. Each piece of information could conceal data using steganography (linguistic steganography, image steganography, etc.). In fact, I am going to “work” with digital images and demonstrate how Google “works” when there is hidden information inside of files.

To do this, I will use two known steganographic techniques: adding information to the end of file (EOF) and hiding information in the least significant bit (LSB) of each pixel of the image.

–

PNG Images

You can upload PNG images to play.google.com that hide information using EOF or LSB techniques. Google does not remove this information.

For example, I created a sample app (automatically generated – https://play.google.com/store/apps/details?id=com.wMyfirstbaskeballgame) and uploaded several images (which you can see on the web) with hidden messages. In one case, I used the OpenStego steganographic tool (http://www.openstego.com/) and in another, I added the information at the end of an image with a hex editor.

The results can be seen by performing the following steps (analyzing the current images “released” on the website):

Example 1: PNG with EOF

Step 1: Save the image from URL (without =h900): https://lh3.googleusercontent.com/_H95_7MRo8yt7Oe3u8pD2pzElzBHYX4L3skE9Lc-Bf-8N6D7FxaslykwEH6XMGFRnw~~=h900~~)

Step 2: Loot at the end of the file 🙂

Example 2: PNG with LSB

Step 1: Save the image from URL (without =h310): https://lh3.googleusercontent.com/K4LJak3O4v5t-bCI1szXzV8tNNJpKw9_ppoN5nBKcl6dCQvgf2GNjJvWT1Kx-3Dxww~~=h310~~

Step 2: Recover the hidden information using Openstego (key=alfonso)

JPEG Images

If you try to upload a steganographic JPEG image (EOF or LSB) to Google Play, the hidden information will be removed. Google reprocesses the image before publishing it. This does not necessarily mean that it is not possible to hide information in this format. In fact, in social networks such as Facebook, we can “avoid” a similar problem with Secret Book or similar browser extensions. I’m working on it…

https://chrome.google.com/webstore/detail/secretbook/plglafijddgpenmohgiemalpcfgjjbph?hl=en-GB

In summary, based on the previous proofs, I can say that Google Play allows information to be hidden in the images of each app. Is this useful? It could be used to exchange hidden information (covert channel using the Google Play). The main question is whether an attacker could use information masked for some evil purpose. Perhaps they could use images to encode executable code that “will exploit” when you are visiting the web (using for example polyglots + stego exploits) or another idea. Time will tell…

https://cms.ioactive.com/wp-content/uploads/2015/09/D1T1-Saumil-Shah-Stegosploit-Hacking-with-Pictures-1.pdf

APK Steganography

Applications uploaded toGoogle Play are not modified by the market. That is, an attacker can use any of the existing resources in an APK to hide information, and Google does not remove that information. For example, machine code (DEX), PNG, JPEG, XML, and so on.

Could it be useful to hide information on those resources?

An attacker might want to conceal malicious code on these resources and hinder automatic detection (static and dynamic analysis) that focuses on the code (DEX). A simple example would be an application that hides a specific phone number in an image (APT?).

The app verifies the phone number, and after a few clicks in a specific screen on a mobile phone, checks if the number is equal to that stored in the picture. If the numbers match, you can start leaking information (depending on the permissions allowed in the application).

I want to demonstrate the potential of Android stegomalware with a PoC. Instead of developing it, I will analyze an active sample that has been on Google Play since June 9, 2014. This stegomalware was developed by researchers at Universidad Carlos III de Madrid, Spain (http://www.uc3m.es). This PoC hides a DEX file (executable code) in an image (resource) of the main app. When the app is running, and the user performs a series of actions, the image recovers the “new” DEX file. This code runs and connects to a URL with a payload (in this case harmless). The “bad” behavior of this application can only be detected if we analyze the resources of the app in detail or simulate the interaction the app used for triggering the connection to the URL.

Let me show how this app works (static manual analysis):

Step 1: Download the APK to our local store. This requires a tool, such as an APK downloader extension or a specific web as http://apps.evozi.com/apk-downloader/

Step 2. Unzip the APK (es.uc3m.cosec.likeimage.apk)

Step 3. Using the Stegdetect steganalytic tool (https://github.com/abeluck/stegdetect) we can detect hidden information in the image “likeimage.jpg”. The author used the F5 steganographic tool (https://code.google.com/p/f5-steganography /).

es.uc3m.cosec.likeimageresdrawable-hdpilikeimage.jpg

likeimage.jpg : f5(***)

Step 4. To analyze (reverse engineer) what the app is doing with this image, I use the dex2jar and jd tools.

Step 5. Analyzing the code, we can observe the key used to hide information in the image. We can recover the hidden content to a file (bicho.dex).

java -jar f5.jar x -p cosec -e bicho.dex likeimage.jpg

Step 6. Analyzing the new file (bicho.dex), we can observe the connection to http://cosec-uc3m.appspot.com/likeimage for downloading a payload.

Step 7. Analyzing the code and payload, we can demonstrate that it is inoffensive.

ZGV4CjAzNQDyUt1DKdvkkcxqN4zxwc7ERfT4LxRA695kAgAAcAAAAHhWNBIAAAAAAAAAANwBAAAKAAAAcAAAAAQAAACYAAAAAgAAAKgAAAAAAAAAAAAAAAMAAADAAAAAAQAAANgAAABsAQAA+AAAACgBAAAwAQAAMwEAAD0BAABRAQAAZQEAAKUBAACyAQAAtQEAALsBAAACAAAAAwAAAAQAAAAHAAAAAQAAAAIAAAAAAAAABwAAAAMAAAAAAAAAAAABAAAAAAAAAAAACAAAAAEAAQAAAAAAAAAAAAEAAAABAAAAAAAAAAYAAAAAAAAAywEAAAAAAAABAAEAAQAAAMEBAAAEAAAAcBACAAAADgACAAEAAAAAAMYBAAADAAAAGgAFABEAAAAGPGluaXQ+AAFMAAhMTUNsYXNzOwASTGphdmEvbGFuZy9PYmplY3Q7ABJMamF2YS9sYW5nL1N0cmluZzsAPk1BTElDSU9VUyBQQVlMT0FEIEZST00gVEhFIE5FVDogVGhpcyBpcyBhIHByb29mIG9mIGNvbmNlcHQuLi4gAAtNQ2xhc3MuamF2YQABVgAEZ2V0UAAEdGhpcwACAAcOAAQABw4AAAABAQCBgAT4AQEBkAIAAAALAAAAAAAAAAEAAAAAAAAAAQAAAAoAAABwAAAAAgAAAAQAAACYAAAAAwAAAAIAAACoAAAABQAAAAMAAADAAAAABgAAAAEAAADYAAAAASAAAAIAAAD4AAAAAiAAAAoAAAAoAQAAAyAAAAIAAADBAQAAACAAAAEAAADLAQAAABAAAAEAAADcAQAA

The code that runs the payload:

Is Google detecting these “stegomalware”?

Well, I don’t have the answer. Clearly, steganalysis science has its limitations, but there are other ways to monitor strange behaviors in each app. Does Google do it? It is difficult to know, especially if we focus on “mutant” applications. Mutant applications are applications whose behavior could easily change. Detection would require continuous monitoring by the market. For example, for a few months I have analyzed a special application, including its different versions and the modifications that have been published, to observe if Google does anything with it. I will show the details:

Step 1. The mutant app is “Holy Quran video and MP3” (tr.com.holy.quran.free.apk). Currently at https://play.google.com/store/apps/details?id=tr.com.holy.quran.free

Step 2. Analyzing the current and previous version of this app, I discover connections to specific URLs (images files). Are these truly images? Not all.

Step 3. Two URLs that the app connects to are very interesting. In fact, they aren’t images but SQLite databases (with messages in Turkish). This is the trivial steganography technique of simply renaming the file extension. The author changed the content of these files:

Step 4. If we analyze these databases, it is possible to find curious messages. For example, recipes with drugs.

Is Google aware of the information exchanged using their applications? This example does not cease to be a mere curiosity, but such procedures might violate the policy of publication of certain applications on the market or more serious things.

Figure: Recipes inside the file io.png (SQLite database)

In summary, this small experiment shows that we can hide information on Google Play and Android apps in general. This feature can be used to conceal data or implement specific actions, malicious or not. Only the imagination of an attacker will determine how this feature will be used…

Disclaimer: part of this research is based on a previous research by the author at ElevenPaths

INSIGHTS | February 6, 2013

Hackers Unmasked: Detecting, Analyzing, And Taking Action Against Current Threats

By Gunter Ollmann

Tomorrow morning I’ll be delivering the opening keynote to InformationWeek & Dark Reading’s virtual security event – Hackers Unmasked — Detecting, Analyzing, And Taking Action Against Current Threats.

You can catch my live session at 11:00am Eastern discussing the “Portrait of a Malware Author” where I’ll be discussing how today’s malware is more sophisticated – and more targeted – than ever before. Who are the people who write these next-generation attacks, and what are their motivations? What are their methods, and how do they chose their targets? Along with how they execute their craft, and what you can do to protect your organization.

The day’s event will have a bunch of additional interesting speakers too – including Dave Aitel and our very own Iftach Ian Amit.

Please come and join the event. I promise not to stumble over my lines too many times, and you’ll learn new things.

You’ll need to quickly subscribe in order to get all the virtual event connection information, so visit the InformationWeek & DarkReading event subscription page HERE.

— Gunter Ollmann, CTO — IOActive, Inc.

INSIGHTS | January 17, 2013

Offensive Defense

By Stephan Chenette

I presented before the holiday break at Seattle B-Sides on a topic I called “Offensive Defense.” This blog will summarize the talk. I feel it’s relevant to share due to the recent discussions on desktop antivirus software (AV)

What is Offensive Defense?

The basic premise of the talk is that a good defense is a “smart” layered defense. My “Offensive Defense” presentation title might be interpreted as fighting back against your adversaries much like the Sexy Defense talk my co-worker Ian Amit has been presenting.

My view of the “Offensive Defense” is about being educated on existing technology and creating a well thought-out plan and security framework for your network. The “Offensive” word in the presentation title relates to being as educated as any attacker who is going to study common security technology and know it’s weaknesses and boundaries. You should be as educated as that attacker to properly build a defensive and reactionary security posture. The second part of an “Offensive Defense” is security architecture. It is my opinion that too many organizations buy a product to either meet the minimal regulatory requirements, to apply “band-aide” protection (thinking in a point manner instead of a systematic manner), or because the organization thinks it makes sense even though they have not actually made a plan for it. However, many larger enterprise companies have not stepped back and designed a threat model for their network or defined the critical assets they want to protect.

At the end of the day, a persistent attacker will stop at nothing to obtain access to your network and to steal critical information from your network. Your overall goal in protecting your network should be to protect your critical assets. If you are targeted, you want to be able to slow down your attacker and the attack tools they are using, forcing them to customize their attack. In doing so, your goal would be to give away their position, resources, capabilities, and details. Ultimately, you want to be alerted before any real damage has occurred and have the ability to halt their ability to ex-filtrate any critical data.

Conduct a Threat Assessment, Create a Threat Model, Have a Plan!

This process involves either having a security architect in-house or hiring a security consulting firm to help you design a threat model tailored to your network and assess the solutions you have put in place. Security solutions are not one-size fits all. Do not rely on marketing material or sales, as these typically oversell the capabilities of their own product. I think in many ways overselling a product is how as an industry we have begun to have rely too heavily on security technologies, assuming they address all threats.

There are many quarterly reports and resources that technical practitioners turn to for advice such as Gartner reports, the magic quadrant, or testing houses including AV-Comparatives, ICSA Labs, NSS Labs, EICAR or AV-Test. AV-Test , in fact, reported this year that Microsoft Security Essentials failed to recognize enough zero-day threats with detection rates of only 69% , where the average is 89%. These are great resources to turn to once you know what technology you need, but you won’t know that unless you have first designed a plan.

Once you have implemented a plan, the next step is to actually run exercises and, if possible, simulations to assess the real-time ability of your network and the technology you have chosen to integrate. I rarely see this done, and, in my opinion, large enterprises with critical assets have no excuse not to conduct these assessments.

Perform a Self-assessment of the Technology

AV-Comparatives has published a good quote on their product page that states my point:

“If you plan to buy an Anti-Virus, please visit the vendor’s site and evaluate their software by downloading a trial version, as there are also many other features and important things for an Anti-Virus that you should evaluate by yourself. Even if quite important, the data provided in the test reports on this site are just some aspects that you should consider when buying Anti-Virus software.”

This statement proves my point in stating that companies should familiarize themselves with a security technology to make sure it is right for their own network and security posture.

There are many security technologies that exist today that are designed to detect threats against or within your network. These include (but are not limited to):

Firewalls
Intrusion Prevention Systems (IPS)
Intrusion Detectoin Systems (IDS)
Host-based Intrusion Prevention Systems (HIPS)
Desktop Antivirus
Gateway Filtering
Web Application Firewalls
Cloud-Based Antivirus and Cloud-based Security Solutions

Such security technologies exist to protect against threats that include (but are not limited to):

File-based malware (such as malicious windows executables, Java files, image files, mobile applications, and so on)
Network-based exploits
Content based exploits (such as web pages)
Malicious email messages (such as email messages containing malicious links or phishing attacks)
Network addresses and domains with a bad reputation

These security technologies deploy various techniques that include (but are not limited to):

Hash-detection
Signature-detection
Heuristic-detection
Semantic-detection

There are of course others techniques (that I won’t go into great detail in this blog on) for example:

Reputation-based
Behavioral based

It is important to realize that there is no silver bullet defense out there, and given enough expertise, motivation, and persistence, each technique can be defeated. It is essential to understand the limitations and benefits of a particular product so that you can create a realistic, layered framework that has been architected to fit your network structure and threat model. The following are a few example attack techniques against each protection technique and technology (these have been widely publicized):

For the techniques that I have not listed in this table such as reputation, refer to my CanSecWest 2008 presentation “Wreck-utation“, which explains how reputation detection can be circumvented. One major example of this is a rising trend in hosting malicious code on a compromised legitimate website or running a C&C on a legitimate compromised business server. Behavioral sandboxes can also be defeated with methods such as time-lock puzzles and anti-VM detection or environment-aware code. In many cases, behavioral-based solutions allow the binary or exploit to pass through and in parallel run the sample in a sandbox. This allows what is referred to as a 1-victim-approach in which the user receiving the sample is infected because the malware was allowed to pass through. However, if it is determined in the sandbox to be malicious, all other users are protected. My point here is that all methods can be defeated given enough expertise, time, and resources.

Big Data, Machine Learning, and Natural Language Processing

My presentation also mentioned something we hear a lot about today… BIG DATA. Big Data plus Machine Learning coupled with Natural Language Processing allows a clustering or classification algorithm to make predictive decisions based on statistical and mathematical models. This technology is not a replacement for what exists. Instead, it incorporates what already exists (such as hashes, signatures, heuristics, semantic detection) and adds more correlation in a scientific and statistic manner. The growing number of threats combined with a limited number of malware analysts makes this next step virtually inevitable.

While machine learning, natural language processing, and other artificial intelligence sciences will hopefully help in supplementing the detection capabilities, keep in mind this technology is nothing new. However, the context in which it is being used is new. It has already been used in multiple existing technologies such as anti-spam engines and Google translation technology. It is only recently that it has been applied to Data Leakage Prevention (DLP), web filtering, and malware/exploit content analysis. Have no doubt, however, that like most technologies, it can still be broken.

Hopefully most of you have read Imperva’s report, which found that less than 5% of antivirus solutions are able to initially detect previously non-cataloged viruses. Regardless of your opinion on Imperva’s testing methodologies, you might have also read the less-scrutinized Cisco 2011 Global Threat report that claimed 33% of web malware encountered was zero-day malware not detectable by traditional signature-based methodologies at the time of encounter. This, in my experience, has been a more widely accepted industry statistic.

What these numbers are telling us is that the technology, if looked at individually, is failing us, but what I am saying is that it is all about context. Penetrating defenses and gaining access to a non-critical machine is never desirable. However, a “smart” defense, if architected correctly, would incorporate a number of technologies, situated on your network, to protect the critical assets you care about most.

The Cloud versus the End-Point

If you were to attend any major conference in the last few years, most vendors would claim “the cloud” is where protection technology is headed. Even though there is evidence to show that this might be somewhat true, the majority of protection techniques (such as hash-detection, signature-detection, reputation, and similar technologies) simply were moved from the desktop to the gateway or “cloud”. The technology and techniques, however, are the same. Of course, there are benefits to the gateway or cloud, such as consolidated updates and possibly a more responsive feedback and correlation loop from other cloud nodes or the company lab headquarters. I am of the opinion that there is nothing wrong with having anti-virus software on the desktop. In fact, in my graduate studies at UCSD in Computer Science, I remember a number of discussions on the end-to-end arguments of system design, which argued that it is best to place functionality at end points and at the highest level unless doing otherwise improves performance.

The desktop/server is the end point where the most amount of information can be disseminated. The desktop/server is where context can be added to malware, allowing you to ask questions such as:

Was it downloaded by the user and from which site?
Is that site historically common for that particular user to visit?
What happened after it was downloaded?
Did the user choose to execute the downloaded binary?
What actions did the downloaded binary take?

Hashes, signatures, heuristics, semantic-detection, and reputation can all be applied at this level. However, at a gateway or in the cloud, generally only static analysis is performed due to latency and performance requirements.

This is not to say that gateway or cloud security solutions cannot observe malicious patterns at the gateway, but restraints on state and the fact that this is a network bottleneck generally makes any analysis node after the end point thorough. I would argue that both desktop and cloud or gateway security solutions have their benefits though, and if used in conjunction, they add even more visibility into the network. As a result, they supplement what a desktop antivirus program cannot accomplish and add collective analysis.

Conclusion

My main point is that to have a secure network you have to think offensively by architecting security to fit your organization needs. Antivirus software on the desktop is not the problem. The problem is the lack of planing that goes into deployment as well as the lack of understanding in the capabilities of desktop, gateway, network, and cloud security solutions. What must change is the haste with which network teams deploy security technologies without having a plan, a threat model, or a holistic organizational security framework in place that takes into account how all security products work together to protect critical assets.

With regard to the cloud, make no mistake that most of the same security technology has simply moved from the desktop to the cloud. Because it is at the network level, the latency of being able to analyze the file/network stream is weaker and fewer checks are performed for the sake of user performance. People want to feel safe relying on the cloud’s security and feel assured knowing that a third-party is handling all security threats, and this might be the case. However, companies need to make sure a plan is in place and that they fully understand the capabilities of the security products they have chosen, whether they be desktop, network, gateway, or cloud based.

If you found this topic interesting, Chris Valasek and I are working on a related project that Chris will be presenting at An Evening with IOActive on Thursday, January 17, 2013. We also plan to talk about this at the IOAsis at the RSA Conference. Look for details!

INSIGHTS | January 7, 2013

The Demise of Desktop Antivirus

By Gunter Ollmann

Are you old enough to remember the demise of the ubiquitous CompuServe and AOL CD’s that used to be attached to every computer magazine you ever brought between the mid-80’s and mid-90’s? If you missed that annoying period of Internet history, maybe you’ll be able to watch the death of desktop antivirus instead.

65,000 AOL CD’s as art

Just as dial-up subscription portals and proprietary “web browsers” represent a yester-year view of the Internet, desktop antivirus is similarly being confined to the annuls of Internet history. It may still be flapping vigorously like a freshly landed fish, but we all know how those last gasps end.

To be perfectly honest, it’s amazing that desktop antivirus has lasted this long. To be fair though, the product you may have installed on your computer (desktop or laptop) bears little resemblance to the antivirus products of just 3 years ago. Most vendors have even done away from using the “antivirus” term – instead they’ve tried renaming them as “protection suites” and “prevention technology” and throwing in a bunch of additional threat detection engines for good measure.

I have a vision of a hunchbacked Igor working behind the scenes stitching on some new appendage or bolting on an iron plate for reinforcement to the Frankenstein corpse of each antivirus product as he tries to keep it alive for just a little bit longer…

That’s not to say that a lot of effort doesn’t go in to maintaining an antivirus product. However, with the millions upon millions of new threats each month it’s hardly surprising that the technology (and approach) falls further and further behind. Despite that, the researchers and engineers that maintain these products try their best to keep the technology as relevant as possible… and certainly don’t like it when anyone points out the gap between the threat and the capability of desktop antivirus to deal with it.

For example, the New York Times ran a piece on the last day of 2012 titled “Outmaneuvered at Their Own Game, Antivirus Makers Struggle to Adapt” that managed to get many of the antivirus vendors riled up – interestingly enough not because of the claims of the antivirus industry falling behind, but because some of the statistics came from unfair and unscientific tests. In particular there was great annoyance that a security vendor (representing an alternative technology) used VirusTotal coverage as their basis for whether or not new malware could be detected – claiming that initial detection was only 5%.

I’ve discussed the topic of declining desktop antivirus detection rates (and evasion) many, many times in the past. From my own experience, within corporate/enterprise networks, desktop antivirus detection typically hovers at 1-2% for the threats that make it through the various network defenses. For newly minted malware that is designed to target corporate victims, the rate is pretty much 0% and can remain that way for hundreds of days after the malware has been released in to the wild.

You’ll note that I typically differentiate between desktop and network antivirus. The reason for this is because I’m a firm advocate that the battle is already over if the malware makes it down to the host. If you’re going to do anything on the malware prevention side of things, then you need to do it before it gets to the desktop – ideally filtering the threat at the network level, but gateway prevention (e.g. at the mail gateway or proxy server) will be good enough for the bulk of non-targeted Internet threats. Antivirus operations at the desktop are best confined to cleanup, and even then I wouldn’t trust any of the products to be particularly good at that… all too often reimaging of the computer isn’t even enough in the face of malware threats such as TDL.

So, does an antivirus product still have what it takes to earn the real estate it take up on your computer? As a standalone security technology – No, I don’t believe so. If it’s free, never ever bothers me with popups, and I never need to know it’s there, then it’s not worth the effort uninstalling it and I guess it can stay… other than that, I’m inclined to look at other technologies that operate at the network layer or within the cloud; stop what you can before it gets to the desktop. Many of the bloated “improvements” to desktop antivirus products over recent years seem to be analogous to improving the hearing of a soldier so he can more clearly hear the ‘click’ of the mine he’s just stood on as it arms itself.

I’m all in favor of retraining any hunchbacked Igor we may come across. Perhaps he can make artwork out of discarded antivirus DVDs – just as kids did in the 1990’s with AOL CD’s?

— Gunter Ollmann, CTO — IOActive, Inc.

INSIGHTS | November 21, 2012

The Future of Automated Malware Generation

By Stephan Chenette

This year I gave a series of presentations on “The Future of Automated Malware Generation”. This past week the presentation finished its final debut in Tokyo on the 10th anniversary of PacSec.

Hopefully you were able to attend one of the following conferences where it was presented:

IOAsis (Las Vegas, USA)
SOURCE (Seattle, USA)
EkoParty (Buenos Aires, Argentina)
PacSec (Tokyo, Japan)

The Future of Automated Malware Generation from Stephan Chenette

Motivation / Intro

Much of this presentation was inspired by a number of key motivations:

Greg Hoglund’s talk at Blackhat 2010 on malware attribution and fingerprinting
The undeniable steady year by year increase in malware, exploits and exploit kits
My unfinished attempt in adding automatic classification to the cuckoo sandbox
An attempt to clear up the perception by many consumers and corporations that many security products are resistant to simple evasion techniques and contain some “secret sauce” that sets them apart from their competition
The desire to educate consumers and corporations on past, present and future defense and offense techniques
Lastly to help reemphasize the philosophy that when building or deploying defensive technology it’s wise to think offensively…and basically try to break what you build

Since the point of the talk is the future of automated malware generation, I’ll start with explaining the current state of automated malware generation, and then I’ll move to reviewing current defenses found in most products today.

Given enough time, resources and skill-set, every defense technique can be defeated, to prove this to you I’ll share some of the associated offensive techniques. I will then discuss new defense technologies that you’ll start to hear more about and then, as has been the cycle in any war, to each defense will come a new offensive technique. So I will then discuss the future of automated malware generation. This is a long blog, but I hope you find it interesting!

Current State of Automated Malware Generation

Automated Malware Generation centers on Malware Distribution Networks (MDNs).

MDNs are organized, distributed networks that are responsible for the entire exploit and infection vector.

There are several players involved:

Pay-per-install client – organizations that write malware and gain a profit from having it installed on as many machines as possible
Pay-per-install services – organizations that get paid to exploit and infect user machines and in many cases use pay-per-install affiliates to accomplish this
Pay-per-install affiliates – organizations that own a lot of infrastructure and processes necessary to compromise web legitimate pages, redirect users through traffic direction services (TDSs), infect users with exploits (in some cases exploit kits) and finally, if successful, download malware from a malware repository.

Figure: Blackhole exploit kit download chain
Source: Manufacturing Compromise: The Emergence of Exploit-as-a-Service

There are a number of different types of malware repositories, some that contain the same binary for the life-time of a particular attack campaign, some that periodically update or repackage the binary to avoid and evade simple detection techniques, and polymorphic/metamorphic repositories that produce a unique sample for each user request. More complex attacks generally involve the latter.

Figure: Basic Break-down of Malware Repository Types

Current State of Malware Defense

Most Security desktop and network products on the market today use the following techniques to detect malware:

hashes – cryptographic checksums of either the entire malware file or sections of the file, in some cases these could include black-listing and white-listing
signatures – syntactical pattern matching using conditional expressions (in some cases format-aware/contextual)
heuristics – An expression of characteristics and actions using emulation, API hooking, sand-boxing, file anomalies and/or other analysis techniques
semantics – transformation of specific syntax into a single abstract / intermediate representation to match from using more abstract signatures and heuristics

EVERY defense technique can be broken – with enough time, skill and resources.

In the above defensive techniques:

hash-based detection can be broken by changing the binary by a single byte
signature-based detection be broken using syntax mutation
e.g.
- Garbage Code Insertion e.g. NOP, “MOV ax, ax”, “SUB ax 0”
- Register Renaming e.g. using EAX instead of EBX (as long as EBX isn’t already being used)
- Subroutine Permutation – e.g. changing the order in which subroutines or functions are called as long as this doesn’t effect the overall behavior
- Code Reordering through Jumps e.g. inserting test instructions and conditional and unconditional branching instructions in order to change the control flow
- Equivalent instruction substitution e.g. MOV EAX, EBX <-> PUSH EBX, POP EAX
heuristics-based detection can be broken by avoiding the characteristics the heuristics engine is using or using uncommon instructions that the heuristics engine might be unable to understand in it’s emulator (if an emulator is being used)
semantics-based detection can be broken by using techniques such as time-lock puzzle (semantics-based detection are unlikely to be used at a higher level such as network defenses due to performance issues) also because implementation requires extensive scope there is a high likelihood that not all cases have been covered. Semantic-based detection is extremely difficult to get right given the performance requirements of a security product.

There are a number of other examples where defense techniques were easily defeated by proper targeted research (generally speaking). Here is a recent post by Trail of Bits only a few weeks ago [Trail of Bits Blog] in their analysis of ExploitSheild’s exploitation prevention technology. In my opinion the response from Zero Vulnerability Labs was appropriate (no longer available), but it does show that a defense technique can be broken by an attacker if that technology is studied and understood (which isn’t that complicated to figure out).

Malware Trends

Check any number of reports and you can see the rise in malware is going up (keep in mind these are vendor reports and have a stake in the results, but being that there really is no other source for the information we’ll use them as the accepted experts on the subject) [Symantec] [Trend] McAfee [IBM X-Force] [Microsoft] [RSA]

Source: Mcafee Global Q12012 Threat Report

The increase in malware samples has also been said of mobile malware [F-Secure Mobile Threat Report].

Since the rise of malware can’t be matched by continually hiring another analyst to analyze malware (this process has its limitations) security companies deploy high-interaction and low-interaction sandboxes. These sandboxes run the malware, analyze its behavior and attempt to trigger various heuristics that will auto-classify the malware by hash. If it’s not able to auto-classify then typically the malware is added to a suspicious bucket for a malware analyst to manually review…thus malware analysts are bottle necks in the process of preemptive malware classification.

In addition, a report from Cisco last year found that 33% of Web malware encountered was zero-day malware not detectable by traditional signature-based methodologies at the time of encounter [Cisco 2011 4Q Global Threat Report]

33%!! — Obviously means there is work to be done on the detection/defense side of the fence.

So how can the security industry use automatic classification? Well, in the last few years a data-driven approach has been the obvious step in the process.

The Future of Malware Defense

With the increase in more malware, exploits, exploit kits, campaign-based attacks, targeted attacks, the reliance on automation will heave to be the future. The overall goal of malware defense has been to a larger degree classification and to a smaller degree clustering and attribution.

Thus statistics and data-driven decisions have been an obvious direction that many of the security companies have started to introduce, either by heavily relying on this process or as a supplemental layer to existing defensive technologies to help in predictive pattern-based analysis and classification.

Where statistics is a discipline that makes you understand data and forces decisions based on data, machine learning is where we train computers to make statistical decisions on real-time data based on inputted data.

While machine learning as a concept has been around for decades, it’s only more recently that it’s being used in web filtering, data-leakage prevention (DLP), and malware content analysis.

Training machine learning classifiers involves breaking down whatever content you want to analyze e.g. a network stream or an executable file into “features” (basically characteristics).

For example historically certain malware has:

No icon
No description or company in resource section
Is packed
Lives in windows directory or user profile

Each of the above qualities/characteristics can be considered “features”. Once the defensive technology creates a list of features, it then builds a parser capable of breaking down the content to find those features. e.g. if the content is a PE WIN32 executable, a PE parser will be necessary. The features would include anything you can think of that is characteristic of a PE file.

The process then involves training a classifier on a positive (malicious) and negative (benign) sample set. Once the classifier is trained it can be used to determine if a future unknown sample is benign or malicious and classify it accordingly.

Source: Symantec DLP Machine Learning White Paper

Let me give you a more detailed example: If you’ve ever played around with malicious PDFs you know there are differences between the structure of a benign PDF and a malicious PDF.

Here are some noteworthy characteristics in the structure of a PDF (FireEye Blog/Presentation – Julia Wolf):

Compressed JavaScript
PDF header location e.g %PDF – within first 1024 bytes
Does it contain an embedded file (e.g. flash, sound file)
Signed by a trusted certificate
Encoded/Encrypted Streams e.g. FlatDecode is used quite a lot in malicious PDFs
Names hex escaped
Bogus xref table

All the above are features that can be used to feed the classifier during training against benign and malicious sample sets (check out “Scoring PDF structure to detect malicious file” from my friend Rodrigo Montoro (YouTube)

There are two open-source projects that I want to mention using machine learning to determine if a file is malicious:

PDF-XRay from Brandon Dixon:

An explanation of how it works from the pdf-xray site is as follows:

Adobe Open Source Malware Classification Tool by Karthik Raman/Adobe

Details (from website): Perform quick, easy classification of binaries for malware analysis.

Published results: 98.21% accuracy, 6.7% false positive rate

7 features = DebugSize, ImageVersion, IatRVA, ExportSize, ResourceSize, VirtualSize2, NumberOfSections

Personal remarks: This tool is a great proof of concept, but my results weren’t as successful as Karthik’s results which I’m told were only on binaries that were not packed, my sample set included packed, unpacked, and files that had never been packed.

Shifting away from analysis of files, we can also attempt to classify shellcode on the wire from normal traffic. Using marchov chains which is a discipline of Artificial Intelligence, but in the realm of natural language processing, we can determine and analyze a network stream of instructions to see if the sequence of instructions are likely to be exploit code.

The below example is attempting to show that most exploit code (shellcode) follows a basic skeleton, be it a decoder loop, decoding a payload and then jumping to that payload or finding the delta, getting the kernel32 imagebase, resolving the addresses for GetProcAddress and LoadLibraryA, calling various functions and finally executing the rest of your payload.

There are a finite set of published methods to do this, and if you can use semantics, you can further limit the possible sequences and determine if the network stream are instructions and further if those instructions are shellcode.

The Future of Automated Malware Generation

In many cases the path of attack and defense techniques follows the same story of cat and mouse. Just like Tom and Jerry, the chase continues forever, in the context of security, new technology is introduced, new attacks then emerge and in response new countermeasures are brought in to the detection of those attacks…an attacker’s game can come to an end IF they makes a mistake, but whereas cyber-criminal organizations can claim a binary 0 or 1 success or failure, defense can never really claim a victory over all it’s attackers. It’s a “game” that must always continue.

That being said you’ll hear more and more products and security technologies talk about machine learning like it’s this unbeatable new move in the game….granted you’ll hear it mostly from savvy marketing, product managers or sales folks. In reality it’s another useful layer to slow down an attacker trying to get to their end goal, but it’s by no means invincible.

Use of machine learning can be taken circumvented by an attacker in several possible ways:

Likelihood of false positives / false negatives due to weak training corpus
Circumvention of classification features
Inability to parse/extract features from content
Ability to poison training corpus

Let’s break down each of those points, because if the next stage of defense will increasingly include machine learning, then attackers will be attempting to include various evasion techniques to avoid this new detection technique.

Likelihood of false positives / false negatives due to weak training corpus

If the defense side creates models based on a small sample set or a sample set that doesn’t represent a diverse enough sample set than the model will be too restrictive and thus have false negatives. If a product has too many false-positives, users won’t trust it, and if given the choice ignore the results. Products that typically have too many false positives will be discontinued. Attackers can benefit from a weak training corpus by using less popular techniques/vulnerabilities that most likely haven’t been used in training and won’t be caught by the classifier.

If the defense creates models based only on malicious files and not enough benign files then there will be tons of false positives. Thus, if the attacker models their files to look more representative of good files, there will be a higher likelihood that the acceptable threshold to mitigate false positives will allow the malicious file through.

Circumvention of classification features

At the start of this blog I mentioned that I’m currently attempting to add automatic classification to the cuckoo sandbox, which is an open source behavioral analysis framework. If I were to add such code, it would be open source and any techniques including features would be exposed. Thus, all an attacker would have to do is read my source code, and avoid the features; this is also true for any product that an attacker can buy or demo. They could either read the source code or reverse engineer the product and see which features are being used and attempt to trick the classification algorithm if the threshold/weights/characteristics can be determined.

Inability to parse/extract features from content

Classification using machine learning is 100% reliant on the fact that the features can be extracted from the content and feed to the classification algorithm, but what if the executable is a .NET binary (Japanese Remote Control Virus) and the engine can’t interpret .NET binaries, or if the format changes, or gets updated e.g. PDF 2.0. For each of these changes, a parser must be built, updated and shipped out. Attackers have the advantage of a window of time between product updates, or again with proper research, an understanding that certain products simply can’t handle a particular format in order to extract features.

Ability to poison training corpus

Training a machine learning classifier involves training the algorithm against a known malicious set and a known benign set. If an attacker were able to poison either set, the results and final classification determination would be flawed. This can occur numerous ways. For example: the attacker releases a massive set of files onto the Internet in the off chance that a security product company will use it as its main source of samples, or they poison a number of known malware behavior frameworks such as VirusTotal or malwr, that share samples with security companies, with bogus malware. This scenario is unlikely, because most companies wouldn’t rely on one major source for all their testing, but still worth mentioning.

Conclusion

In reality, we haven’t yet seen malware that contains anti machine learning classification or anti-clustering techniques. What we have seen is more extensive use of on-the-fly symmetric-key encryption where the key isn’t hard-coded in the binary itself, but uses something unique about the target machine that is being infected. Take Zeus for example that makes use of downloading an encrypted binary once the machine has been infected where the key is unique to that machine, or Gauss who had a DLL that was encrypted with a key only found on the targeted user’s machine.

What this accomplishes is that the binary can only work the intended target machine, it’s possible that an emulator would break, but certainly sending it off to home-base or the cloud for behavioral and static analysis will fail, because it simply won’t be able to be decrypted and run.

Most defensive techniques if studied, targeted and analyzed can be evaded — all it takes is time, skill and resources. Using Machine learning to detect malicious executables, exploits and/or network traffic are no exception. At the end of the day it’s important that you at least understand that your defenses are penetrable, but that a smart layered defense is key, where every layer forces the attackers to take their time, forces them to learn new skills and slowly gives away their resources, position and possibly intent — hopefully giving you enough time to be notified of the attack and cease it before ex-filtration of data occurs. What a smart layered defense looks like is different for each network depending on where your assets are and how your network is set up, so there is no way for me to share a one-size fits all diagram, I’ll leave that to you to think about.

Useful Links:
Coursera – Machine Learning Course
CalTech – Machine Learning Course
MLPY (https://mlpy.fbk.eu/)
PyML (http://pyml.sourceforge.net/)
Milk (http://pypi.python.org/pypi/milk/)
Shogun (http://raetschlab.org/suppl/shogun) Code is in C++ but it has a python wrapper.
MDP (http://mdp-toolkit.sourceforge.net) Python library for data mining
PyBrain (http://pybrain.org/)
Orange (http://www.ailab.si/orange/) Statistical computing and data mining
PYMVPA (http://www.pymvpa.org/)
scikit-learn (http://scikit-learn.org): Numpy / Scipy / Cython implementations for major algorithms + efficient C/C++ wrappers
Monte (http://montepython.sourceforge.net) a software for gradient-based learning in Python
Rpy2 (http://rpy.sourceforge.net/): Python wrapper for R

About Stephan
Stephan Chenette has been involved in computer security professionally since the mid-90s, working on vulnerability research, reverse engineering, and development of next-generation defense and attack techniques. As a researcher he has published papers, security advisories, and tools. His past work includes the script fragmentation exploit delivery attack and work on the open source web security tool Fireshark.

Stephan is currently the Director of Security Research and Development at IOActive, Inc.
Twitter: @StephanChenette

INSIGHTS | October 11, 2012

SexyDefense Gets Real

By Ian Amit

As some of you know by now, the recent focus of my research has been defense. After years of dealing almost exclusively with offensive research, I realized that we have been doing an injustice to ourselves as professionals. After all, we eventually get to help organizations protect themselves (having the mindset that the best way to learn defense is to study the offensive techniques), but nevertheless, when examining how organizations practice defense one has a feeling of missing something.

For far too long the practice (and art?) of defense has been entrusted to bureaucrats and was lowered down to a technical element that is a burden on an organization. We can see it from the way that companies have positioned defensive roles: “firewall admin,” “IT security manager,” “incident handler,” and even the famous “CISO.” CISOs have been getting less and less responsibility over time, basically watered down to dealing with the network/software elements of the organization’s security. No process, no physical, no human/social. These are all handled by different roles in the company (audit, physical security, and HR, respectively).

This has led to the creation of the marketing term “APT”: Advanced Persistent Threat. The main reason why non-sophisticated attackers are able to deploy an APT is the fact that organizations are focusing on dealing with extremely narrow threat vectors; any threat that encompasses multiple attack vectors that affect different departments in an organization automatically escalates into an APT since it is “hard” to deal with such threats. I call bullshit on that.

As an industry, we have not really been supportive of the defensive front. We have been pushing out products that deal mainly with past threats and are focused on post-mortem detection of attacks. Anti-virus systems, firewalls, IDS, IPS, and DLP – these are all products that are really effective against attacks from yesteryears. We ignore a large chunk of the defense spectrum nowadays, and attackers are happily using this against us, the defenders.

When we started SexyDefense, the main goal was to open the eyes of defensive practitioners, from the hands-on people to executive management. The reason for this is that this syndrome needs to be fixed throughout the ranks. I already mentioned that the way we deal with security in terms of job titles is wrong. It’s also true for the way we approach it on Day 1. We make sure that we have all the products that industry best practices tell us to have (which are from the same vendors that have been pushing less-than-effective products for years), and then we wait for the alert telling us that we have been compromised for days or weeks.

What we should be doing is first understanding what are we protecting! How much is it worth to the organization? What kind of processes, people, and technologies “touch” those assets, and how do they affect it? What kind of controls are there to protect such assets? And ultimately, what are the vulnerabilities in processes, people, and technologies related to said assets?

These are tough questions – especially if you are dealing with an “old school” practice of security in a large organization. Now try asking the harder question: who is your threat? No, don’t say hackers! Ask the business line owners, the business development people, sales, marketing, and finance. These are the people who probably know best what are the threats to the business, and who is out there to get it. Now align that information with the asset related ones, and you get a more complete picture of what you are protecting, and from whom. In addition, you can already see which controls are more or less effective against such threats, as it’s relatively easy to figure out the capabilities, intent, and accessibility of each adversary to your assets.

Now, get to work! But don’t open that firewall console or that IPS dashboard. “Work” means gathering intelligence on your threat communities, keeping track of organizational information and changes, and owning up to your home-field advantage. You control the information and resources used by the organization. Use them to your advantage to thwart threats, to detect intelligence gathering against your organization, to set traps for attackers, and yes, even to go the whole 9 yards and deal with counterintelligence. Whatever works within the confines of the law and ethics.

If this sounds logical to you, I invite you to read my whitepaper covering this approach [sexydefense.com] and participate in one of the SexyDefense talks in a conference close to you (or watch the one given at DerbyCon online: [http://www.youtube.com/watch?v=djsdZOY1kLM].

If you have not yet run away, think about contributing to the community effort to build a framework for this, much like we did for penetration testing with PTES. Call it SDES for now: Strategic Defense Execution Standard. A lot of you have already been raising interest in it, and I’m really excited to see the community coming up with great ideas and initiatives after preaching this notion for a fairly short time.

Who knows what this will turn into?

INSIGHTS | June 28, 2012

Inside Flame: You Say Shell32, I Say MSSECMGR

By Ruben Santamarta

When I was reading the CrySyS report on Flame (sKyWIper)[1], one paragraph, in particular, caught my attention:

In case of sKyWIper, the code injection mechanism is stealthier such that the presence of the code injection cannot be determined by conventional methods such as listing the modules of the corresponding system processes (winlogon, services, explorer). The only trace we found at the first sight is that certain memory regions are mapped with the suspicious READ, WRITE and EXECUTE protection flags, and they can only be grasped via the Virtual Address Descriptor (VAD) kernel data structure

So I decided to take a look and see what kind of methods Flame was using.

Flame is conceived to gather as much information as possible within heterogeneous environments that can be protected by different solutions, isolated at certain levels, and operated upon by different profiles. Which means that, from the developers point of view, you can’t assume anything and should be prepared for everything.

Some of the tricks implemented in Flame seem to focus on bypass just as much AV products, specifically in terms of heuristics. A distributed “setup” functionality through three different processes (winlogon, explorer, and services ) is way more confusing than letting a unique, trusted process do the job; i.e. it’s less suspicious to detect Internet Explorer coming from explorer.exe than winlogon.

In essence, the injection method seems to pivot around the following three key features:

· Disguise the malicious module as a legitimate one; Shell32.dll in this case.

· Bypass common registration methods supplied by the operating system, such as LoadLibrary, to avoid being detected as an active module.

· Achieve the same functionality as a correctly-registered module.

So, let’s see how Flame implements it.

During the initial infection when DDEnumCallback is called, Flame injects a blob and creates a remote thread in Services.exe. The blob has the following structure:

The loader stub is a function that performs the functionality previously described: basically a custom PE loader that’s similar to the CryptoPP dllloader.cpp[2] with some additional tricks.

The injection context is a defined structure that contains all the information the loader stub may need including API addresses or names, DLL names, and files—in fact, the overall idea reminded me of Didier Stevens’ approach to generating shellcodes directly from a C compiler[3]

Injection Context: Blob + 0x710

API Addresses:

esi OpenMutexW

esi+4 VirtualAlloc

esi+8 VirtualFree

esi+0Ch VirtualProtect

esi+10h LoadLibraryA

esi+14h LoadLibraryW

esi+18h GetModuleHandleA

esi+1Ch GetProcAddress

esi+20h memcpy

esi+24h memset

esi+28h CreateFileMappingW

esi+2Ch OpenFileMappingW

esi+30h MapViewOfFile

esi+34h UnmapViewOfFile

esi+38h ReleaseMutex

esi+3Ch NtQueryInformationProcess

esi+40h GetLastError

esi+44h CreateMutexW

esi+48h WaitForSingleObject

esi+4Ch CloseHandle

esi+50h CreateFileW

esi+54h FreeLibrary

esi+58h Sleep

esi+5Ch LocalFree

The loader stub also contains some interesting tricks.

Shell32.dll: A matter of VAD

To conceal its own module, Flame hides itself behind Shell32.dll, which is one of the largest DLLs you can find on any Windows system, meaning it’s large enough to hold Flame across different versions.

Once shell32.dll has been mapped, a VAD node is created that contains a reference to the FILE_OBJECT, which points to Shell32.dll. Flame then zeroes that memory and loads its malicious module through the custom PE loader, copying sections, adjusting permissions, and fixing relocations.

As a result, those forensics/AntiMalware/AV engines walking the VAD tree to discover hidden DLLs (and not checking images) would be bypassed since they assume that memory belongs to Shell32.dll, a trusted module, when it’s actually mssecmgr.ocx.

The stub then calls to DllEntryPoint, passing in DLL_PROCESS_ATTACH to initialize the DLL.

The malicious DLL currently has been initialized, but remember it isn’t registered properly, so cannot receive remaining events such as DLL_THREAD_ATTACH, DLL_THREAD_DETACH, and DLL_PROCESS_DETACH.

And here comes the final trick:

The msvcrt.dll is loaded up to five times, which is a little bit weird, no?

Then the PEB InLoadOrder structure is traversed to find the entry that corresponds to msvcrt.dll by comparing the DLL base addresses:

Once found, Flame hooks this entry point:

InjectedBlock1 (0x101C36A1) is a small piece of code that basically dispatches the events received to both the malicious DLL and the original module.

The system uses this entry point to dispatch events to all the DLLs loaded in the process; as a result, by hooking into it Flame’s main module achieves the goal of receiving all the events other DLLs receive. Therefore, it can complete synchronization tasks and behaves as any other DLL. Neat.

I assume that Flame loads msvcrt.dll several times to increase its reference count to prevent msvcrt.dll from being unloaded, since this hook would then become useless.

See you in the next post!

[1] https://cms.ioactive.com/wp-content/uploads/2012/06/skywiper.pdf

[2] http://code.google.com/p/secureimplugin/source/browse/trunk/CryptoPP/dllloader.cpp

[3] http://blog.didierstevens.com/programs/shellcode/#ShellCodeWithaCCompiler