INSIGHTS | March 12, 2012

3 Metal 350nm teardown explanation

Real quick image as posted on Facebook tech .at. flylogic.net profile. A Total of 4 overlayed images of a small section of an NEC upd78F9210 MCU.

A FlipFlop and a few AND’s were quickly spotted. Can you find them?

INSIGHTS | March 6, 2012

Enter the Dragon(Book), Part 1

This is a fairly large topic; I’ve summarized and written in a somewhat narrative/blog friendly way here.
 
A few years ago I was reading a blog about STL memory allocators (http://blogs.msdn.com/b/vcblog/archive/2008/08/28/the-mallocator.aspx), memory allocators being a source of extreme security risk, I took the author’s statement, “I’ve carefully implemented all of the integer overflow checks and so forth that would be required in real production code.” as a bit of a challenge.

After playing with permutations of the code I was able to get failures of this STL allocator.  What was interesting to me is that I wasn’t only getting failures in being able to cause failures in my test code; I was able to also crash the compiler and linker. 
 
Exploiting a compiler is nothing new; Trusting Trust by Ken Thompson is of course the preeminent philosophy on this topic.  In a nutshell, a compiler can be made that compiles other applications with known subtle backdoors which when any valid/flawless code is compiled, the backdoor is included, very interesting and tricky.
 
David A. Wheeler has a page dedicated to his PhD dissertation that (http://www.dwheeler.com/trusting-trust/) proposes a fairly simple technique known as Diverse Double-Compiling (DDC) where you compile all code with a save/trusted compiler to validate your possibly-untrusted compiler output.  Sounds simple and effective enough?
Enter the dragon(book), or rather the C specification.  I am not a language lawyer (and I do not even play one on T.V.), but what’s interesting about the C specification is that there are significant portions of state that are left to the imagination of the compiler writer (i.e. undefined operations).  What if you could exploit this behavior in a deterministic way?  What if you could exploit this behavior in a cross-compiler-deterministic way?
It would seem then that you would have the perfect backdoor, undetectable by DDC techniques or even manual inspection.
 
After some time and checking with vendors on the security sensitivity nature of this class of problems, I found out that there was unlikely to be a “fix” (unless the C specification is altered).  This gave me a clear conscience to publish the method.

The attached code is the code I used to win the backdoor hiding contest @ DEFCON (http://defcon.org).  It is a library class written in C++/CLI that exposes a number of methods that allow for the loading/saving of data to a disk file.
 

See if you can find the backdoor, I’ll post the explanation and details on the flaw soon.


// eBookLib.cpp : main project file.
// Project requirements
// Add/Remove/Query eBooks
// One code file (KISS in effect)
//

//
// **** Mostly generated from Visual Studio project templates ****
//
#define WIN32_LEAN_AND_MEAN
#define _WIN32_WINNT 0x501

#include <windows.h>
#include <stdio.h>
#include <wchar.h>

#include <msclrmarshal.h>

#using <Microsoft.VisualC.dll>
#using <System.dll>
#using <System.Core.dll>

using namespace System;
using namespace System::IO;
using namespace System::Threading;
using namespace System::Threading::Tasks;
using namespace System::Reflection;
using namespace System::Diagnostics;
using namespace System::Globalization;
using namespace System::Collections::Generic;
using namespace System::Security::Permissions;
using namespace System::Runtime::InteropServices;
using namespace System::IO::MemoryMappedFiles;
using namespace System::IO;
using namespace System::Runtime::CompilerServices;

using namespace msclr;
using namespace msclr::interop;

//
// General Information about an assembly is controlled through the following
// set of attributes. Change these attribute values to modify the information
// associated with an assembly.
//
[assembly:AssemblyTitleAttribute(“eBookLib”)];
[assembly:AssemblyDescriptionAttribute(“”)];
[assembly:AssemblyConfigurationAttribute(“”)];
[assembly:AssemblyCompanyAttribute(“Microsoft”)];
[assembly:AssemblyProductAttribute(“eBookLib”)];
[assembly:AssemblyCopyrightAttribute(“Copyright (c) Microsoft 2010”)];
[assembly:AssemblyTrademarkAttribute(“”)];
[assembly:AssemblyCultureAttribute(“”)];
//
// Version information for an assembly consists of the following four values:
//
//      Major Version
//      Minor Version
//      Build Number
//      Revision
//
// You can specify all the value or you can default the Revision and Build Numbers
// by using the ‘*’ as shown below:

[assembly:AssemblyVersionAttribute(“1.0.*”)];

[assembly:ComVisible(false)];

[assembly:CLSCompliantAttribute(true)];

[assembly:SecurityPermission(SecurityAction::RequestMinimum, UnmanagedCode = true)];

////////////////////////////////////////////////////////////////////////////////////////////////////////////////
// Native structures used from legacy system,
// define the disk storage for our ebook,
// 
// The file specified by the constructor is read from and loaded automatically, it is also auto saved when closed.
////////////////////////////////////////////////////////////////////////////////////////////////////////////////
enum eBookFlag
{
NOFLAG = 0,
ACTIVE = 1,
PENDING_REMOVE = 2
};

typedef struct _eBookAccountingData
{
// Binary Data, may include nulls
char PurchaseOrder[ACCOUNTING_SIZE];
char RecieptData[ACCOUNTING_SIZE];
size_t PurchaseOrderLength;
size_t RecieptDataLength;
} eBookAccountingData, *PeBookAccountingData;

typedef struct _eBookPublicData
{
wchar_t ISBN[BUFSIZ];
wchar_t MISC[BUFSIZ];
wchar_t ShortName[BUFSIZ];
wchar_t Author[BUFSIZ];
wchar_t LongName[BUFSIZ];
wchar_t PathToFile[MAX_PATH];
int Rating;
int SerialNumber;
} eBookPublicData, *PeBookPublicData;

typedef struct _eBook
{
eBookFlag Flag;
eBookAccountingData Priv;
eBookPublicData Pub;
} eBook, *PeBook;

// define managed analogues for native/serialized types
namespace Client {
namespace ManagedEbookLib {
[System::FlagsAttribute]
public enum class ManagedeBookFlag : int 
{
NOFLAG = 0x0,
ACTIVE = 0x1,
PENDING_REMOVE = 0x2,
};

public ref class ManagedEbookPublic 
{
public:
__clrcall ManagedEbookPublic()
{
ISBN = MISC = ShortName = Author = LongName = PathToFile = String::Empty;
}
Int32 Rating;
String^ ISBN;
String^ MISC;
Int32 SerialNumber;
String^ ShortName;
String^ Author;
String^ LongName;
String^ PathToFile;
};

public ref class ManagedEbookAccounting 
{
public:
__clrcall ManagedEbookAccounting()
{
PurchaseOrder = gcnew array<Byte>(0);
RecieptData = gcnew array<Byte>(0);
}
array<Byte>^ PurchaseOrder;
array<Byte>^ RecieptData;
};

public ref class ManagedEbook 
{
public:
__clrcall ManagedEbook()
{
Pub = gcnew ManagedEbookPublic();
Priv = gcnew ManagedEbookAccounting();
}
ManagedeBookFlag Flag;
ManagedEbookPublic^ Pub;
ManagedEbookAccounting^ Priv;
array<Byte^>^ BookData;
};
}
}

using namespace Client::ManagedEbookLib;

// extend marshal library for native/managed inter-op
namespace msclr {
   namespace interop {
template<>
inline ManagedEbookAccounting^ marshal_as<ManagedEbookAccounting^, eBookAccountingData> (const eBookAccountingData& Src) 
{
ManagedEbookAccounting^ Dest = gcnew ManagedEbookAccounting;

if(Src.PurchaseOrderLength > 0 && Src.PurchaseOrderLength < sizeof(Src.PurchaseOrder))
{
Dest->PurchaseOrder = gcnew array<Byte>((int) Src.PurchaseOrderLength);
Marshal::Copy(static_cast<IntPtr>(Src.PurchaseOrder[0]), Dest->PurchaseOrder, 0, (int) Src.PurchaseOrderLength); 
}

if(Src.RecieptDataLength > 0 && Src.RecieptDataLength < sizeof(Src.RecieptData))
{
Dest->RecieptData = gcnew array<Byte>((int) Src.RecieptDataLength);
Marshal::Copy(static_cast<IntPtr>(Src.RecieptData[0]), Dest->RecieptData, 0, (int) Src.RecieptDataLength); 
}

return Dest;
};
template<>
inline ManagedEbookPublic^ marshal_as<ManagedEbookPublic^, eBookPublicData> (const eBookPublicData& Src) {
ManagedEbookPublic^ Dest = gcnew ManagedEbookPublic;
Dest->Rating = Src.Rating;
Dest->ISBN = gcnew String(Src.ISBN);
Dest->MISC = gcnew String(Src.MISC);
Dest->SerialNumber = Src.SerialNumber;
Dest->ShortName = gcnew String(Src.ShortName);
Dest->Author = gcnew String(Src.Author);
Dest->LongName = gcnew String(Src.LongName);
Dest->PathToFile = gcnew String(Src.PathToFile);
return Dest;
};
template<>
inline ManagedEbook^ marshal_as<ManagedEbook^, eBook> (const eBook& Src) {
ManagedEbook^ Dest = gcnew ManagedEbook;

Dest->Priv = marshal_as<ManagedEbookAccounting^>(Src.Priv);
Dest->Pub = marshal_as<ManagedEbookPublic^>(Src.Pub);
Dest->Flag = static_cast<ManagedeBookFlag>(Src.Flag);

return Dest;
};
   }
}

// Primary user namespace
namespace Client
{
namespace ManagedEbooks
{
// “Store” is Client::ManagedEbooks::Store()
public ref class Store
{
private:
String^ DataStore;
List<ManagedEbook^>^ Books;
HANDLE hFile;

// serialization from disk
void __clrcall LoadDB()
{
Books = gcnew List<ManagedEbook^>();
eBook AeBook;
DWORD red = 0;

marshal_context^ x = gcnew marshal_context();
hFile = CreateFileW(x->marshal_as<const wchar_t*>(DataStore), GENERIC_READ | GENERIC_WRITE, FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE, 0, OPEN_ALWAYS,  0,  0);

if(hFile == INVALID_HANDLE_VALUE) 
return;

do {
ReadFile(hFile, &AeBook, sizeof(eBook), &red, NULL);

if(red == sizeof(eBook))
Books->Add(marshal_as<ManagedEbook^>(AeBook));

} while(red == sizeof(eBook));
}

// scan hay for anything that matches needle
bool __clrcall MatchBook(ManagedEbook ^hay, ManagedEbook^ needle)
{
// check numeric values first
if(hay->Pub->Rating != 0 && hay->Pub->Rating == needle->Pub->Rating)
return true;
if(hay->Pub->SerialNumber != 0 && hay->Pub->SerialNumber == needle->Pub->SerialNumber)
return true;

// scan each string
if(!String::IsNullOrEmpty(hay->Pub->ISBN) && hay->Pub->ISBN->Contains(needle->Pub->ISBN))
return true;
if(!String::IsNullOrEmpty(hay->Pub->MISC) && hay->Pub->MISC->Contains(needle->Pub->MISC))
return true;
if(!String::IsNullOrEmpty(hay->Pub->ShortName) && hay->Pub->ShortName->Contains(needle->Pub->ShortName))
return true;
if(!String::IsNullOrEmpty(hay->Pub->Author) && hay->Pub->Author->Contains(needle->Pub->Author))
return true;
if(!String::IsNullOrEmpty(hay->Pub->LongName) && hay->Pub->LongName->Contains(needle->Pub->LongName))
return true;
if(!String::IsNullOrEmpty(hay->Pub->PathToFile) && hay->Pub->PathToFile->Contains(needle->Pub->PathToFile))
return true;
return false;
}

// destructor
__clrcall !Store()

Close();
}

// serialization to disk happens here
void __clrcall _Close()
{
if(hFile == INVALID_HANDLE_VALUE) 
return;

SetFilePointer(hFile, 0, NULL, FILE_BEGIN);
for each(ManagedEbook^ book in Books)
{
eBook save;
DWORD wrote=0;
marshal_context^ x = gcnew marshal_context();
ZeroMemory(&save, sizeof(save));

save.Pub.Rating = book->Pub->Rating;
save.Pub.SerialNumber = book->Pub->SerialNumber;
save.Flag = static_cast<eBookFlag>(book->Flag);

swprintf_s(save.Pub.ISBN, sizeof(save.Pub.ISBN), L”%s”, x->marshal_as<const wchar_t*>(book->Pub->ISBN));
swprintf_s(save.Pub.MISC, sizeof(save.Pub.MISC), L”%s”, x->marshal_as<const wchar_t*>(book->Pub->MISC));
swprintf_s(save.Pub.ShortName, sizeof(save.Pub.ShortName), L”%s”, x->marshal_as<const wchar_t*>(book->Pub->ShortName));
swprintf_s(save.Pub.Author, sizeof(save.Pub.Author), L”%s”, x->marshal_as<const wchar_t*>(book->Pub->Author));
swprintf_s(save.Pub.LongName, sizeof(save.Pub.LongName), L”%s”, x->marshal_as<const wchar_t*>(book->Pub->LongName));
swprintf_s(save.Pub.PathToFile, sizeof(save.Pub.PathToFile), L”%s”, x->marshal_as<const wchar_t*>(book->Pub->PathToFile));

if(book->Priv->PurchaseOrder->Length > 0)
{
pin_ptr<Byte> pin = &book->Priv->PurchaseOrder[0];

save.Priv.PurchaseOrderLength = min(sizeof(save.Priv.PurchaseOrder), book->Priv->PurchaseOrder->Length);
memcpy(save.Priv.PurchaseOrder, pin, save.Priv.PurchaseOrderLength);
pin = nullptr;
}

if(book->Priv->RecieptData->Length > 0)
{
pin_ptr<Byte> pin = &book->Priv->RecieptData[0];

save.Priv.RecieptDataLength = min(sizeof(save.Priv.RecieptData), book->Priv->RecieptData->Length);
memcpy(save.Priv.RecieptData, pin, save.Priv.RecieptDataLength);
pin = nullptr;
}

WriteFile(hFile, &save, sizeof(save), &wrote, NULL);
if(wrote != sizeof(save))
return;
}
CloseHandle(hFile);
hFile = INVALID_HANDLE_VALUE;
}

protected:

// destructor forwards to the disposable interface
virtual __clrcall ~Store()

this->!Store(); 
}

public:

// possibly hide this
void __clrcall Close()
{
_Close();
}

// constructor
__clrcall Store(String^ DataStoreDB)
{
DataStore = DataStoreDB;
LoadDB();
}

// add ebook
void __clrcall Add(ManagedEbook^ eBook)
{
Books->Add(eBook);
}

// remove ebook
void __clrcall Remove(ManagedEbook^ eBook)
{
Books->Remove(eBook);
}

// get query list
List<ManagedEbook^>^ __clrcall Query(ManagedEbook^ eBook)
{
List<ManagedEbook^>^ rv = gcnew List<ManagedEbook^>();

for each(ManagedEbook^ book in Books)
{
if(MatchBook(book, eBook))
rv->Add(book);
}
return rv;
}
};
}
}

INSIGHTS | February 24, 2012

IOActive’s IOAsis at RSA 2012

 

This is not a technical post as usual. This is an invitation for an important event if you are going to RSA 2012 and want to escape the chaos and experience the luxury at IOAsis while enjoying great technical talks and meeting with industry experts. If you want to feel like a VIP and have great time then don’t miss this opportunity!

 

We have scheduled some really interesting talks such as:
  • Firmware analysis of Industrial Devices with IOActive researcher Ruben Santamarta
  • Mobile Security in the Enterprise with IOActive VP, David Baker and IOActive Principal Consultant, Ilja van Sprundel
  • The Social Aspect of Pen Testing with IOActive Managing Consultant, Ryan O’Horo
  • Battling Compliance in the Cloud with IOActive Principal Compliance Consultant, Robert Zigweid
We hope to see you there!

 

INSIGHTS | February 17, 2012

Estimating Password and Token Entropy (Randomness) in Web Applications

Entropy

“In information theory, entropy is a measure of the uncertainty associated with a random variable. In this context, the term usually refers to the Shannon entropy, which quantifies the expected value of the information contained in a message, usually in units such as bits. In this context, a ‘message’ means a specific realization of the random variable.” [1]

1. http://en.wikipedia.org/wiki/Entropy_%28information_theory%29

I find myself analyzing password and token entropy quite frequently and I’ve come to rely upon Wolfram Alpha and Burp Suite Pro to get my estimates for these values. It’s understandable why we’d want to check a password’s entropy. It gives us an indication of how long it would take an attacker to brute force it, whether in a login form or a stolen database of hashes. However, an overlooked concern is the entropy contained in tokens for session and object identifiers. These values can also be brute forced to steal active sessions and gain access to objects to which we do not have permission. Not only are these tokens sometimes too short, they sometimes also contain much less entropy than appears.

Estimating Password Entropy
Wolfram Alpha has a keyword specifically for analyzing passwords.
http://www.wolframalpha.com/input/?i=password+strength+f00b4r^LYFE

 

 

 

Estimating Token Entropy
Estimating the solution for: [ characters ^ length = 2 ^ x ] will convert an arbitrary string value to bits of entropy. This formula is not really solvable, so I use Wolfram Alpha to estimate the solution.

 

e.g. 1tdrtahp4y8201att8i414a7km has the formula:
http://www.wolframalpha.com/input/?i=36^26+%3D+2^x

 

Click “Approximate Form” under the “Real solution”:

 

The password strength calculator also works okay on tokens, and we’ll see a similar result:

 

http://www.wolframalpha.com/input/?i=password+strength+1tdrtahp4y8201att8i414a7km
BUT! Analysis of a single token is not enough to measure /effective/ entropy. Burp Suite Sequencer will run the proper entropy analysis tests on batches of session identifiers to estimate this value. Send your application login request (or whatever request generates a new token value) to the Sequencer and configure the Sequencer to collect the target token value. Start collecting and set the “Auto-Analyze” box to watch as Burp runs its tests.

 

A sample token “1tdrtahp4y8201att8i414a7km” from this application has an estimated entropy of 134.4 bits, but FIPS analysis of a batch of 2000 of these identifiers shows an effective entropy of less than 45 bits!

 

Not only that, but the tokens range in length from 21 to 26 characters, some are much shorter than we originally thought.
Burp will show you many charts, but these bit-level analysis charts will give you an idea of where the tokens are failing to meet expected entropy.

 

 

You can spot a highly non-random value near the middle of the token (higher is better), and the varying length of the tokens drag down entropy near the end. The ASCII-based character set used in the token have one or more unused or underused bits, as seen in the interspersed areas of very low entropy.

 

In the case illustrated above I would ask the client to change the way randomness is supplied to the token and/or increase the token complexity with a hashing function, which should increase attack resistance.

 

Remember, for session or object identifiers, you want to get close to 128 bits of /effective/ entropy to prevent brute forcing. This is a guideline set by OWASP and is in line with most modern web application frameworks.

 

If objects persist for long periods or are very numerous (in the millions) you’ll want more entropy to maintain the same level of safety as a session identifier, which is more ephemeral. An example of persistent objects (on the order of years) which rely on high entropy tokens would be Facebook photo URLs. Photos marked private are still publicly accessible, but Facebook counts on the fact that their photo URLs have high entropy.

 

The following URL has at least 160 bits of entropy:

https://fbcdn-sphotos-a.akamaihd.net/hphotos-ak-ash4/398297_10140657048323225_750784224_11609676_1712639207_n.jpg

 

For passwords, the analysis is a little more subjective, but Wolfram Alpha gives you a good estimate. You can use this password analysis for encryption keys or passphrases as well, e.g. if they are provided as part of a source code audit.

 

Happy Hacking!

 

INSIGHTS | February 8, 2012

I can still see your actions on Google Maps over SSL

A while ago, yours truly gave two talks on SSL traffic analysis: one at 44Con and one at RuxCon. A demonstration of the tool was also given at last year’s BlackHat Arsenal by two of my co-workers. The presented research and tool may not have been as groundbreaking as some of the other talks at those conferences, but attendees seemed to like it, so I figured it might make some good blog content. 

Traffic analysis is definitely not a new field, neither in general nor when applied to SSL; a lot of great work has been done by reputable research outlets, such as Microsoft Research with researchers like George Danezis. What recent traffic analysis research has tried to show is that there are enormous amounts of useful information to be obtained by an attacker who can monitor the encrypted communication stream. 

A great example of this can be found in the paper with the slightly cheesy title Side-Channel Leaks in Web Applications: a Reality Today, a Challenge Tomorrow. The paper discusses some approaches to traffic analysis on SSL-encrypted web applications and applies them to real-world systems. One of the approaches enables an attacker to build a database that contains traffic patterns of the AutoComplete function in drop-down form fields (like Google’s Auto Complete). Another great example is the ability to—for a specific type of stock management web application—reconstruct pie charts in a couple of days and figure out the contents of someone’s stock portfolio.

After discussing these attack types with some of our customers, I noticed that most of them seemed to have some difficulty grasping the potential impact of traffic analysis on their web applications. The research papers I referred them to are quite dry and they’re also written in dense, scientific language that does nothing to ease understanding. So, I decided to just throw some of my dedicated research time out there and come up with a proof of concept tool using a web application that everyone knows and understands: Google Maps.

Since ignorance is bliss, I decided to just jump in and try to build something without even running the numbers on whether it would make any sense to try. I started by running Firefox and Firebug in an effort to make sense of all the JavaScript voodoo going on there. I quickly figured out that Google Maps works by using a grid system in which PNG images (referred to as tiles) are laid out. Latitude and longitude coordinates are converted to x and y values depending on the selected zoom level; this gives a three dimensional coordinate system in which each separate (x, y, z)-triplet represents two PNG images. The first image is called the overlay image and contains the town, river, highway names and so forth; the second image contains the actual satellite data. 

Once I had this figured out the approach became simple: scrape a lot of satellite tiles and build a database of the image sizes using the tool GMapCatcher. I then built a tool that uses libpcap to approximate the image sizes by monitoring the SSL encrypted traffic on the wire. The tool tries to match the image sizes to the recorded (x,y,z)-triplets in the database and then tries to cluster the results into a specific region. This is notoriously difficult to do since one gets so many false positives if the database is big enough. Add to this the fact that it is next to impossible to scrape the entire Google Maps database since, first, they will ban you for generating so much traffic and, second, you will have to store many petabytes of image data. 

With a little bit of cheating—I used a large browser screen so I would have more data to work with—I managed to make the movie Proof of Concept – SSL Traffic Analysis of Google Maps. 

As shown in the movie, the tool has a database that contains city profiles including Paris, Berlin, Amsterdam, Brussels, and Geneva. The tool runs on the right and on the left is the browser accessing Google Maps over SSL. In the first attempt, I load the city of Paris and zoom in a couple of times. On the second attempt I navigate to Berlin and zoom in a few times. On both occasions the tool manages to correctly guess the locations that the browser is accessing. 

Please note that it is a shoddy proof of concept, but it shows the concept of SSL traffic analysis pretty well. It also might be easier to understand for less technically inclined people, as in “An attacker can still figure out what you’re looking at on Google Maps” (with the addendum that it’s never going to be a 100% perfect and that my shoddy proof of concept has lots of room for improvement). 

For more specific details on this please refer to the IOActive white paper Traffic Analysis on Google Maps with GMaps-Trafficker or send me a tweet at @santaragolabs.

INSIGHTS | February 3, 2012

Solving a Little Mystery

Firmware analysis is a fascinating area within the vast world of reverse engineering, although not very extended. Sometimes you end up in an impasse until noticing a minor (or major) detail you initially overlooked. That’s why sharing methods and findings is a great way to advance into this field.

While looking for certain information during a session of reversing, I came across this great post. There is little to add except for solving the ‘mystery’ behind that simple filesystem and mentioning a couple of technical details.
This file system is part of the WindRiver’s Web Server architecture for embedded devices, so you will likely find it inside firmwares based on VxWorks. It is known as MemFS (watch out, not the common MemFS) or Wind River management file system, and basically allows devices to serve files via the embedded web server without needing an ‘actual’ file system since this one lies on its non-volatile memory.
VxWorks  provides  pagepack, a tool used to transform any file intended to be served by a WindWeb server into C code. Therefore, a developer just compiles everything into the same firmware image.
 From a reverser’s point of view, what we should find is the following structure:
 
 

 There are a few things  here worth mentioning:

  • The header is not necessarily 12 but 8 so the third field seems optional.
  • The first 4 bytes look like a flag field that may indicate, among other things,  whether  a file data will be compressed or not (1 = Compressed, 2 = Plain)
  • The signature can vary between firmwares since it is defined by the constant ‘HTTP_UNIQUE_SIGNATURE’ , in fact, we may find this signature twice inside a firmware; the first one due to  the .h  where it is defined (close to other strings such as the webserver banner )and the second one already as part of  the MemFS.
Hope these additional details help you on your future research.
INSIGHTS | January 17, 2012

A free Windows Vulnerability for the NSA

Some months ago at Black Hat USA 2011 I presented this interesting issue in the workshop “Easy and Quick Vulnerability Hunting in Windows,” and now I’m sharing it with all people a more detailed explanation in this blog post.

In Windows 7 or Windows 2008, in the folder C:WindowsInstaller there are many installer files (from already installed applications) with what appear to be random names. When run, some of these installer files (like Microsoft Office Publisher MUI (English) 2007) will automatically elevate privileges and try to install when any Windows user executes them. Since the applications are already installed, there’s no problem, at least in theory.

 

However, an interesting issue arises during the installation process when running this kind of installer: a temporary file is created in C:UsersusernameAppDataLocalTemp, which is the temporary folder for the current user. The created file is named Hx????.tmp (where ???? seem to be random hex numbers), and it seems to be a COM DLL from Microsoft Help Data Services Module, in which its original name is HXDS.dll. This DLL is later loaded by msiexec.exe process running under the System account that is launched by the Windows installer service during the installation process.

 

When the DLL file is loaded, the code in the DLL file runs as the System user with full privileges. At first sight this seems to be an elevation of privileges vulnerability since the folder where the DLL file is created is controlled by the current user, and the DLL is then loaded and run under the System account, meaning any user could run code as the System user by replacing the DLL file with a specially-crafted one before the DLL is loaded and executed.

 

Analysis reveals that the issue is not easily exploitable since the msiexec.exe process generates an MD5 hash of the DLL file and compares it with a known-good MD5 hash value that is read from a file located in C:WindowsInstaller, which is only readable and writable by System and Administrators accounts.

 

In order to exploit this issue, an attacker needs to replace the DLL file with a modified DLL file that contains exploit code that can match the valid MD5 hash. The attacker DLL will then be run under the System account, allowing privilege elevation and operating system compromise. The problem is that this is not a simple attack—it’s an attack to the MD5 hashing algorithm referred to as a second-preimage attack for which there are no practical attacks that I know of, so it’s impossible for a regular attacker to generate a file with the same MD5 hash as the existing DLL file.

 

The reason for the title of this post comes from the fact that intelligence agencies, which are known for their cracking technologies and power, probably could perform this attack and build a local elevation of privileges 0day exploit for Windows.

 

I don’t know why Microsoft continues using MD5; it has been banned by Microsoft SDL since 2005 and it seems there has been some component oversight or these components have been built without following SDL guidance. Who knows on what other functionality MD5 continues to be used by Microsoft, allowing abuse by intelligence agencies.

 

Note: When installing some Windows updates, the Windows Installer service also creates the same DLL file in the C:windowstemp folder, possibly allowing the same attack.

 

The following YouTube links provide more technical details and video demonstrations about this vulnerability.

References.

INSIGHTS | January 9, 2012

Common Coding Mistakes – Wide Character Arrays

This post contains a few of my thoughts on common coding mistakes we see during code reviews when developers deal with wide character arrays. Manipulating wide character strings is reasonably easy to get right, but there are plenty of “gotchas” still popping up. Coders should make sure they take care because a few things can slip your mind when dealing with these strings and result in mistakes.

A little bit of background:
The term wide character generally refers to character data types with a width larger than a byte (the width of a normal char). The actual size of a wide character varies between implementations, but the most common sizes are 2 bytes (i.e. Windows) and 4 bytes (i.e. Unix-like OSes). Wide characters usually represent a particular character using one of the Unicode character sets: in Windows this will be UTF-16 and for Unix-like systems, whose wide characters are twice the size, this will usually be UTF-32.

 

Windows seems to love wide character strings and has made them standard. As a result, many Windows APIs have two versions: functionNameA and functionNameW, an ANSI version and a wide char string version, respectively. If you’ve done any development on Windows systems, you’ll definitely be no stranger to wide character strings.

 

There are definite advantages to representing strings as wide char arrays, but there are a lot of mistakes to make, especially if you’re used to developing on Unix-like systems or you forget to consider the fact that one character does not equal one byte.

 

For example, consider the following scenario, where a Windows developer begins to unsuspectingly parse a packet that follows their proprietary network protocol. The code shown takes a UTF-16 string length (unsigned int) from the packet and performs a bounds check. If the check passes, a string of the specified length (assumed to be a UTF-16 string) is copied from the packet buffer to a fresh wide char array on heap.

 

[ … ]
if(packet->dataLen > 34 || packet->dataLen < sizeof(wchar_t)) bailout_and_exit();
size_t bufLen = packet->dataLen / sizeof(wchar_t);

wchar_t *appData = new wchar_t[bufLen];
memcpy(appData, packet->payload, packet->dataLen);
[ … ]
This might look okay at first glance; after all, we’re just copying a chunk of data to a new wide char array. But consider what would happen if packet->dataLen was an odd number. For example, if packet->dataLen = 11, we end up with size_t bufLen = 11 / 2 = 5 since the remainder of the division will be discarded.

 

So, a five-element–long wide character buffer is allocated into which the memcpy() copies 11 bytes. Since five wide chars on Windows is 10 bytes (and 11 bytes are copied), we have an off-by-one overflow. To avoid this, the modulo operator should be used to check that packet->dataLen is even to begin with; that is:

 

 

if(packet->dataLen % 2) bailout()

 

Another common occurrence is to forget that the NULL terminator on the end of a wide character buffer is not a single NULL byte: it’s two NULL bytes (or 4, on a UNIX-like box). This can lead to problems when the usual len + 1 is used instead of the len + 2 that is required to account for the extra NULL byte(s) needed to terminate wide char arrays, for example:

 

int alloc_len = len + 1;
wchar_t *buf = (wchar_t *)malloc(alloc_len);
memset(buf, 0x00, len);
wcsncpy(buf, srcBuf, len);
If srcBuf had len wide chars in it, all of these would be copied into buf, but wcsncpy() would not NULL terminatebuf. With normal character arrays, the added byte (which will be a NULL because of the memset) would be the NULL terminator and everything would be fine. But since wide char strings need either a two- or four-byte NULL terminator (Windows and UNIX, respectively), we now have a non-terminated string that could cause problems later on.

 

Some developers also slip up when they wrongly interchange the number of bytes and the number of characters. That is, they use the number of bytes as a copy length when what the function was asking for was the number of characters to copy; for example, something like the following is pretty common:
int destLen = (stringLen * (sizeof(wchar_t)) + sizeof(wchar_t);
wchar_t *destBuf = (wchar_t *)malloc(destLen);
MultiByteToWideChar(CP_UTF8, 0, srcBuf, stringLen, destBuf, destLen);
[ do something ]

 

The problem with the sample shown above is that the sixth parameter to MultiByteToWideChar is the length of the destination buffer in wide characters, not in bytes, as the call above was done. Our destination length is out by a factor of two here (or four on UNIX-like systems, generally) and ultimately we can end up overrunning the buffer. These sorts of mistakes result in overflows and they’re surprisingly common.

 

The same sort of mistake can also be made when using “safe” wide char string functions, like wcsncpy(), for example:
unsigned int destLen = (stringLen * sizeof(wchar_t)) + sizeof(wchar_t);
wchar_t destBuf[destLen];
memset(destBuf, 0x00, destLen);
wcsncpy(destBuf, srcBuf, sizeof(destBuf));
Although using sizeof(destuff) for maximum destination size would be fine if we were dealing with normal characters, this doesn’t work for wide character buffers. Instead, sizeof(destBuf) will return the number of bytes indestBuf, which means the wcsncpy() call above it can end up copying twice as many bytes to destBuf as intended—again, an overflow.

 

The other wide char equivalent string manipulation functions are also prone to misuse in the same ways as their normal char counterparts—look for all the wide char equivalents when auditing such functions as swprintf,wcscpywcsncpy, etc. There also are a few wide char-specific APIs that are easily misused; take, for example,wcstombs(), which converts a wide char string to a multi-byte string. The prototype looks like this:
size_t wcstombs(char *restrict s, const wchar_t *restrict pwcs, size_t n);

 

It does bounds checking, so the conversion stops when n bytes have been written to s or when a NULL terminator is encountered in pwcs (the source buffer). If an error occurs, i.e. a wide char in pwcs can’t be converted, the conversion stops and the function returns (size_t)-1, else the number of bytes written is returned. The MSDN considerswcstombs() to be deprecated, but there are still a few common ways to mess when using it, and they all revolve around not checking return values.

 

If a bad wide character is encountered in the conversion and you’re not expecting a negative number to be returned, you could end up under-indexing your array; for example:
int i;
i = wcstombs( … )  // wcstombs() can return -1
buf[i] = L'';
If a bad wide character is found during conversion, the destination buffer will not be NULL terminated and may contain uninitialized data if you didn’t zero it or otherwise initialize it beforehand.
Additionally, if the return value is n, the destination buffer won’t be NULL terminated, so any string operations later carried out on or using the destination buffer could run past the end of the buffer. Two possible consequences are a potential page fault if an operation runs off the end of a page or potential memory corruption bugs, depending on howdestbuf is usedlater . Developers should avoid wcstombs() and use wcstombs_s() or another, safer alternative. Bottom line: always read the docs before using a new function since APIs don’t always do what you’d expect (or want) them to do.

 

Another thing to watch out for is accidentally interchanging wide char and normal char functions. A good example would be incorrectly using strlen() on a wide character string instead of wcslen()—since wchar strings are chock full of NULL bytes, strlen() isn’t going to return the length you were after. It’s easy to see how this can end up causing security problems if a memory allocation is done based on a strlen() that was incorrectly performed on a wide char array.

 

Mistakes can also be made when trying to develop cross-platform or portable code—don’t hardcode the presumed length of wchars. In the examples above, I have assumed sizeof(wchar_t) = 2; however, as I’ve said a few times, this is NOT necessarily the case at all, since many UNIX-like systems have sizeof(wchar_t) = 4.

 

Making these assumptions about width could easily result in overflows when they are violated. Let’s say someone runs your code on a platform where wide characters aren’t two bytes in length, but are four; consider what would happen here:
wchar_t *destBuf = (wchar_t *)malloc(32 * 2 + 2);
wcsncpy(destBuf, srcBuf, 32);
On Windows, this would be fine since there’s enough room in destBuff for 32 wide chars + NULL terminator (66 bytes). But as soon as you run this on a Linux box—where wide chars are four bytes—you’re going to get wcsncpy()writing 4 * 32 + 2 = 130 bytes and resulting in a pretty obvious overflow.

 

So don’t make assumptions about how large wide characters are supposed to be since it can and does vary. Always usesizeof(wchar_t) to find out.

 

When you’re reviewing code, keep your eye out for the use of unsafe wide char functions, and ensure the math is right when allocating and copying memory. Make sure you check your return values properly and, most obviously, read the docs to make absolutely sure you’re not making any mistakes or missing something important.
INSIGHTS | December 7, 2011

Automating Social Engineering: Part Three

 

PHASE 2: Ruses

 

Once we have enough information about the employees and company in question, we can begin to make some sense of the information and start crafting our ruses. It is worth noting that this stage currently does not have a lot of since it does require a lot of human intuition and information processing. Certainly as we continue developing the tool we will be able to automate more and create some decision making systems capable of creating useful ruses, but for now a key factor of this phase is to look for key ideas and useful information in order to help us generate our attack as realistic and trustworthy as possible.

 

As previously mentioned, this stage it is not fully automated. Still, EMaily provides several examples and template emails that could help automate some ruses. Having said this, if we want to have high success rates with our attacks, it is still clear that testers will need to craft their own custom built email ruses depending on the data gathered in the previous phase.

 

Before we move on, there are a few things that are worth discussing before we dig down to the technical tools themselves. Social engineering is about abusing people and abusing their needs, senses, ideas, likes, dislikes, fears, etc. It is basically about selling a good lie- good enough that even you would believe it.
Depending on the feeling or idea we are trying to generate or trigger, we need to use a different ruse or combination of ruses. For example, the list below shows a short list of ruses ordered by specific feeling they try to trigger.

 

Ruses Examples:

 

General Templates:
  • · Facebook invite
  • · Twitter invite
  • · Linkedin Invite
  • · Mail cannot be delivered.
  • · Etc.
Fear Oriented Templates:
  • · Virus Found
  • · Compliance updates
  • · Popular thread in the news.
  • · Etc.

Needs or Likes Templates:

  • · Win an iPad.
  • · Internships or new internal available positions.
  • · General Company party.
  • · New Corporate discounts.
  • · Etc.
Gossip or Need to Know Templates:
  • · Latest financial data
  • · Layoff for next month
  • · Internal memorandum.
  • · Etc.
Let’s look at a couple real life examples of such ruses:

 

Source:

 

In this case, the person copied a friendship request from Facebook. We can easily change the links, and redirect them to a fake Facebook website, among many other things. Furthermore, we could use other techniques we will discuss later to actually perform internal egressing firewall rules scans by adding fake images to the email.

 

PHASE 3: Internal Information gathering: Software and Physical networks.

Once we have a target list (emails, names, etc.) and a ruse, we need to begin gathering information about the internal network and infrastructure from within. One possible way of doing this is by sending one or more rounds of emails using specially crafted html templates consisting of several image tags pointing to different ports, as it is shown on the figure below.

 

 
EMaily is a command line tool created to send multiple template emails using several servers at the same time. It contains many templates, but users can create their own templates and populate them as needed. It is worth noting that EMaily is also an expandable ruby library that can be called from any other ruby script or application.

Once we have a list of ports we want to scan from an internal perspective, we do not need to generate the entire list. EMaily will automatically generate the list and populated with the corresponding email using the template system as it is shown on the following code snippet, by simply using the %%payload[port 1, … ,port n]%%

You may ask, “but what is to gain from generating this random set of images?” The short answer is lots of information. We will not only be able to confirm that the firewall in the company we are trying to attack is not properly filtering the particular port, but also whenever an application makes an HTTP request, it sends lots of useful information back to the attacker.

 

As we can see from the output generated by EMaily, this will test egressing rules, obtain information such as operating system, email client used, IP addresses, etc…
Basically, Emaily works as a reverse scanner, allowing people to send their “payloads” to victims, which will render the images and will generate a list of request that will be served by the Emaily web server process, allowing us to gather all kinds of information, as it is shown in the figure below.

 

Now we have emails and hopefully tons of information tight to each address such as egressing rules, operating system name and version, browser or mail client name and version, mobile phone information, etc. We have enough to start using the more interesting ruses and correctly targeting the victims using the correct payloads (since we know the OS and mail client version), and open ports allowing us to connect back to us. In the next phase we will discuss how to use all that information to successfully compromise the company being tested.

This is part three of a four-part social engineering post. The next and final entry will discuss compromising machines.

 

INSIGHTS | November 8, 2011

Automating Social Engineering: Part Two

 

As with any other type of penetration test, we need to gather information. The only difference here is that instead of looking for operating system types, software versions, and vulnerabilities, we’re searching for information about the company, their employees, their social networking presence, et cetera.

Given that we’re performing an assessment from a corporate perspective, there are some limitations with regard to privacy and employees’ private life, but the truth is that real attackers won’t abide by such limitations. So, you should assume that any information made public or available on the Internet will be considered usable. (Disclosure: consultants/employees should talk to your client/employer and lawyers to define the scope for any penetration test prior to information gathering.)

 

As stated in the comic, information gathering is really simple and there’s only one rule: There is never enough information; the more you have the better. Everything is relevant in some way or another—everything from company icons, images, and documents all they way down to where an employee went to dinner last week and with whom.

 

Luckily for us, Mark Zuckerberg (creator of Facebook) and corporate America have made people’s lives public and easy to follow by convincing them that they’re supposed to forget about privacy and share as much information as they can with as many people and services as they can, because it is “good” for them.

 

The type of data we need depends on the type of attack we’re performing. Given that we are currently discussing social engineering assessments in a corporate context, we will surely need to gather corporate email accounts and plenty of names. There are many tools capable of performing Open Source Intelligence (OSINT) including the Harverster, Maltego, and, of course, ESearchy.

 

Esearchy is project that I began a few years go as a small Ruby library with a proof-of-concept CLI tool capable of searching the internet for email addresses and people from a specific domain or company. Currently, the supported search plug-ins include but are not limited to:

 

Search Engines
– Google
– Bing
– Yahoo
– AltaVista
Social Engines
– LinkedIn
– Google Profiles
– Naymz
– Classmates
– Spoke
– Google+
Other Engines
– PGP servers
– Usenets
– GoogleGroups Search
– Spider
– LDAP
In addition to that, ESearchy is capable of downloading—upon request—several types of files and searches their contents for emails. File types supported include but are not limited to:
PDF
DOC
DOCX
ODP
ODS
ODB
XLSX
PPTX
TXT
ODT
ASN

 

With this simple introduction, we’re now going to install the tool and test a few of the information gathering concepts described above. ESearchy is currently hosted as a Ruby gem at https://rubygems.org, so by fetching the gem in any Linux, OSX, or Windows environment, it will install all the necessary dependencies and binaries.
Note: Ubuntu users will need to add the Ruby path to their $PATH in order to run esearchy.

 

$> sudo gem install esearchy

 

Once ESearchy is installed, we are ready to start gathering information. As previously mentioned, the application supports several types of searches using the esearchy CLI command and/or by creating custom scripts using the ESearchy library—that is, they requireesearchy.
Using the tool is straightforward; for example:

 

$> esearchy -q @company.com –enable-google –enable-pgp
$> esearchy -q @company.com -c “Company Inc” –enable-linkedin
For a full description on the engines supported and all the other possible ESearchy features, please refer to the help command in the ESearchy tool itself, which is:
esearchy –h
Despite now having a list of email addresses related to the company in question, it’s a good idea to continue gathering as much data as possible. We should continue performing searches; we may need to find information regarding the DNS servers and mail servers, as well as other information that is usually collected as part of a standard penetration test. ESearchy currently does not perform these search types , but that functionality will be supported in future versions as separate, standalone tool.
Last but not least, a good way to confirm (and possibly obtain more) email addresses involves checking the SMTP server for vulnerabilities (such as information disclosures) using VRFY or EXPN, et cetera. If present, this information should allow us to confirm our email addresses and possibly even acquire more.

 

This is part two of a four-part social engineering post. The next entry will discuss using ruses to gather more intrusive information about the internal network.