From C to shellcode (simple way)
TL;DR; c-to-shellcode.py GitHub
Shellcode (in terms of malware development) is a independent piece of machine code that can be injected anywhere and executed without worrying about dependencies, DLLs, stack layout, other kinds of variables.
The most obvious way to create shellcode is to use Assembly language, which is very predictable. Dependency-free Assembly guarantees the conditions provided for a valid shellcode. However, writing extensive code in Assembly can be quite complicated. Mankind noticed this problem long time ago and created the C language in the 1970s. I write about the history of the C language here: C standard vs implementation.
C is much easier to use than Assembly but has drawbacks nonetheless. We do not have full control over the stack, the machine code produced is larger and less predictable. Compilers add a lot of their own functions and compile everything into a complex PE or ELF file structure.
So is it actualy possible to write code in C that could be used as a standalone shellcode? Yes, it can be done although it requires special steps.
How to write a shellcode in C?
When writing shellcode in C, we need to be careful about a few things. First, after compilation, the only section we will have access to will be .text
section. So we cannot use global constants or string literals directly in the code. They are placed in the .rodata
or .data
section.
1. We have to put all constants on the stack, string literals too. To get stack-based strings, we have to turn the string into an array of chars and put it in a local variable:
IMPORTANT: Remember about the null-terminator character at the end of the array!
In the case of wide character strings (often used in WinAPI), the notation is as follows:
Here's a convenient one-liner in Python for converting strings to stack-based strings:
2. We don't have access to any external libraries so no libc. Independent shellcode can't rely on dependencies it won't load itself. To interact with the operating system, you need to use the Windows API via indirect API calling (manually parsing kernel32.dll and using WinAPI functions).
3. All our local functions must be placed in a separate section by the compiler, so that we can move them to the end of the shellcode at the linking stage. I write more about linking below. To achieve this we will use a special directive of the GCC compiler:
Compilation and linking
For compilation I use MinGW (x86_64-w64-mingw32-gcc-win32
) which is a port of GCC for Windows. Trying to do the same from MSVC on Windows can lead to mental breakdown. MinGW implements all necessary GCC flags to generate shellcode:
-Os
- optimize generated machine code for size rather than speed;-fPIC
- generate position-independent code (don't hardcode specific memory addresses);-nostdlib
- don't link with libc;-nostartfiles
- don't link with standard startup files, don't include standard initialization code that runs before the main function;-ffreestanding
- generate code for a freestanding environment (no dependencies or runtime assumptions);-fno-asynchronous-unwind-tables
- don't generate stack unwind tables (reduce binary size);-fno-ident
- don't generate compiler identification string (reduce binary size);-s
- strip all symbols and debugging information (reduce binary size);-e start
- specify the entry point to the program (instead of defaultmain
);
Then we take the generated file and link it using a special linker script:
I will not describe here how the linker script works exactly. The most important information is that it generates a flat binary file with our entry point at the beginning of the shellcode, followed only by local functions. This way, by injecting the shellcode, we can start execution at the beginning of the buffer:
The .text
and .func
sections (where we keep our functions) have been merged into one continuous raw machine code in payload.bin
. This is our independent shellcode! We can embed the binary file prepared this way in the shellcode loader and execute it.
Indirect API calling in C
Indirect API calling by manually parsing kernel32.dll
library structures from process memory I described in detail in this blog post: Shellcode x64: Find and execute WinAPI functions with Assembly. Now I'm just going to demonstrate how much faster and more convenient it is to get the same effect using C.
I implemented two functions from standard libc: wcscmp
and strcmp
. The function that retrieves the address of the PEB structure must have been implemented using GCC's disgusting inline assembly syntax, since we are using the GS segment register here:
The following program run calc.exe
without using any libraries directly (note the stack-based strings):
A more unusual thing in this program is the ALIGN_STACK()
macro before calling the WinExec
function. It's the requirement of WinAPI to align stack before calling. Since the compiler does not know that we are calling a WinAPI function (indirect API calling), we have to take care of stack alignment ourselves before each call. Not gonna lie, this mess is generated by AI. I'm disgusted by AT&T syntax with GCC inline Assembly.
It is worth noting that in the code you can use types belonging to external header files:
Types are used only at the compilation stage and do not affect the “no dependency” principle, as long as you do not use a specific function. The types and macros themselves are harmless.
Full source code available at: payload.c
Here's the result:
Assembly vs C shellcode
The entire C program executing calc.exe
compiled into shellcode takes 480 bytes. A program doing the same thing written in pure Assembly takes about 200 bytes. And my Assembly isn't the most concise code in the world. That's still more than twice as many bytes. With larger programs this difference will probably increase, but the benefits of using C (in my opinion) are more important than the bytes saved.
Code written in C is just readable, easy to expand and maintain. In fact, it is rather obvious at first glance.
Automation script
I wouldn't be myself if I didn't automate the entire process. A Python script that compiles C to shellcode and injects it right into the example loader can be found here: c-to-shellcode.py
The script generates the following files:
bin/payload.exe
- compiled C program (without shellcode conversion), so you can use libc and WinAPI functions directly, e.g.printf()
. Great for debugging and fast development.bin/loader.exe
- sample loader with compiled shellcode. It really injects shellcode into memory and executes it just like real malware.bin/payload.bin
- raw shellcode binary file.
The Python script allows for rapid prototyping and debugging. It returns all necessary file formats for effective malware development.
~ Print3M