6.2. Armored Viruses
The term armored virus was coined by the computer crime unit of New Scotland Yard to describe computer viruses that make it even more difficult to detect and analyze their functions quickly.
A computer virus's primary goal is to spread as far as possible without being noticed. In previous chapters, you learned about advanced techniques, such as stealth, that can hide the virus code and prevent it from being detected too quickly. The authors of armored viruses want to be sure that the virus code is even more difficult for scanners to detect, even if the scanners use techniques such as heuristics that can pinpoint previously unknown computer viruses. Furthermore, if a virus sample is obtained by any means, its author wants to make the analysis of the virus code as difficult as possible to further delay rapid response to the virus attack.
The following section describes basic methods of armored viruses:
Computer viruses written in Assembly language are challenging to understand because they often use tricks that normal programs never or very rarely use. There are several good disassemblers on the market. I can strongly recommend Data Rescue's IDA for computer virus analysis, which allows you to redefine the location of code and data on the fly.
Several disassemblers have achieved great success in the past, such as the almost automated disassembler called Sourcer. To prevent such analysis, virus writers deploy techniques to trick such solutions. The greatest attack on disassemblers are simple variations of code obfuscation techniques such as encryption, polymorphism, and especially metamorphism. Because these techniques are rather important to discuss in detail, Chapter 7, "Advanced Code Evolution Techniques and Computer Virus Generator Kits" is dedicated to them.
6.2.2. Encrypted Data
One of the most obvious ways to avoid disassembling is to use encrypted data in the virus code. All constant data in the virus can be encrypted. When the virus is loaded into the disassembler, the virus code will reference encrypted snippets, which you need to decrypt one by one to understand what the virus does with themmaking virus analysis an even more tedious process.
For example, the W95/Fix2001 worm sends stolen account information to the attacker via e-mail. The author of the worm does not want you to discover quickly the e-mail address to which it sends the information, so examining the code of the worm in a file view or disassembler will not reveal the e-mail addresses to which the worm sends the stolen information. Can you guess what the encrypted block shown in Figure 6.1 hides?
Figure 6.1. An encrypted block of data in the W95/Fix2001 worm.
When you analyze the worm in a disassembler such as IDA (the interactive disassembler), you will find a decryption routine that uses a simple XOR decryption. The decryption loop will decrypt 87h bytes, with the constant value 19h coming backwards, as shown in Figure 6.2.
Figure 6.2. A data decryptor of the W95/Fix2001 worm.
The "xor byte ptr encrypted[ecx], 19h" instruction could be also shown as "xor byte ptr [ecx+004048DB],19".
To continue your analysis, you need to decrypt the data. For example, you might prefer to use a debugger, which will obviously take some time and make your analysis more difficult. The decrypted data will look like that shown in Figure 6.3.
Figure 6.3. Decrypted data in the W95/Fix2001 worm.
On many occasions, incorrect information is published by teams of incompetent virus analysts. This happens even more often when a virus is encrypted. Such misleading information often reaches the media, causing damage by misinforming users about the threat. The only reliable way to analyze virus code is with comprehensive care. Anything else is unprofessional and must be avoided.
6.2.3. Code Confusion to Avoid Analysis
Consider a simple file-writing function under DOS:
MOV CX, 100h ; this many bytes MOV AH, 40h ; to write INT 21h ; use main DOS handler
Such code is very easy to read in a disassemblerbut the attacker knows this. In fact, heuristic analyzers that use disassembling also will find similar code sequences.
When the code is written in an obfuscated way, as shown in Listing 6.1, you need to make your own calculations to figure which function will be called. Thus you might prefer to use a debugger; however, the attacker also might present antidebugging techniques to prevent you from using the debugger.
Listing 6.1. Slightly Obfuscated Code
MOV CX,003Fh ; CX=003Fh INC CX ; CX=CX+1 (CX=0040h) XCHG CH,CL ; swap CH and CL (CX=4000h) XCHG AX,CX ; swap AX and CX (AX=4000h) MOV CX,0100h ; CX=100h INT 21h
As a result, when the code reaches INT 21h, the registers will be set accordingly. The attacker also might prefer to introduce many jump instructions into the code flow. This can make the use of a disassembler hard, especially when several kilobytes of code are used to jump back and forth to confuse you. The preceding code can easily be dealt with using a code emulator in the scanner. The attacker, however, might introduce antiemulation techniques to prevent you from using one.
6.2.4. Opcode MixingBased Code Confusion
Consider the example shown in Figure 6.4, selected from the W98/Yobe virus. Notice that the CALL instruction has an offset of 4013E6+1. The +1 is a side effect of a B8h (MOV) opcode, which was inserted into the code flow to confuse the disassembling.
Figure 6.4. Opcoding mixing code confusion in W98/Yobe.
Fortunately, a modern disassembler such as IDA allows the quick redefinition of an instruction as data. When the B8h is redefined as a single byte of data, the code flow is cleared. Notice the proper disassembling of the code at loc_4013e7 in the disassembly shown in Figure 6.5.
Figure 6.5. Correctly disassembled code by defining a B8h data byte.
In fact, a CALL to a POP instruction is a common sequence in computer viruses to adjust for their location in the file. Note that CALL pushes the offset where the code would need to return. This is immediately picked from the top of the stack to get the location of the virus body. Consequently the preceding trick might confuse a heuristic analyzer that uses disassembling because it will easily miss the CALL to a POP pair.
6.2.5. Using Checksum
Many viruses use some sort of checksum, such as the CRC32 algorithm or something simpler, to avoid using any string matching in code. The code is straightforward to read, but you can get confused about its meaning. In some cases, this kind of checksum becomes an extremely tricky puzzle. The W95/Drill virus3, written by Metal Driller, contains such a trick. The virus picks API names according to the name's checksum, which is stored in the virus. When I analyzed the virus, I could not resolve a few API checksums to anything meaningful as a string. In fact, a few APIs were only used by an antiemulation trick in the virus, so they were not essential to the proper functioning of the virus code.
In my article about W95/Drill for Virus Bulletin, I could never explain exactly to what some API checksums corresponded. A couple of months later when we received the source of the virus code in a zine, I found the following minor typo:
;; DWORD GetCurrentProcessID(void) dd 8D91AE5Fh, 0
The proper way to write the name of the API is GetCurrentProcessId(). Because the last character in the API name is incorrect, the precalculated checksum will never match any API string. Another mystery solved!
6.2.6. Compressed, Obfuscated Code
Many successful computer worms, such as W32/Blaster (further discussed in Chapter 10, "Exploits, Vulnerabilities, and Buffer Overflow Attacks"), are written in a high-level language. The attacker often uses compression because it has several advantages, including making the relatively large virus code more compact. The analysis of the code is more difficult because the executable needs to be unpacked first. Furthermore, scanners and heuristics analyzers both are challenged unless appropriate support is available.
There are well over 500 packer versions that attackers can use. Fortunately, many of these packers are buggy and do not run on all Win32 platforms. Several solutions do exist, such as ASPack and ASProtect, that support polymorphism for individual compressions as well as forms of antidebugging. In fact, it is known that ASPack uses the polymorphic engine of the W95/Marburg virus4.
Consider the example from W32/Blaster, shown in Figure 6.6. The header of the worm's file will not reveal the usual section names, but it allows you to guess the name of the packer, which is UPX.
Figure 6.6. The PE header area of the W32/Blaster worm.
Later in the worm's body, you can only find readable snippets of strings, as shown in Figure 6.7.
Figure 6.7. Some string snippets in the packed content of the W32/Blaster worm.
To analyze the code effectively in a disassembler, it must be unpacked first. You might be able to use a Win32 debugger, such as Turbo Debugger, OllyDBG, or SoftICE, to unpack the content of the program in memory. However, you might first need to deal with antidebugging tricks. Antidebugging is explained in the next section.
Attackers can use a number of tricks as antidebugging features. The attacker's goal is to prevent you from using a debugger easily. Because hardware supports debugging, the antidebug features can be rather platform-specific. In this section, you will learn about a few common techniques on more than one platform.
Several of these methods are incompatible with certain systems. For example, DESQView/DOS was a multitasking DOS operating system that simply did not like such tricks. I was disappointed to learn that I could not protect my antivirus, Pasteur, by several such techniques in this environment. Indeed, antidebugging techniques are useful to protect the code of the antivirus program and DRM (Digital Rights Management) software from novice attackers.
22.214.171.124 Hooking INT 1 and INT 3 on x86
A common technique is to hook INT 1 and INT 3 interrupts. This is harmless during the normal execution of the virus code, but the debuggers will lose their context as a result. Typically the hook routine is set to an IRET, as it is supposed to be without a debugger installed. Some viruses, such as V2Px, use hooks to decrypt their body with INT 1 and INT 3.
126.96.36.199 Calculating in the Interrupt Vectors of INT 1 and INT 3
Instead of hooking INT 1 and INT 3, the armoring code can simply make essential calculations in the interrupt vector table (IVT). Attackers learned this trick from common copy protection wrappers5, similar to that of the powerful EltGuard.
The idea is to use the vectors of INT 1 and INT 3 continuously during the execution of the code, such as calculating a decryption key to decrypt the next encrypted layer in the code. By itself, this kind of code is easy to bypass using a more powerful debugger, such as Turbo Debugger in Virtual 86 mode. In that mode, the debugger uses a virtual machine, so the vectors of INT 1 and INT 3 can be modified without harming code execution.
188.8.131.52 Calculating Checksum of the Code to Detect Break Points
Because a debugger uses a 0xCC opcode byte (INT 3) inserted into the code flow as a break point, the code changes when the break point is placed. The debugger mimics the proper code by displaying correct code in a dump or disassembly wherever a break point was put. When the running code checks that location, however, the debugger no longer provides the original code, and thus the change can be detected easily. Several viruses checksum their running code to see changes in their code and halt the execution of the virus code if the debugger is detected in this way.
184.108.40.206 Checking the State of the Stack During Execution of Code
A very simple way to detect a debugger is to check the state of the stack during execution. The debugger saves a trace record to the stack during single-stepping, which includes CS:IP and the flags. So by simply comparing the states of the stack for a given value, single-stepping can be detected, as shown in Listing 6.2.
Listing 6.2. Detecting Single-Stepping Using Stack State
MOV BP,SP ; Let's pick the Stack Pointer PUSH AX ; Let's store any AX mark on the stack POP AX ; Pick the value from the stack CMP WORD PTR [BP-2], AX ; Compare against the stack JNE DEBUG ; Debugger detected!
The memory-resident monitor portion of some antivirus products also contains such routines to avoid tunneling techniques that use tracing. Needless to say, emulation-based tunneling will not be detectable with this method.
220.127.116.11 Using INT 1 or INT 3 to Execute Another Interrupt
This is similar to the previously mentioned hooking method, but instead of simply hooking INT 1 and INT 3 with do-nothing routines, the attacker can easily call another interrupt, such as the original INT 21h. Whenever the virus uses an INT 21h, it can now use INT 1 instead.
Consider the following example:
MOV AH, 40 ; write function INT 3 ; call original INT 21h via previous hook
18.104.22.168 Using INT 3 to Enter Kernel Mode on Windows 9x
Some computer viruses use a trick for the transition from user mode to kernel mode on Windows 9x by manipulating the entries in the IDT for a particular interrupt. Unfortunately, this is rather easy to achieve because the IDT is writeable for user-mode code on Windows 9x. This is, of course, a huge problem: Any interrupt will be associated with kernel-level access, so the moment that the virus code executes such a "hook," the virus code switches to kernel mode.
On Intel 386 and higher processors, the IDT holds the offsets and selectors for each interrupt with special flags. The trick is that the offset of the handler is stored as two words split on the "sides" of a QWORD (8 bytes) IDT entry. There are 256 interrupts. The first 32 entries are reserved in the table for processor exceptions; the rest are software interrupts. The address of the IDT is available via the SIDT instruction, which can read the content of the processor's IDTR register that holds the pointer to the IDT.
An antidebug technique can be developed by hooking INT 3 in the IDT and executing code via the handler. Unfortunately, however, this confuses debuggers such as SoftICEin case you wanted to use a break point to trace and analyze the virus code.
This is exactly how the W95/CIH virus jumps to kernel mode and is antidebugging at the same time. Consider Listing 6.3 as an example.
Listing 6.3. "My Precious!"Ring 0
PUSH EAX SIDT FWORD PTR [ESP-2] ; Get IDT Address POP EBX ; and move it to EBX ADD EBX, 1CH ; Points into INT 3's slot CLI ; (3*8+4 = 1Ch) MOV EBP, [EBX] ; Get high half of current MOV BP, [EBX-4] ; Rest of INT 3 handler LEA ESI, NEW_HANDLER ; Offset of new handler in ESI MOV [EBX-4], SI ; Set low half in IDT SHR ESI, 10H MOV [EBX+2], SI ; Set high half in IDT INT 3 ; Run the new handler in Ring0
Evidently, CIH cannot easily be traced in SoftICE; however, debug register break points can be used to step into the virus code without using an INT 3based break-point condition. Obviously, the preceding code violates security and generates an exception on systems such as Windows NT/2000/XP/2003 (which is handled by CIH using an exception handler). However, the virus can use the same trick if appropriate rights are granted.
22.214.171.124 Using INT 0 to Generate a Divide-by-Zero Exception
I mentioned earlier that the Virtual 86 mode of Turbo Debugger is an excellent way to deal with malicious armored code, but there are several counterattacks possible against Turbo Debugger. For instance, the attacker can hook INT 0 (the division-by-zero interrupt) to generate a division by zero and run the next instruction in the code flow as the "handler." This will confuse Turbo Debugger. Examples of such a virus are Velvet and W95/SST.951.
126.96.36.199 Using INT 3 to Generate an Exception
A variation of the preceding method is to generate an exception on 32-bit Windows systems with a handler to catch it. In this way, the virus code can recover easily, but an application-level debugger like Turbo Debugger will lose the context because the exception handling gets into the picture, which runs kernel-mode code.
Some viruses on Win32 systems use INT 3 to generate the exceptions. In such cases, the exception handler functions as a general API caller routine with function IDs and parameters passed in on the stack. W32/Infynca uses this method.
188.8.131.52 Using Win32 with IsDebuggerPresent() API
184.108.40.206 Detecting a Debugger via Registry Keys Look-up
220.127.116.11 Detecting a Debugger via Driver-List or Memory Scanning
There are, of course, other means for the attacker to find out that you are running a debugger. For instance, the device list can be queried for driver names and the list checked to see whether it contains the name of a debugger driver. An even simpler technique is to use memory scanning and simply detect the debugger's code in memory.
18.104.22.168 Decryption Using the SP, ESP (Stack Pointer)
The Cascade virus was the first known virus to use an SP (stack pointer) in its decryption routine. Because the stack is used by INT 1 during tracing, the decryption fails. Other viruses simply decrypt or build their code on the stack, such as the W95/SK virus. Not surprisingly, it is difficult to use debuggers against such code-armoring attacks.
22.214.171.124 Backward Decryption of the Virus Body
A standard decryption routine might be used by the virus, but the direction of the decryption loop can be turned around by the attacker. Typically you can trace the virus code until the first byte of the encrypted virus body is decrypted and put a break point to the location of the decrypted byte. This will not work, however, if the decryption routine comes backward because the break-point opcode (0xCC INT 3) will be overwritten by the decryption loop. Several viruses use backward decryption. For example, the W95/Marburg virus can generate both forward and backward decryption loops.
126.96.36.199 Prefetch-Queue Attacks (and When They Backfire)
Whale was dubbed "the mother of all viruses." In fact, it used so much armoring that the virus could hardly replicate because of its high complexity. The original variant of Whale was only able to infect early XT (8088) systems because its self-modification routine backfired. This is an interesting hardware dependency in the code, which was fixed in later variants of the virus. Many researchers, however, could not replicate the virus at the time because it was incompatible with 8086, AT, 386, and 486 systems. Thus they believed the virus was headed for complete extinction. Surprisingly, Whale "reincarnated" on Pentium CPU series, at least so in theory, because these processors flush the prefetch queue. Consider the following code of Whale to understand this discovery more thoroughly.
The Whale virus hooks INT 3 to force a handler execution. Apparently, the handler executes rubbish, most likely as the result of a bug in calculating the handler's address to a nearby RET instruction. The code is obfuscated, as you can see in Listing 6.4.
Listing 6.4. The Obfuscated Trick of Whale
pop ax ; POP 0xE9CF into AX register xor ax,020C ; decrypt 0xEBC3 in AX (0xc3 RET) cs: mov [trap],al ; try to overwrite INT 3 with RET add ax,020C ; fill the prefetch queue trap: INT 3 ; Will change to RET ; Only if the prefetch queue is ; already full (on 8088 only) or ; flushed (Pentium+) INT3: ; Points to Rubbish Invd ; Random Rubbish (2 bytes) ret
The virus writer expected that the INT 3 would be successfully replaced with a RET instruction to take the control flow to the proper place. His computer was an XT (8088), which has a 4-byte processor prefetch queue size (later replaced with 6 bytes on 8086). This is why the preceding code worked on his computer.
Other viruses use prefetch queue attacks to mislead debuggers and emulators. In single-stepping (or emulation not supporting the prefetch queue), such self-modification always takes place. Therefore the attacker can detect tracing easily by checking that the modified code is running instead of the instructions in the prefetch queue.
188.8.131.52 Disabling the Keyboard
When you are tracing code, you are using the keyboard. The attacker knows that and disables the keyboard to prevent you from continuing to trace, perhaps by reconfiguring the content of the I/O PORT 21h or PORT 61h. Such ports might be different according to how modern operating systems map them.
IN AL,21 OR AL,02 OUT 21,AL
Other methods hook INT 9 (keyboard handler) early on. This prevents some debuggers from popping up for some time after the code is executed. A common technique to break into armored code is to run the virus and break in on the fly using the debugger's hot key. If the hot key is disabled, the debugger has no chance to break into the code.
Another interesting attack is performed by the Cryptor virus, which stores its encryption keys in the keyboard buffer, which is destroyed by a debugger6.
184.108.40.206 Using Exception Handlers
Many Win32 viruses use exception handlers to challenge you while debugging them. The W32/Cabanas7 was the first to use this trick. Pay close attention to the FS: access in malicious code, which is used as a reference to get and set exception handler records. Also pay attention to anything loading from FS: because that can be used to access the FS: data without using the FS: override. Both of these data are in the TIB (Thread Information Block).
220.127.116.11 Clearing the Content of Debug (DRn) Registers
Some viruses, such as CIH, use the debug registers for the purpose of "Are you there?" calls. Running SoftICE on a CIH-infected system can easily confuse the virus, which might install itself a second time in such a situation. But it is also true that debug registers can be cleared to confuse even advanced debugging tricks, such as those that use memory break points.
18.104.22.168 Checking the Content of Video Memory
Although I have not seen this dangerous trick in too many viruses, a few viruses do use it. The idea is based on the hooking of the timer interrupt (INT 8 or INT 1Ch, which is triggered by it). The routine continuously checks the contents of the video memory for a search string. When the virus is running and you attempt to locate the virus on the system by dumping the memory in a debugger, the search string is detected in the video memory, and the virus can notice this and attack the system, possibly by trashing the disk's content.
22.214.171.124 Checking the Content of the TIB (Thread Information Block)
Some Windows viruses check whether the TIB has a nonzero value stored at FS:[20h]. Application-level debuggers can be detected with this method. This can easily be ignored during your analysis, however. The trick is to know what to look for.
126.96.36.199 Using the CreateFile() API (the "Billy Belcebu" Method)
Kernel-mode drivers usually communicate with a Win32 user-mode component. To gain a handle to the driver, the Win32 applications can simply use the CreateFile() API with the name of a device. For example, SoftICE's name is \\.\SICE on Windows 9x and \\.\NTICE on Windows NT.
This technique is easily noticed during analysis, but many malicious programs simply refuse to work when they detect SoftICE. This can make test replications more difficult if you happened to load the debugger by default on the "dirty" PC that you use for analysis.
188.8.131.52 Using Hamming Code to Attack Break Points
Some viruses such as variants in the Yankee_Doodle family, written by the Bulgarian virus writer "T.P," used error correction with Hamming code to self-heal their code. Hamming code is an invention of Richard Hamming from the 1940s, which allows programs not only to detect errors in transmitted data, but also to correct them (with some limitations).
In fact, Yankee_Doodle had a few bugs and could really get corrupted when infecting EXE files that had ES:SP (program stack) header fields pointing into the virus body. As a result, when incorrectly infected samples started to run, the initial PUSH instructions in the virus code could easily corrupt the main part of the virus body because the program stack (ES:SP) accidentally pointed into the virus. The error checking code in the virus could potentially detect these corruptions and repair them to some extentor at least stop the execution of the incorrect samples. However, the real intention of the author of the virus was to eliminate break-points inserted by debuggers8. Consequently, when a debugger inserted a break-point 0xCC byte (INT 3) into the virus code (and when lucky enough), the virus could find out about this and fix itself, removing the break-point. The virus code fixes up to 16 break-points, or corrupted bytes in itself using the hamming data.
184.108.40.206 Obfuscated File Formats and Entry Points
Many debuggers fail if the file format containing the virus code is obfuscated or trickily constructed. For example, it is completely legal to overlap some areas of a Portable Executable file via manipulations of internal file structures. Such manipulations, however, might confuse the debugger, which is prepared to work with "standard" file structures. Not surprisingly, debuggers often fail to pick the correct entry-point of the application in case the entry point is obscure. For example, as discussed in Chapter 3, "Malicious Code Environments," the Thread Local Storage (TLS) of Portable Executable files offers an alternate entry point, on NT-based systems, before the main entry point is executed. Some debuggers such as SoftICE will break only on the main entry point, and thus, the virus that loads via the TLS is already running by the time the debugger offers a prompt to a virus analyst. Run-time packers expose similar problems to debuggers because they often cause unexpected file structure changes.
In 1998, Windows virus development was in a relatively early stage9. This is why a wide variety of different infection methods were introduced, making it possible to consider heuristic analysis against 32-bit Windows viruses. Heuristic analysis can detect unknown viruses and closely related variants of existing viruses using static and dynamic methods. Static heuristics rely on file format and common code fragment analysis. Dynamic heuristics use code emulation to mimic the processor and the operating system environment and detect suspicious operations as the code is "running" in the virtual machine of the scanner. Virus writers developed antiheuristic and antiemulation techniques to fight back against heuristic analyzers.
Careful analysis of different infection types led to the development of first-generation Win32 heuristic detectors, which used static heuristics. Static heuristics are capable of pinpointing suspicious portable executable (PE) file structures and therefore can catch first-generation 32-bit Windows viruses with a very high detection rate. The idea was based on DOS virus detection that uses similar methods.
By the end of 1999, many new virus replication methods had already been developed. Moreover, a major part of 32-bit Windows viruses used some sort of encryption, polymorphic, or metamorphic technique. Encrypted viruses are more difficult to detectand very dangerous when we look ahead to the future of scanners. This is because scanners can become very slow in attempting decryption for too many clean files on your system.
First-generation heuristics were extremely successful against PE file viruses. Even virus writers were surprised by their success, but, as always, we did not have to wait long before they came up with attacks against heuristic detectors. Several antiheuristic infection types have been introduced during the last couple of years.
Some virus writers also introduced antiemulation techniques against the strongest component of the antivirus product: the emulator.
220.127.116.11 Attacks Against First-Generation Win32 Heuristics
This section describes attacks against first-generation Win32 heuristics recently introduced by virus writers. Examination of these new methods enables us to enhance heuristic scanners to deal with these new virus-writing tricks. (The heuristics detection techniques are the subject of Chapter 11, "Antivirus Defense Techniques".)
18.104.22.168.1 New PE File-Infection Techniques
Many PE file viruses add a new section or append to the last section of PE files. Even today, many viruses can be detected with the simplest possible PE file heuristic. This heuristic has the benefit that it can easily be performed by endusers. The heuristic checks to see if the entry point of the PE file points into the last section of the application.
This heuristic can cause false positives, but additional checks can confirm whether the file is likely compressed. Compressed PE files cause the most significant false alarm rate, because the actual extractor code is very often patched into the last section of the application.
A large number of commercial PE file on-the-fly packers were introduced for Win32 platforms, such as UPX, Neolite, Petite, Shrink32, ASPack, and so on. Such packers are commonly used by virus writers and Trojan code creators to hide their creations. Many people can perform heuristic examination, and a file can be extremely suspicious if it contains certain words or phrases. Consequently, virus writers use packers to hide any suspicious material from the heuristics of scanners and humans. Unfortunately, certain worm or Trojan code can be changed with packers. The W32/ExploreZip worm was the first family of computer worms that were packed. The worm had several variants that were created by using various 32-bit PE packers, such as UPX.
Modern antivirus software must deal with new packers. It should not matter to the scanner if a Word document is placed in a ZIP archive that contains another embedded document that also has an embedded executable object packed with a 32-bit packer. The scanner needs to get the last possible decomposed object and perform its scanning task on the decompressed content.
Obviously, the decomposer needs to have a recognizer module that can easily sort out the packed object. This means that the false positive rate of 32-bit Windows virus heuristic will decrease significantly, while the detection rate of regular scanning will be much better. Virus writers have realized that infecting the last section of PE files is a far too obvious method and have come up with new infection techniques to avoid heuristic detection that raised the alarm on such objects.
22.214.171.124.2 More Than One Virus Section
Several Win32 viruses append not only to one section but to many sections at a time. For instance, the W32/Resure virus appends four sections (.text, .rdata, .data, and .reloc) to each PE host. Because the entry point of the application will be changed to point into the new .text section of the virus, the heuristic is fooled. This is because the last section does not receive any control.
Furthermore, the virus will maintain a very high-compatibility factor with most Win32 systems. This is because the virus is written in C instead of Assembly. The file can be suspect heuristicallya human might easily notice the same four sections added to the section tables of a few PE files, but a heuristic scanner has more difficulty. This is because some linkers create suspicious structures in some circumstances, so a heuristic developed for this virus to detect multiple section names would have to be considered a low-value heuristic.
126.96.36.199.3 Prepending Viruses that Encrypt the Host File Header
First-generation heuristics can look for a PE file header at the purported end of the PE file, according to the prepended virus file header. Again, a human can perform such a heuristic easily, even if the end of the file is somewhat encrypted (as with W32/HLLP.Cramb).
188.8.131.52.4 First-Section Infection of Slack Area
Some viruses, such as W95/Invir10, do not modify the entry point of the application to point to the last section. Rather, the virus overwrites the slack area of the first section and jumps from there to the start of virus code that is placed somewhere else, such as the last section.
As we will see later, this technique is often combined with antiemulation tricks. This is because second-generation, dynamic heuristics use code emulation to see whether the actual application jumps from one section to another.
184.108.40.206.5 First-Section Infection by Shifting the File Sections
The first known virus utilizing this method was the W32/IKX virus, by the virus writer Murkry. This virus is also known as "Mole," referring to a special cavity infection style used by the virus. The virus body of IKX is longer than would fit into a single section's slack area, so the virus makes a "hole" for itself by shifting each section of the PE host after its body. The virus appends itself to the code section and adjusts the raw data offset of each subsequent section by 512 bytes. Clearly, dynamic heuristics are necessary in such a casethat is, heuristics performed on the code instead of on the structure of the application.
220.127.116.11.6 First-Section Infection with Packing
One of the most interesting antiheuristic viruses was W95/Aldabera. W95/Aldabera infects the last section of PE files. The virus cannot infect small files; rather, it looks for larger PE files and tries to pack their code section.
If the code section of the application can be packed such that the actual packed code and the virus code fit in the very same place, the virus will shrink the code section and place the results together in the code section. After that, it will unpack the code section on execution, similar to an on-the-fly packer.
Obviously, W95/Aldabera causes problems for heuristic scanners and regular scanners alike. This is because the virus is rather big and decrypts/unpacks itself and the code section very slowly during emulation, taking a lot of emulator iterations. The virus also uses the RDA (random decryption algorithm) decryption technique11. RDA viruses do not need to store decryption keys to decrypt their code. Instead, the virus decryptor generates keys and decryption methods applying brute-force decryption. (This technique is further discussed in Chapter 7.)
Moreover, the virus accesses and modifies too much virtual memory. More than 64KB of "dirty pages" must be kept by the emulator to decrypt the code properly, and this exceeds the maximum of some of the early 32-bit code emulatorsthose that still try to support the XT platform (oh, well).
18.104.22.168.7 Entry-Point Obscuring Techniques
Various 32-bit Windows viruses apply a very effective antiheuristic infection method known as the entry-point obscuring or inserting technique. Many old DOS viruses used this technique, several of which were introduced in Chapter 4, "Classification of Infection Strategies."
As expected, virus writers implemented inserting polymorphic viruses to evade detection by heuristic scanners. To date, the entry-point obscuring method is the most advanced technique of virus writers.
I strongly suspect that future entry-point obscuring polymorphic viruses will use internal mass-mailing capabilities. A further indication of this tendency is the W32/Perenast virus family, which also spreads quickly on networks. W32/Perenast is a highly obfuscated virus that is very difficult to detect.
22.214.171.124.8 Selecting a Random Entry Point in the Code Section
The W95/Padania was one of the first 32-bit Windows viruses that occasionally did not modify the entry point of the application (in the PE header) to point to the start of virus code. Sometimes the virus searches the relocation section of the application to find a safe position to perform code replacement in the code section of PE files. The virus does change certain CALL or PUSH instructions to a JMP instruction that points to the start of the virus. The virus selects this position near to the original entry point of the application; therefore, the virus code will likely get control when the original application is executedthough there is no guarantee it will.
Whenever the virus is not encrypted, the virus is detected easily by those scanning engines that do not just apply an entry pointbased scanning engine model, but that also scan the top and bottom of each PE file for virus strings. Scanners that do not support this kind of scanning or some sort of algorithmic scanning (discussed in Chapter 11) will be more difficult to update to detect such viruses.
Moreover, heuristic scanning needs to deal with the problem of PE emulation in a different manner. Emulators typically stop when encountering an "unknown" API call. Thus the actual position where the jump (JMP) is made to the virus entry point will not be reached. The issue here is that different APIs have a variable number of parameters, and the APIs are responsible for their own stack cleaning. It is simply impossible to know all Win32 APIs and the number of parameters passed in to the function, which is the minimum needed to follow an accurate program code chain. Someone could implement the standard APIs, but what about the thousands of other DLLs?
Even dynamic heuristic file scanning is not effective against such creations. Some files might be detected heuristically when the jump to the virus start is placed near the actual entry point of the PE application.
126.96.36.199.9 Recycling Compiler Alignment Areas
W95/Orez fragments its decryptor into these "islands" of the PE files. W95/SK uses a relatively short but efficiently permuted decryptor that replaces such an area in the PE files. Control is given from a randomly selected place via a CALL.
188.8.131.52.10 Viruses That Do Not Change Any Section to Writeable
It is a very useful heuristic to check whether a code section is also marked writeable. In particular, a writeable entry-point section is suspicious. Some viruses do not need to mark any section to be writeable or do not necessarily change any section to writeable. For instance, the W95/SK virus decrypts itself on the stack and not at the place of the actual virus body in the last section. Therefore, SK does not need to set the writeable flag on itself. Most viruses that decrypt on the stack will not need to rely on the writeable flag at all, so heuristics scanners have less chance to detect such viruses.
184.108.40.206.11 Accurate File Infection
First-generation heuristics can check file structures for accuracy. Several first- generation viruses modify the file structure in a way that corrupts the file. When someone tries to execute an infected file on a Windows NT/2000 system, the system loader will refuse to execute such an application.
Unfortunately, the majority of 32-bit Windows viruses are already classified in the Win32 class, meaning that the infection strategy is properly done on more than one major Win32 system. Therefore, this heuristic detection is declining in usefulness against new viruses.
220.127.116.11.12 Checksum Recalculation
A possible heuristic is the recalculation of the system DLL's checksum, such as KERNEL32.DLL. When the KERNEL32.DLL's checksum is incorrect, Windows 95 still loads the DLL for execution (unless major corruption is identified, such as a too-short image). Win32 viruses that access KERNEL32.DLL will try to recalculate the checksum of the DLL by using Win32 APIs. Others implement the checksum calculation.
The actual checksum API does the following12:
W32/Kriz was the first virus that implemented the DLL checksum recalculation algorithm without calling any system APIs. It is more difficult to see an application inconsistency when the checksum is properly calculated. Other viruses also recalculate the checksum for PE files whenever the checksum is a nonzero value in the PE header. Therefore, this first-generation heuristic will fail to identify modern Win32 viruses.
18.104.22.168.13 Renaming Existing Sections
Some of the first-generation heuristic scanners try to check whether a known noncode section gets control during emulation. Under normal circumstances, several sections (such as ".reloc," ".data," and so on) do not contain any code (unless an on-the-fly packer patches itself to an existing section similarly to a virus). Some viruses change the section name to a random string. For instance, the W95/SK virus renames the ".reloc" section name to a five-character-long random name with a one-in-eight probability. As a result, heuristic scanners cannot pinpoint the virus easily based on the section name and its characteristics.
22.214.171.124.14 Avoiding Header Infection
First-generation heuristics can check whether the entry point is located directly after the section headers. W95/Murkry, W95/CIH, and a large number of W95/SillyWR variants use this header-infection strategy and easily can be detected heuristically or generically. Some of the modern 32-bit Windows viruses avoid infecting the PE header area for this reason.
126.96.36.199.15 Avoiding Imports by Ordinal
A few early Windows 95 and Win32 viruses patched the host program's import table to include extra imports by ordinal. This is because many hosts programs do not import GetProcAddress() and GetModuleHandle() APIs. Because imports can be resolved using ordinal numbers (with some limitations) instead of function names, a few viruses insert ordinal-based imports into the executables import directory to resolve their need to call Win32 APIs. Most of the new Win32 viruses do not use such logic, so this heuristic became useless against them.
188.8.131.52.16 No CALL-to-POP Trick
Most 32-bit Windows appending viruses use the CALL-to-POP trick to locate the start address for their data relocations. Because first-generation heuristics could easily look for that, virus writers tried to implement new ways to get the base address of the code.
Normally, the trick looks like this:
0040601A E800000000 call 0040601F ; pick current offset to ESI 0040601F 5E pop esi ; ESI=0040601F
In normal circumstances, code similar to the preceding is not generated by compilers, so the use of E800000000 opcode is suspicious.
Listing 6.5 shows the trick implemented in the W32/Kriz virus.
Listing 6.5. Hiding CALL-to-POP trick
Offset Opcode Instruction 0040601A E807000000 call 00406026h 0040601F 34F4 xor al,F4 00406021 F0A4 lock movsb 00406023 288C085EB934AC sub [eax+ecx-53CB46A2],cl 0040602A 0200. add al,[eax]
184.108.40.206.17 Fixing the Code Size in the Header
A possible heuristic logic is the recalculation of the code section sizes that are actually used. The PE header holds the size of all code sections. Most linkers will calculate a proper code size for the header according to the raw data size of the actual code sections in the file. Some viruses set their own section to be executable but forget about the recalculation in the PE header. Static heuristics can take advantage of this.
220.127.116.11.18 No API String Usage
A very effective antiheuristic/antidisassembly trick appears in various modern viruses. An example is the W32/Dengue virus, which uses no API strings to access particular APIs from the Win32 set. Normally, an API import happens by using the name of the API, such as FindFirstFileA(), OpenFile(), ReadFile(), WriteFile(), and so on, used by many first-generation viruses. A set of suspicious API strings will appear in nonencrypted Win32 viruses. For instance, if we use the strings command to see the strings in the virus body of the W32/IKX virus, we will get something like the following:
Murkry\IKX EL32 CreateFileA *.EXE CreateFileMappingA MapViewOfFile CloseHandle FindFirstFileA FindNextFileA FindClose UnmapViewOfFile SetEndOfFile
The name of infamous virus writer Murkry appears on the list. (Actually, the names of certain virus writers or certain vulgar words are useful for heuristic detection.) Moreover, we see the *.EXE string, as well as almost a dozen APIs that search for files and make file modifications. This can make the disassembly of the virus much easier and is potentially useful for heuristic scanning.
Modern viruses, instead, use a checksum list of the actual strings. The checksums are recalculated via the export address table of certain DLLs, such as KERNEL32.DLL, and the address of the API is found. It is more difficult to understand a virus that uses such logic because the virus researcher needs to identify the APIs used. By examining the checksum algorithm of the actual virus, the virus researcher can write a program that will list the API strings. The virus researcher, Eugene Kaspersky, employs this technique to identify the used API names relatively quickly. Moreover, Kaspersky creates a reference for IDA that can be used directly by the disassembler13.
6.2.9. Antiemulation Techniques
First-generation heuristics used to be virus-infection-specific. The heuristics checked for possible infection methods used on a particular application. Virus writers realized that some scanners use emulation for detecting 32-bit Windows viruses and started to create attacks specifically made against the strongest component of the scanner: the emulator. In this section, I explain some of the antiemulation techniques that have been used by virus writers over the years.
18.104.22.168 Using the Coprocessor (FPU) Instructions
Some virus writers realized the power of emulators and looked for weaknesses and quickly realized that coprocessor emulation was not implemented. In fact, most emulators skipped coprocessor instructions until recently, whereas most processors that are currently used support coprocessor instructions by default.
Viruses do not limit themselves any longer when they rely on using coprocessor instructions. In the early years of virus development (back in the '80s), virus writers could not utilize coprocessor techniques much because early processors did not include a coprocessor unit. With the introduction of 486 processors, Intel placed the unit in the actual processor. (Even 486 SX systems were built with an FPU. However, their FPU got disabled for two possible reasons: to be able to sell 486DX processors with defective FPUs as SX and to reduce production costs to market "different" processors with lower prices.) Nowadays, the majority of systems have the capability to execute coprocessor code.
Some of the nonpolymorphic but encrypted viruses already used coprocessor instructions to decrypt themselves. The Prizzy polymorphic engine (PPE) was first to use coprocessor instructions. The PPE is capable of generating 43 different coprocessor instructions for the use of its polymorphic decryptor.
22.214.171.124 Using MMX Instruction
Other virus writers went so far as to implement a virus that used the MMX (multimedia extension) instructions of Pentium processors. The first of this kind was W95/Prizzy as well. The virus was too buggy to work, and no antivirus researchers managed to replicate the distributed sample of W95/Prizzy, regardless of the extra care taken to replicate it. The Prizzy polymorphic engine was capable of generating as many as 46 different MMX instructions.
Other more successful viruses, such as W32/Legacy14 and W32/Thorin, were considerably better written than their predecessor.
Most MMX-capable polymorphic engines do not assume that MMX is available in the processor. Some viruses try to create two polymorphic decryption loops and check the existence of MMX support by using the CPUID instruction. The virus might attempt to replicate on those Pentium systems that do not have MMX support, but it will fail unless it somehow checks the processor type. The CPUID instruction is not available in older processors, whereas a 386 processor is enough to execute a PE image. Not surprisingly, some of the MMX viruses fail to determine MMX support properly and execute MMX instructions on non-MMX systems. As a result, they fail and generate a processor exception.
Virus writers might remove the CPUID check from polymorphic engines of the future because the number of Pentium systems that support MMX is growing very quickly.
126.96.36.199 Using Structured Exception Handling
The use of structured exception handling in 32-bit Windows viruses is as old as the first real Win32 virus. W32/Cabanas used a structured exception handler (SEH) for antidebug purposes and to protect itself from potential crash situations. Many 32-bit Windows viruses set up an exception handler at their start and automatically return to the host program's entry point in case of a bug condition.
Viruses often set up an exception handler to create a trap for the emulators used in antivirus products. Such a trick was introduced in the W95/Champ.5447.B virus. In addition, some polymorphic engines generate random code as part of their decryptors. After setting up an exception handler, the virus executes the garbage block, forcing an exception to occur. During actual code execution, the virus indirectly executes its own handler, which gives control to another part of the polymorphic decryptor. When the AV product's emulator cannot handle the exception, the actual virus code will not be reached during code emulation.
Unfortunately, solving this problem is not a matter of implementing an exception handler recognizer module in AV products. Although some exception conditions can be determined easily, the emulated environment (what we can build into an AV product) cannot handle all exceptions perfectly. In a real-world situation, it is impossible to be 100% sure that a particular instruction will cause an exception to be generated15.
188.8.131.52 Executing Random Virus Code: Is This a Virus Today?
One of the first viruses to use random execution logic was W95/Invir. This virus either transfers control to the host program (original entry point) or gives control to the virus body. In other words, executing an infected program does not guarantee that the virus will be loaded.
W95/Invir uses the FS:[0Ch] value as the seed of randomness. On Win32 systems under Intel machines, the data block at FS:0 is known as the thread information block (TIB). The WORD value FS:[0Ch] is called W16TDB and only has meaning under Windows 9x. Windows NT specifies this value as 0.
When the value is 0, the virus will execute the host program. How elegantthe virus will not try to load itself under Windows NT because it is not compatible with it. W16TDB is basically random under Windows 95. The TIB is directly accessible without calling any particular API, which is one of the simplest ways to get a random number.
The basic scheme of the first tricky code block is the following:
MOV reg, FS:[0C] AND reg, 8 ADD reg, jumptable JMP [reg]
The problem for emulators is obvious. Without having the proper value at FS:[0Ch], the virus decryptor will not be reached at all. The issue is a matter of complexity, and the detection of such viruses could be extremely difficult even with virus-specific detection methodsheuristics can easily fail to detect such a tricky virus.
184.108.40.206 Using Undocumented CPU Instructions
Although there are not too many undocumented Intel processor instructions, there are a few. For example, W95/Vulcano uses the undocumented SALC instruction in its polymorphic decryptor as garbage to stop the processor emulators of certain antivirus engines that cannot handle it. Intel claims that SALC can be emulated as a NOP (a no-operation instruction). Hardly a NOP, this instruction sets AL=FFh if the carry flag is set (CF=1), or resets AL to zero if the carry flag is clear (CF=0)16. Some emulators' implementation of these instructions might differ subtly from the processor's so that a virus could detect when it is executing under emulation.
Obviously, emulators should not stop when encountering unknown instructions. However, if the size of the actual opcode is not perfectly calculated, the dynamic heuristic scanner misses the virus.
220.127.116.11 Using Brute-Force Decryption of Virus Code
Some viruses, use a brute-force algorithm also known as RDA (Random Decryption Algorithm) to decrypt themselves. This method was used by DOS viruses, such as Spanska virus variants and some older Russian viruses, such as the RDA family. All of these old tricks of virus writers have been recycled in Win32 viruses. Brute-force decryption does not use fixed keys but tries to determine the actual method and the proper keys by trial and error. This logic is relatively fast in the case of real-time execution, but it generates very long loops causing zillions of emulation iterations, ensuring that the actual virus body will not be reached easily. The decryption itself can be suspicious activity, and modern dynamic heuristics can make good use of these tricks.
RDA is a very effective attack against code emulators built into antivirus software.
18.104.22.168 Using Multithreaded Virus Functionality
Many viruses tried to use threads to give emulators a hard time. Emulators were first used to emulate DOS applications. DOS only supported single-threaded execution, a much simpler model for emulators than the multithreaded model. Emulation of multithreaded Windows applications is challenging because the synchronization of various threads is crucial but rather difficult. As emulators become stronger, virus writers will certainly try to use this antiemulation trick. Recently, there are some sophisticated Win32 emulators, such as the one developed in the Norwegian Norman Antivirus, which can deal with multithreaded Win32 code effectively.
22.214.171.124 Using Interrupts in Polymorphic Decryptors
Just like some of the old DOS viruses, a few Windows 95 viruses used random INT instruction insertion in their polymorphic decryptors. Some old generic decryptors might stop emulating the program when the first interrupt call is reached. This is because most encrypted viruses do not use any interrupts before they are completely decrypted. The very same attack was built into the W95/Darkmil virus, which uses INT calls such as INT 2Bh, INT 08, INT 72h, and so on.
126.96.36.199 Using an API to Transfer Control to the Virus Code
The W95/Kala.7620 virus was one of the first to use an API to transfer control to its decryptor. The virus writes a short code segment into the code section of the host application. This short code makes a call to the CreateThread() API via the import address table of the host program. The actual thread start address is specified in the last section of the host program that is created by the virus. Obviously, the virus code will not be reached with emulation if the CreateThread() API is not emulated by the scanner. Modern antivirus software needs to address this problem by adding support for certain APIs whenever necessary. It is crucial to have API emulation as part of the arsenal of the AV software.
During the last couple of years, virus researchers expected to see new creations that use random API calls in their polymorphic decryptors, similarly to the technique described in the previous section. The W95/Drill virus was the first to implement it in practice.
188.8.131.52 Long Loops
Computer viruses also attempt to use long loops to defeat generic decryptors that use emulation. The loop is often polymorphic, do-nothing code, but in some cases an overly long loop takes place to calculate a decryption key that is passed to the polymorphic decryptor. An example of such a virus is W32/Gobi. Gobi is an EPO virus. It uses long key-generator loops that run as many as 40 million iterations before passing a decryption key to the polymorphic decryptor. Thus the emulation of the virus code becomes extremely slow.
6.2.10. Antigoat Viruses
Computer virus researchers typically build goat files to better understand the infection strategy of a particular virus (see Chapter 15, "Malicious Code Analysis Techniques" for more details). The infection of goat files helps virus analysis because it visually separates known file content from the virus body. The goat files typically contain do-nothing instructions (such as NOPs) and return to the operating system without any special functionality. Goat files are created with various file formats and internal structures for different kinds of viruses. For example, a set of goat files is created that are 4, 8, 16, or 32KB in size.
Antigoat viruses use heuristic rules to detect possible goat files. For example, a virus might not infect a file if it is too small or if it contains a large number of do-nothing instructions, or if the filename contains numbers. Obviously, antigoat viruses are a little bit more time-consuming to analyze because the special taste of the virus needs to be fulfilled first.