Chapter 3. Malicious Code Environments
One of the most important steps toward understanding computer viruses is learning about the particular execution environments in which they operate. In theory, for any given sequence of symbols we could define an environment in which that sequence could replicate itself. In practice, we need to be able to find the environment in which the sequence of symbols operates and prove that it uses code explicitly to make copies of itself and does so recursively1.
A successful penetration of the system by viral code occurs only if the various dependencies of malicious code match a potential environment. Figure 3.1 is an imperfect illustration of common environments for malicious code. A perfect diagram like this is difficult to draw in 2D form.
Figure 3.1. Common environments of malicious code.
The figure shows that Microsoft Office itself creates a homogeneous environment for malicious code across Mac and the PC. However, not all macro viruses2 that can multiply on the PC will be able to multiply on the Mac because of further dependencies. Each layer might create new dependencies (such as vulnerabilities) for malicious code. It is also interesting to see how possible developments of .NET on further operating systems, such as Linux, might change these dependency points and allow computer viruses to jump across operating systems easily. Imagine that each ring in Figure 3.1 has tiny penetration holes in it. When the holes on all the rings match the viral code and all the dependencies are resolved, the viral code successfully infects the system.
Figure 3.1 suggests how difficult virus research has become over the years. With many platforms already invaded by viruses, the fight against malicious code gets more and more difficult.
Please note that I am not suggesting that viruses would need to exploit systems. An exploitable vulnerability is just one possible dependency out of many examples.
Automation of malicious code analysis has also become increasingly more difficult because of diverse environment dependency issues. It is not uncommon to spend many hours with a virus in a lab environment, attempting natural replication, but without success, while the virus is being reported from hundreds or perhaps even thousands of systems around the world.
Another set of viruses could be so unsuccessful that a researcher could never manage to replicate them. Steve White of IBM Research once said that he could give a copy of the Whale virus ("the mother of all viruses") to everybody in the audience, and it would still not replicate3. However, it turns out that Whale has an interesting dependency on early 8088 architectures4 on which it works perfectly. Even more interestingly, this dependency disappears on Pentium and above processors5. Thus Whale, "the dinosaur heading for extinction,"6 is able to return, theoretically, in a Jurassic Parklike fashion.
One of the greatest challenges facing virus researchers is the need to be able to recognize the types, formats, and sequences of code and to find its environment. A researcher can only analyze the code according to the rules of its en-vironment and prove that the sequence of code is malicious in that environment.
Over the years, viruses have appeared on many platforms, including Apple II, C64, Atari ST, Amiga, PC, and Macintosh, as well as mainframe systems and handheld systems such as the PalmPilot7, Symbian phones, and the Pocket PC. However, the largest set of computer viruses exists on the IBM PC and its clones.
In this chapter, I will discuss the most important dependency factors that computer viruses rely on to replicate. I will also demonstrate how computer viruses unexpectedly evolve, devolve, and mutate, caused by the interaction of virus code with its environment.