11.11. Integrity Checking
The wide variety of scanning techniques clearly shows how difficult computer virus detection based on the identification of known viruses can be. Thus the problem of virus detection appears to be better controlled using more generic methods that detect and prevent changes to the file and other executable objects based on their integrity.
Frederick Cohen demonstrated that integrity checking37 of computer files is the best generic method. For example, on-demand integrity checkers can calculate the checksums of each file using some known algorithm, such as MD4, MD538, or a simple CRC32. Indeed, even simple CRC algorithms work effectively by changing the generator polynomial39.
On-demand integrity checkers use a checksum database that is created on the protected system or elsewhere, such as an online location. This database is used each time the integrity checker is run to see whether any object is new on the system or whether any objects have changed checksums. The detection of new and changed objects is clearly the easiest way to find out about possible virus infections and other system compromises. There are, however, a number of disadvantages of this method, which are discussed in the sections that follow.
11.11.1. False Positives
In general, integrity checkers produce too many false positives. For example, many applications change their own code. On creating my first integrity checker, I was surprised to learn that applications such as Turbo Pascal changed their own code. Programs typically change their code to store configuration information together with the executable. This is clearly a bad idea from the viewpoint of integrity. Nonetheless, it is used by many applications.
Another set of false positives appears because users prefer to use run-time packers. Tools such as PKLITE, LZEXE, UPX, ASPACK, or Petite (to name just a few) are used to pack applications on the disk. Users can decide to compress an application at any time. Thus an integrity checker will sound the alarm when the packed program is used because it no longer has the same checksum as the unpacked one. Typically, a packed file is considerably smaller than its original. Thus an integrity checker might be able to reduce the false positives by storing extra information about the file, such as the file size. When the file size of a changed file is smaller than the original, the integrity checker might reduce false positives by not displaying a warning. However, this will allow file-compressing viruses to infect the system successfully.
In addition, a typical source of false positives is caused by updates. Many security updates (including Windows Update) are often obscure, thus you do not get a good idea which files will be changed by the updates. As a result, you do not know easily when you should accept a changed file on a system and when you should not. This is exactly why patch management and integrity checking are likely to be merged in security solutions in the future.
11.11.2. Clean Initial State
Integrity checkers need to assume that the system has a clean initial state. However, this is not necessarily the case. Unfortunately, many users will resort to an antivirus program only after they suspect that their system is infected. If the system is already infected, the checksum of the virus might be taken, making the integrity checking ineffective. The development of integrity checkers resulted in a large set of counterattacks. For example, stealth viruses are difficult for integrity checkers to handle. Another problem appears if a newly infected application is trusted by the user for execution. After the virus has executed, it can delete the checksum database of the integrity checker. As a result, the integrity checker is either completely removed or needs to be executed again from scratch to create a new database. Thus its effectiveness is reduced. Even more importantly, integrity checking systems must trust systems that are not trustworthy by their design because of the lack of security built into the hardware40.
Integrity checkers are typically slow. Executable objects can be large and require a lot of I/O to recalculate. This slowdown might be disturbing to the user. For that very reason, integrity checkers are typically optimized to take a checksum of the areas of file objects that are likely to change with an infection. For example, the front (header area), the entry point, and the file size are stored, and sometimes the attributes and last access information fields. Such tricky integrity checking can enhance the performance of the integrity checker, but at the same time it reduces effectiveness because random overwriting viruses or entry point- obscuring viruses (discussed in Chapter 4) will not always change the file at the expected places.
11.11.4. Special Objects
Integrity checkers need to know about special objects such as Microsoft Word documents with macros in them41. It is not good enough to report a change to a document every time the user edits it. There would be so many reports that the user would be annoyed and would probably turn off the protection entirelyand there is nothing less secure than a system with an unused protection. The actual objects, such as documents that can store malicious macros, need to be parsed; instead of the entire document, the stored macros inside the document must be checked. This means that integrity checking, just like antivirus software, is affected by unknown file formats, so the approach becomes less generic than it first appears.
11.11.5. Necessity of Changed Objects
11.11.6. Possible Solutions
Some of the integrity checkers' most common problems can be reduced42, for example, by using them in combination with other protection solutions, such as antivirus software. The antivirus can search for known viruses, and the integrity checking can raise the bar of protection. Such solutions can be truly adequate and are expected to be more and more popular in the future as the number of computer viruses continues to grow. In fact, as the number of entry pointobscuring viruses increases, so will the I/O ratio of the antivirus software. At one point, the I/O ratio of the antivirus will be similar to that of an integrity checker that calculates a complete checksum of the entire file. Thus it will cost at least as much to calculate the checksum of a file as to scan it against all known viruses. At that point, integrity checking becomes more acceptable in terms of performance. This means that complete file integrity checking can be used to speed up scanning of files, by only checking changed files for possible infections.
It is also expected that in the future more applications will be released in signed form. Thus it is also likely that the number of self-modifying applications will continue to decrease.
Integrity checking methods can be further enhanced when implemented as an on-access solution. Frederick Cohen called such systems "integrity shells."43 As discussed, a typical PC environment cannot be trusted by software installed on it because the hardware does not implement a secure booting system. In the future, the PC architecture will contain a security chip that stores the user's secret key. This will make it possible to load the operating system in such a way that the individual integrity of each component can be checked and trusted. Furthermore, such systems will offer enhanced memory protection, making it more difficult for malicious code to interfere with the protection itself. The administrator of the system will be able to create policies to trust applications based on several factors, such as whether or not the actual application is signed. The result is a better integrity system that can significantly reduce the impact of computer viruses.
The only drawback is that users expect to install new software on their systems. It is impossible to achieve perfect integrity of the system when users are tricked into executing almost anything. Some of these problems can be addressed using extra policy management and defining trusted and untrusted sources. For example, some integrity shells utilize a white list of known clean files and their names. Such solutions are very practical on mission-critical environments that are under the control of centralized system administration.