### 10.3. Types of Vulnerabilities

#### 10.3.1. Buffer Overflows

Buffers are data storage areas that generally hold a predefined amount of finite data. A buffer overflow occurs when a program attempts to store data in a buffer, and the data is larger than the size of the buffer.

When the data exceeds the size of the buffer, the extra data can overflow into adjacent memory locations, corrupting valid data and possibly changing the execution path and instructions. Exploiting a buffer overflow makes it possible to inject arbitrary code into the execution path. This arbitrary code could allow remote system-level access, giving unauthorized access not only to malicious hackers, but also to replicating malware.

Buffer overflows are generally broken into multiple categories, based both on ease of exploitation and historical discovery of the technique. Although there is no formal definition, buffer overflows are, by consensus, broken into three generations:

• First-generation buffer overflows involve overwriting stack memory9.

• Second-generation overflows involve heaps, function pointers, and off-by-one exploits.

• Third-generation overflows involve format string attacks10 and vulnerabilities in heap structure management. Third-generation attacks often overwrite data just like "anywhere" in memory to force a change in execution flow indirectly according to what the attacker wants11.

For simplicity, the following explanations will assume an Intel CPU architecture, but the concepts can be applied to other processor designs.

#### 10.3.2. First-Generation Attacks

First-generation buffer overflows involve overflowing a buffer that is located on the stack.

##### 10.3.2.1 Overflowing a Stack Buffer

Listing 10.1 declares a buffer that is 256 bytes long. However, the program attempts to fill it with 512 bytes of the letter A (0x41).

##### Listing 10.1. A Bogus Function
int i;
void function(void)
{
char buffer[256];     //create a buffer

for(i=0;i<512;i++)        //iterate 512 times
buffer[i]='A';        //copy the letter A
}


Figure 10.1 illustrates how the EIP (where to execute next) is modified due to the program's overflowing the small 256-byte buffer.

##### Figure 10.1. An overflow of a return address.

When data exceeds the size of the buffer, the data overwrites adjacent areas of data stored on the stack, including critical values such as the instruction pointer (EIP), which defines the execution path. By overwriting the return EIP, an attacker can change what the program should execute next.

##### 10.3.2.2 Exploiting a Stack Buffer Overflow

Instead of filling the buffer full of As, a classic exploit will fill the buffer with its own malicious code. Also instead of overwriting the return EIP (where the program will execute next) with random bytes, the exploit overwrites EIP with the address to the buffer, which is now filled with malicious code. This causes the execution path to change and the program to execute injected malicious code.

Figure 10.2 demonstrates a classic stack-based (first-generation) buffer overflow, but there are variations. The exploit utilized by CodeRed was a first-generation buffer overflow that is more complex and is described later.

##### 10.3.2.3 Causes of Stack-Based Overflow Vulnerabilities

Stack-based buffer overflows are caused by programs that do not verify the length of data being copied into a buffer. This is often caused by using common functions that do not limit the amount of data copied from one location to another.

For example, strcpy is a C programming language function that copies a string from one buffer to another; however, the function does not verify that the receiving buffer is large enough, so an overflow can occur. Many such functions have safer counterparts, such as strncpy, which takes an additional count parameter, specifying the number of bytes that should be copied. On BSD systems, even safer versions, such as strlcpy, are available.

Of course, if the count is larger than the receiving buffer, an overflow can still occur. Programmers often make counting errors and utilize counts that are off by one byte. This can result in a type of second-generation overflow called off-by-one.

#### 10.3.3. Second-Generation Attacks

##### 10.3.3.1 Off-By-One Overflows

Programmers who attempt to use relatively safe functions such as strncat do not necessarily make their programs much more secure from overflows. In fact, the strncat function often leads to overflows because of its rather unintuitive behavior, which is inconsistent with strncpy's behavior27. Errors in counting the size of the buffer can occur, usually resulting in a single-byte overflow, or off-by-one. This is because indexes starting with "0" are not intuitive to novice programmers.

Consider the program shown in Listing 10.2. The programmer has mistakenly utilized "less than or equal to" instead of simply "less than."

##### Listing 10.2. Another Bogus Program
#include <stdio.h>

int i;
void vuln(char *foobar)
{
char buffer[512];

for(i=0;i<512;i++)
buffer[i]=foobar[i]
}

void main(int argc, char *argv[])
{
if (argc==2)
vuln(argv[1]);
}


In this case, the stack will appear as shown in Table 10.1.

##### Table 10.1. The Resulting Stack State of the Aforementioned Bogus Program

Value

0x0012FD74

buffer[512]

...

...

...

...

0x0012FF74

Old EBP=0x0012FF80

Ret EIP=0x00401048

Although one is unable to overwrite EIP because the overflow is only one byte, one can overwrite the least significant byte of EBP because it is a little-endian system. For example, if the overflow byte is set to 0x00, the old EBP is set to 0x0012FF00 as shown in Figure 10.3. Depending on the code, the fake EBP might be used in different ways. Often, a mov esp, ebp will follow, causing the stack frame pointer to be in the location of buffer. Thus buffer can be constructed to hold local variables, an EBP, and more importantly, an EIP that redirects to the malicious code.

##### Figure 10.3. An off-by-one overflow.

The local variables of the fake stack frame are processed, and the program continues execution at the fake return EIP, which has been set to injected code located in the stack. Thus a single-byte overflow could allow arbitrary code execution.

##### 10.3.3.2 Heap Overflows

A common misconception of programmers is that by dynamically allocating memory (utilizing heaps), they can avoid using the stack, reducing the likelihood of exploitation. Although stack overflows might be the "low-hanging fruit," utilizing heaps does not eliminate the possibility of exploitation.

##### 10.3.3.3 The Heap

A heap is memory that has been dynamically allocated. This memory is logically separate from the memory allocated for the stack and code. Heaps are dynamically created (for instance, "new, malloc") and removed (for instance, "delete, free").

Heaps are generally used because the amount of memory needed by the program is not known ahead of time or is larger than the stack.

The memory where heaps are located generally does not contain return addresses such as the stack. Thus without the ability to overwrite saved return addresses, redirecting execution is potentially more difficult. However, that does not mean utilizing heaps makes one secure from buffer overflows and exploitation.

##### 10.3.3.4 Vulnerable Code

Listing 10.3 shows a sample program with a heap overflow. The program dynamically allocates memory for two buffers. One buffer is filled with As. The other is taken in from the command line. If too many characters are typed on the command line, an overflow will occur.

##### Listing 10.3. A Heap Overflow Example
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void main(int argc, char **argv)
{
char *buffer = (char *) malloc(16);
char *input = (char *) malloc(16);

strcpy(buffer,"AAAAAAAAAAAAAAA");

// Use a non-bounds checked function
strcpy(input,argv[1]);
printf("%s",buffer);
}


With a normal amount of input, memory will appear as shown in Table 10.2.

##### Table 10.2. Memory Layout with Normal Input

Variable

Value

00300350

Input

BBBBBBBBBBBBBBBB

00300360

?????

?????????????????

00300370

Buffer

AAAAAAAAAAAAAAAA

However, if one inputs a large amount of data to overflow the heap, one can potentially overwrite the adjacent heap, as shown in Table 10.3.

Variable

Value

00300350

Input

BBBBBBBBBBBBBBBB

00300360

?????

BBBBBBBBBBBBBBBB

00300370

Buffer

BBBBBBBBAAAAAAAA

##### 10.3.3.5 Exploiting the Overflow

In a stack overflow, one could overflow a buffer and change EIP. This allows an attacker to change EIP to point to exploit code, usually in the stack.

Overflowing a heap does not typically affect EIP. However, overflowing heaps can potentially overwrite data or modify pointers to data or functions. For example, on a locked-down system one might not be able to write to C:\AUTOEXEC.BAT. However, if a program with system rights had a heap buffer overflow, instead of writing to some temporary file, someone could change the pointer to the temporary filename to point instead to the string C:\AUTOEXEC.BAT, inducing the program with system rights to write to C:\AUTOEXEC.BAT. This results in a user-rights elevation.

In addition, heap buffer overflows (just like stack overflows) also can result in a denial of service, allowing an attacker to crash an application.

Consider the vulnerable program shown in Listing 10.4, which writes characters to C:\HARMLESS.TXT.

##### Listing 10.4. A Vulnerable Program
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void main(int argc, char **argv)
{
int i=0,ch;
FILE *f;
static char buffer[16], *szFilename;
szFilename = "C:\\harmless.txt";

ch = getchar();
while (ch != EOF)
{
buffer[i] = ch;
ch = getchar();
i++;
}
f = fopen(szFilename, "w+b");
fputs(buffer, f);
fclose(f);
}


Memory will appear as shown in Table 10.4.

##### Table 10.4. Memory Layout of the Program Shown in Listing 10.4

Variable

Value

0x00300ECB

argv[1]

0x00407034

*szFilename

C:\harmless.txt

0x00407680

Buffer

0x00407690

szFilename

0x00407034

Notice that the buffer is close to szFilename. Both variables are placed to static heap (global data section), which is typically merged into the ".data" section of a PE file on Windows. If the attacker can overflow the buffer, the attacker can also overwrite the szFilename pointer and change it from 0xx00407034 to another address. For example, changing it to 0x00300ECB, which is argv[1], allows one to change the filename to any arbitrary filename passed in on the command line.

For example, if the buffer is equal to XXXXXXXXXXXXXXXX00300ECB and argv[1] is C:\AUTOEXEC.BAT, memory appears as shown in Table 10.5.

##### Table 10.5. Memory Layout During Attack

Variable

Value

0x00300ECB

argv[1]

C:\AUTOEXEC.BAT

0x00407034

*szFilename

C:\harmless.txt

0x00407680

Buffer

XXXXXXXXXXXXXXX

0x00407690

szFilename

0x00300ECB

Notice that szFilename has changed and now points to argv[1], which is C:\AUTOEXEC.BAT. Although heap overflows might be more difficult to exploit than the average stack overflow, the increased difficulty does not stop dedicated, intelligent attackers from exploiting them. The exploit code of Linux/Slapper worm (discussed in this chapter) makes this very clear.

##### 10.3.3.6 Function Pointers

Another second-generation overflow involves function pointers. A function pointer occurs mainly when callbacks occur. If in memory, a function pointer follows a buffer, there is the possibility to overwrite the function pointer if the buffer is unchecked. Listing 10.5 is a simple example of such code.

##### Listing 10.5. An Example Application Using Function Pointers
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int CallBack(const char *szTemp)
{
printf("CallBack(%s)\n", szTemp);
return 0;
}

void main(int argc, char **argv)
{
static char buffer[16];
static int (*funcptr)(const char *szTemp);

funcptr = (int (*)(const char *szTemp))CallBack;
strcpy(buffer, argv[1]); // unchecked buffer

(int)(*funcptr)(argv[2]);
}


Table 10.6 shows what memory looks like when this code is executed.

##### Table 10.6. Normal Memory Layout of the Application Shown in Listing 10.5

Variable

Value

00401005

CallBack()

004013B0

system()

004255D8

buffer

????????

004255DC

funcptr

00401005

For the exploit, one passes in the string ABCDEFGHIJKLMNOP004013B0 as argv[1], and the program will call system() instead of CallBack(). In this case, usage of a NULL (0x00) byte renders this example inert. Avoiding a NULL byte is left to the reader, to avoid exact exploit code (see Table 10.7).

##### Table 10.7. Memory Layout with Attack Parameters

Variable

Value

00401005

CallBack()

004013B0

system()

004255D8

buffer

ABCDEFGHIJKLMNOP

004255EE

funcptr

004013B0

This demonstrates another nonstack overflow that allows the attacker to run any command by spawning system() with arbitrary arguments.

#### 10.3.4. Third-Generation Attacks

##### 10.3.4.1 Format String Attacks

Formatstring vulnerabilities occur due to sloppy coding by software engineers. A variety of C language functions allow printing of characters to files, buffers, and the screen. Not only do these functions place values on the screen, they also can format them. Table 10.8 shows common ANSI format functions:

##### Table 10.8. ANSI Format Functions

printf

Print formatted output to the standard output stream

wprintf

Wide-character version of printf

fprintf

Print formatted data to a stream (usually a file)

fwprintf

Wide-character version of fprintf

sprintf

Write formatted data to a string

swprintf

Wide-character version of sprintf

vprintf

Write formatted output using a pointer to a list of arguments

vwprintf

Wide-character version of vprintf

vfprintf

Write formatted output to a stream using a pointer to a list of arguments

vwfprintf

Wide-character version of vfprintf

The formatting ability of these functions allows programmers to control how their output is written. For example, a program could print the same value in both decimal and hexadecimal.

##### Listing 10.6. An Application Using Format Strings
#include <stdio.h>
void main(void)
{
int foo =1234;
printf("Foo is equal to: %d (decimal), %X (hexadecimal)", foo, foo);
}


The program shown in Listing 10.6 would display the following:

Foo is equal to: 1234 (decimal), 4D2 (hexadecimal)


The percent sign (%) is an escape character signifying that the next character(s) represent the form the value should be displayed. Percent-d (%d), for example, means to "display the value in decimal format," and %X means to "display the value in hexadecimal with uppercase letters." These are known as format specifiers.

The format function family specification requires the format control and then an optional number of arguments to be formatted (as shown in Figure 10.4).

##### Figure 10.4. The format function specification.

Sloppy programmers, however, often do not follow this specification. The program in Listing 10.7 will display Hello World! on the screen, but it does not strictly follow the specification. The commented line demonstrates the proper way the program should be written.

##### Listing 10.7. A Bogus Program Using Incorrect Formatting Syntax
#include <stdio.h>
void main(void)
{
A Bogus Program Using Incorrect Formatting Syntax
char buffer[13]="Hello World!";
printf(buffer);        // using argument as format control!
// printf("%s",buffer); this is the proper way
}


Such sloppy programming allows an attacker the potential to control the stack and inject and execute arbitrary code. The program in Listing 10.8 takes in one command-line parameter and writes the parameter back to the screen. Notice that the printf statement is used incorrectly by using the argument directly instead of a format control.

##### Listing 10.8. Another Sloppy Program Subject to Code Injection Attacks
int vuln(char buffer[256])
{
int nReturn=0
printf(buffer);    //print out command line
// printf("%s",buffer); // correct-way
return(nReturn);
}

void main(int argc,char *argv[])
{
char buffer[256]=""; // allocate buffer
if (argc == 2)
{
strncpy(buffer,argv[1],255);  // copy command line
}
vuln(buffer); // pass buffer to bad function
}


This program copies the first parameter on the command line into a buffer, and then the buffer is passed to the vulnerable function. Because the buffer is being used as the format control, however, instead of feeding in a simple string, one can attempt to feed in a format specifier.

For example, running this program with the argument %X will return some value on the stack instead of %X:

C:\>myprog.exe %X
401064


The program interprets the input as the format specifier and returns the value on the stack that should be the passed-in argument. To understand why this is the case, examine the stack just after printf is called. Normally, if one uses a format control and the proper number of arguments, the stack will look similar to what is shown in Figure 10.5.

##### Figure 10.5. The state of the stack with the proper number of arguments.

However, by using printf() incorrectly, the stack will look different, as shown in Figure 10.6.

##### Figure 10.6. Incorrect use of print().

In this case, the program believes that the stack space after the format control is the first argument. When %X is fed to our sample program, printf displays the return address of the previous function call instead of the expected missing argument. In this example we display arbitrary memory, but the key to exploitation from the point of view of a malicious hacker or virus writer would be the ability to write to memory for the purpose of injecting arbitrary code.

The format function families allow for the %n format specifier, which will store (write to memory) the total number of bytes written. For example, the following line will store 6 in nBytesWritten because foobar consists of six characters:

printf("foobar%n",&nBytesWritten);


Consider the following:

C:\>myprog.exe foobar%n


When executed, instead of displaying the value on the stack after the format control, the program attempts to write to that location. So instead of displaying 401064 as demonstrated previously, the program attempts to write the number 6 (the total number of characters in foobar) to the address 401064. The result is an application error message, as shown in Figure 10.7.

##### Figure 10.7. Myprog.exe crashes and proves exploitability.

This does demonstrate, however, that one can write to memory. With this ability, one would actually wish to overwrite a return pointer (as in buffer overflows), redirecting the execution path to injected code. Examining the stack more fully, the stack appears as shown in Figure 10.8.

##### Figure 10.8. The stack state of myprog.exe before exploitation.

Knowing how the stack appears, consider the following exploit string:

C:>myprog.exe AAAA%x%x%n


This results in the exploited stack state, as shown in Figure 10.9.

##### Figure 10.9. An example exploitation of stack using tricky format strings.

The first format specifier, %x, is considered the return address (0x401064); the next format specifier, %x, is 0x12FE84. Finally, %n will attempt to write to the address specified by the next DWORD on the stack, which is 0x41414141 (AAAA). This allows an attacker to write to arbitrary memory addresses.

Instead of writing to address 0x41414141, an exploit would attempt to write to a memory location that contains a saved return address (as in buffer overflows). In this example, 0x12FE7C is where a return address is stored. By overwriting the address in 0x12FE7C, one can redirect the execution path. So instead of using a string of As in the exploit string, one would replace them with the value 0x12FE7C, as shown in Figure 10.10.

##### Figure 10.10. Exploitation of a return address.

The return address should be overwritten by the address that contains the exploit code. In this case, that would be at 0x12FE84, which is where the input buffer is located. Fortunately, the format specifiers can include how many bytes to write using the syntax, %.<bytestowrite>x. Consider the following exploit string:

<exploitcode>%x%x%x%x%x%x%x%x%x%.622404x%.622400x%n\x7C\xFEx12


This will cause printf to write the value 0x12FE84 (622404+622400=0x12FE84) to 0x12FE7C if the exploit code is two bytes long. This overwrites a saved return address, causing the execution path to proceed to 0x12FE84, which is where an attacker would place exploit code, as shown in Figure 10.11.

##### Figure 10.11. Exploitation of a return address to exploit the buffer.

For the sake of completeness, I need to mention that most library routines disallow the use of large values as a length specifier, and others will crash with large values behind a "%" specifier12. Thus the usual attacks write a full DWORD (32-bit value) in four overlapping writes similar to the following sequence:

      000000CC (1st write)
000001BB   (2nd write)
000002AA     (3rd write)
000003CC       (4th write)
CCAABBCC (complete DWORD)


By not following the specification exactly, programmers can allow an attack to overwrite values in memory and execute arbitrary code. While the improper usage of the format function family is widespread, finding vulnerable programs is also relatively easy.

Most popular applications have been tested by security researchers for such vulnerabilities. Nevertheless, new applications are developed constantly and, unfortunately, developers continue to use the format functions improperlyleaving them vulnerable.

##### 10.3.4.2 Heap Management

Heap management implementations vary widely. For example, the GNU C malloc is different from the System V routine. However, malloc implementations generally store management information within the heap area itself. This information includes such data as the size of memory blocks and is usually stored right before or after the actual data.

Thus by overflowing heap data, one can modify values within a management information structure (or control block). Depending on the memory management functions' actions (for instance, malloc and free) and the specific implementation, one can cause the memory management function to write arbitrary data at arbitrary memory addresses when it utilizes the overwritten control block.

##### 10.3.4.3 Input Validation

Input validation exploits take advantage of programs that do not properly validate user-supplied data. For example, a Web page form that asks for an e-mail address and other personal information should validate that the e-mail address is in the proper form and does not contain special escape or reserved characters.

Many applications, such as Web servers and e-mail clients, do not properly validate input. This allows hackers to inject specially crafted input that causes the application to perform in an unexpected manner.

Although there are many types of input validation vulnerabilities, URL canonicalization and MIME header parsing are specifically discussed here due to their widespread usage in recent blended attacks.

##### 10.3.4.4 URL Encoding and Canonicalization

Canonicalization is the process of converting something that can exist in multiple different but equally valid forms into a single, common, standard form. For example, C:\test\foo.txt and C:\test\bar\..\foo.txt are different full pathnames that represent the same file. Canonicalization of URLs occurs in a similar manner, where http://doman.tld/user/foo.gif and http://domain.tld/user/bar/../foo.gif would represent the same image file.

A URL canonicalization vulnerability results when a security decision is based on the URL and not all of the URL representations are taken into account. For example, a Web server might allow access only to the /user directory and subdirectories by examining the URL for the string /user immediately after the domain name. For example, the URL http://domain.tld/user/../../autoexec.bat would pass the security check but actually provide access to the root directory.

After more widespread awareness of URL canonicalization issues due to such an issue in Microsoft Internet Information Server, many applications added security checks for the double-dot string (..) in the URL. However, canonicalization attacks were still possible due to encoding.

For example, Microsoft IIS supports UTF-8 encoding such that %2F represents a forward slash (/). UTF-8 translates US-ASCII characters (7 bits) into a single octet (8 bits) and other characters into multioctets. Translation occurs as follows:

0-7 bits 0xxxxxxx
8-11 bits 110xxxxx 10xxxxxx
12-16 bits 1110xxxx 10xxxxxx 10xxxxxx
17-21 bits 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx


For example, a slash (0x2F, 0101111) would be 00101111 in UTF-8. A character with the hexadecimal representation of 0x10A (binary 100001010) has 9 bits and thus a UTF-8 representation of 11000100 10001010.

Although standard encoding did not defeat the aforementioned input validation security checks, Microsoft IIS still provides decoding, if one encodes a 7-bit character using the 811 bit rule format.

For example, a slash (0x2F, 101111) in 811 bit UTF-8 encoding would be 11000000 10101111 (hexadecimal 0xC0 0xAF). Thus, instead of using the URL

http://domain.tld/user/../../autoexec.bat


one could substitute the slash with the improper UTF-8 representation:

http://domain.tld/user/..%co%af../autoexec.bat.


The input validation allows this URL because it does not recognize the UTF-8-encoded forward slash, which gives access outside the Web root directory. Microsoft fixed this vulnerability with Security Bulletin MS00-78.

In addition, Microsoft IIS performs UTF-8 decoding on two separate occasions. This allows characters to be double-encoded. For example, a backslash (0x5C) can be represented as %5C. However, one also can encode the percent sign (0x25) itself. Thus, %5C can be encoded as %255c. On the first decoding pass, %255c is decoded to %5c, and on the second decoding pass, %5C is decoded to a backslash.

Thus, a URL such as http://domain.tld/user/..%5c../autoexec.bat will not pass the input validation check, but http://domain.tld/user/..%255c../autoexec.bat would pass, allowing access outside the Web root directory. Microsoft fixed this vulnerability with Security Bulletin MS01-26.

The inability of Web servers to provide proper input validation can lead to attacks. For example, in IIS, one can utilize the encoding vulnerabilities to break out of the Web root directory and execute CMD.EXE from the Windows system directory, allowing remote execution. W32/Nimda utilized such an attack to copy itself to the remote Web server and then execute itself.

When Internet Explorer parses a file, it can contain embedded MIME-encoded files. Handling of these files occurs by examining a header, which defines the MIME type. Using a look-up table, these MIME types are associated with a local application. For example, the MIME type audio/basic is generally associated with Windows Media Player. Thus, MIME-encoded files designated as audio/basic are passed to Windows Media Player.

MIME types are defined by a Content-Type header. In addition to the associated application, each type has a variety of associated settings, including the iconindicating whether to show the extension and whether to pass the file automatically to the associated application when the file is being downloaded.

When an HTML e-mail is received with Microsoft Outlook and some other e-mail clients, code within Internet Explorer actually renders the e-mail. If the e-mail contains a MIME-embedded file, Internet Explorer parses the e-mail and attempts to handle embedded MIME files. Vulnerable versions of Internet Explorer would check whether the application should be opened automatically (passed to the associated application without prompting) by examining the Content-Type header. For example, audio/x-wav files are automatically passed to Windows Media Player for playing.

A bug exists in vulnerable versions of Internet Explorer, however, where files are passed to the incorrect application. For example, a MIME header might appear as follows:

Content-Type: audio/x-wav;
name="foobar.exe"
Content-Transfer-Encoding: base64
Content-ID: <CID>


In this case, Internet Explorer determines that the file should be passed automatically to the associated application (no prompting) because the content type is audio/x-wav. However, when determining what the associated application is, instead of utilizing the Content-Type header (and the file header itself), Internet Explorer incorrectly relies on a default association, which will be made according to the extension. In this case, the extension is .EXE, and the file is passed to the operating system for executioninstead of passing the audio file to an associated application to be played.

This bug allows for the automatic execution of arbitrary code. Several Win32 mass-mailers send themselves via an e-mail with a MIME-encoded malicious executable and a malformed header, and the executable silently executes unbeknownst to the user. This occurs whenever Internet Explorer parses the mail and can happen when a user simply reads or previews e-mail. Such e-mail worms can spread themselves without any user actually executing or detaching a file.

Any properly associated MIME file type that has not set the "Confirm open after download" flag can be utilized for this exploit. Thus a definitive list is unavailable because developers can register their own MIME types.

Such an exploit was utilized by both W32/Badtrans and W32/Klez, allowing them to execute themselves upon a user's reading or previewing an infected e-mail.

##### 10.3.4.6 Application Rights Verification

Although improper input validation can give applications increased access, as with URL canonicalization, other models simply give applications increased rights due to improper designation of code as "safe." Such a design is employed by ActiveX. As a result, numerous blended attacks have also used ActiveX control rights verification exploits13.

##### 10.3.4.7 Safe-for-Scripting ActiveX Controls

By design, ActiveX controls are generally scriptable. They expose a set of methods and properties that can potentially be invoked in an unforeseen and malicious manner, often via Internet Explorer.

The security framework for ActiveX controls requires developers to determine whether their ActiveX control could potentially be used in a malicious manner. If a developer determines a control is safe, the control can be marked "safe for scripting."

Microsoft notes that ActiveX controls that have any of the following characteristics must not be marked safe for scripting:

• Accessing information about the local computer or user

• Exposing private information on the local computer or network

• Modifying or destroying information on the local computer or network

• Faulting of the control and potentially crashing the browser

• Consuming excessive time or resources, such as memory

• Executing potentially damaging system calls, including executing files

• Using the control in a deceptive manner and causing unexpected results

However, despite these simple guidelines, some ActiveX controls with these characteristics have been marked safe for scriptingand used maliciously.

For example, VBS/Bubbleboy used the Scriptlet.Typelib14 ActiveX control to write out a file to the Windows Startup directory. The Scriplet.Typelib contained properties to define the path and contents of the file. Because this ActiveX control was incorrectly marked safe for scripting, one could invoke a method to write a local file via a remote Web page or HTML e-mail without triggering any ActiveX warning dialog.

ActiveX controls that have been marked safe for scripting can easily be determined by examining the Registry. If the safe-for-scripting CLSID key exists under the Implemented Categories key for the ActiveX control, the ActiveX control is marked safe for scripting.

For example, the Scriptlet.Typelib control has a class ID of { 06290BD5-48AA-11D2-8432-006008C3FBFC} , and the safe-for-scripting CLSID is { 7DD95801-9882-11CF-9FA0-00AA006C42C4} . In the Registry, an unpatched system would contain the key

[View full width]
HKCR/CLSID/{06290BD5-48AA-11D2-8432-006008C3FBFC}/Implemented Categories
/{7DD95801-9882-11CF-9FA0-00AA006C42C4}


This allows any remote Web page or incoming HTML e-mail to create malicious files on the local system. Clearly, leaving such security decisions to the developer is far from foolproof.

##### 10.3.4.8 System Modification

After malicious software has gained access to the system, the system is often modified to disable application or user rights verification. Such modifications can be as simple as eliminating a root password or modifying the kernel, allowing user rights elevation or previously unauthorized access.

For example, CodeRed creates virtual Web roots, allowing general access to the compromised Web server, and W32/Bolzano15 patches the kernel, disabling user rights verification on Windows NT systems. System modificationbased exploitation techniques, however, were also possible on older systems such as Novell NetWare environments.

##### 10.3.4.8.1 NetWare ExecuteOnly Attribute: Consider Harmful

For many years, the Novell NetWare ExecuteOnly attribute has been recommended as a good way to secure programs against virus infections and to disallow illegal copying. Unfortunately, this attribute is not safe and, in some cases, can be harmful. A small bug in old versions of NetWare Workstation Shell (for instance, NET3.COM from 1989) made it possible for some viruses to write to ExecuteOnly files, even though these viruses had no NetWare-specific code at all16.

The same buggy shell still works with newer Novell NetWare versions, which means that the problem exists even today. When running on top of a buggy network shell, fast infectors can write into ExecuteOnly files on any NetWare version. In addition, there are other ways to exploit the same vulnerability on any version of the NetWare shell.

This happens because the ExecuteOnly attribute is not a NetWare right but a plain NetWare attribute. The workstation shell is supposed to handle it, but in an unsecured DOS workstation, this is impossible. Because there is no write protection at the server side for ExecuteOnly files, any program capable of subverting the shell checks can not only read and copy ExecuteOnly files, but can also write to them.

Because ExecuteOnly files cannot normally be read by any program or even by the Administrator, an infected ExecuteOnly file cannot be detected by standard virus scanners. This makes recovery from such an infection a major problem.

##### 10.3.4.8.2 Execute, Only?

Under Novell NetWare it is possible to mark program files as ExecuteOnly. Only the supervisor has the rights to do this. Because these files are not available for read by anyone, including the supervisor, this is supposed to make them unreachable for any kind of access except execution or deletion (by the supervisor only). Unfortunately, this method is not perfect, and the problem is fundamental.

Before the workstation can access files from the server, it must load the NetWare client software into memory. In the case of a DOS workstation, the client software consists of two TSR programs. The client must first start IPX (low-level communication protocol driver) and then NETx, the Workstation Shell. NETx adds several new subfunctions to the DOS INT 21h handler.

When IPX and NETx are running, the user of the workstation can execute LOGIN to connect to the server. When the client requests a file, the shell checks the location of the file. There are no special functionalities in NETx for requests for local files because it redirects all requests that need replies from the server. Thus all program execution requests, EXEC (INT 21h/AH=4Bh) calls, are redirected to the file server if the path of the executable is mapped from the server.

Every EXEC function (implicitly) must read the program code into the workstation memory first. To do this, it must be able to open the file for reading. ExecuteOnly files are not available to open by any user (including the supervisor), so the shell has to have a "backdoor." The workstation shell whispers, "I am the shell, and I want to execute this file; give me access to this file." And its open/read/close requests for this specific ExecuteOnly file succeed.

Note

I am not going to document here how the backdoor works. That's because I do not like to give ideas to virus writers.

In the buggy network shell, the operation continues with a simple "open file" function call:

1AA3:62EF B8003D      MOV      AX,3D00  ; Open for READ
1AA3:62F2 CD21        INT      21


Because the shell needs to open the file, the vulnerable code uses an Open function (3Dh) of INT 21h interrupt. Unfortunately, every resident program can monitor this function call since it goes through via the interrupt vector table. Thus the interrupt call is visible to fast infectors (viruses that hook INT 21h and wait for the Open subfunction). Without any efforts, this vulnerability allows viruses to infect files marked ExecuteOnly when executed on a workstation running the bogus shell.

To demonstrate this, I executed a test using the Burglar.1150.A virus, a typical fast infector. (Burglar was in the wild at the time of this experience.) I loaded the preceding vulnerable Workstation Shell version and Novell NetWare 4.10. I also created several test files in one directory by using supervisor rights and flagged them as ExecuteOnly.

I executed the virus on a workstation where I was logged in with normal user rights (no supervisor rights), but I did have directory modification rights to the test directory. The virus could infect ExecuteOnly files perfectly as long as I had directory modification rights on the test directory. That means there is no write protection on ExecuteOnly files at the server side on a file granularity level. If a workstation is able to write to an ExecuteOnly file, the NetWare server will allow it!

For another simple test, I executed the DOS internal command ECHO and redirected its output to an ExecuteOnly file. The command was the following:

ECHO X >> test.com.


To my surprise, I was able to modify ExecuteOnly files with this method as well.

Now, let's see what Novell did to fix the vulnerability. Apparently, there is a minor, but crucial change to how the shell opens a file. The shell calls its own INT 21h entry point directlynot through the interrupt vector table. This means that the call is no longer visible to other TSR programs (including viruses). The shell's code is fixed the following way:

1AB3:501D B8003D        MOV     AX,3D00
1AB3:5020 9C            PUSHF
1AB3:5021 0E            PUSH    CS
1AB3:5022 E893B7        CALL    07B8; Shell entry point


I used INTRSPY (Interrupt Spy)17 to check the OPEN and EXEC interrupt sequences. Interrupt Spy is a very useful, small TSR that can monitor all interrupts.

Interrupt Spy could see as the file "TEST.COM" was opened by the bogus network shell, on the file server:

EXEC: INT 21h AX=4B00 BX=0D03 CX=0D56 DX=41B9           Timer: 880420
File:         F:\VIRUS!\TEST.COM

OPEN: INT 21h AX=3D00 BX=008C CX=0012 DX=41B9
Timer: 880420
File:         F:\VIRUS!\TEST.COM


When I used the fixed network shell, (NETX.EXE), Interrupt Spy could no longer notice that "TEST.COM" was opened on the file server:

EXEC: INT 21h AX=4B00 BX=0D03 CX=0D56 DX=41B9              Timer: 893702
File:         F:\VIRUS!\TEST.COM

OPEN: INT 21h AX=3D00 BX=0001 CX=0000 DX=001F              Timer: 893810
File          A:\REPORT.TXT


##### 10.3.4.8.3 Conclusions from This Experience

The ExecuteOnly flag alone cannot stop illegal copying or viruses. An attacker has too many methods to bypass this particular protection:

Attack 1: The attacker can save an image of the executed program from the memory of the workstation. Sometimes this can be difficult (as with relocated code) but generally is feasible.

Attack 2: The attacker uses a buggy shell with a resident program, which hooks INT 21h and waits for the file to open. Then the program copies the file to some other location without the ExecuteOnly flag.

Attack 3: The attacker modifies the shell code in memory (or in the NETx program file itself) by patching a few bytes. This can remove the ExecuteOnly protection from correct shells as well. From that workstation, every user (or program) can copy or modify ExecuteOnly flagged files. I have demonstrated this attack with my proof of concept exploit code on the EICAR conference in 1996.

You should not think that ExecuteOnly is a secure solution in NetWareespecially not since fast infector viruses could infect files marked ExecuteOnly when running on the vulnerable shell. This vulnerability caused major problems when cleaning the network from fast infector viruses.

##### 10.3.4.9 Network Enumeration

Several 32-bit computer viruses enumerate the Windows networks using standard Win32 browsing APIs, such as WNetOpenEnum(), WNetEnumResourceA() of MPR.DLL. This attack first appeared in W32/ExploreZip. Figure 10.12 shows the structure of a typical Windows network.

##### Figure 10.12. A typical Windows network with domains.

Resources that do not contain other resources are called objects. In Figure 10.12, Sharepoint #1 and Sharepoint #2 are objects. A sharepoint is an object that is accessible across the network. Examples of sharepoints include printers and shared directories.

The W32/Funlove18 virus was the first file infector to infect files on network shares using network enumeration. W32/Funlove caused major headaches by infecting large corporate networks worldwide. This is because of the network-aware nature of the virus. People often share directories without any security restrictions in place and share in more directories (such as a drive C:) than they need to, and often without any passwords. This enhances the effectiveness of network-aware viruses.

Certain viruses, such as W32/HLLW.Bymer, use the form \\nnn.nnn.nnn.nnn\c\windows\ (where the nnn's describe an IP address) to access the shared C: drive of remote systems that have a Windows folder on them. Such an attack can be particularly painful for many home users running a home PC that is not behind a firewall. Windows makes the sharing of network resources very simple. However, the user must pay extra attention to security, using strong passwords, limiting access to only necessary resources, and using other means of security, such as personal firewall software.

Even more importantly, the real exploitable vulnerability comes into the picture with the appearance of W32/Opaserv worm19. Opaserv exploits the vulnerability described in MS00-072. Using the exploit Opaserv can easily attack shared drives on Windows 9x/Me systems even if there was a shared drive password set. The vulnerability allows the worm to send a one-character "long" password. Thus the worm can quickly brute-force the first character of the real password in the range between the 0x21 ("!") and 0xFF characters and be able to copy itself to the shared drive of the "protected" remote systems in the blink of an eye.