Vulnerability Case Study: Format String Vulnerabilities

Format string attacks

  • The goal of a format string exploit is to take control of the execution of a program by overwriting data. It is dependent on programming mistakes or shortcuts made when using format functions such as print().
  • If the software being tested is written in C, C++, or Perl. even in part. there is a possibility that it may use one or more format functions.
  • The root cause of this vulnerability is that these languages allow functions to accept any number of arguments by “popping“ as many arguments as they wish off the stack and trusting the early arguments to show how many additional arguments are to be popped and what data types they are. If this is combined with unfiltered user input as the format string. The vulnerability is created.
  • The following are the basic ANSI C functions that may be vulnerable to format string attacks:
    • fprintf — prints to a file stream
    • printf — prints to the stdout stream
    • sprintf — prints into a string
    • snprintf — prints into a string with link checking
    • vfprintf — print to a file stream from a va_arg structure
    • vprintf — prints to a stdout stream from a va_arg structure
    • vsprintf — prints to a string from a va_arg structure
    • vsnprintf — prints to a string with length checking from a va_arg structure.
  • The format string itself contains the format parameters, similar to how inline formatting is done in HTML. The format parameters for printf()are:
    • %c — Unicode represented by an integer - passed as a value
    • %d — decimal (int) - passed as a value
    • %e — scientific representation of a floating pount number – passed as a value
    • %f — signed decimal string of the form xx.yyy
    • %i — signed decimal string - passed as a value
    • %u — unsigned decimal (unsigned int) - passed as a value
    • %n — number of bytes written so far (int)
    • %o — unsigned octal string - passed as a value
    • %p — formats a pointer to an address - passed as a reference
    • %s — string ((const) (unsigned) char) – passed as a reference
    • %x — hexadecimal (unsigned int) - passed as a value
    • %%— used to print the escape character “%”
  • When a string is passed to a format function, the function evaluates the format string one character at a time and if the character is not is percentage sign (%). it is copied to the output. If the percentage sign is encountered, the next character specifies the type of parameter that requires evaluation.
  • The following Figure illustrates the stack when a format string has been pushed onto it.

  • For every an parameter found in the string, the function expects an additional variable to be passed in. So if four format parameters exist in a format string, there should be a total of five arguments passed to function — the format string itself and one additional argument for each of the parameters. The problems occur when developers use a format string function without explicitly stating that the input must be treated as a string.
  • Example:
    • Correct: printf(”%s“, inputString);
    • As opposed to: Incorrect: printf (inputString);
    • If the prior two uses of printf() are used where inputString = “Testing%s”: Correct: Testing%s
    • Incorrect: Testing?? (where the ?? is whatever string is being pointed to by whatever memory address was at the top of the stack).
  • For a simple string that contains no format parameters. this will result in the same output. But if a string is provided to the function that includes One or more of these format parameters embedded in it, the function will treat the string (that the user wanted to treat as merely a string) as a format string.
    • The danger in this is that some of the format string functions take a pointer to memory — specifically %s (string) and %n (number of bytes written so far).
    • Example:
      • %s expects a memory address and then will print the data at that address until a null byte terminator is encountered to signal the

end of the string.

  • %n expects a memory address and will write the number of bytes written so far into memory at that address.
  • So if an attacker can create an input string that is passed to a vulnerable implementation of a format string function, the individual can do anything from crashing processes or systems, to showing the contents of pans of the stack, to writing your data to the stack.

Anatomy of an Exploit

  • Most exploits of format string vulnerabilities start with knowing that a format string vulnerability exists. This can be discovered in multiple ways. including personal experimentation and seeing it posted on one of the many boards frequented by hackers.
  • It's also helpful to know that the language the software is written in is one that has a risk for this vulnerability to exist (C, C++, and Perl, in this case).
  • Once the vulnerability is known, the attacker needs to carefully craft a string to exploit the format string vulnerability.
  • The exact string or strings will depend on what the attacker wishes to accomplish.
  • Once the attackers determine what they want to accomplish, they can craft one or more strings to exploit the vulnerability.
  • Now they can combine this exploit with other code to create malicious software to distribute or install.

Real-World Example

  • Although considerable numbers of format string vulnerabilities have been reported, few actual exploits are listed among the reports of real incidents.
  • But a very interesting toolkit that did use string format vulnerabilities in some of its functionality was reported “in the wild” in January 2001.

Ramen Worm Toolkit

  • This intrude toolkit was recovered from several compromised systems and analyzed to discover that it contained several tools for attempting exploit vulnerabilities in common software.
  • Ramen is self-propagating and restarting, and once it has completed modifications of the compromised host system, Ramen starts to scan and attempts exploits against external systems. Similar to all root compromises, Ramen is painful and time consuming (and therefore expensive) to recover from.
  • The vulnerabilities Ramen attempts to exploit will cause a root compromise if even one of them is successful. These vulnerabilities are described as follows:

wu-ftpd (port 2 l/tcp)

  • The initial advisory for this vulnerability was reported in July 2000 — five months prior to the Ramen worm. There are actually two separate vulnerabilities under this advisory:
    • The “site exec” functionality allows a user logged into the ftp server to execute a limited number of commands on the server itself. However, this functionality uses the printf() function and, if the user passes carefully crafted strings when calling that functionality, the daemon can be tricked into executing an arbitrary code as the root. So, in other words, the developers didn't use a format specifier when they used printf().
    • The “setproctitle()” call sets a string used to display process identifier information. This functionality eventually calls vsnprintfO and passes the buffer created from the setproctitle0 as the format string. The developers didn't use a format specifier when they used vsnprintf().

rpc.statd (port 111/udp)

  • The initial advisory for this vulnerability was reported in August 2000 — four months prior to the Ramen worm.
  • The rpc.statd program passes user input data to the syslog() function as a format string. This means a user can create a string that put executable code into the process stack and overwrite the return address, forcing the code to be executed. The developers didn't use a format specifier when they called syslog().
  • This was made worse by the fact that rpc.statd maintains root access privileges even though those privileges are only needed for it to initially open its network socket. This means that whatever code the attacker injects is run with root (administrator) privileges.

Iprng (port 515/tcp)

  • The initial advisory for this vulnerability was released on December 12,2000 – only one month prior to the Ramen worm.
  • The Iprng software accepts user input that is later passed to syslog() as the format suing for a function call to snprintf(). This particular instance of the format string vulnerability can allow users who have remote access to the printer port to overwrite addresses in the printing service's address space. This can cause segmentation that will lead to denial of printing services or can be used to execute an arbitrary code injected by other methods into the memory stack of the printer service.
  • All the incidents were caused by the lack of a format specifier when syslog() was called.

Test Techniques

  • Out of the list of formatting functions, sprintf and vsprintf are the functions that deserve some special care and attention as they are the ones that -print- formatted data to a buffer. The fix for this vulnerability is to always use a format specifier to format data.

Black Box

  • In Black-Box testing, it becomes important to include some of the formatting parameters in all input fields. These should include the following. but more can be used:
    • %x
    • %s
    • %n
  • Then watch all output and the program itself for unusual output .or behavior. You can also try testing with several %s parameters embedd in the input string. If a format string vulnerability exists, the resulting behavior would be likely to be an access violation error or another error that would cause the application to crash.

White Box

  • White-Box testers have a relatively easy time finding these vulnerabilities. The code base needs to be searched for any use of the C format string functions, and each time they are used, there must be a format string in place to insure that the data is correctly interpreted.
  • There are some additional functions that use formatted output, which must also be scrutinized, and are fairly OS specific. These include functions such as syslog().


  • There are a number of potentially useful free tools that can provide additional checks of the C code. Although automated tools can add significant value to test efforts, they do not take the place of security reviews and dedicated security testing. They also have a tendency to generate a large number of false positives and thus a great deal of noise.
  1. Flawfinder: This is an open source tool that does security scans of C code. It can be obtained from
  2. ITS4 Security Scanner: This is a freeware tool from Cigital, Inc., that scans for potentially dangerous function calls in C code (including format string calls). It can be obtained from
  3. Pscan: This is an open source tool that scans C code for potentially dangerous function calls. It can be obtained from
  4. Rough Auditing Tool for Security (RATS): This is an open source code analysis tool designed to check C source code for potentially dangerous function calls (including format string calls). It can be obtained from
  5. Smatch: This is an open source tool that scans C and C++ code for known bugs and Potential security defects but is moldy focused on checking the Linux kernel code. It can be obtained from http://smatch•
  6. Splint: This is an open source code tool that scans C code for potential vulnerabilities and dangerous programming practices and can be obtained from