Linux4Dummies: Preventing HTML form tampering

Introduction

Most web applications rely on HTML forms to receive input from the user.

However, HTML forms have one large weakness: users can save the form to a file, edit it, then use the edited version to submit data back to the server.

This security problem is made worse by the "stateless" nature of web applications. HTTP transactions are connectionless, one-time transmissions. For applications that lead a user through a series of input forms, the application must temporarily store field data entered on previous pages.

Developers have two choices for storing this "state" information: on the server side or on the client side.

Most developers find it easiest to store this "state" information on the client side and have it sent back with each transaction. State information can be stored in a browser three ways:

Browser "cookies"
Values added to the URL
HTML form "hidden" fields

There are advantages and disadvantages to each of the above methods but as we shall demonstrate all are vulnerable to being changed by the end user.

This paper describes the problem in detail and provides example code to protect web application state data against tampering. We demonstrate how easy it is to tamper with hidden field values and show one method a web application can use to detect it.

The example web applications shown here are written in Perl using the popular CGI.pm library (http://search.cpan.org/dist/CGI.pm/). However, the issues and solution presented apply to any language used for web development... Java, PHP, ASP, Dot Net and so on.

Hidden fields

Arguably the most common method of storing state information is to use HTML "hidden" fields... fields embedded in HTML that the browser doesn't display.

"Hidden" fields are an easy choice for preserving state information in a web application. As the user proceeds through each input screen, you can use "hidden" fields to store the information already collected so it is sent back to the server as part of the next transaction.

Example: log in and change address

Here's a simple web application that allows a user update their mailing address.

The application displays two pages:

A form for mailing address information
A page indicating the result of the transaction

unsafe-form.pl Perl 5 CGI script (3k)

Try it here

(We're cheating a little here by omitting a user login page, plus the script doesn't actually do anything with the form data, but you get the idea).

Run the above script in your browser and view the HTML source code. The form is using "hidden" fields to store information from the login process:

<input type="hidden" name="userid" value="ktrout">
<input type="hidden" name="credit_ok" value="1">
<input type="hidden" name="form_expires" value="20001001:12:45:20">

When you fill in some information and press the submit button, a confirmation screen uses the value of the userID hidden field, read directly from the HTML form.

Tampering with the form

Use your browser's "Save as" feature to save the HTML of the change address form to your computer. The complete HTML, including values in the "hidden" fields are saved.

Open the HTML file with a text editor and change the userid field and save the file. You can alter the other "hidden" fields as well.

Open the file in your web browser and submit the form. Because the application gives complete trust to the contents of the "hidden" fields, any user knowledgeable enough to save and load the HTML is able to alter the user ID field and change the addresses of any user they like.

What about HTTP_REFERER?

Experienced web programmers may be thinking this type of tampering can be prevented by checking the HTTP_REFERER variable.

Most browsers send an HTTP header named "HTTP_REFERER" (yes, it's really spelled that way). It contains the URL of the page the user viewed before the current one (in the CGI.pm module this header is available from the referer() function).

For self-referring web applications such as our demo app, HTTP_REFER will contain the URL of the application itself. If the user saves the form to their computer and resubmits it, HTTP_REFERER is blank or contains a different URL.

This is not a safe method to validate a form against tampering. Just like form fields, the value of HTTP_REFERER is set by the web browser. A user with only a little knowledge can write a script (in Perl or other language) to spoof this header along with the contents of the form.

Checking HTTP_REFERER will catch trivial attempts to tamper with forms, but cannot be relied on for serious web applications.

A general rule when developing web applications is that anything sent back by a web browser: form fields, HTTP headers, and even cookies can all be tampered with and must be considered untrustworthy information.

Smashing cookies for fun and profit

Some languages such as PHP and Java make it easy to store session data on the web server (e.g. in temporary files, in memory, or temporary SQL tables). Since the data is not held by the web browser, altering form field data is not as easy.

However, the server-side data must be referenced by a client-side identifier, usually in the form of a random session ID stored in a cookie, appended to the application URL, or (rarely) stored in a hidden form field.

URLs and hidden form fields are easy to alter. Cookies are less so, but still cannot be considered secure in any way.

Cookies are simply small chunks of data stored by the browser and come in two flavours:

Transient (or session) cookies: Ones without an expiry date or a date set in the past.

Persistent cookies: Ones with an expiry date set in the future.

Persistent cookies are stored by the web browser on disk. MS Internet Explorer, for example, stores each persistent cookie as a unique file in the user's "Application Data\cookies" folder. Netscape and Mozilla-based browser store all cookies in one file named "cookies.txt". Since they are stored in files, persistent cookies can be altered by the user using a text editor.

Transient cookies are usually kept only in memory and erased when the browser window closes, making them more difficult to alter. However, several HTTP proxies are available that allow users to view and modify transient cookies (for example "Burp Proxy" http://portswigger.net/proxy/), so they must not be considered secure.

Detecting tampering with digest algorithms

If all session input from a web browser can be altered and is untrustworthy, how can a web application detect tampering?

One way is to use message digest algorithms. Digest algorithms create a unique "signature" string for any given input data, such that it is practically impossible to alter the original data so that it still produces the same signature. Digests are used in SSL browser connections, Virtual Private Networks (VPNs) and Public Key Infrastructure (PKI) systems to "sign" data. The same technique can be used in a web app to "sign" hidden fields.

The most common digest algorithm is Message Digest 5 (MD5), but that is now considered weak. Better algorithms exist such as SHA1 and RIPEMD.

Digests and HTML forms

Applied to HTML forms, you can use MD5 or other algorithm to create a fingerprint of hidden field data. You could simply concatenate the values of all the hidden fields into a string, run it through a digest algorithm to get a fingerprint and send it out in another hidden field. When the form is submitted by the user, the hidden fields returned could be fingerprinted again and compared to the original fingerprint.

But wait! If the user knows you're using MD5, it's possible for them to alter the hidden fields and generate a new fingerprint from the form data.

Fortunately, Digest algorithms are designed to make it impossible to determine the contents of a message from it's fingerprint. We can use this to also add a secret component to the fingerprint the user never sees. The user may be able to compute the fingerprint of the "hidden" form fields, but without also knowing the secret component, they cannot compute the correct fingerprint.

By adding a secret component to a digest, we're actually constructing what cryptographers call a "message authentication code" (MAC). The standard way to do this is via a function called HMAC.

The HMAC standard

The standard way to use a digest for message authentication is HMAC (RFC 2104). This algorithm folds the text to be authenticated with two keys and three iterations through a digest such as MD5 or SHA1.

HMAC functions are either part of or an available addition to every programming language:

Perl	Digest::HMAC	http://search.cpan.org/
PHP	mhash()	http://mhash.sourceforge.net/
Java	KeyGenerator	http://java.sun.com/products/jce/
MS Dot Net	HMACSHA1	Dot Net Framework 1.1

Example: a tamper-resistant form

Here's our simple "change address" application again but this time we add a SHA1 HMAC "signature" to the form.

safe-form.pl Perl 5 CGI script (3k)

Try it here

Run the above script in your browser and view the HTML source code. The form has a new "hidden" field named "signature" to store a SHA1 HMAC:

<input type="hidden" name="userid" value="ktrout">
<input type="hidden" name="credit_ok" value="1">
<input type="hidden" name="form_expires" value="20051001:12:45:20">
<input type="hidden" name="signature" value="YJSG2/fXQRSsvLdDXJpjF/xLLYo">

The fingerprint was generated using the names and values of the three hidden fields, plus a secret key stored only on the server.

When you submit the form, the contents of the three hidden fields are combined and an HMAC using the secret key is generated. If it doesn't match the "signature" field sent with the original form, the fields have been tampered with.

Try saving the form to your computer and editing the "userid" field again. This time the tampering is detected.

This technique can also be used with web applications that store the actual data on the server and use a client-side cookie to store a session key.

The solution to tampering?

This method can detect changes to fields and cookies, but it is not foolproof.

The biggest weakness in this method relies on a "secret key" stored on the web server. If the server is broken into or someone manages to view the source code of the application, the method is no longer secure.

This is a weakness of all "shared key" encryption methods and difficult to solve. However, assuming your web server is reasonably secure from break-ins and you change the key regularly, it may be secure enough. Certainly this method is an improvement to trusting "hidden" fields or using HTTP_REFERER checks.

Linux4Dummies

Pages

Tuesday, May 8, 2012

Preventing HTML form tampering