Validation

From Hashphp.org
Revision as of 21:56, 27 July 2011 by Caffinated (Talk | contribs) (What Is an SQL Injection Attack?)

Jump to: navigation, search

Validating User Input

As many of you might be aware, one of the staples of any web application security is to make sure that data passed to you from the user won't break your application or otherwise damage your data. Just like you'd check to see who is at the door of your house before you let them in, you'd check user supplied data to see if it is acceptable for whatever you intend to use it for.

Another reason to validate input is simply because you can provide more intuative responses to the user. They may hit a letter by accident while typing in a number. It is far more clear to the user if you inform them that they've made a mistake and allow them to correct it, rather than have the application break, or worse, have the application continue with an unexpected result.

What Could Possibly Happen If I Don't?

Well, it largely depends on what your application is doing. It could be as simple as throwing a big ugly error, but it could also be a lot worse. Take the following highly insecure code example:

 

This piece of code seems simple enough. Open a connection to 'myserver.com', and select a row from 'mytable'. What could possibly go wrong? As it turns out, quite a lot. This particular piece of code has two very distinct problems which we'll look at.

First off, we're not validating the user input at all. This is a problem in cases where the program expects a certain kind of input and we don't explicitly check for that input. I've purposefully labeled the columns "int_col" and "string_col" to let you know what types they are in the table. Notice that one is specifically an integer. Now, if the user enters a non-integer value in to that column, the query will error.

So, how do I fix it?

The best way to fix this sort of validation issue is to use php's filter library. This library is designed to help you check and sanitize user input. For example, we could change our application to use filter_var() and it might look like this:

 

Now if we try to enter anything but a number for $_GET['int_col'] it won't get to the point where it asks MySQL for data. These kinds of validations are perfect for form data, and can allow you to decide to render the form again with errors to inform the user that they have made a mistake, and to try again. However, we're not quite finished here just yet. There is a remaining problem. The dreaded SQL injection.

What Is an SQL Injection Attack?

An SQL injection attack is when a user injects SQL commands in to an unprotected SQL query. This can lead to a number of issues, including modifying rows you didn't intend the user to modify, dropped tables, deleted rows, and access to possibly sensitive data. For example, if I send: "some_data' AND user_id=5 -- " in $_GET['string_col'] and 1 in $_GET['int_col'] the query ends up looking something like this:

 

Notice the double dash. This is a mysql comment token, and it will cause everything after it to be ignored. To mysql, the query now looks like this:

 

Which is clearly now ignoring our $user_id variable in favor of what the attacker has chosen.

How to Deal With Injection Attacks

So, we started off protecting our integer value, but what about the string value? Technically, a string could be any sequence of characters. If your program needs to allow them to be anything, we still have to protect our code. So how do we do this?

As with many things, there are several ways to deal with this problem depending on what database library you're using. To start off, we'll look at the standard mysql library we've been using in our previous examples. The best way to avoid injection attacks when dealing with the standard mysql library, is to escape your parameters. We can do this with mysql_real_escape_string().

For example:

 

This stops the injection attack! Hurray! But what if there were an easier, better way to do all this? The good news is, there is! PHP comes with another database library to deal with MySQL. It is called PHP Data Objects or PDO for short. This library can use drivers for many different database types, and supports a very important feature known as prepared statements, sometimes also known as parameterized queries.

So what do these prepared statements do? Well, quite a lot. They allow us to design our query ahead of time and to put placeholders in the areas where our user supplied data will go. Then when we ask the library to inject the values, it automatically escapes them for us. Consider this example:

 

Not only is this method much cleaner, but we can't forget to escape our parameters because PDO is kind enough to do it for us. Another advantage of prepared statements is that you can use the prepared statement over and over by simply replacing the parameters and executing it again. Many servers detect that you're doing this, and even make the query faster by performing server-side optimizations to get your data faster.