Sanitizing user's data in GET by PHP

68,983

Solution 1

How do you sanitize data in $_GET -variables by PHP?

You do not sanitize data in $_GET. This is a common approach in PHP scripts, but it's completely wrong*.

All your variables should stay in plain text form until the point when you embed them in another type of string. There is no one form of escaping or ‘sanitization’ that can cover all possible types of string you might be embedding your values into.

So if you're embedding a string into an SQL query, you need to escape it on the way out:

$sql= "SELECT * FROM accounts WHERE username='".pg_escape_string($_GET['username'])."'";

And if you're spitting the string out into HTML, you need to escape it then:

Cannot log in as <?php echo(htmlspecialchars($_GET['username'], ENT_QUOTES)) ?>.

If you did both of these escaping steps on the $_GET array at the start, as recommended by people who don't know what they're doing:

$_GET['username']= htmlspecialchars(pg_escape_string($_GET['username']));

Then when you had a ‘&’ in your username, it would mysteriously turn into ‘&amp;’ in your database, and if you had an apostrophe in your username, it would turn into two apostrophes on the page. Then when you have a form with these characters in it is easy to end up double-escaping things when they're edited, which is why so many bad PHP CMSs end up with broken article titles like “New books from O\\\\\\\\\\\\\\\\\\\'Reilly”.

Naturally, remembering to pg_escape_string or mysql_real_escape_string, and htmlspecialchars every time you send a variable out is a bit tedious, which is why everyone wants to do it (incorrectly) in one place at the start of the script. For HTML output, you can at least save some typing by defining a function with a short name that does echo(htmlspecialchars(...)).

For SQL, you're better off using parameterised queries. For Postgres there's pg_query_params. Or indeed, prepared statements as you mentioned (though I personally find them less managable). Either way, you can then forget about ‘sanitizing’ or escaping for SQL, but you must still escape if you embed in other types of string including HTML.

strip_tags() is not a good way of treating input for HTML display. In the past it has had security problems, as browser parsers are actually much more complicated in their interpretation of what a tag can be than you might think. htmlspecialchars() is almost always the right thing to use instead, so that if someone types a less-than sign they'll actually get a literal less-than sign and not find half their text mysteriously vanishing.

(*: as a general approach to solving injection problems, anyway. Naturally there are domain-specific checks it is worth doing on particular fields, and there are useful cleanup tasks you can do like removing all control characters from submitted values. But this is not what most PHP coders mean by sanitization.)

Solution 2

If you're talking about sanitizing output, I would recommend storing content in your database in it's full, unescaped form, and then escaping it (htmlspecialchars or something) when you are echoing out the data, that way you have more options for outputting. See this question for a discussion of sanitising/escaping database content.

In terms of storing in postgres, use pg_escape_string on each variable in the query, to escape quotes, and generally protect against SQL injection.

Edit:

My usual steps for storing data in a database, and then retrieving it, are:

  1. Call the database data escaping function (pg_escape_string, mysql_escape_string, etc), to escape each incoming $_GET variable used in your query. Note that using these functions instead of addslashes results in not having extra slashes in the text when stored in the database.

  2. When you get the data back out of the database, you can just use htmlspecialchars on any outputted data, no need to use stripslashes, since there should be no extra slashes.

Solution 3

You must sanitize all requests, not only POST as GET.

You can use the function htmlentities(), the function preg_replace() with regex, or filter by cast:

<?
$id = (int)$_GET['id'];
?>

Solution 4

Sanitize your inputs according to where it is going.

  • If you display it (on a page or as an input field's value), use htmlspecialchars and/or str_replace.
  • If you use it as another type, cast it.
  • If you include it in SQL query, escape it using the appropriate function, maybe strip html tags if you do want those to be totally removed (which is not the same as escaped).

Same for POST or even data from your DB, since the data inside your DB should generally not be escaped.

Two things you should check:

  1. Encoding of your input vs. your PHP scripts / output / DB table
  2. If you have [magic_quotes_gpc][1] enabled, you should either disable it (whenever you can) or stripslashes() GET, POST and COOKIE values. magic_quotes_gpc is deprecated, you should sanitize the data you manipulate, depending on the use of that data.

Solution 5

Use a PHP native function filter_var() with FILTER_SANITIZE_STRING filter.

Example: https://www.w3schools.com/php/filter_sanitize_string.asp

Share:
68,983
Michal aka Miki
Author by

Michal aka Miki

Vacare.

Updated on November 22, 2021

Comments

  • Michal aka Miki
    Michal aka Miki over 2 years

    How do you sanitize data in $_GET -variables by PHP?

    I sanitize only one variable in GET by strip_tags. I am not sure whether I should sanitize everything or not, because last time in putting data to Postgres, the problem was most easily solved by the use of pg_prepare.