What are the best practices for avoiding xss attacks in a PHP site

28,212

Solution 1

Escaping input is not the best you can do for successful XSS prevention. Also output must be escaped. If you use Smarty template engine, you may use |escape:'htmlall' modifier to convert all sensitive characters to HTML entities (I use own |e modifier which is alias to the above).

My approach to input/output security is:

  • store user input not modified (no HTML escaping on input, only DB-aware escaping done via PDO prepared statements)
  • escape on output, depending on what output format you use (e.g. HTML and JSON need different escaping rules)

Solution 2

I'm of the opinion that one shouldn't escape anything during input, only on output. Since (most of the time) you can not assume that you know where that data is going. Example, if you have form that takes data that later on appears in an email that you send out, you need different escaping (otherwise a malicious user could rewrite your email-headers).

In other words, you can only escape at the very last moment the data is "leaving" your application:

  • List item
  • Write to XML file, escape for XML
  • Write to DB, escape (for that particular DBMS)
  • Write email, escape for emails
  • etc

To go short:

  1. You don't know where your data is going
  2. Data might actually end up in more than one place, needing different escaping mechanism's BUT NOT BOTH
  3. Data escaped for the wrong target is really not nice. (E.g. get an email with the subject "Go to Tommy\'s bar".)

Esp #3 will occur if you escape data at the input layer (or you need to de-escape it again, etc).

PS: I'll second the advice for not using magic_quotes, those are pure evil!

Solution 3

There are a lot of ways to do XSS (See http://ha.ckers.org/xss.html) and it's very hard to catch.

I personally delegate this to the current framework I'm using (Code Igniter for example). While not perfect, it might catch more than my hand made routines ever do.

Solution 4

This is a great question.

First, don't escape text on input except to make it safe for storage (such as being put into a database). The reason for this is you want to keep what was input so you can contextually present it in different ways and places. Making changes here can compromise your later presentation.

When you go to present your data filter out what shouldn't be there. For example, if there isn't a reason for javascript to be there search for it and remove it. An easy way to do that is to use the strip_tags function and only present the html tags you are allowing.

Next, take what you have and pass it thought htmlentities or htmlspecialchars to change what's there to ascii characters. Do this based on context and what you want to get out.

I'd, also, suggest turning off Magic Quotes. It is has been removed from PHP 6 and is considered bad practice to use it. Details at http://us3.php.net/magic_quotes

For more details check out http://ha.ckers.org/xss.html

This isn't a complete answer but, hopefully enough to help you get started.

Solution 5

rikh Writes:

I do my best to always call htmlentities() for anything I am outputing that is derived from user input.

See Joel's essay on Making Code Look Wrong for help with this

Share:
28,212
Rik Heywood
Author by

Rik Heywood

web developer

Updated on July 05, 2022

Comments

  • Rik Heywood
    Rik Heywood almost 2 years

    I have PHP configured so that magic quotes are on and register globals are off.

    I do my best to always call htmlentities() for anything I am outputing that is derived from user input.

    I also occasionally seach my database for common things used in xss attached such as...

    <script
    

    What else should I be doing and how can I make sure that the things I am trying to do are always done.

  • Kornel
    Kornel over 15 years
    htmlentities() is an overkill and it's encoding-sensitive. htmlspecialchars() protects just as well.
  • Kornel
    Kornel over 15 years
    SQL does not execute JavaScript. Transforming data to a safe subset common to HTML, SQL, mail, etc. is too limiting and doesn't eliminate risk completely. Proper escaping of HTML output is bulletproof for HTML. For proper SQL escaping use SQL tools!
  • Cheekysoft
    Cheekysoft over 14 years
    htmlspecialchars may not be your friend : stackoverflow.com/questions/110575/…
  • Josiah
    Josiah about 14 years
    I agree wholeheartedly, and I would say that the best template library is xsl.
  • James
    James over 13 years
    Don't use register globals. They make it easy to write insecure code and have been depreciated in the time since this was posted.
  • Alexey Feldgendler
    Alexey Feldgendler over 13 years
    I meant disabling register globals, of course, not enabling. Typo.
  • Casebash
    Casebash over 12 years
    @Cheekysoft: Just set the appropriate flags
  • Airy
    Airy about 10 years
    As I think It would be better to escape first and then save it in Database because in this way you will have to escape only once but if you just store it DB and escape everytime user visits site can make work a bit server loaded. And most of the escaping are same for PHP and Node.js. So better Escape first and then save.
  • Michał Niedźwiedzki
    Michał Niedźwiedzki about 10 years
    @AbdulJabbarWebBestow absolutely not. Data base is a place where you store data in output agnostic format. Different output devices require different escaping rules, thus by escaping for HTML output before hitting the database you lock yourself out from writing APIs, PDF exports, etc. Don't worry about server load. It's their job to be loaded.
  • Airy
    Airy about 10 years
    @MichałRudnicki I am not really ironing but can you give some different examples where we need different escaping because as far as I know escaping is same for all. And about database load then it's probable our duty to decrease load as much as possible otherwise this gonna take down your server. And most of the times about 75% it's html which requires same escaping and when the time comes for PDF or else you can use decode functions to reverse.
  • Scott Arciszewski
    Scott Arciszewski over 8 years
    Bonus round: WordPress got owned by XSS via MySQL column truncation in 2015 thanks to filtering on input, rather than output.
  • Mr Lister
    Mr Lister over 8 years
    @AbdulJabbarWebBestow Quotes " need to be escaped as &quot; for use in HTML, but \" for use in most other languages.