Regular expressions to prevent XSS or something else?

16,382

Solution 1

Please read over the OWASP XSS (Cross Site Scripting) Prevention Cheat Sheet for a broad array of information. Black listing tags is not a very efficient way to do it and will leave gaps. You should filter input, sanitize before outputting to browser, encode HTML entities, and various other techniques discussed in my link.

Solution 2

    public static bool ValidateAntiXSS(string inputParameter)
    {
        if (string.IsNullOrEmpty(inputParameter))
            return true;

        // Following regex convers all the js events and html tags mentioned in followng links.
        //https://www.owasp.org/index.php/XSS_Filter_Evasion_Cheat_Sheet                 
        //https://msdn.microsoft.com/en-us/library/ff649310.aspx

        var pattren = new StringBuilder();

        //Checks any js events i.e. onKeyUp(), onBlur(), alerts and custom js functions etc.             
        pattren.Append(@"((alert|on\w+|function\s+\w+)\s*\(\s*(['+\d\w](,?\s*['+\d\w]*)*)*\s*\))");

        //Checks any html tags i.e. <script, <embed, <object etc.
        pattren.Append(@"|(<(script|iframe|embed|frame|frameset|object|img|applet|body|html|style|layer|link|ilayer|meta|bgsound))");

        return !Regex.IsMatch(System.Web.HttpUtility.UrlDecode(inputParameter), pattren.ToString(), RegexOptions.IgnoreCase | RegexOptions.Compiled);
    }

Solution 3

You should encode string as HTML. Use dotNET method

HttpUtils.HtmlEncode(string text)

There is more details http://msdn.microsoft.com/en-us/library/73z22y6h.aspx

Solution 4

Blacklisting as sanitization is not effective, as has already been discussed. Think about what happens to your blacklist when someone submits crafted input:

<SCRIPT>
<ScRiPt>
< S C R I P T >
<scr&#00ipt>
<scr<script>ipt> (did you apply the blacklist recursively ;-) )

This is not an enumeration of possible attacks, but just some examples to keep in mind about how the blacklist can be defeated. These will all render in the browser correctly.

Share:
16,382
Andrey
Author by

Andrey

Updated on June 16, 2022

Comments

  • Andrey
    Andrey almost 2 years

    I am trying to protect my website from Cross-Site Scripting (XSS) and I'm thinking of using regular expressions to validate user inputs.

    Here is my question: I have a list of dangerous HTML tags...

    <applet>
    <body>
    <embed>
    <frame>
    <script>
    <frameset>
    <html>
    <iframe>
    <img>
    <style>
    <layer>
    <link>
    <ilayer>
    <meta>
    <object>
    

    ...and I want to include them in regular expressions - is this possible? If not, what should I use? Do you have any ideas how to implement something like that?