How do I convert a string into safe SQL String?

66,877

Solution 1

Don't sanitize your strings. Use parameterized queries instead, as they handle all sanitization.

You don't specify which database you are using, so I assume it is MS SQL Server. Microsoft has an article on the official ASP.net website about this. Also see MSDN for SqlCommand.Parameters and the AddWithValue method.

Solution 2

There is only a single character you have to escape: ansi 0x27, aka the single quote:

safeString = unsafeString.Replace("'","''");
Share:
66,877
Diskdrive
Author by

Diskdrive

Updated on September 22, 2020

Comments

  • Diskdrive
    Diskdrive over 3 years

    I'm generating some sql insert statements from a bunch of text files.

    These text files are generally user input data. I would like to sanitize this data so that it's not going to break the insert statement.

    For example, some of the input data, people have used the word Don't. The "'" in don't will lead the sql statement to think the string has ended and therefore cause an error.

    Is there any .NET method I can call to kind of convert all of these characters to either escape codes or safe characters?

  • marc_s
    marc_s about 13 years
    +1 there is no "safe" way to concatenate together SQL statements - using parametrized queries (ALWAYS!) is the only viable way to go
  • Andomar
    Andomar about 13 years
    @marc_s: Parameterized queries are only a partial solution. They don't protect against SQL injection into a dynamic SQL query, javascript injection into a varchar field, parameter injection into an html label, etc. There is no "silver bullet" solution to injection.
  • Martin Smith
    Martin Smith about 13 years
    This assumes that the OP is quoting all values in their insert statement though. If they are generating a string like this INSERT INTO T1(numeric,string) values (1,'some string') then it won't help if they have some unexpected bad data in the first column. (e.g. '1,2); DROP TABLE ...'
  • Michael Stum
    Michael Stum about 13 years
    @Andomar That's a different problem though: Parameterized Queries offer 100% protection against SQL Injection, that is "Injection of unwanted SQL into another SQL Statement". The other side is Output Encoding, aka. HTML Encoding all user data that is output into a website to prevent stuff like XSS. That is an entirely different problem, one that is much harder. I'm not sure what you mean by "dynamic SQL query" - even when building a query with StringBuilders you can easily add parameters? (Parameterized Queries are NOT stored procedures after all, they can be dynamically generated)
  • Andomar
    Andomar about 13 years
    See this example for how dynamic SQL enables SQL Injection, even though you use a .NET parameterized query
  • Michael Stum
    Michael Stum about 13 years
    @Andomar I see. That seems to be a rather unusual way of doing it though: Why would anyone use T-SQL and EXEC if they are already coding in .net? In .net I would just do if(!string.IsNullOrEmpty(custName)){ sb.Append(" AND customerName = @custName"); cmd.Parameters.AddWithValue("@custName",custName); } cmd.CommandText = sb.ToString(); and be done with it.
  • Michael Stum
    Michael Stum about 13 years
    @Andomar I think I understand it now. Indeed, you are right. If your Stored Procedure is dynamically concatenating strings, then parameterized queries won't do much.
  • jhsowter
    jhsowter almost 10 years
    DO sanitise untrusted input because of persistent XSS for one.
  • Michael Stum
    Michael Stum almost 10 years
    @jhsowter Not on the database side. XSS is a much bigger problem and should be handled in an appropriate layer - don't ever put it in the database to begin with, and never render out anything from the database without encoding it.
  • jhsowter
    jhsowter almost 10 years
    @MichaelStum I have thought a bit more and I agree you should validate the string for XSS and then store it if it's safe. That is, validate, but do not sanitise. Let a framework sanitise the string for you.
  • C. Tewalt
    C. Tewalt almost 9 years
    @MichaelStum Unfortunately, you can't use a parameterized query with the "ORDER BY" clause. (At least in Sql Server)
  • Sellorio
    Sellorio over 8 years
    @MartinSmith Then use int.TryParse for the first parameter to ensure its a clean integer.
  • Johan Aspeling
    Johan Aspeling over 7 years
    What about commas in an insert statement?
  • ruffin
    ruffin about 7 years
    @marc_s (& Mr. Stum) I understand that parameterized queries are always preferred, but that's also just offloading the problem -- somebody's writing the code that does that cleanup within the parameterization ;). If for no other reason than for a thought experiment, can you explain why Andormar's answer wouldn't work for string params in, eg, the {0} in SELECT * FROM Table1 WHERE Name='{0}'? Or TryParse for an int? Etc. What makes conventional use cases too complicated for the general dev public?
  • O. R. Mapper
    O. R. Mapper almost 7 years
    The OP says they want to write the INSERT statements into SQL script files. It's not clear from this answer how parametrized queries would work in those. Once the parameter values are written into the script files, they still have to be escaped in some way.
  • Anders Lindén
    Anders Lindén over 6 years
    How do you use parametrerized queries when you want to use thecol like '%foobar%'?
  • Jesús López
    Jesús López about 6 years
    This is enough for me. I included it in my micro ORM (EntityLite) for very large IN predicates, to workaround the 2100 SQL Server max number of parameters. I hope there is no more characters to take care of. I think I was too much paranoid with SQL injection.
  • Jesús López
    Jesús López about 6 years
    @marc_s, There is always (almost always) an exception to every rule. When you use SQL Server and you have more than 2100 elements in an IN or NOT IN predicate, you cannot use parameters, you need to resort to literals.
  • marc_s
    marc_s about 6 years
    @JesúsLópez: if you have more than 2100 elements for an IN clause - you have quite a few other problems to deal with first......
  • Jesús López
    Jesús López about 6 years
    @marc_s. Most of the time we can use a subquery. But in some extreme cases we have no better alternative.
  • marc_s
    marc_s about 6 years
    @JesúsLópez: if you really need 2100+ elements - why aren't those stored in a separate table (possibly a temp table) so you can just easily join - instead of having such a monster of an IN clause?? There's always a better alternative than 2100+ elements in an IN!
  • Jesús López
    Jesús López about 6 years
    @marc_s. Yes, I see this sound weird.But trust me, I saw rare scenarios where It make sense.
  • Jesús López
    Jesús López about 6 years
    @marc_s. The large IN scenario is not the only one you need literals. Think about a bunch of insert, update and delete statements in a single batch. It's very easy to have more than 2100 values is this scenario, in this case you need literals.
  • StayOnTarget
    StayOnTarget about 5 years
    @JohanAspeling a comma within single quotations should not cause a problem. But if it occurred where you did not expect a string then the other comment about TryParse for a numeric value would take care of it
  • Alexandre Daubricourt
    Alexandre Daubricourt almost 5 years
    Lol is this really the only think to take care of ? In this case people really are too paranoid x) Parametrized queries are painful to maintain and don't look nice
  • Chris HG
    Chris HG about 4 years
    This doesn't sanitize the data as per the OP if the sproc is using dynamic SQL. Consider stackoverflow.com/questions/4102387/…