Java - escape string to prevent SQL injection

java sql regex escaping sql-injection

293,271

Solution 1

PreparedStatements are the way to go, because they make SQL injection impossible. Here's a simple example taking the user's input as the parameters:

public insertUser(String name, String email) {
   Connection conn = null;
   PreparedStatement stmt = null;
   try {
      conn = setupTheDatabaseConnectionSomehow();
      stmt = conn.prepareStatement("INSERT INTO person (name, email) values (?, ?)");
      stmt.setString(1, name);
      stmt.setString(2, email);
      stmt.executeUpdate();
   }
   finally {
      try {
         if (stmt != null) { stmt.close(); }
      }
      catch (Exception e) {
         // log this error
      }
      try {
         if (conn != null) { conn.close(); }
      }
      catch (Exception e) {
         // log this error
      }
   }
}

No matter what characters are in name and email, those characters will be placed directly in the database. They won't affect the INSERT statement in any way.

There are different set methods for different data types -- which one you use depends on what your database fields are. For example, if you have an INTEGER column in the database, you should use a setInt method. The PreparedStatement documentation lists all the different methods available for setting and getting data.

Solution 2

The only way to prevent SQL injection is with parameterized SQL. It simply isn't possible to build a filter that's smarter than the people who hack SQL for a living.

So use parameters for all input, updates, and where clauses. Dynamic SQL is simply an open door for hackers, and that includes dynamic SQL in stored procedures. Parameterize, parameterize, parameterize.

Solution 3

If really you can't use Defense Option 1: Prepared Statements (Parameterized Queries) or Defense Option 2: Stored Procedures, don't build your own tool, use the OWASP Enterprise Security API. From the OWASP ESAPI hosted on Google Code:

Don’t write your own security controls! Reinventing the wheel when it comes to developing security controls for every web application or web service leads to wasted time and massive security holes. The OWASP Enterprise Security API (ESAPI) Toolkits help software developers guard against security‐related design and implementation flaws.

For more details, see Preventing SQL Injection in Java and SQL Injection Prevention Cheat Sheet.

Pay a special attention to Defense Option 3: Escaping All User Supplied Input that introduces the OWASP ESAPI project).

Solution 4

(This is in answer to the OP's comment under the original question; I agree completely that PreparedStatement is the tool for this job, not regexes.)

When you say \n, do you mean the sequence \+n or an actual linefeed character? If it's \+n, the task is pretty straightforward:

s = s.replaceAll("['\"\\\\]", "\\\\$0");

To match one backslash in the input, you put four of them in the regex string. To put one backslash in the output, you put four of them in the replacement string. This is assuming you're creating the regexes and replacements in the form of Java String literals. If you create them any other way (e.g., by reading them from a file), you don't have to do all that double-escaping.

If you have a linefeed character in the input and you want to replace it with an escape sequence, you can make a second pass over the input with this:

s = s.replaceAll("\n", "\\\\n");

Or maybe you want two backslashes (I'm not too clear on that):

s = s.replaceAll("\n", "\\\\\\\\n");

Solution 5

PreparedStatements are the way to go in most, but not all cases. Sometimes you will find yourself in a situation where a query, or a part of it, has to be built and stored as a string for later use. Check out the SQL Injection Prevention Cheat Sheet on the OWASP Site for more details and APIs in different programming languages.

View more solutions

293,271

Scott Bonner

I am a computer programmer and graphics designer with an extensive history. to view my exploits visit www.scottbonner.com

Updated on May 29, 2021

Comments

Scott Bonner almost 3 years

I'm trying to put some anti sql injection in place in java and am finding it very difficult to work with the the "replaceAll" string function. Ultimately I need a function that will convert any existing \ to \\, any " to \", any ' to \', and any \n to \\n so that when the string is evaluated by MySQL SQL injections will be blocked.

I've jacked up some code I was working with and all the \\\\\\\\\\\ in the function are making my eyes go nuts. If anyone happens to have an example of this I would greatly appreciate it.
- Scott Bonner over 14 years
  
  Okay, I've come to the conclussion that PreparedStatements are the way to go, however based off current objecctives I need to proceed as was originally planned and just put a filter in place for the time being and once the current milestone is reached I can go back and refactor the database for preparedstatement. In the mean time to maintain momentum, does someone have a solution to effectively escape the above characters for MySQL given the Java and it's regular expression system are an absolute pain to work out the number of escapes needed....
- Neil McGuigan over 8 years
  
  Not all SQL statements are parameterizable, for example "SET ROLE role_name" or "LISTEN channel_name"
- Seldom 'Where's Monica' Needy about 7 years
  
  @NeilMcGuigan Yep. Most drivers will also refuse to parameterize something like CREATE VIEW myview AS SELECT * FROM mytable WHERE col = ? since the main statement is a DDL-statement, even though the part you're trying to parameterize is actually DML.
Scott Bonner over 14 years

via this method can you treat every parameter as a string and still be safe? I'm trying to figure out a way to update my existing architecture to be safe without having to rebuild the whole database layer...
duffymo over 14 years

And even parameterized SQL isn't a 100% guarantee. But it's a very good start.
Cylon Cat over 14 years

@duffymo, I agree that nothing is ever 100% safe. Do you have an example of SQL injection that will work even with parameterized SQL?
Cylon Cat over 14 years

All dynqmic SQL is just strings, so that isn't the question to ask. I'm not familiar with PrepareStatement, so the real question is does it generate a parameterized query that can then be executed with ExecuteUpdate. If yes, that's good. If no, then it's simply hiding the problem, and you may not have any secure option except redesigning the database layer. Dealing with SQL injection is one of those things you have to design in from the beginning; it's not something you can add easily later on.
Kaleb Brasee over 14 years

If you're inserting into an INTEGER field, you'll want to use a 'setInt'. Likewise, other numerical database fields would use other setters. I posted a link to the PreparedStatement docs that list all the setter types.
Kaleb Brasee over 14 years

Yes Cylon, PreparedStatements generate parameterized queries.
Cylon Cat over 14 years

@Kaleb Brasee, thanks. That's good to know. The tools are different in every environment, but getting down to parameterized queries is the fundamental answer.
Steve Kass over 14 years

@Cylon Cat: Sure, when a chunk of SQL (like @WhereClause or @tableName) is passed as the parameter, concatenated into the SQL, and executed dynamically. SQL injection occurs when you let users write your code. It doesn't matter whether you capture their code as a parameter or not.
Cylon Cat over 14 years

@Steve, thanks. I don't think I've seen a database that would let you pass keywords or whole SQL phrases as a parameter; that sounds dangerous. Where clauses can be parameterized just like input or update values, though. SQL Server is really good about that; Oracle is iffy with it on update statements, but fine on queries. As for dynamic table names.... no thanks.
Newbie over 14 years

BTW, I don't know why this isn't mentioned more, but working with PreparedStatements is also much easier and much more readable. That alone probably makes them the default for every programmer who knows about them.
Cylon Cat over 14 years

@Edan, it really does depend on the environment. In .NET, I know of nothing like PreparedStatements, but the combination of LINQ and an ORM provides even more safety, productivity, and maintainability. Because LINQ is integrated into C# and VB, everything about your query and results are strongly named, strongly typed, and compiler-checked. All SQL statements are generated automatically, as needed, fully parameterized. So "PreparedStatement" would feel like a step backwards. But it all depends on the environment and tools that you're working with; use whatever works best.
Adam Robinson over 14 years

@Cylon: That's the argument of dynamically-generated SQL (typically by an ORM) and manually-constructed procedures. The idea of a "prepared statement" is fully present in .NET, though; the DbCommand (and associated DbParameter) abstract class provide full support for parameterizing your SQL, depending on the provider.
Scott Bonner over 14 years

I've been using a homebrew mysql/java solution where a statement such as "select * from sometable where somefield = '[somefield]'" then when I execute the query My code first escapes characters in each of the argument values... [somefield] and then does a replace of somefield with the escape string. So yeah I'm just trying to find a good hook in so under the hood it uses PreparedStatements but everything else stays the same... Definitely didn't make my day to think of all the impending work of this change... So it's on tomorrow's plate
Cylon Cat over 14 years

@Adam, I see the parallel; DBCommand is an aggregate of SQL command, parameters, connection reference, and methods to execute the command. However, "PreparedStatement" suggested the idea of preparing an execution plan; I don't know if any modern databases still require this as an explicit step in client code. I know in SQL Server, it's handled automatically on the server side.
matbrgz over 14 years

Please note that PreparedStatements for some databases are VERY expensive to create, so if you need to do a lot of them, measure both types.
Scott Bonner over 14 years

Thanks for the comment, I like the way you did all the characters in one, I was going about it the less regular expression way of a replace all for each... I'm not sure how to assign the answer on this question now. Ultimately PreparedStatements is the answer, but for my current objective your answer is the answer I need, would you be upset if I gave the answer to one of the earlier prepared statement's answers, or is there a way to share the answer between a couple?
Alan Moore over 14 years

Since this is just a temporary kludge, go ahead and accept one of the PreparedStatement answers.
Eduardo over 13 years

Could you please properly close JDBC resources on error? Like: try { ... } finally { try { pstmt.close(); } catch (java.sql.SQLException ignore) {} } I think examples should be correct as many people will literally copy them. If you do not like direct use of JDBC API you can use static.springsource.org/spring/docs/3.0.x/javadoc-api/org/…
Eduardo almost 13 years

@Kaleb Brasee, I see you finally fixed the code but did not like my edit where I define variables as close where they are used as suggested by most programmers stackoverflow.com/questions/1411463/…
Excrubulent almost 11 years

@Steve Katt "SQL injection occurs when you let users write your code. It doesn't matter whether you capture their code as a parameter or not." I wouldn't call letting the user write their own code an injection vulnerability. It's a vulnerability, but it's caused by allowing someone to write arbitrary code and execute it on your system directly. Of COURSE that can be abused; you've removed the requirement to perform an injection attack by inserting an IV feed into your system and giving the other end to a stranger.
siddhusingh over 9 years

If the query is made at run-time not only in program but also in Stored procedure, then it becomes tricky. So better to use org.apache.commons.lang.StringEscapeUtils.escapeSql(<VALUE>) before even passing that value to SP. Sometimes user is interested in generating dynamic SQL queries. So you can't use a static prepared query.
SantiBailors almost 9 years

Although I always prefer using prepared statements, it might be misleading to say that PreparedStatements are the way to go, because they make SQL injection impossible. In my understanding they just make injection take more effort. As noted in several questions with nobody arguing against it, f.ex. point 1 here and by FindBugs SQL_PREPARED_STATEMENT_GENERATED_FROM_NONCONSTANT_STRING: ...SQL injection could be used to make the prepared statement do something unexpected and undesirable.
zhy about 8 years

I think this code is the decompiled version of the source code in the above link. Now in newer mysql-connector-java-xxx, the case '\u00a5' and case '\u20a9' statements seem having been removed
shareef over 7 years

i tried sqlmap with your code and it did not protect me from the frist attack ` Type: boolean-based blind Title: AND boolean-based blind - WHERE or HAVING clause Payload: q=1%' AND 5430=5430 AND '%'='`
shareef over 7 years

Sorry its working but was viewing the last stored session results .. i kept the comment for future similar ..
Mohamed Ennahdi El Idrissi over 7 years

You can use org.ostermiller.utils.StringHelper.escapeSQL() or com.aoindustries.sql.SQLUtility.escapeSQL().
dan carter over 7 years

@SteveKass that's not an example of an unsafe parameterised query. WIth parameterised queries you don't concatenate chunks of strings into the query but rather use parameter placeholders and then bind the user input using the JDBC/JPA API.
Steve Kass over 7 years

Unfortunately, some people do “concatenate chunks of strings into the query” (e.g., a whole WHERE clause, or a table name - the examples I mentioned) that have been passed as parameters and then execute the query. You might not call this a “parameterized query,” but that phrase is sufficiently imprecise that others might. For example, someone suggested it here: stackoverflow.com/a/18524793/139164 Yes, parameterized queries are safe when the parameters are bound to values and not concatenated into the SQL, but I don’t think I suggested otherwise.
amdev about 7 years

Why you just don't answer to the question ? PreparedStatement are terrible with IN clause. I still wait for a real answer to the question.
Nick Spacek about 7 years

Important to note the GPLv2 license on the original code this was copied from for anyone coming across this. I'm not a lawyer but I would highly recommend not using this answer in your project unless you are fully aware of the implications of including this licensed code.
Paco Abato about 7 years

For reference: commons.apache.org/proper/commons-lang/javadocs/api-2.6/org/‌… Anyway, this method only escapes quotes and does not seem to prevent SQL Injection attacks.
Pini Cheyni almost 7 years

This was removed from the latest versions because it was only escaping single quotes
ChrisOdney almost 7 years

The ESAPI seems defunct as of today. On AWS there is WAF which can help against SQL injection, XSS etc. are there any other alternatives at this point?
drorw over 5 years

Prepared Statements definitely help in preventing SQL Injection attacks. Here's a short video that demonstrates using Prepared Statements with Java JDBC.
Javan R. about 5 years

This answer should be deleted because it does not prevent sql injection.
Javan R. about 5 years

@ChrisOdney A WAF can be easily bypassed. Most Frameworks already implement their own SQL-Injection prevention in which they escape parameters automatically by their own. Alternatives for legacy projects: owasp.org/index.php/…
simon about 5 years

That's not always true. For example, if the parameter is allowed only to be alphanumeric (business requirement), all non-alphanumeric characters can be removed.
theferrit32 almost 5 years

OWASP cheatsheets have been moved to GitHub. The SQL Injection cheat sheet is now here: github.com/OWASP/CheatSheetSeries/blob/master/cheatsheets/…
caot almost 4 years

Is it good the java-security-cross-site-scripting-xss-and-sql-injection topic ? I am trying to find a solution for a legacy application.
izogfif over 3 years

Warning: it returns null for empty strings so you might get unexpected results: "SELECT ITEM.id FROM ITEM WHERE ITEM.value = '" + MysqlRealScapeString(text) + "'" will show you items where ITEM.value = 'null' instead of ITEM.value = ''!
tgdavies about 3 years

Suggesting a transformation which doesn't preserve the input is a bad idea.
Your Common Sense about 3 years

Just think, what would your answer looks like if such an approach was used right here on Stack Overflow.
guidop21 almost 3 years

Sorry, I have a question...After you have created this class how do you connect it to the rest of the Java application?