SQL SHA1 inside WHERE

12,029

Solution 1

I'd store the SHA1 of the IP in the database along with the raw IP, so that the query would become

SELECT * FROM records WHERE ip_sha1 = "..."

Then I'd make sure that the SHA1 calculation happens exactly one place in code, so that there's no opportunity for it be be done slightly differently in multiple places. That also gives you the opportunity to mix a salt into the calculation, so that someone can't simply compute the SHA1 on an IP address they're interested in and pass that in by hand.

Storing the SHA1 hash the database also gives you the opportunity to add a secondary index on ip_sha1 to speed up that SELECT. If you have a very large data set, doing the SHA1 in the WHERE clauses forces the database to do a complete table scan, along with redoing a calculation for every record on every scan.

Solution 2

Don't know if it matters, but your SHA1 hash da39a3ee5e6b4b0d3255bfef95601890afd80709 is a well-known hash of an empty string.

Is it just an example or you forgot to provide an actual IP address to the hash calculation function?

Update:

Does your webpage code generate SHA1 hashes in lowercase?

This check will fail in MySQL:

SELECT  SHA1('') = 'DA39A3EE5E6B4B0D3255BFEF95601890AFD80709'

In this case, use this:

SELECT  SHA1('') = LOWER('DA39A3EE5E6B4B0D3255BFEF95601890AFD80709')

, which will succeed.

Also, you can precalculate the SHA1 hash when you insert the records into the table:

INSERT
INTO    ip_records (ip, ip_sha)
VALUES  (@ip, SHA1(CONCAT('my_secret_salt', @ip))

SELECT  *
FROM    ip_records
WHERE   ip_sha = @my_salted_sha1_from_webpage

This will return you the original IP and allow indexing of ip_sha, so that this query will work fast.

Solution 3

Every time I've had an unexpected hashing mismatch, it was because I accidentally hashed a string that included some whitespace, such as "\n".

Solution 4

Just a quick thought: that's a very simple obfuscation. There are only 232 possible IP addresses, so if somebody with technical knowledge wanted to figure it out, they could do that by calculating all 4 billion hashes, which wouldn't take very long. Depending on the sensitivity of those ip addresses, you may want to consider a private lookup table.

Share:
12,029
Isaac Waller
Author by

Isaac Waller

currently ----

Updated on June 04, 2022

Comments

  • Isaac Waller
    Isaac Waller almost 2 years

    In my program, we store a user's IP address in a record. When we display a list of records to a user, we don't want to give away the other user's IP, so we SHA1 hash it. Then, when the user clicks on a record, it goes to a URL like this:

    http://www.example.com/allrecordsbyipaddress.php?ipaddress=SHA1HASHOFTHEIPADDRESS
    

    Now, I need to list all the records by the IP address specified in the SHA1 hash. I tried this:

    SELECT * FROM records 
    WHERE SHA1(IPADDRESS)="da39a3ee5e6b4b0d3255bfef95601890afd80709"
    

    but this does not work. How would I do this?
    Thanks, Isaac Waller

  • Alex Martelli
    Alex Martelli almost 15 years
    +1: DRY (Don't Repeat Yourself) is a key principle, and ensuring that each key piece of data or code exists only once, i.e. in "exactly one place" as you put it, a crucial part of DRY.
  • Andomar
    Andomar almost 15 years
    @Isaac Waller: You can store both the IP and its hash in the table. That would make searches faster, and troubleshooting easier.
  • Dave W. Smith
    Dave W. Smith almost 15 years
    @Andomar That's what I meant, but I see it's not what I wrote. Thanks for the catch.