Why is this code vulnerable to buffer overflow attacks?

28,143

Solution 1

On most compilers the maximum value of an unsigned short is 65535.

Any value above that gets wrapped around, so 65536 becomes 0, and 65600 becomes 65.

This means that long strings of the right length (e.g. 65600) will pass the check, and overflow the buffer.


Use size_t to store the result of strlen(), not unsigned short, and compare len to an expression that directly encodes the size of buffer. So for example:

char buffer[100];
size_t len = strlen(str);
if (len >= sizeof(buffer) / sizeof(buffer[0]))  return -1;
memcpy(buffer, str, len + 1);

Solution 2

The problem is here:

strncpy(buffer,str,strlen(str));
                   ^^^^^^^^^^^

If the string is greater than the length of the target buffer, strncpy will still copy it over. You are basing the number of characters of the string as the number to copy instead of the size of the buffer. The correct way to do this is as follows:

strncpy(buffer,str, sizeof(buff) - 1);
buffer[sizeof(buff) - 1] = '\0';

What this does is limit the amount of data copied to the actual size of the buffer minus one for the null terminating character. Then we set the last byte in the buffer to the null character as an added safeguard. The reason for this is because strncpy will copy upto n bytes, including the terminating null, if strlen(str) < len - 1. If not, then the null is not copied and you have a crash scenario because now your buffer has a unterminated string.

Hope this helps.

EDIT: Upon further examination and input from others, a possible coding for the function follows:

int func (char *str)
  {
    char buffer[100];
    unsigned short size = sizeof(buffer);
    unsigned short len = strlen(str);

    if (len > size - 1) return(-1);
    memcpy(buffer, str, len + 1);
    buffer[size - 1] = '\0';
    return(0);
  }

Since we already know the length of the string, we can use memcpy to copy the string from the location that is referenced by str into the buffer. Note that per the manual page for strlen(3) (on a FreeBSD 9.3 system), the following is stated:

 The strlen() function returns the number of characters that precede the
 terminating NUL character.  The strnlen() function returns either the
 same result as strlen() or maxlen, whichever is smaller.

Which I interpret to be that the length of the string does not include the null. That is why I copy len + 1 bytes to include the null, and the test checks to make sure that the length < size of buffer - 2. Minus one because the buffer starts at position 0, and minus another one to make sure there's room for the null.

EDIT: Turns out, the size of something starts with 1 while access starts with 0, so the -2 before was incorrect because it would return an error for anything > 98 bytes but it should be > 99 bytes.

EDIT: Although the answer about a unsigned short is generally correct as the maximum length that can be represented is 65,535 characters, it doesn't really matter because if the string is longer than that, the value will wrap around. It's like taking 75,231 (which is 0x000125DF) and masking off the top 16 bits giving you 9695 (0x000025DF). The only problem that I see with this is the first 100 chars past 65,535 as the length check will allow the copy, but it will only copy up to the first 100 characters of the string in all cases and null terminate the string. So even with the wraparound issue, the buffer still will not be overflowed.

This may or may not in itself pose a security risk depending on the content of the string and what you are using it for. If it's just straight text that is human readable, then there is generally no problem. You just get a truncated string. However, if it's something like a URL or even a SQL command sequence, you could have a problem.

Solution 3

Even though you're using strncpy, the length of the cutoff is still dependent on the passed string pointer. You have no idea how long that string is (the location of the null terminator relative to the pointer, that is). So calling strlen alone opens you up to vulnerability. If you want to be more secure, use strnlen(str, 100).

Full code corrected would be:

int func(char *str) {
   char buffer[100];
   unsigned short len = strnlen(str, 100); // sizeof buffer

   if (len >= 100) {
     return -1;
   }

   strcpy(buffer, str); // this is safe since null terminator is less than 100th index
   return 0;
}

Solution 4

The answer with the wrapping is right. But there is a problem I think was not mentioned if(len >= 100)

Well if Len would be 100 we'd copy 100 elements an we'd not have trailing \0. That clearly would mean any other function depending on proper ended string would walk way beyond the original array.

The string problematic from C is IMHO unsolvable. You'd alway better have some limits before the call, but even that won't help. There is no bounds checking and so buffer overflows always can and unfortunately will happen....

Solution 5

Beyond the security issues involved with calling strlen more than once, one should generally not use string methods on strings whose length is precisely known [for most string functions, there's only a really narrow case where they should be used--on strings for which a maximum length can be guaranteed, but the precise length isn't known]. Once the length of the input string is known and the length of the output buffer is known, one should figure out how big a region should be copied and then use memcpy() to actually perform the copy in question. Although it's possible that strcpy might outperform memcpy() when copying a string of only 1-3 bytes or so, on many platforms memcpy() is likely to be more than twice as fast when dealing with larger strings.

Although there are some situations where security would come at the cost of performance, this is a situation where the secure approach is also the faster one. In some cases, it may be reasonable to write code which is not secure against weirdly-behaving inputs, if code supplying the inputs can ensure they will be well-behaved, and if guarding against ill-behaved inputs would impede performance. Ensuring that string lengths are only checked once improves both performance and security, though one extra thing can be done to help guard security even when tracking string length manually: for every string which is expected to have a trailing null, write the trailing null explicitly rather than expecting the source string to have it. Thus, if one were writing an strdup equivalent:

char *strdupe(char const *src)
{
  size_t len = strlen(src);
  char *dest = malloc(len+1);
  // Calculation can't wrap if string is in valid-size memory block
  if (!dest) return (OUT_OF_MEMORY(),(char*)0); 
  // OUT_OF_MEMORY is expected to halt; the return guards if it doesn't
  memcpy(dest, src, len);      
  dest[len]=0;
  return dest;
}

Note that the last statement could generally be omitted if the memcpy had processed len+1 bytes, but it another thread were to modify the source string the result could be a non-NUL-terminated destination string.

Share:
28,143

Related videos on Youtube

Jason
Author by

Jason

Updated on July 08, 2022

Comments

  • Jason
    Jason almost 2 years
    int func(char* str)
    {
       char buffer[100];
       unsigned short len = strlen(str);
    
       if(len >= 100)
       {
            return (-1);
       }
    
       strncpy(buffer,str,strlen(str));
       return 0;
    }
    

    This code is vulnerable to a buffer overflow attack, and I'm trying to figure out why. I'm thinking it has to do with len being declared a short instead of an int, but I'm not really sure.

    Any ideas?

    • Dmitri Chubarov
      Dmitri Chubarov about 9 years
      There are multiple issues with this code. Recall that C strings are null-terminated.
    • R Sahu
      R Sahu about 9 years
      @DmitriChubarov, not null terminating the string will be a problem only if the string is used after the call to strncpy. In this case, it is not.
    • Jim Balter
      Jim Balter about 9 years
      The problems in this code flow directly from the fact that strlen is calculated, used for the validity check, and then it is absurdly calculated again -- it's a DRY failure. If the second strlen(str) were replaced with len, there would be no possibility of buffer overflow, regardless of the type of len. The answers don't address this point, they just manage to avoid it.
    • CiaPan
      CiaPan about 9 years
      @RSahu What happens to strlen if the string is not NUL-terminated?
    • Kaiserludi
      Kaiserludi about 9 years
      @CiaPan: Wenn passing a not null-terminated string to it, strlen will show undefined behavior.
    • Jim Balter
      Jim Balter about 9 years
      It's weird that people get hung up on NUL termination here when that's completely irrelevant to this question about a buffer overflow attack. They might as well concern themselves with str being NULL.
    • Asad Saeeduddin
      Asad Saeeduddin about 9 years
      @JimBalter Wouldn't the buffer overflow still occur if you used len in the strncpy and changed nothing else?
    • Asad Saeeduddin
      Asad Saeeduddin about 9 years
      @JimBalter Yes, I'm very unfamiliar with C. The reason I thought an overflow would still occur is because of orlp's answer below, which indicates that if strlen(str) > 65535, len will wrap around and the sentry if clause will pass, allowing strings that exceed the length of the buffer to be copied into the buffer.
    • Asad Saeeduddin
      Asad Saeeduddin about 9 years
      @JimBalter Ah, nevermind. Just looked at the strncpy docs and found out what the third arg is.
    • Asad Saeeduddin
      Asad Saeeduddin about 9 years
      @JimBalter Nah, I think I'll leave them there. Maybe someone else will have the same foolish misconception and learn from it. Feel free to flag them if they irk you, someone might come around and delete them.
    • LeeNeverGup
      LeeNeverGup about 9 years
      Also, it might address a time of check to time of use issue. If an attacker change the length of the sring in another thread, he might eventually change the content of str right after the check, but before the copy.
    • Dmitry Poroh
      Dmitry Poroh about 9 years
      Just curious how do you recognise that this code is "vulnerable to a buffer overflow attack"?
  • Patrick Roberts
    Patrick Roberts about 9 years
    If using strnlen guarantees it terminates before overflowing, then wouldn't it be safe (and faster) at that point to just use strcpy when you know the null terminator is less than 100 indices?
  • Patrick Roberts
    Patrick Roberts about 9 years
    @user3386109 Wouldn't strlen also then access past the end of the buffer?
  • orlp
    orlp about 9 years
    @PatrickRoberts Theoretically, yes. But you have to keep in mind that 10% of the code is responsible for 90% of the runtime, so you shouldn't let performance go before security. And keep in mind that over time the code changes, which can suddenly mean that the previous check is gone.
  • David Grayson
    David Grayson about 9 years
    orlp's answer is right. I don't think this answer adds anything, and could be removed. This answer overlooks the fact that the OP's code is attempting to check the length of the string. A complete answer should explain why that check does not work.
  • Patrick Roberts
    Patrick Roberts about 9 years
    @user3386109 what you're pointing out makes orlp's answer just as invalid as mine. I fail to see why strnlen doesn't solve the problem if what orlp is suggesting is supposedly correct anyway.
  • Jim Balter
    Jim Balter about 9 years
    To prevent buffer overflow, simply use len as the third argument of strncpy. Using strlen again is dumb in any case.
  • cmaster - reinstate monica
    cmaster - reinstate monica about 9 years
    The string problematic is solvable: just use the appropriate functions. I. e. not strncpy() and friends, but the memory allocating functions like strdup() and friends. They are in the POSIX-2008 standard, so they are fairly portable, though not available on some proprietary systems.
  • Jim Balter
    Jim Balter about 9 years
    "strncpy(buf, str, strlen(str)) does exactly the same as strcpy(buf, str)" -- no it doesn't; strncpy doesn't NUL-terminate (and does NUL-pad).
  • orlp
    orlp about 9 years
    @JimBalter Fair enough.
  • Jim Balter
    Jim Balter about 9 years
    "I don't think strnlen solves anything here" -- of course it does; it prevents overflowing buffer. "since str could point to a buffer of 2 bytes, neither of which is NUL." -- that's irrelevant, as it is true of any implementation of func. The question here is about buffer overflow, not UB because the input isn't NUL-terminated.
  • Jim Balter
    Jim Balter about 9 years
    "The second parameter passed to strnlen must be the size of the object that the first parameter points to, or strnlen is worthless" -- this is complete and utter nonsense. If the second argument to strnlen is the length of the input string, then strnlen is equivalent to strlen. How would you even obtain that number, and if you had it, why would you need to call str[n]len? That's not what strnlen is for at all.
  • Jim Balter
    Jim Balter about 9 years
    +1 Although this answer is imperfect because it's not equivalent to the OP's code -- strncpy NUL-pads and doesn't NUL terminate, whereas strcpy NUL-terminates and doesn't NUL-pad, it does solve the problem, contrary to the ridiculous, ignorant comments above.
  • Jim Balter
    Jim Balter about 9 years
    And what about my other point? Using the sizeof(buffer) is excessive; len is always less and is sufficient. And it works regardless of the type of len -- the real problem here is that the test is done against len but the actual copy uses strlen(str) ... this violation of the DRY principle is bad practice in numerous ways, and here we see one of them.
  • Daniel Rudy
    Daniel Rudy about 9 years
    True, but that is beyond of scope of the question. The code clearly shows the function being passed a char pointer. Outside of the scope of the function, we don't care.
  • Jim Balter
    Jim Balter about 9 years
    "any other function depending on proper ended string" -- buffer is local to this function and isn't used elsewhere. In a real program we would have to examine how it is used ... sometimes not NUL-terminating is correct (the original use of strncpy was to create UNIX's 14 byte directory entries -- NUL-padded and not NUL-terminated). "The string problematic from C is IMHO unsolvable" -- while C is a gawdawful language that has been surpassed by far better technology, safe code can be written in it if enough discipline is used.
  • Jim Balter
    Jim Balter about 9 years
    " the buffer in which str is stored" -- that's not a buffer overflow, which is the issue here. And every answer has that "problem", which is unavoidable given the signature of func ... and every other C function ever written that takes NUL-terminated strings as arguments. Bringing up the possibility of the input not being NUL-terminated is completely clueless.
  • Jim Balter
    Jim Balter about 9 years
    "that is beyond of scope of the question" -- which sadly is beyond the ability of some people to comprehend.
  • Jim Balter
    Jim Balter about 9 years
    "The problem is here" -- you're right, but you're still missing the key issue, which is that the test (len >= 100) was done against one value but the length of the copy was given a different value ... this is a violation of the DRY principle. Simply calling strncpy(buffer, str, len) avoids the possibility of buffer overflow, and does less work than strncpy(buffer,str,sizeof(buffer) - 1) ... although here it is just a slower equivalent to memcpy(buffer, str, len).
  • Jim Balter
    Jim Balter about 9 years
    / sizeof(buffer[0]) -- note that sizeof(char) in C is always 1 (even if a char contains a gazillion bits) so that's superfluous when there's no possibility of using a different data type. Still ... kudos for a complete answer (and thanks for being responsive to comments).
  • Jim Balter
    Jim Balter about 9 years
    One last comment: since the copy always stops short of the NUL, strncpy(buffer, str, len) could be replaced with the equivalent but generally faster memcpy(buffer, str, len).
  • rr-
    rr- about 9 years
    Why does sizeof(buffer) work? Wasn't it supposed to be the same as sizeof(void*) due to const char[] being just a char*?
  • Dietrich Epp
    Dietrich Epp about 9 years
    @rr-: char[] and char* are not the same thing. There are many situations in which a char[] will get implicitly converted into a char*. For example, char[] is exactly the same as char* when used as the type for function arguments. However, the conversion does not happen for sizeof().
  • Patrick Roberts
    Patrick Roberts about 9 years
    Your observation seems to me misguided. if (len >= 100) is the condition for when the check fails, not when it passes, which means there is not a case where exactly 100 bytes with no NUL terminator is copied over, as that length is included in the fail condition.
  • Daniel Rudy
    Daniel Rudy about 9 years
    @JimBalter It is beyond the scope of the question, but I digress. I understand that the values used by the test and what is used in strncpy are two different ones. However, general coding practice says that the copy limit should be sizeof(buffer) - 1 so it doesn't matter what the length of str is on the copy. strncpy will stop copying bytes when it either hits a null or copies n bytes. The next line guarantees that the last byte in the buffer is a null char. The code is safe, I stand by my previous statement.
  • Friedrich
    Friedrich about 9 years
    @ cmaster. In this case you are wrong. It is not solvable, because one always can write beyoned the bounds. Yes it's undefiened behaviour but there is not way to prevent it completely.
  • Friedrich
    Friedrich about 9 years
    @Jim Balter. It does not matter. I potentially can write over the bounds of this local buffer and so it always will be possible to corrupt some other datastructures.
  • Jim Balter
    Jim Balter about 9 years
    Non sequitur. You seem to have forgotten what you wrote and what I was responding to.
  • Jim Balter
    Jim Balter about 9 years
    "general coding practice says" -- no it doesn't; there are other cases and this is one of them. "The code is safe" -- I didn't say otherwise. "I stand by my previous statement" -- I didn't challenge any previous statement. You, OTOH, paid no attention to what I wrote.
  • Friedrich
    Friedrich about 9 years
    I don't think so. Yes the buffer is local but it's still possible to write over the bounds and so we an invoke undefined behaviour. There might be some data which can be overwritten. You sometimes even can find out what is overwrittern, if e.g there would be another buffer this overwritting might take place...
  • Daniel Rudy
    Daniel Rudy about 9 years
    I will admit, I had no idea what the DRY or WET principles were. I had to look them up. With that in mind, I can see why you were saying what you were saying. I'm going to put some more code up to illustrate this.
  • Jim Balter
    Jim Balter about 9 years
    You're just repeating yourself. Again, you have forgotten what you wrote and what I responded to ... it was about strncpy not NUL-terminating the data, which is a completely different subject from overflowing the buffer. Now go repeat yourself again ... I won't respond further.
  • Bogdan Alexandru
    Bogdan Alexandru about 9 years
    Can you please explain the security issues involved with calling strlen more than once?
  • Patrick Roberts
    Patrick Roberts about 9 years
    While this is safe, since a NUL character is explicitly added to the end, buffer overflow can still technically occur since strlen is being assigned to an unsigned short. I find it strange how this implementation arbitrarily allows strings of length [N * 65536, N * 65536 + 98] (N being any non-negative integer) and terminates them at len - N * 65536. That problem could be resolved by changing unsigned short to unsigned size_t or even just size_t.
  • supercat
    supercat about 9 years
    @BogdanAlexandru: Once one has called strlen and taken some action based upon the value returned (which was presumably the reason for calling it in the first place), then a repeated call either (1) will always yield the same answer as the first one, in which case it's simply wasted work, or (2) may sometimes (because something else--perhaps another thread--modified the string in the meantime) yield a different answer, in which case code which does some things with the length (e.g. allocating a buffer) may assume a different size than code which does other things (copying to the buffer).
  • ThunderPhoenix
    ThunderPhoenix about 9 years
    @orlp I'm not agree with len >= sizeof(buffer) / sizeof(buffer[0]) because buffer is just a pointer and sizeof(buffer) will return the size of a char pointer. You should use directly the size with which it was declared in char buffer[100]; that is to say 100.
  • aldeb
    aldeb about 9 years
    @orlp: Could you please explain why comparing len to the expression is prefered to comparing the expression to 100?
  • orlp
    orlp about 9 years
    @Controll Because if you change the size of buffer at some point the expression automatically updates. This is critical for security, because the declaration of buffer might be quite some lines away from the check in actual code. So it's easy to change the size of buffer, but forget to update in every location where the size is used.
  • aldeb
    aldeb about 9 years
    Thanks! I understood.There is another point i'm missing. How the code you proposed is supposed to prevent the buffer overflow? If it wasn't supposed to, then how can I prevent that?
  • orlp
    orlp about 9 years
    @Controll The code above is assumed to be inside a function. The buffer overflow is prevented because if the check fails the function returns before it writes to the buffer.
  • aldeb
    aldeb about 9 years
    Thanks for you answer! But how the fact of using size_t prevents len from being wrapped around, since afaik size_t is a typedef of an unsigned (perhaps short)?
  • orlp
    orlp about 9 years
    @Controll That's wrong. strlen itself returns a size_t.
  • orlp
    orlp about 9 years
    @JimBalter Because arrays in C(++) are confusing in that they decoy to pointers when passed into a function, even if the function argument looks like it's declared as an array. This confuses people into thinking that arrays are always just pointers, instead of only converting when passed into a function.
  • orlp
    orlp about 9 years
    @JimBalter It's not stupidity - it's a lack of knowledge. Knowledge that is counter-intuitive.
  • Jim Balter
    Jim Balter about 9 years
    @Controll "size_t is a typedef of an unsigned (perhaps short)?" -- size_t is defined to be a type that is big enough to hold the length of any object, so it can only be unsigned short on machines with only 64K of memory.
  • Jim Balter
    Jim Balter about 9 years
    @PatrickRoberts "since a NUL character is explicitly added to the end" -- NUL termination is irrelevant; the issue is buffer overflow. "buffer overflow can still technically occur" -- Wrong. It is completely impossible for this code to store outside of buffer -- that's what "buffer overflow" means. This is true for any value of len; it doesn't matter whether len accurately reflects the length of the string -- that's an entirely different matter. (I don't endorse this code, which has several flaws, but it is safe.)
  • aldeb
    aldeb about 9 years
    @JimBalter: I've nothing more to say than: yes, you're right :) It's really not likely that size_t would be unsigned short
  • orlp
    orlp about 9 years
    After 130+ upvotes I'm surprised no one found the critical bug in my original code. The string would never be null-terminated!