Why is this code vulnerable to buffer overflow attacks?
Solution 1
On most compilers the maximum value of an unsigned short
is 65535.
Any value above that gets wrapped around, so 65536 becomes 0, and 65600 becomes 65.
This means that long strings of the right length (e.g. 65600) will pass the check, and overflow the buffer.
Use size_t
to store the result of strlen()
, not unsigned short
, and compare len
to an expression that directly encodes the size of buffer
. So for example:
char buffer[100];
size_t len = strlen(str);
if (len >= sizeof(buffer) / sizeof(buffer[0])) return -1;
memcpy(buffer, str, len + 1);
Solution 2
The problem is here:
strncpy(buffer,str,strlen(str));
^^^^^^^^^^^
If the string is greater than the length of the target buffer, strncpy will still copy it over. You are basing the number of characters of the string as the number to copy instead of the size of the buffer. The correct way to do this is as follows:
strncpy(buffer,str, sizeof(buff) - 1);
buffer[sizeof(buff) - 1] = '\0';
What this does is limit the amount of data copied to the actual size of the buffer minus one for the null terminating character. Then we set the last byte in the buffer to the null character as an added safeguard. The reason for this is because strncpy will copy upto n bytes, including the terminating null, if strlen(str) < len - 1. If not, then the null is not copied and you have a crash scenario because now your buffer has a unterminated string.
Hope this helps.
EDIT: Upon further examination and input from others, a possible coding for the function follows:
int func (char *str)
{
char buffer[100];
unsigned short size = sizeof(buffer);
unsigned short len = strlen(str);
if (len > size - 1) return(-1);
memcpy(buffer, str, len + 1);
buffer[size - 1] = '\0';
return(0);
}
Since we already know the length of the string, we can use memcpy to copy the string from the location that is referenced by str into the buffer. Note that per the manual page for strlen(3) (on a FreeBSD 9.3 system), the following is stated:
The strlen() function returns the number of characters that precede the terminating NUL character. The strnlen() function returns either the same result as strlen() or maxlen, whichever is smaller.
Which I interpret to be that the length of the string does not include the null. That is why I copy len + 1 bytes to include the null, and the test checks to make sure that the length < size of buffer - 2. Minus one because the buffer starts at position 0, and minus another one to make sure there's room for the null.
EDIT: Turns out, the size of something starts with 1 while access starts with 0, so the -2 before was incorrect because it would return an error for anything > 98 bytes but it should be > 99 bytes.
EDIT: Although the answer about a unsigned short is generally correct as the maximum length that can be represented is 65,535 characters, it doesn't really matter because if the string is longer than that, the value will wrap around. It's like taking 75,231 (which is 0x000125DF) and masking off the top 16 bits giving you 9695 (0x000025DF). The only problem that I see with this is the first 100 chars past 65,535 as the length check will allow the copy, but it will only copy up to the first 100 characters of the string in all cases and null terminate the string. So even with the wraparound issue, the buffer still will not be overflowed.
This may or may not in itself pose a security risk depending on the content of the string and what you are using it for. If it's just straight text that is human readable, then there is generally no problem. You just get a truncated string. However, if it's something like a URL or even a SQL command sequence, you could have a problem.
Solution 3
Even though you're using strncpy
, the length of the cutoff is still dependent on the passed string pointer. You have no idea how long that string is (the location of the null terminator relative to the pointer, that is). So calling strlen
alone opens you up to vulnerability. If you want to be more secure, use strnlen(str, 100)
.
Full code corrected would be:
int func(char *str) {
char buffer[100];
unsigned short len = strnlen(str, 100); // sizeof buffer
if (len >= 100) {
return -1;
}
strcpy(buffer, str); // this is safe since null terminator is less than 100th index
return 0;
}
Solution 4
The answer with the wrapping is right. But there is a problem I think was not mentioned if(len >= 100)
Well if Len would be 100 we'd copy 100 elements an we'd not have trailing \0. That clearly would mean any other function depending on proper ended string would walk way beyond the original array.
The string problematic from C is IMHO unsolvable. You'd alway better have some limits before the call, but even that won't help. There is no bounds checking and so buffer overflows always can and unfortunately will happen....
Solution 5
Beyond the security issues involved with calling strlen
more than once, one should generally not use string methods on strings whose length is precisely known [for most string functions, there's only a really narrow case where they should be used--on strings for which a maximum length can be guaranteed, but the precise length isn't known]. Once the length of the input string is known and the length of the output buffer is known, one should figure out how big a region should be copied and then use memcpy()
to actually perform the copy in question. Although it's possible that strcpy
might outperform memcpy()
when copying a string of only 1-3 bytes or so, on many platforms memcpy()
is likely to be more than twice as fast when dealing with larger strings.
Although there are some situations where security would come at the cost of performance, this is a situation where the secure approach is also the faster one. In some cases, it may be reasonable to write code which is not secure against weirdly-behaving inputs, if code supplying the inputs can ensure they will be well-behaved, and if guarding against ill-behaved inputs would impede performance. Ensuring that string lengths are only checked once improves both performance and security, though one extra thing can be done to help guard security even when tracking string length manually: for every string which is expected to have a trailing null, write the trailing null explicitly rather than expecting the source string to have it. Thus, if one were writing an strdup
equivalent:
char *strdupe(char const *src)
{
size_t len = strlen(src);
char *dest = malloc(len+1);
// Calculation can't wrap if string is in valid-size memory block
if (!dest) return (OUT_OF_MEMORY(),(char*)0);
// OUT_OF_MEMORY is expected to halt; the return guards if it doesn't
memcpy(dest, src, len);
dest[len]=0;
return dest;
}
Note that the last statement could generally be omitted if the memcpy had processed len+1
bytes, but it another thread were to modify the source string the result could be a non-NUL-terminated destination string.
Related videos on Youtube
Jason
Updated on July 08, 2022Comments
-
Jason almost 2 years
int func(char* str) { char buffer[100]; unsigned short len = strlen(str); if(len >= 100) { return (-1); } strncpy(buffer,str,strlen(str)); return 0; }
This code is vulnerable to a buffer overflow attack, and I'm trying to figure out why. I'm thinking it has to do with
len
being declared ashort
instead of anint
, but I'm not really sure.Any ideas?
-
Dmitri Chubarov about 9 yearsThere are multiple issues with this code. Recall that C strings are null-terminated.
-
R Sahu about 9 years@DmitriChubarov, not null terminating the string will be a problem only if the string is used after the call to
strncpy
. In this case, it is not. -
Jim Balter about 9 yearsThe problems in this code flow directly from the fact that
strlen
is calculated, used for the validity check, and then it is absurdly calculated again -- it's a DRY failure. If the secondstrlen(str)
were replaced withlen
, there would be no possibility of buffer overflow, regardless of the type oflen
. The answers don't address this point, they just manage to avoid it. -
CiaPan about 9 years@RSahu What happens to
strlen
if the string is not NUL-terminated? -
Kaiserludi about 9 years@CiaPan: Wenn passing a not null-terminated string to it, strlen will show undefined behavior.
-
Jim Balter about 9 yearsIt's weird that people get hung up on NUL termination here when that's completely irrelevant to this question about a buffer overflow attack. They might as well concern themselves with
str
being NULL. -
Asad Saeeduddin about 9 years@JimBalter Wouldn't the buffer overflow still occur if you used
len
in thestrncpy
and changed nothing else? -
Asad Saeeduddin about 9 years@JimBalter Yes, I'm very unfamiliar with C. The reason I thought an overflow would still occur is because of orlp's answer below, which indicates that if
strlen(str) > 65535
,len
will wrap around and the sentry if clause will pass, allowing strings that exceed the length of the buffer to be copied into the buffer. -
Asad Saeeduddin about 9 years@JimBalter Ah, nevermind. Just looked at the strncpy docs and found out what the third arg is.
-
Asad Saeeduddin about 9 years@JimBalter Nah, I think I'll leave them there. Maybe someone else will have the same foolish misconception and learn from it. Feel free to flag them if they irk you, someone might come around and delete them.
-
LeeNeverGup about 9 yearsAlso, it might address a time of check to time of use issue. If an attacker change the length of the sring in another thread, he might eventually change the content of
str
right after the check, but before the copy. -
Dmitry Poroh about 9 yearsJust curious how do you recognise that this code is "vulnerable to a buffer overflow attack"?
-
-
Patrick Roberts about 9 yearsIf using
strnlen
guarantees it terminates before overflowing, then wouldn't it be safe (and faster) at that point to just usestrcpy
when you know the null terminator is less than 100 indices? -
Patrick Roberts about 9 years@user3386109 Wouldn't
strlen
also then access past the end of the buffer? -
orlp about 9 years@PatrickRoberts Theoretically, yes. But you have to keep in mind that 10% of the code is responsible for 90% of the runtime, so you shouldn't let performance go before security. And keep in mind that over time the code changes, which can suddenly mean that the previous check is gone.
-
David Grayson about 9 yearsorlp's answer is right. I don't think this answer adds anything, and could be removed. This answer overlooks the fact that the OP's code is attempting to check the length of the string. A complete answer should explain why that check does not work.
-
Patrick Roberts about 9 years@user3386109 what you're pointing out makes orlp's answer just as invalid as mine. I fail to see why
strnlen
doesn't solve the problem if what orlp is suggesting is supposedly correct anyway. -
Jim Balter about 9 yearsTo prevent buffer overflow, simply use
len
as the third argument of strncpy. Using strlen again is dumb in any case. -
cmaster - reinstate monica about 9 yearsThe string problematic is solvable: just use the appropriate functions. I. e. not
strncpy()
and friends, but the memory allocating functions likestrdup()
and friends. They are in the POSIX-2008 standard, so they are fairly portable, though not available on some proprietary systems. -
Jim Balter about 9 years"
strncpy(buf, str, strlen(str))
does exactly the same asstrcpy(buf, str)
" -- no it doesn't; strncpy doesn't NUL-terminate (and does NUL-pad). -
orlp about 9 years@JimBalter Fair enough.
-
Jim Balter about 9 years"I don't think strnlen solves anything here" -- of course it does; it prevents overflowing
buffer
. "since str could point to a buffer of 2 bytes, neither of which is NUL." -- that's irrelevant, as it is true of any implementation offunc
. The question here is about buffer overflow, not UB because the input isn't NUL-terminated. -
Jim Balter about 9 years"The second parameter passed to strnlen must be the size of the object that the first parameter points to, or strnlen is worthless" -- this is complete and utter nonsense. If the second argument to strnlen is the length of the input string, then strnlen is equivalent to strlen. How would you even obtain that number, and if you had it, why would you need to call str[n]len? That's not what strnlen is for at all.
-
Jim Balter about 9 years+1 Although this answer is imperfect because it's not equivalent to the OP's code -- strncpy NUL-pads and doesn't NUL terminate, whereas strcpy NUL-terminates and doesn't NUL-pad, it does solve the problem, contrary to the ridiculous, ignorant comments above.
-
Jim Balter about 9 yearsAnd what about my other point? Using the sizeof(buffer) is excessive;
len
is always less and is sufficient. And it works regardless of the type oflen
-- the real problem here is that the test is done againstlen
but the actual copy usesstrlen(str)
... this violation of the DRY principle is bad practice in numerous ways, and here we see one of them. -
Daniel Rudy about 9 yearsTrue, but that is beyond of scope of the question. The code clearly shows the function being passed a char pointer. Outside of the scope of the function, we don't care.
-
Jim Balter about 9 years"any other function depending on proper ended string" --
buffer
is local to this function and isn't used elsewhere. In a real program we would have to examine how it is used ... sometimes not NUL-terminating is correct (the original use of strncpy was to create UNIX's 14 byte directory entries -- NUL-padded and not NUL-terminated). "The string problematic from C is IMHO unsolvable" -- while C is a gawdawful language that has been surpassed by far better technology, safe code can be written in it if enough discipline is used. -
Jim Balter about 9 years" the buffer in which str is stored" -- that's not a buffer overflow, which is the issue here. And every answer has that "problem", which is unavoidable given the signature of
func
... and every other C function ever written that takes NUL-terminated strings as arguments. Bringing up the possibility of the input not being NUL-terminated is completely clueless. -
Jim Balter about 9 years"that is beyond of scope of the question" -- which sadly is beyond the ability of some people to comprehend.
-
Jim Balter about 9 years"The problem is here" -- you're right, but you're still missing the key issue, which is that the test (
len >= 100
) was done against one value but the length of the copy was given a different value ... this is a violation of the DRY principle. Simply callingstrncpy(buffer, str, len)
avoids the possibility of buffer overflow, and does less work thanstrncpy(buffer,str,sizeof(buffer) - 1)
... although here it is just a slower equivalent tomemcpy(buffer, str, len)
. -
Jim Balter about 9 years
/ sizeof(buffer[0])
-- note thatsizeof(char)
in C is always 1 (even if a char contains a gazillion bits) so that's superfluous when there's no possibility of using a different data type. Still ... kudos for a complete answer (and thanks for being responsive to comments). -
Jim Balter about 9 yearsOne last comment: since the copy always stops short of the NUL,
strncpy(buffer, str, len)
could be replaced with the equivalent but generally fastermemcpy(buffer, str, len)
. -
rr- about 9 yearsWhy does
sizeof(buffer)
work? Wasn't it supposed to be the same assizeof(void*)
due toconst char[]
being just achar*
? -
Dietrich Epp about 9 years@rr-:
char[]
andchar*
are not the same thing. There are many situations in which achar[]
will get implicitly converted into achar*
. For example,char[]
is exactly the same aschar*
when used as the type for function arguments. However, the conversion does not happen forsizeof()
. -
Patrick Roberts about 9 yearsYour observation seems to me misguided.
if (len >= 100)
is the condition for when the check fails, not when it passes, which means there is not a case where exactly 100 bytes with no NUL terminator is copied over, as that length is included in the fail condition. -
Daniel Rudy about 9 years@JimBalter It is beyond the scope of the question, but I digress. I understand that the values used by the test and what is used in strncpy are two different ones. However, general coding practice says that the copy limit should be sizeof(buffer) - 1 so it doesn't matter what the length of str is on the copy. strncpy will stop copying bytes when it either hits a null or copies n bytes. The next line guarantees that the last byte in the buffer is a null char. The code is safe, I stand by my previous statement.
-
Friedrich about 9 years@ cmaster. In this case you are wrong. It is not solvable, because one always can write beyoned the bounds. Yes it's undefiened behaviour but there is not way to prevent it completely.
-
Friedrich about 9 years@Jim Balter. It does not matter. I potentially can write over the bounds of this local buffer and so it always will be possible to corrupt some other datastructures.
-
Jim Balter about 9 yearsNon sequitur. You seem to have forgotten what you wrote and what I was responding to.
-
Jim Balter about 9 years"general coding practice says" -- no it doesn't; there are other cases and this is one of them. "The code is safe" -- I didn't say otherwise. "I stand by my previous statement" -- I didn't challenge any previous statement. You, OTOH, paid no attention to what I wrote.
-
Friedrich about 9 yearsI don't think so. Yes the buffer is local but it's still possible to write over the bounds and so we an invoke undefined behaviour. There might be some data which can be overwritten. You sometimes even can find out what is overwrittern, if e.g there would be another buffer this overwritting might take place...
-
Daniel Rudy about 9 yearsI will admit, I had no idea what the DRY or WET principles were. I had to look them up. With that in mind, I can see why you were saying what you were saying. I'm going to put some more code up to illustrate this.
-
Jim Balter about 9 yearsYou're just repeating yourself. Again, you have forgotten what you wrote and what I responded to ... it was about
strncpy
not NUL-terminating the data, which is a completely different subject from overflowing the buffer. Now go repeat yourself again ... I won't respond further. -
Bogdan Alexandru about 9 yearsCan you please explain the security issues involved with calling
strlen
more than once? -
Patrick Roberts about 9 yearsWhile this is safe, since a NUL character is explicitly added to the end, buffer overflow can still technically occur since
strlen
is being assigned to anunsigned short
. I find it strange how this implementation arbitrarily allows strings of length[N * 65536, N * 65536 + 98]
(N being any non-negative integer) and terminates them atlen - N * 65536
. That problem could be resolved by changingunsigned short
tounsigned size_t
or even justsize_t
. -
supercat about 9 years@BogdanAlexandru: Once one has called
strlen
and taken some action based upon the value returned (which was presumably the reason for calling it in the first place), then a repeated call either (1) will always yield the same answer as the first one, in which case it's simply wasted work, or (2) may sometimes (because something else--perhaps another thread--modified the string in the meantime) yield a different answer, in which case code which does some things with the length (e.g. allocating a buffer) may assume a different size than code which does other things (copying to the buffer). -
ThunderPhoenix about 9 years@orlp I'm not agree with
len >= sizeof(buffer) / sizeof(buffer[0])
because buffer is just a pointer andsizeof(buffer)
will return the size of a char pointer. You should use directly the size with which it was declared inchar buffer[100];
that is to say 100. -
aldeb about 9 years@orlp: Could you please explain why comparing
len
to the expression is prefered to comparing the expression to100
? -
orlp about 9 years@Controll Because if you change the size of
buffer
at some point the expression automatically updates. This is critical for security, because the declaration ofbuffer
might be quite some lines away from the check in actual code. So it's easy to change the size of buffer, but forget to update in every location where the size is used. -
aldeb about 9 yearsThanks! I understood.There is another point i'm missing. How the code you proposed is supposed to prevent the buffer overflow? If it wasn't supposed to, then how can I prevent that?
-
orlp about 9 years@Controll The code above is assumed to be inside a function. The buffer overflow is prevented because if the check fails the function returns before it writes to the buffer.
-
aldeb about 9 yearsThanks for you answer! But how the fact of using
size_t
preventslen
from being wrapped around, since afaiksize_t
is atypedef
of anunsigned
(perhapsshort
)? -
orlp about 9 years
-
orlp about 9 years@JimBalter Because arrays in C(++) are confusing in that they decoy to pointers when passed into a function, even if the function argument looks like it's declared as an array. This confuses people into thinking that arrays are always just pointers, instead of only converting when passed into a function.
-
orlp about 9 years@JimBalter It's not stupidity - it's a lack of knowledge. Knowledge that is counter-intuitive.
-
Jim Balter about 9 years@Controll "size_t is a typedef of an unsigned (perhaps short)?" --
size_t
is defined to be a type that is big enough to hold the length of any object, so it can only beunsigned short
on machines with only 64K of memory. -
Jim Balter about 9 years@PatrickRoberts "since a NUL character is explicitly added to the end" -- NUL termination is irrelevant; the issue is buffer overflow. "buffer overflow can still technically occur" -- Wrong. It is completely impossible for this code to store outside of
buffer
-- that's what "buffer overflow" means. This is true for any value oflen
; it doesn't matter whetherlen
accurately reflects the length of the string -- that's an entirely different matter. (I don't endorse this code, which has several flaws, but it is safe.) -
aldeb about 9 years@JimBalter: I've nothing more to say than: yes, you're right :) It's really not likely that
size_t
would beunsigned short
-
orlp about 9 yearsAfter 130+ upvotes I'm surprised no one found the critical bug in my original code. The string would never be null-terminated!