Wrong format specifier in scanf("%d", unsigned short int) after gets(pointer) changes the char pointer's value, but why?

28,526

Solution 1

Your program has undefined behavior.

You need to tell scanf() that there's only room for a short integer, how else is it going to know which size to store the number as?

Change to:

scanf("%hu", &temp);

Where h means "half", i.e. short, and u is for unsigned. Your failure to use the proper format conversion specifier caused undefined behavior, in which scanf() overwrote a neighboring variable in memory.

Also, please note that gets() is deprecated due to being very dangerous: please don't use it. Use the much more well-behaved fgets() instead. And never scale an allocation by sizeof (char), that's just a very hard-to-read way of writing * 1 which adds no value.

Solution 2

Because in C nothing prevents you from writing beyond a particular variable's memory. Everything is just an address, knowing how many bytes after this address you can write to is up to you and not something the compiler is going to check.

a short int uses less bytes of memory than a regular int. You allocated a short int. Then you asked scanf to write a normal int. scanf wrote beyond the allocated memory, and overwrote part of char *pointer which happened to be located just after your short int. This is called undefined behavior because there is no knowing what you could be overwriting. The fact that pointer is located in memory right after temp is a coincidence.

pointer now points to an invalid memory address, and you get a segmentation fault when you try to access it.

A pointer is actually just another integer variable (a long) that stores a memory address.

Share:
28,526
clancy688
Author by

clancy688

Updated on November 07, 2020

Comments

  • clancy688
    clancy688 over 3 years

    I've recently tried some C-programming and stumbled upon the following problem. I'm using NetBeans 7.4 64 IDE with MinGW 32 Bit. This is a short example code which highlights my problem:

    int main(void) {
    
        unsigned short int temp;
        char *pointer;
        pointer = malloc(12 * sizeof(char));
    
        printf("The pointers value is %d \n", (int)pointer);
        printf("Type a short string:\n");
    
        gets(pointer);
    
        printf("The pointers value is %d \n", (int)pointer);
        printf("Type an int: \n");
    
    //This line changes the char pointer to an apparently random value
        scanf("%d", &temp);
    
    //Segmentation fault upon this point
        printf("The pointers value is %d \n", (int)pointer);
    
    //And here as well
        free(pointer);
    
    
        return (EXIT_SUCCESS);
    }
    

    Until scanf everything is fine. The string read by gets is written into the memory space pointer is pointing at. But AFTER scanf has been processed, pointer's value is changed so that pointer is pointing on any space. So not only my string's lost, but I also get segmentation faults when trying to access / free memory which doesn't belong to my program.

    The value change is apparently random. Each time I'm debugging this program, the pointer's changed to another value.

    I've already deduced that the unsigned short int is at fault, or rather the wrong format specifier (%d instead of %hu) in my scanf. If I either change unsigned short int to int or use %hu as specifier, everything works fine. So there's the solution.

    But I'm still curious why and how the pointer's affected by this mistake. Can anyone help me there?

    • Sunil Bojanapally
      Sunil Bojanapally over 10 years
      Never use gets. It offers no protections against a buffer overflow vulnerability.
  • clancy688
    clancy688 over 10 years
    Thank you for your answer, but I already knew the %hu bit. I'm interested in knowing how not using this specifier does correlate to randomly changing a before used pointer?
  • clancy688
    clancy688 over 10 years
    Thank you very much! That's exactly what I wanted to know. :)
  • unwind
    unwind over 10 years
    @clancy688 Ah, I'm sorry for not making that clearer. I've edited now.
  • phenompbg
    phenompbg over 10 years
    @some It does, as does the size of a long. A pointer would be the same size as the memory address size of the CPU (so 64 bits for a 64-bit CPU). Usually a long is the same size, because a long is the same size as the "native" integer data type of the CPU. Usually.
  • greggo
    greggo about 10 years
    Note that it's not necessary to use 'h' when passing a short to printf, since the short will be converted to int as it is passed. for scanf, however, a pointer is passed to the location to write; so the function needs to be told exactly how large a value to write.