Right shift with zeros at the beginning

18,607

Solution 1

This is how C and binary arithmetic both work:

If you left shift 0xff << 3, you get binary: 00000000 11111111 << 3 = 00000111 11111000

If you right shift 0xff >> 3, you get binary: 00000000 11111111 >> 3 = 00000000 00011111

0xff is a (signed) int with the positive value 255. Since it is positive, the outcome of shifting it is well-defined behavior in both C and C++. It will not do any arithmetic shifts nor any kind or poorly-defined behavior.

#include <stdio.h>

int main()
{

  printf("%.4X %d\n", 0xff << 3, 0xff << 3);
  printf("%.4X %d\n", 0xff >> 3, 0xff >> 3);

}

Output:

07F8 2040
001F 31

So you are doing something strange in your program because it doesn't work as expected. Perhaps you are using char variables or C++ character literals.


Source: ISO 9899:2011 6.5.7.


EDIT after question update

int number = ~0; gives you a negative number equivalent to -1, assuming two's complement.

number = number << 4; invokes undefined behavior, since you left shift a negative number. The program implements undefined behavior correctly, since it either does something or nothing at all. It may print fffffff0 or it may print a pink elephant, or it may format the hard drive.

number = number >> 4; invokes implementation-defined behavior. In your case, your compiler preserves the sign bit. This is known as arithmetic shift, and arithmetic right shift works in such a way that the MSB is filled with whatever bit value it had before the shift. So if you have a negative number, you will experience that the program is "shifting in ones".

In 99% of all real world cases, it doesn't make sense to use bitwise operators on signed numbers. Therefore, always ensure that you are using unsigned numbers, and that none of the dangerous implicit conversion rules in C/C++ transforms them into signed numbers (for more info about dangerous conversions, see "the integer promotion rules" and "the usual arithmetic conversions", plenty of good info about those on SO).

EDIT 2, some info from the C99 standard's rationale document V5.10:

6.5.7 Bitwise shift operators

The description of shift operators in K&R suggests that shifting by a long count should force the left operand to be widened to long before being shifted. A more intuitive practice, endorsed by the C89 Committee, is that the type of the shift count has no bearing on the type of the result.

QUIET CHANGE IN C89

Shifting by a long count no longer coerces the shifted operand to long. The C89 Committee affirmed the freedom in implementation granted by K&R in not requiring the signed right shift operation to sign extend, since such a requirement might slow down fast code and since the usefulness of sign extended shifts is marginal. (Shifting a negative two’s complement integer arithmetically right one place is not the same as dividing by two!)

Solution 2

If you explicitly shift 0xff it works as you expected

cout << (0xff >> 3) << endl; // 31

It should be possible only if 0xff is in type of signed width 8 (char and signed char on popular platforms).


So, in common case:

You need to use unsigned ints

(unsigned type)0xff

right shift works as division by 2(with rounding down, if I understand correctly).

So when you have 1 as first bit, you have negative value and after division it's negative again.

Solution 3

The two kinds of right shift you're talking about are called Logical Shift and Arithmetic Shift. C and C++ use logical shift for unsigned integers and most compilers will use arithmetic shift for a signed integer but this is not guaranteed by the standard meaning that the value of right shifting a negative signed int is implementation defined.

Since you want a logical shift you need to switch to using an unsigned integer. You can do this by replacing your constant with 0xffU.

Solution 4

To explain your real code you just need the C++ versions of the quotes from the C standard that Lundin gave in comments:

int number = ~0;
number = number << 4;

Undefined behavior. [expr.shift] says

The value of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are zero-filled. If E1 has an unsigned type, the value of the result is E1 × 2E2, reduced modulo one more than the maximum value representable in the result type. Otherwise, if E1 has a signed type and non-negative value, and E1×2E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.

number = ~0;
number = number >> 4;

Implementation-defined result, in this case your implementation gave you an arithmetic shift:

The value of E1 >> E2 is E1 right-shifted E2 bit positions. If E1 has an unsigned type or if E1 has a signed type and a non-negative value, the value of the result is the integral part of the quotient of E1/2E2. If E1 has a signed type and a negative value, the resulting value is implementation-defined

You should use an unsigned type:

unsigned int number = -1;
number = number >> 4;
std::cout << std::hex << number << std::endl;

Output:

0x0fffffff

Solution 5

To add my 5 cents worth here... I'm facing exactly the same problem as this.lau! I've done some perfunctory research on this and these are my results:

typedef unsigned int Uint;
#define U31 0x7FFFFFFF
#define U32 0xFFFFFFFF

printf ("U31 right shifted: 0x%08x\n", (U31 >> 30));
printf ("U32 right shifted: 0x%08x\n", (U32 >> 30));

Output:
U31 right shifted: 0x00000001 (expected)
U32 right shifted: 0xffffffff (not expected)

It would appear (in the absence of anyone with detailed knowledge) that the C compiler in XCode for Mac OS X v5.0.1 reserves the MSB as a carry bit that gets pulled along with each shift.

Somewhat annoyingly, the converse is NOT true:-

#define ST00 0x00000001
#define ST01 0x00000002

printf ("ST00 left shifted: 0x%08x\n", (ST00 << 30));
printf ("ST01 left shifted: 0x%08x\n", (ST01 << 30));

Output:
ST00 left shifted: 0x40000000
ST01 left shifted: 0x80000000

I concur completely with the people above that assert that the sign of the operand has no bearing on the behaviour of the shift operator.

Can anyone shed any light on the specification for the Posix4 implementation of C? I feel a definitive answer may rest there.

In the meantime, it appears that the only workaround is a construct along the following lines;-

#define CARD2UNIVERSE(c) (((c) == 32) ? 0xFFFFFFFF : (U31 >> (31 - (c))))

This works - exasperating but necessary.

Share:
18,607
laurent
Author by

laurent

Updated on June 04, 2022

Comments

  • laurent
    laurent almost 2 years

    I'm trying to do a kind of left shift that would add zeros at the beginning instead of ones. For example, if I left shift 0xff, I get this:

    0xff << 3 = 11111000
    

    However, if I right shift it, I get this:

    0xff >> 3 = 11111111
    

    Is there any operation I could use to get the equivalent of a left shift? i.e. I would like to get this:

    00011111
    

    Any suggestion?

    Edit

    To answer the comments, here is the code I'm using:

    int number = ~0;
    number = number << 4;   
    std::cout << std::hex << number << std::endl;
    
    number = ~0;
    number = number >> 4;
    std::cout << std::hex << number << std::endl;
    

    output:

    fffffff0
    ffffffff
    

    Since it seems that in general it should work, I'm interested as to why this specific code doesn't. Any idea?

  • UmNyobe
    UmNyobe over 11 years
    I upvoted, but really just put a little more text. like why it make the difference.
  • Lundin
    Lundin over 11 years
    I don't understand this answer at all. 0xff is equivalent to writing (signed int)255. There is nothing in the original post setting any sign bits. 0xFF >> 3 is 0x1F, the positive number 31.
  • Pubby
    Pubby over 11 years
    Right shift does not work as division! The rounding is different.
  • Michael Burr
    Michael Burr over 11 years
    Note that you only have to do this for values of 0xff only if it's in a signed char type - otherwise 0xff has a non-negative value (255). I think this should be explicitly noted because I've seen people confused by thinking that 0xff has a negative value in contexts where it doesn't.
  • Lundin
    Lundin over 11 years
    No, C and C++ invokes implementation-defined behavior when you right shift a negative number. There are no guarantees that the result makes any sense or that it can be represented as a correctly formatted negative number. If you left shift a negative number, you invoke undefined behavior.
  • Jack Aidley
    Jack Aidley over 11 years
    Checks standard. Yes, you're correct, the type of shift is implementation defined. It's just usually an arithmetic shift. I've updated my answer.
  • Lundin
    Lundin over 11 years
    @Chethan Yes, I've posted an answer of my own. As a rule of thumb, when you use bit-wise operators, C does not know or care of the programmer's intended representation of the contents. It just manipulates the bits. For example, whether the result of right shift of a signed number happens to end up in a way that makes sense or not, depends on the compiler.
  • Lundin
    Lundin over 11 years
    The C11 and C++11 standards seem 100% identical in their description of the shift operators.
  • Steve Jessop
    Steve Jessop over 11 years
    @Lundin: yes, I didn't mean to imply the rules (or even the text) are different. It's just that the questioner has since explained in a comment that they're using C++, previously it was unclear.
  • Chethan
    Chethan over 11 years
    @Lundin, Yes, the lang says so. But why? a. Why are they UB/ID? and b. Why are they not both UB / ID? What is it about signed numbers with nagative values?
  • Lundin
    Lundin over 11 years
    @Chethan I can just speculate, but it is likely because of the underlying hardware instructions on assembler level. Right shift on a negative number will default to the assembler instruction for arithmetic right shift. This can be executed as expected, because it only makes the number smaller. The data loss is expected and predicted. But if you left shift, the assembler instruction for arithmetic left shift will "shift out" bits into a carry bit in a CPU register, which you can't reach from C.
  • Lundin
    Lundin over 11 years
    @Chethan So the C compiler could either just discard this carry bit, or it could do something fancy with it. What the programmer expects isn't obvious. On two's complement systems, the number would change into something completely different, data would get shifted into the sign bit, it just doesn't make any sense.
  • Lundin
    Lundin over 11 years
    @Chethan I found a bit of info from the standard's rationale document, post updated above. I know too little of the initial K&R de facto standard to comment it, however. Perhaps someone else reading it could enlighten us both.