What's the purpose of the rotate instructions (ROL, RCL on x86)?
Rotates are required for bit shifts across multiple words. When you SHL the lower word, the high-order bit spills out into the carry. To complete the operation, you need to shift the higher word(s) while bringing in the carry to the low-order bit. RCL is the instruction that accomplishes this.
High word Low word CF Initial 0110 1001 1011 1001 1100 0010 0000 1101 ? SHL low word 0110 1001 1011 1001 1000 0100 0001 1010 1 RCL high word 1101 0011 0111 0011 1000 0100 0001 1010 0
ROL and ROR are useful for examining a value bit-by-bit in a way that is (ultimately) non-destructive. They can also be used to shunt a bitmask around without bringing in garbage bits.
The rotate shift opcodes ROL, RCL, ROR, RCR) are used almost exclusively for hashing and CRC computations. They are pretty arcane and very rarely used.
The shift opcodes (SHL, SHR) are used for fast multiplication by powers of 2, or to move a low byte into a high byte of a large register.
The difference between ROL and SHL is ROL takes the high bit and rolls it around into the low bit position. SHL throws the high bit away and fills the low bit position with zero.
ROR ROL are "historic" but still useful in a number of ways.
Before the 80386 (and opcode BT), ROL would be used a lot to test a bit (SHL doesn't propagate to the carry flag) - actually in 8088, ROR/ROL would only shift by 1 bit at a time !!!!
Also if you want to shift one way and then the other way without loosing the bits that have been shifted out of scope, you'd use ROR/ROL instead of SHR/SHL
If I understand you correctly, your question is this:
"Given the fact that rotation instructions seem to be very special-purpose and not emitted by compilers, when are they actually used and why are they included in CPUs?".
The answer is twofold:
CPU's are not designed specifically to execute C programs. Rather, they are designed as general purpose machines, intended to solve a wide array of problems using a wide variety of different tools and languages.
The designers of a language are under no obligation to use every opcode in the CPU. In fact, most of the time, they do not, because some CPU instructions are highly specialized, and the language designer has no pressing need to use them.
More information about bitwise operators (and how they relate to C programming) can be found here: http://en.wikipedia.org/wiki/Bitwise_operation
Back when microprocessors were first created, most programs were written in assembly, not compiled. The majority of CPU instructions are probably not emitted by compilers (which is the impetus for creating RISC), but are often relatively easy to implement in hardware.
Many algorithms in graphics and cryptography use rotation, and their inclusion in CPUs makes it possible to write very fast algorithms in assembly.
Gratian Lup almost 2 years
I always wondered what's the purpose of the rotate instructions some CPUs have (ROL, RCL on x86, for example). What kind of software makes use of these instructions?
I first thought they may be used for encryption/computing hash codes, but these libraries are written usually in C, which doesn't have operators that map to these instructions. (Editor's note: see Best practices for circular shift (rotate) operations in C++ for how to write C or C++ that will compile to a rotate instruction. Also, optimized crypto libraries often do have asm for specific platforms.)
Has anybody found an use for them? Why where they added to the instructions set?
Gabe over 12 yearsI don't see how you answered the question.
Gabe over 12 yearsWhen would use rotation to test bits instead of
Rahul Das over 12 yearsWhen you want to test them all and, perhaps, in order.
Rahul Das over 12 yearsOr alternatively, when you don't have BT to begin with.
Gabe over 12 yearsAnd the 8080 didn't even have shift instructions -- rotate was all you got!
Jeroen Wiert Pluimers over 12 yearsmaybe you can add the difference to ROL/RCL and ROR/RCR in your answer too.
phuclv about 10 yearsRotates are only effective when shifting only 1 bit
Assad Ebrahim over 9 yearsWouldn't CF be 0 after the third step? (the bit that goes off is set to CF and previous value of CF is inserted to the right-most position)
phuclv almost 9 years"In assembly languages these instructions are represented by mnemonics such as ADD/SUB, ADC/SBC (ADD/SUB including carry), SHL/SHR (bit shifts), ROL/ROR (bit rotates), RCR/RCL (rotate through carry), and so on.  The use of the carry flag in this manner enables multi-word add, subtract, shift, and rotate operations." en.wikipedia.org/wiki/Carry_flag
The Welder almost 8 yearsPretty arcane and rarely used? Really? There are many places where rotation is useful, especially in, as you say hashing and cryptography. On many CPU's where the amount shifted affects time, it's actually faster to rotate and bitwise and rather than doing a shift.
dthorpe almost 8 yearsYes, very rarely used. Hashing and crypto are things to be used from libraries, not something every developer should write for themselves.
ecm over 2 yearsIt is right that on the 8088 you can only rotate by one, if you use an immediate rotate count. However, the 8088 does support rotating by a count given in the register
cltoo. (Immediate byte shift/rotate counts other than 1 were added in the 186 instruction set.)
Peter Cordes over 2 yearsNote that from a CPU design perspective (which instructions to provide), the relevant measure is how frequently it is (or would be) executed, not how many different pieces of software will contain the instruction. It's not that hard to emulate (unlike some special-purpose instructions like
psadbwwhich was added basically for video-encode motion-search), but OTOH it doesn't take much extra hardware to make a barrel shifter capable of rotating.
Peter Cordes over 2 years
Peter Cordes almost 2 yearsMany CPUs have a rotate-through-carry flag which you could use for equivalent purposes, if you're limited to shifting 1 bit at a time. That also enabled variable-count shifts across register boundaries using a loop, which wouldn't be possible with ROL without another shift (and NOT) to create those masks. Still, yes, interesting point for constant shift-counts on machines which have multi-bit shifts that are faster than looping but still slow. (Like 8086). However, you'd optimize to
src_hi << 3instead of ROL + mask, since the bits shifted out there aren't shifted into anything.
Aki Suihkonen almost 2 yearsYes, it's definitely worth it to have the destructive variants (arithmetic and logical shifting right / logical shift left) for those cases that need it; And indeed the last word benefits from those instructions. I suppose I wanted to extend the concept to really-multi-word shifting before editing.