Can someone explain ARM bitwise operations to me?

arm bit-manipulation bit-shift

22,627

Solution 1

Truth tables, two inputs, the two numbers on the left and one output, the number on the right:

the left two inputs a and b represent the four possible combinations of inputs, no more no less that is the list.

Consider a 1 to mean true and 0 to mean false. And the word OR in this case means if a OR b is true then c is true. And as you see in the table, horizontally if either a or b is true then c is true.

AND

And means they both have to be true if a AND b are both true then c is true. There is only one case where that exists above.

Now take two bytes 0x12 and 0x34 which in decimal are 18 and 52 but we dont really care much about decimal. we care about binary 0x12 is 0b00010010 and 0x34 is 0b00110100. The bitwise operators like AND and OR and XOR in assembly language mean you take one bit from each operand and that gives the result in the same bit location. Its not like add where you have things like this plus that equals blah carry the one.

so we line up the bits

0b00010010    0x12
0b00110100    0x34

So tilt your head sidways like you are going to take a bite out of a taco held in your left hand and visualize the truth table above. If we look at the two bits on the right they are 0 and 0, the next two bits are 1 and 0 and so on. So if we wanted to do an OR operation, the rule is if either a or b is true then c, the result, is true

   0b00010010
   0b00110100
OR ==========
   0b00110110

Head tilted to the right, least significant bit (the bit in the ones column in the number) 0 or 0 = 0, neither one is set. next column (the twos column) 1 or 0 = 1 at least one is true. and so on so

0x12 OR 0x34 = 0x36

In arm assembly that would be

mov r0,#0x12
mov r1,#0x34
orr r2,r0,r1

after the or operation r2 would hold the value 0x36.

Now lets and those numbers

    0b00010010
    0b00110100
AND ==========
    0b00010000

Remembering our truth table and the rule both a and b have to be true (a 1) we tilt our head to the right, 0 and 0 is 0, both are not true. and by inspection only one column has both inputs with a 1, the 16s column. this leaves us with 0x12 AND 0x34 = 0x10

In arm assembly that would be

mov r0,#0x12
mov r1,#0x34
and r2,r0,r1

Now we get to the BIC instruction. Which stands for bitwise clear, which hopefully will make sense in a bit. Bic on the arm is a anded with not b. Not is another truth table, but only one input and one output

NOT

a  c
0  1
1  0

With only one input we have only two choices 0 and 1, 1 is true 0 is false. NOT means if not a then c is true. when a is not true c is true, when a is true c is not true. Basically it inverts.

What the bic does is have two inputs a and b, the operation is c = a AND (NOT b) so the truth table for that would be:

a AND (NOT b)

I started with the AND truth table then then NOTted the b bits, where b was a 0 in the AND truth table I made it a 1 where b was a 1 in the AND truth table I made it a 0.

So the bic operation on 0x12 and 0x34 is

    0b00010010
    0b00110100
BIC ==========
    0b00000010

Why is it called bit clear? Understanding that makes it much easier to use. If you look at the truth table and think of the first and second inputs. Where the second, b, input is a 1 the output is 0. where the second input, b, is a 0, the output is a itself unmodified. So what that truth table or operation is doing is saying anywhere b is set clear or zero those bits in A. So if I have the number 0x1234 and I want to zero the lower 8 bits, I would BIC that with 0x00FF. And your next question is why not AND that with 0xFF00? (analyze the AND truth table and see that wherever b is a 1 you keep the a value as is, and wherever b is a 0 you zero the output). The ARM uses 32 bit registers, and a fixed 32 bit instruction set, at least traditionally. The immediate instructions

mov r0,#0x12

In arm are limited to 8 non-zero bits shifted anywhere within the number, will get to shifting in a bit. So if I had the value 0x12345678 and wanted to zero out the lower 8 bits I could do this

; assume r0 already has 0x12345678
bic r0,r0,#0xFF

; assume r0 already has 0x12345678
mov r1,#0xFF000000
orr r1,r1,#0x00FF0000
orr r1,r1,#0x0000FF00
;r1 now contains the value 0xFFFFFF00
and r0,r0,r1

; assume r0 already contains 0x12345678
ldr r1,my_byte_mask
and r0,r0,r1
my_byte_mask: .word 0xFFFFFF00

which is not horrible, compared to using a move and two orrs, but still burns more clock cycles than the bic solution because you burn the extra memory cycle reading my_byte_mask from ram, which can take a while.

; assume r0 already contains 0x12345678
mvn r1,#0xFF
and r0,r0,r1

This last one being not a bad compromize. note that mvn in the arm documentation is bitwise not immediate, that means rx = NOT(immediate). The immediate here is 0xFF. NOT(0xFF) means invert all the bits, it is a 32 bit register we are going to so that means 0xFFFFFF00 is the result of NOT(0xFF) and that is what the register r1 gets, before doing the and.

So that is why bic has a place in the ARM instruction set, because sometimes it takes fewer instructions or clock cycles to mask (mask = AND used to make some bits zeros) using the bic instruction instead of the and instruction.

I used the word mask as a concept to make bits in a number zero leaving the others alone. orring can be thought of as making bits in a number one while leaving the others alone, if you look at the OR truth table any time b is a 1 then c is a 1. So 0x12345678 OR 0x000000FF results in 0x123456FF the bits in the second operand are set. Yes it is also true that anytime a is set in the OR truth table then the output is set, but a lot of the time when you use these bitwise operations you have one operand you want to do something to, set a certain number of bits to one without modifying the rest or set a certain number of bits to zero without modifying the rest or you want to zero all the bits except for a certain number of bits. When used that way you have one operand coming in which is what you want to operate on and you create the second operand based on what you want the overall effect to be, for example in C if we wanted to keep only the lower byte we could have a one parameter in, one parameter out function:

unsigned int keep_lower_byte ( unsigned int a )
{
    return(a&(~0xFF));
}

~ means NOT so ~0xFF, for 32 bit numbers means 0xFFFFFF00 then & means AND, so we return a & 0xFFFFFF00. a was the only real operand coming in and we invented the second one based on the operation we wanted to do...Most bitwise operations you can swap the operands in the instruction and everything turns out okay, instructions like ARM's bic though the operands are in a certain order, just like a subtract you have to use the correct order of operands.

Shifting...there are two kinds, logical, and arithmetic. logical is easiest and is what you get when you use >> or << in C.

Start with 0x12 which is 0b00010010. Shifting that three locations to the left (0x12<<3) means

00010010 < our original number 0x12
0010010x < shift left one bit location
010010xx < shift left another bit location
10010xxx < shift left a third bit location

What bits get "shifted in" to the empty locations, the x'es above, varies based on the operation. For C programming it is always zeros:

00010010 < our original number 0x12
00100100 < shift left one bit location
01001000 < shift left another bit location
10010000 < shift left a third bit location

But sometimes (usually every instruction set supports a rotate as well as a shift) there are other ways to shift and the differences have to do with what bit you shift into the empty spot, and also sometimes the bit you shifted off the end doesnt always just disappear sometimes you save that in a special bit holder location.

Some instruction sets only have a single bit shift meaning for each instruction you program you can only shift one bit, so the above would be 3 instructions, one bit at a time. Other instruction sets, like arm, allow you to have a single instruction and you specify in the instruction how many bits you want to shift in that direction. so a shift left of three

mov r0,#0x12
mov r3,r0,lsl#3 ; shift the contents of r0 3 bits to the left and store in r3

This varying of what you shift in is demonstrated between lsr and asr, logical shift right and arithmetic shift right (you will see that there is no asl, arithmetic shift left because that makes no sense, some assemblers will allow you to use an asl instruction but encode it as a lsl).

A LOGICAL shift right:

00010010 - our original number 0x12
x0001001 - shifted right one bit
xx000100 - shifted right another bit
xxx00010 - shifted right another bit

As with C there is a version that shifts in zeros, that is the logical shift right, shifting in zeros

00010010 - our original number 0x12
00001001 - shifted right one bit
00000100 - shifted right another bit
00000010 - shifted right another bit

ARITHMETIC shift right means preserve the "sign bit" what is the sign bit? that gets into twos complement numbers which you also need to learn if you have not. Basically if you consider the bit pattern/value to be a twos complement number then the most significant bit, the one on the left, is the sign bit. if it is 0 the number is positive and 1 the number is negative. You may have noticed that a shift left by one bit is the same as multiplying by 2 and a shift right is the same as dividing by 2. 0x12 >> 1 = 0x9, 18 >> 1 = 9 but what if we were to shift a minus 2 to the right one, minus two is 0xFE using bytes or 0b11111110. using the C style logical shift right 0xFE >> 1 = 0x7F, or in decimal -2 >> 1 = 0x127. We cannot solve that in C in a single operation, unfortunately, but in assembly we can using an arithmetic shift, assuming your instruction set has one, which the arm does

ARITHMETIC shift right

s1100100 - our starting value s is the sign bit whatever that is 0 or 1
ss110010 - one shift right
sss11001 - another shift right
ssss1100 - another shift right

So if the sign bit s was a 0 when we started, if the number was 01100100 then

01100100 - our starting value
00110010 - one shift right
00011001 - another shift right
00001100 - another shift right

but if that sign bit had been a one

11100100 - our starting value
11110010 - one shift right
11111001 - another shift right
11111100 - another shift right

And we can solve the 0xFE shifted right one:

11111110 - 0xFE a minus 2 in twos complement for a byte
11111111 - shifted right one

so in pseudo code 0xFE ASR 1 = 0xFF, -2 ASR 1 = -1. -2 divided by 2 = -1

The last thing you need to read up on your own has to do with rotates and/or what happens to the bit that is shifted off the end. a shift right the lsbit is shifted "off the end" of the number like blocks being slid of a table and the one that falls off might just go into the "bit bucket" (ether, heaven or hell, one of these places where bits go to die when they disappear from this world). But some instructions in some instruction sets will take that bit being shifted off and put it in the Carry flag (read up on add and subtract), not because it is a carry necessarily but because there are status bits in the alu and the Carry bit is one that kinda makes sense. Now what a rotate is, is lets say you had an 8 bit processor and you rotated one bit, the bit falling off the end lands in the Carry bit, AND the bit shifting in the other side is what was in the carry bit before the operation. Basically it is musical chairs, the bits are walking around the chairs with one person left standing, the person standing is the carry bit, the people in chairs are the bits in the register. Why is this useful at all? lets say we had an 8 bit processor like the Atmel AVR for example but wanted to do a 64 bit shift. 64 bits takes 8, 8 bit, registers, say I have my 64 bit number in those 8 registers and I want to do a 64 bit shift left one bit. I would start with the least significant byte and do an lsl which shifts a zero in but the bit shifting out goes into the carry bit. then the next most significant byte I do a rol, rotate left one bit, the bit coming in is the bit going out of the prior byte and the bit going out goes to the carry bit. I repeat the rol instruction for the other bytes, looking at a 16 bit shift:

00100010 z0001000 - our original number
00100010 z 0001000 - lsl the least significant byte, the ms bit z is in carry
0100010z 00010000 - rotate left the most significant byte pulling the z bit from carry

00100010z0001000 - if it had been a 16 bit register
0100010z00010000 - a logical shift left on a 16 bit with a zero coming in on the left

that is what the rotates are for and that is why the assembly manual bothers to tell you what flags are modified when you perform a logical operation.

Solution 2

I'll do the first one and then maybe you can try and work out the rest using a similar approach:

/** LSL **/
mov r0, #1            ; r0 = 0000 0000 0000 0000 0000 0000 0000 0001
mov r3, r0, LSL#10    ; r3 = r0 logically shifted left by 10 bit positions 
                           = 0000 0000 0000 0000 0000 0100 0000 0000
                                                       ^           ^
                                                       +<<<<<<<<<<<+
                                                     shift left 10 bits

Note however that if you don't yet understand boolean operations such as OR (|), AND (&), etc, then you will have a hard time understanding the corresponding ARM instructions (ORR, AND, etc).

22,627

Author by

Kristina Brooks

I do stuff. Mostly interested in low level things. Some notable ones: Open source firmware for Raspberry Pi's VPU (VideoCore4): https://github.com/christinaa/rpi-open-firmware Writing an LLVM backend for VideoCore4: https://github.com/christinaa/LLVM-VideoCore4 Porting XNU/Darwin to various ARMv7 boards (for example, the OMAP3 base BeagleBoard) Implementing a Mach-like IPC system inside the Linux kernel

Updated on February 07, 2020

Comments

Kristina Brooks about 4 years
Can someone explain ARM bit-shifts to me like I'm five? I have a very poor understanding of anything that involves non-decimal number systems so understanding the concepts of bit shifts and bitwise operators is difficult for me.

What would each of the following cases do and why (what would end up in R3 and what happens on behind the scenes on the bit level)?
```
/** LSL **/
mov r0, #1
mov r3, r0, LSL#10

/** LSR **/
mov r0, #1
mov r3, r0, LSR#10

/** ORR **/
mov r0, #1
mov r1, #4
orr r3, r1, r0

/** AND **/
mov r0, #1
mov r1, #4
and r3, r1, r0

/** BIC **/
mov r0, #1
mov r1, #4
bic r3, r1, r0
```
PS. Do not explain it in terms of C bitwise operators. I don't know what they do either (the >>, <<, |, & ones).
Mikhail Kalashnikov over 8 years

Wow, your answer impressed me!
71GA about 6 years

This is a really good answer! Could we ask admin to save this one?
71GA about 6 years

@old_timer Would you maybe know why I get unshifted register required -- bic r0,r0,#0x3 during compilation for Thumb when I use your BIC syntax? Is this still a bug from 2007? gcc.gnu.org/bugzilla/show_bug.cgi?id=34436
old_timer about 6 years

@71GA if you look at the armv7-m ARM ARM you see that the only bic immediate is a thumb2 extension. There is a 16 bit bic register encoding (as well as a thumb2 extension). So if you specify the unified syntax (.syntax unified) then it will be happy. I personally dont want to switch to the unified, and used to be able to get it to generate stuff like this, but have to specify it if/when I need that syntax. Its because you are set for traditional thumb and asking for a thumb2 instruction.
71GA about 6 years

So I use this directive right before the BIC assembly command? Would this also solve for example immediate: 1024 is out of range?I thought of this because it compiles to 32-bit Thumb2 encoding and more space is then left for immediate values right?
old_timer about 6 years

use it up at the top along with .cpu if you want to use .cpu and with .thumb, yes it gives you a little more flexibility on immediates over stock thumb.