How can I use bit shifting to replace integer division?
Solution 1
Best approach is to let the compiler do it for you. You simply write
a/b
in your language of choice, and the compiler generates the bit twiddling.
EDIT (I hope you don't mind, i'm adding reinforcement to your answer:
#include <stdio.h>
int main(int argc, char **argv) {
printf("%d\n", argc/4);
}
Obviously, the fastest thing to do is argc>>2
. Lets see what happens:
.file "so3.c"
.section .rodata
.LC0:
.string "%d\n"
.text
.globl main
.type main, @function
main:
pushl %ebp
movl %esp, %ebp
andl $-16, %esp
subl $16, %esp
movl 8(%ebp), %eax
movl %eax, %edx
sarl $31, %edx
shrl $30, %edx
leal (%edx,%eax), %eax
sarl $2, %eax
movl %eax, %edx
movl $.LC0, %eax
movl %edx, 4(%esp)
movl %eax, (%esp)
call printf
leave
ret
.size main, .-main
.ident "GCC: (Ubuntu 4.4.3-4ubuntu5) 4.4.3"
.section .note.GNU-stack,"",@progbits
yup, there it is, sarl $2, %eax
EDIT 2 (Sorry to pile on, but 20/19
is a bit more complicated…)
I just substituted argc*20/19
for argc/4
and this is the math that comes out:
0000000100000f07 shll $0x02,%edi
0000000100000f0a movl $0x6bca1af3,%edx
0000000100000f0f movl %edi,%eax
0000000100000f11 imull %edx
0000000100000f13 sarl $0x03,%edx
0000000100000f16 sarl $0x1f,%edi
0000000100000f19 subl %edi,%edx
So, the process is
- Multiply input by 4 (shll)
- Load (movl 0x...) and multiply by (imull) a fixed-point fraction obtaining a 64-bit result (this is 32-bit code)
- Divide high-order 32 bits of result by 8 (sarl), note how this handles negative numbers
- Divide low-order 32 bits of result by INT_MAX (sarl) to obtain either 0 or -1
- Correctly round the high-order result by adding 1 (subtracting -1) if necessary.
Solution 2
It makes no sense because what you are trying to do does not optimise the resulting process!!!
Hey, I did not read anywhere in your question that you had intention to optimise.
Electrical Engg people never stop being curious regardless of "usefulness". We are like compulsive obsessive hoarders of items of whom you read in the news where they stack their attics, cellars, bedrooms and living rooms up with junk which they believe would come in handy one day. At least that was the case when I was in Engg school a little less than 30 years ago. I encourage you to continue in your quest to hoard up "useless" knowledge that appears to have little possibilities of optimising your life or life-style. Why depend on the compiler when you can do it by hand-coded algorithm?! Yah? Be a little adventurous, you know. Ok enuf dissing people who express disdain at your pursuit of knowledge.
Recall in your middle-school, the way you were taught to do your division? 437/24, e.g.
_____
24|437
018
-----
24|437
24
-----
197
24
-----
5
The number which is subject to division, 437, is called the dividend. 24 is the divisor, the result 18 is the quotient, and 5 is the remainder. Like when you file your taxes, you need to fill in profits you had gained from stock "dividends", which is a misnomer. What you fill into the tax form is a multiple of the quotient of a single huge chunk of dividend. You did not receive the dividend, but portions of dividend - otherwise, it would mean you owned 100% of the stock.
___________
11000|110110101
000010010
-----------
11000|110110101
11000
----------
000110101 remainder=subtract divisor from dividend
11000000 shift divisor right and append 0 to quotient until
1100000 divisor is not greater than remainder.
110000 Yihaa!
----------
000101 remainder=subtract shifted divisor from remainder
11000 shift divisor right and append 0 to quotient until
1100 divisor is not greater than remainder.
----------
oops, cannot shift anymore.
The above, as you might already know, is TRUE division. Which is achieved by subtracting by a shifted divisor.
What you want is to achieve the same thing by simply shifting the dividend. That, unfortunately cannot be done unless the divisor is a exponential power of 2 (2,4,8,16). Which is an obvious fact of binary arithmetic. Or, at least I am not aware of any method that can do it without approximation and intrapolative techniques.
Therefore, you have to use a combination of dividend shift and true division. e.g.
24 = 2 x 2 x 2 x 3
First, divide 437 by 8 using binary shift to get 010010 and then use true division to divide by 3:
010010
--------
11|110110
11
-------
011
11
-----
0
which works out to 010010 = 18.
Voila.
How do you determine 24 = 2^8 x 3?
By shifting 11000 rightwards until you hit a 1.
Which means, you could shift the dividend the same number of times as you would shift the divisor until the divisor hits a 1.
Therefore, obviously, this method would not work if a divisor is odd. e.g., it will not work for divisor 25, but it will work a little for divisor 50.
May be, there are predictive methods that could interpolate a divisor like 13 to be between 2^3=8 and 2^4=16. If there are, I am not familiar with them.
What you need to explore is using a number series. For example dividing by 25:
1 1 1 1 1
__ = __ - ___ - ___ + ___ - ... until the precision you require.
25 16 64 128 256
where the general form of the series is
1 1 b1 bn
_ = ___ + _______ + ... + ______
D 2^k 2^(k+1) 2^(k+n)
where bn is either -1, 0 or +1.
I hoping my binary manipulation above would not have errors or typos. If so, thousands apologies.
Solution 3
Suppose you have the expression a = b / c
. As hroptatyr mentioned, the multiplication is quite fast (and it's much faster than division). So the basic idea is to transform the division into multiplication like : a = b * (1/c)
.
Now, we still need division for computation of reciprical 1/c
, so this would work only if c
is known apriori. While for floating point computation it's enough, for intereges we have to use another trick: we can use for reciprocal of the value of c
the value some_big_number / c
, so that finally we'll compute a2 = b * (some_big_number / c)
, that is equal to some_big_number * b/c
. Because we're interested in value of b/c
, we have to divide the final result by some_big_number
. If it's choosed to be a power of 2, then the final division would be fast.
ex:
// we'll compute 1/20 of the input
unsigned divide_by_20(unsigned n){
unsigned reciprocal = (0x10000 + 20 - 1) / 20; //computed at compile time, but you can precompute it manually, just to be sure
return (n * reciprocal) >> 16;
}
EDIT: a good part of this method is that you can choose any rounding method for the divison by choosing the correction (in this case it was 20 - 1
for rounding towards zero).
Solution 4
If you are interested in the math behind it, read Hacker's Delight by Henry S. Warren.
If you are interested in optimized code, just write what is most easy to read by humans. For example:
int five_percent(int x) {
return x / 20;
}
When you compile this function using g++ -O2
, it will not do an actual division but some magic multiplication, bit-shifting and correction instead.
Solution 5
You can't do everything with shifts, you will instead need to use 'magic' divisors(see hackers delight). Magic division works by multiplying a number by another suitably large number, rolling it over in such a way as to yield the answer of division(mul/imul is faster than div/idiv). There magic constants are only unique for each prime, multiples require a shift, eg: unsigned division by 3 can be represented (on 32 bit) as x * 0xAAAAAAAB
, division by 6 would be (x * 0xAAAAAAAB) >> 1
division by 12 would shift by 2, 24 by 3 etc (its the geometric series 3 * (2 ^ x)
, where 0 <= x < 32)
glutz78
Updated on July 07, 2020Comments
-
glutz78 almost 4 years
I understand how to do it for powers of 2 so that's not my question.
For example, if I want to find 5% of a number using a bit shift instead of an integer divide, how would i calculate that?
So instead of (x * 20 / 19), I could do (x * 100 >> 11). Now this isn't right but it's close and I arrived at it using trial and error. How would I determine the most possible precise shift to use?
-
Potatoswatter over 13 yearsThat makes no sense. Multiplication by 5 includes any cost of shifting
<< 2
. The object here is to multiply by any rational number in just one or two instructions with no division, not to decompose the number and use an indefinite number of insns. -
Potatoswatter over 13 years+1 - working out the bits by hand is a chore, and the best way to learn the process is to look at compiled output.
-
hroptatyr over 13 yearsWho said that? The OP wants to know how to turn integer multiplication into bit-shifting, I've just described the general procedure.
-
hroptatyr over 13 yearsOh and btw, never judge before you've measured, I've just found that an
imul
would be 3 cycles on my CPU whereas my solution with ashl
and anadd
takes 2 cycles. -
Potatoswatter over 13 yearsAn
shl
and anadd
only accomplishes multiplication by 5. You still need another insn to shift again. The compiler should be smart enough to figure it out and not produce theimul
if its really inferior, although for portability it might not be specialized to your chip, and the higher instruction count might cause other congestion. -
Potatoswatter over 13 yearsAnyway, the question isn't so much about replacing the multiplication as the division, which you don't address at all. That requires getting the high-order result of multiplication, which can't be represented using C operators. (At least, not obtaining the full width of an integer register.) It's a fixed-point math trick.
-
SingleNegationElimination over 13 yearsI added compiler output to demonstrate just how right you are!
-
ergosys over 13 yearsFor signed values, divide by 65536 instead of shifting by 16, the compiler will convert to a shift and fix-up.
-
High Performance Mark over 13 years@Potatoswatter: I fell baaaddd earning so much rep from your efforts. Not very bad, it won't keep me awake at nights, but a little bit bad :-)
-
Potatoswatter over 13 years@Mark: Meh, if I'd gone to the trouble of describing it in general terms, that would be more helpful. No point in letting rep actually decide anything.