Any way faster than pow() to compute an integer power of 10 in C++?
Solution 1
Something like this:
int quick_pow10(int n)
{
static int pow10[10] = {
1, 10, 100, 1000, 10000,
100000, 1000000, 10000000, 100000000, 1000000000
};
return pow10[n];
}
Obviously, can do the same thing for long long
.
This should be several times faster than any competing method. However, it is quite limited if you have lots of bases (although the number of values goes down quite dramatically with larger bases), so if there isn't a huge number of combinations, it's still doable.
As a comparison:
#include <iostream>
#include <cstdlib>
#include <cmath>
static int quick_pow10(int n)
{
static int pow10[10] = {
1, 10, 100, 1000, 10000,
100000, 1000000, 10000000, 100000000, 1000000000
};
return pow10[n];
}
static int integer_pow(int x, int n)
{
int r = 1;
while (n--)
r *= x;
return r;
}
static int opt_int_pow(int n)
{
int r = 1;
const int x = 10;
while (n)
{
if (n & 1)
{
r *= x;
n--;
}
else
{
r *= x * x;
n -= 2;
}
}
return r;
}
int main(int argc, char **argv)
{
long long sum = 0;
int n = strtol(argv[1], 0, 0);
const long outer_loops = 1000000000;
if (argv[2][0] == 'a')
{
for(long i = 0; i < outer_loops / n; i++)
{
for(int j = 1; j < n+1; j++)
{
sum += quick_pow10(n);
}
}
}
if (argv[2][0] == 'b')
{
for(long i = 0; i < outer_loops / n; i++)
{
for(int j = 1; j < n+1; j++)
{
sum += integer_pow(10,n);
}
}
}
if (argv[2][0] == 'c')
{
for(long i = 0; i < outer_loops / n; i++)
{
for(int j = 1; j < n+1; j++)
{
sum += opt_int_pow(n);
}
}
}
std::cout << "sum=" << sum << std::endl;
return 0;
}
Compiled with g++ 4.6.3, using -Wall -O2 -std=c++0x
, gives the following results:
$ g++ -Wall -O2 -std=c++0x pow.cpp
$ time ./a.out 8 a
sum=100000000000000000
real 0m0.124s
user 0m0.119s
sys 0m0.004s
$ time ./a.out 8 b
sum=100000000000000000
real 0m7.502s
user 0m7.482s
sys 0m0.003s
$ time ./a.out 8 c
sum=100000000000000000
real 0m6.098s
user 0m6.077s
sys 0m0.002s
(I did have an option for using pow
as well, but it took 1m22.56s when I first tried it, so I removed it when I decided to have optimised loop variant)
Solution 2
There are certainly ways to compute integral powers of 10 faster than using std::pow()
! The first realization is that pow(x, n)
can be implemented in O(log n) time. The next realization is that pow(x, 10)
is the same as (x << 3) * (x << 1)
. Of course, the compiler knows the latter, i.e., when you are multiplying an integer by the integer constant 10, the compiler will do whatever is fastest to multiply by 10. Based on these two rules it is easy to create fast computations, even if x
is a big integer type.
In case you are interested in games like this:
- A generic O(log n) version of power is discussed in Elements of Programming.
- Lots of interesting "tricks" with integers are discussed in Hacker's Delight.
Solution 3
A solution for any base using template meta-programming :
template<int E, int N>
struct pow {
enum { value = E * pow<E, N - 1>::value };
};
template <int E>
struct pow<E, 0> {
enum { value = 1 };
};
Then it can be used to generate a lookup-table that can be used at runtime :
template<int E>
long long quick_pow(unsigned int n) {
static long long lookupTable[] = {
pow<E, 0>::value, pow<E, 1>::value, pow<E, 2>::value,
pow<E, 3>::value, pow<E, 4>::value, pow<E, 5>::value,
pow<E, 6>::value, pow<E, 7>::value, pow<E, 8>::value,
pow<E, 9>::value
};
return lookupTable[n];
}
This must be used with correct compiler flags in order to detect the possible overflows.
Usage example :
for(unsigned int n = 0; n < 10; ++n) {
std::cout << quick_pow<10>(n) << std::endl;
}
Solution 4
No multiplication and no table version:
//Nx10^n
int Npow10(int N, int n){
N <<= n;
while(n--) N += N << 2;
return N;
}
Solution 5
Here is a stab at it:
// specialize if you have a bignum integer like type you want to work with:
template<typename T> struct is_integer_like:std::is_integral<T> {};
template<typename T> struct make_unsigned_like:std::make_unsigned<T> {};
template<typename T, typename U>
T powT( T base, U exponent ) {
static_assert( is_integer_like<U>::value, "exponent must be integer-like" );
static_assert( std::is_same< U, typename make_unsigned_like<U>::type >::value, "exponent must be unsigned" );
T retval = 1;
T& multiplicand = base;
if (exponent) {
while (true) {
// branch prediction will be awful here, you may have to micro-optimize:
retval *= (exponent&1)?multiplicand:1;
// or /2, whatever -- `>>1` is probably faster, esp for bignums:
exponent = exponent>>1;
if (!exponent)
break;
multiplicand *= multiplicand;
}
}
return retval;
}
What is going on above is a few things.
First, so BigNum support is cheap, it is template
ized. Out of the box, it supports any base type that supports *= own_type
and either can be implicitly converted to int
, or int
can be implicitly converted to it (if both is true, problems will occur), and you need to specialize some template
s to indicate that the exponent type involved is both unsigned and integer-like.
In this case, integer-like and unsigned means that it supports &1
returning bool
and >>1
returning something it can be constructed from and eventually (after repeated >>1
s) reaches a point where evaluating it in a bool
context returns false
. I used traits classes to express the restriction, because naive use by a value like -1
would compile and (on some platforms) loop forever, while (on others) would not.
Execution time for this algorithm, assuming multiplication is O(1), is O(lg(exponent)), where lg(exponent) is the number of times it takes to <<1
the exponent
before it evaluates as false
in a bool
ean context. For traditional integer types, this would be the binary log of the exponent
s value: so no more than 32.
I also eliminated all branches within the loop (or, made it obvious to existing compilers that no branch is needed, more precisely), with just the control branch (which is true uniformly until it is false once). Possibly eliminating even that branch might be worth it for high bases and low exponents...
szli
Updated on November 22, 2020Comments
-
szli over 3 years
I know power of 2 can be implemented using << operator. What about power of 10? Like 10^5? Is there any way faster than pow(10,5) in C++? It is a pretty straight-forward computation by hand. But seems not easy for computers due to binary representation of the numbers... Let us assume I am only interested in integer powers, 10^n, where n is an integer.
-
user541686 almost 11 years@OliCharlesworth: Wouldn't it be 20 entries max? log(2^64) < 20
-
Oliver Charlesworth almost 11 years@Mehrdad: ah, yes, it looks like the OP may only be interested in integer powers. (I was considering the general case, where a large table would screw cache performance...)
-
Rahul Tripathi almost 11 years@OliCharlesworth:- Exponentiation by squaring is also a good option. Correct me if I am wrong!!!
-
Admin almost 11 yearsFor some bigint type, probably. I don't think the question is about that, though. For fixed-size integer types, a lookup table makes it an O(1) operation. (Edit: that's not really fair, though. For fixed-size integer types, n has an upper limit, and if n has an upper limit, even something like O(2^n) is equivalent to O(1) then.) For floating point types, pow is already normally implemented as an O(1) operation.
-
Admin almost 11 years@hvd
O(1)
isn't necessarily faster thanO(n)
, though. -
Admin almost 11 years@H2CO3 Yeah, you're right. O(1) will be faster than O(n) for large enough n, but that isn't saying much when n cannot be more than 20 or so.
-
brian beuning almost 11 yearsNot to be confuse with caf pow.
-
FreelanceConsultant almost 11 yearsYou might be able to initialize your data with a loop. Compiler might be able to optimize that by computing the values before runtime?
-
Mats Petersson almost 11 years@EdwardBird: For this purpose,
static
is faster than a loop, for sure. Since it's probably initialized at compile-time, or at least only once. A loop will initialize every time. -
FreelanceConsultant almost 11 years@MatsPetersson I read something recently about compiler optimizations which compute values at compile time... For example, if you have
x = x * 10 * 5
some compilers will change that tox = x * 50
. Would a compiler not detect that the loop initializes some values and therefore compute them when compiling so any executable program wouldn't have to? -
user541686 almost 11 yearsYes, software can calculate pow10() in O(log n) sequential time, but hardware is inherently parallel. It doesn't have to calculate things sequentially (never mind the fact that it could do things sequentially just like software, if it wanted to). So the time complexity point is irrelevant. You're comparing apples and oranges in your sentence about time complexity.
-
Dietmar Kühl almost 11 years@Mehrdad: Sure, for fixed sized integers you don't even need a parallel version because you can use a look-up table. However, once the size of your integers isn't bounded (other than the size of total available memory) the parallel algorithm the hardware can use won't help you much to get below O(log n).
-
user541686 almost 11 years@DietmarKühl: On the one hand you're talking about humongous numbers when you refer to O(log n), but on the other hand you're using them to talk about the performance of
std::pow
, which is obviously designed for small numbers that can be implemented in hardware. Are you talking about small numbers or large numbers?? The fact that exponentiation is O(log n) has nothing to do with the performance ofstd::pow
... -
Mats Petersson almost 11 yearsI have just confirmed that g++ at least just makes a global table that is initialized at build-time.
-
szli almost 11 yearsI thought std::pow() is O(logn) time, isn't it? The bit operation thought is quite interesting and very hard to come up with by myself :)
-
MSalters almost 11 years@szli: The problem is that
std::pow
is technically O(1) because its inputs are O(1) aka finite. And when you get to variable-length integers, just reading them becomes O(log N). That complicates things. -
cmaster - reinstate monica almost 11 yearsNice, optimal answer, but I think, you really should add a bounds check on
n
. -
cmaster - reinstate monica almost 11 yearsWhy do you needlessly pollute your code with these hideous templates?!? It hurts. I almost downvoted you for it...
-
Yakk - Adam Nevraumont almost 11 years@cmaster Because doing this code mainly makes sense if you are using bignums or other types that are not simple integers. If your base is an integer > 1, then using 64 bit ints the naive "multiply self exponent times" is going to be a loop of no more than length 64 before it overflows, which will be faster than my code above. The above code handles bases that are floating point, bignum, rationals, or anything else: and outside of those cases, the above code is not worth running. In the integer base case, it won't be much slower in the worst case, and it may be faster in common cases.
-
Vincent almost 11 years@cmaster Ok. I tried to improve my answer... I will delete my answer if it is still not useful or incorrect.
-
Toby over 9 yearsShould there be an
n--;
afterr *= x;
inopt_int_pow()
? Also to note that this only works for a positive power -
Mats Petersson over 9 years@Toby: Yes, there should.
-
dmjalund almost 6 yearsaren;t we asking for pow(10,n) and not pow(n,10)?
-
einpoklum about 4 yearsJust used this solution for this question. Thank you!
-
Konrad over 3 yearsIs that actually faster on any common processor/compiler combination? It's certainly not very readable.
-
Gregory Morse over 3 years(x<<3)*(x<<1)=8x*2x=16x^2...this answer is pretty problematic. Probably meant exponentiation by repeated squaring e.g. x^(1<<3)*x^(1<<1) but obviously this type of mistake is unacceptable
-
Björn Sundin about 3 yearsPlease trust the compiler with these kinds of optimizations. It does a better job than you think.
-
sampathsris over 2 years@GregoryMorse I'm betting they meant
(x<<3)+(x<<1)
which is8x+2x=10x
. -
Pete about 2 yearsThis is neat but only works on unsigned numbers - the function signature ought to convey that.