C++ most efficient way to convert string to int (faster than atoi)

c++ string performance int type-conversion

48,555

Solution 1

I experimented with solutions using lookup tables, but found them fraught with issues, and actually not very fast. The fastest solution turned out to be the least imaginitive:

int fast_atoi( const char * str )
{
    int val = 0;
    while( *str ) {
        val = val*10 + (*str++ - '0');
    }
    return val;
}

Running a benchmark with a million randomly generated strings:

fast_atoi : 0.0097 seconds
atoi      : 0.0414 seconds

To be fair, I also tested this function by forcing the compiler not to inline it. The results were still good:

fast_atoi : 0.0104 seconds
atoi      : 0.0426 seconds

Provided your data conforms to the requirements of the fast_atoi function, that is pretty reasonable performance. The requirements are:

Input string contains only numeric characters, or is empty
Input string represents a number from 0 up to INT_MAX

Solution 2

atoi can be improved upon significantly, given certain assumptions. This was demonstrated powerfully in a presentation by Andrei Alexandrescu at the C++ and Beyond 2012 conference. Hi s replacement used loop unrolling and ALU parallelism to achieve orders of magnitude in perf improvement. I don't have his materials, but this link uses a similar technique: http://tombarta.wordpress.com/2008/04/23/specializing-atoi/

Solution 3

This page compares conversion speed between different string->int functions using different compilers. The naive function, which offers no error checking, offers speeds roughly twice as fast as atoi(), according to the results presented.

// Taken from http://tinodidriksen.com/uploads/code/cpp/speed-string-to-int.cpp
int naive(const char *p) {
    int x = 0;
    bool neg = false;
    if (*p == '-') {
        neg = true;
        ++p;
    }
    while (*p >= '0' && *p <= '9') {
        x = (x*10) + (*p - '0');
        ++p;
    }
    if (neg) {
        x = -x;
    }
    return x;
}

it is always positive

Remove the negative checks in the above code for a micro optimization.

If you can guarantee the string will not have anything but numeric characters, you can micro optimize further by changing the loop

while (*p >= '0' && *p <= '9') {

while (*p != '\0' ) {

Which leaves you with

unsigned int naive(const char *p) {
    unsigned int x = 0;
    while (*p != '\0') {
        x = (x*10) + (*p - '0');
        ++p;
    }
    return x;
}

Solution 4

Quite a few of the code examples here are quite complex and do unnecessary work, meaning the code could be slimmer and faster.

Conversion loops are often written to do three different things with each character:

bail out if it is the end-of-string character
bail out if it is not a digit
convert it from its code point to the actual digit value

First observation: there is no need to check for the end-of-string character separately, since it is not a digit. Hence the check for 'digitness' covers the EOS condition implicitly.

Second observation: double conditions for range testing as in (c >= '0' && c <= '9') can be converted to a single test condition by using an unsigned type and anchoring the range at zero; that way there can be no unwanted values below the beginning of the range, all unwanted values are mapped to the range above the upper limit: (uint8_t(c - '0') <= 9)

It just so happens that c - '0' needs to be computed here anyway...

Hence the inner conversion loop can be slimmed to

uint64_t n = digit_value(*p);
unsigned d;

while ((d = digit_value(*++p)) <= 9)
{
   n = n * 10 + d;
}

The code here is called with the precondition that p be pointing at a digit, which is why the first digit is extracted without further ado (which also avoids a superfluous MUL).

That precondition is less outlandish than might appear at first, since p pointing at a digit is the reason why this code is called by the parser in the first place. In my code the whole shebang looks like this (assertions and other production-quality noise elided):

unsigned digit_value (char c)
{
   return unsigned(c - '0');
}

bool is_digit (char c)
{
   return digit_value(c) <= 9;
}

uint64_t extract_uint64 (char const **read_ptr)
{
   char const *p = *read_ptr;
   uint64_t n = digit_value(*p);
   unsigned d;

   while ((d = digit_value(*++p)) <= 9)
   {
      n = n * 10 + d;
   }

   *read_ptr = p;

   return n;
}

The first call to digit_value() is often elided by the compiler, if the code gets inlined and the calling code has already computed that value by calling is_digit().

n * 10 happens to be faster than manual shifting (e.g. n = (n << 3) + (n << 1) + d), at least on my machine with gcc 4.8.1 and VC++ 2013. My guess is that both compilers use LEA with index scaling for adding up to three values in one go and scaling one of them by 2, 4, or 8.

In any case that's exactly how it should be: we write nice clean code in separate functions and express the desired logic (n * 10, x % CHAR_BIT, whatever) and the compiler converts it to shifting, masking, LEAing and so on, inlines everything into the big bad parser loop and takes care of all the required messiness under the hood to make things fast. We don't even have to stick inline in front of everything anymore. If anything then we have to do the opposite, by using __declspec(noinline) judiciously when compilers get over-eager.

I'm using the above code in a program that reads billions of numbers from text files and pipes; it converts 115 million uints per second if the length is 9..10 digits, and 60 million/s for length 19..20 digits (gcc 4.8.1). That's more than ten times as fast as strtoull() (and just barely enough for my purposes, but I digress...). That's the timing for converting text blobs containing 10 million numbers each (100..200 MB), meaning that memory timings make these numbers appear a bit worse than they would be in a synthetic benchmark running from cache.

Solution 5

Paddy's implementation of fast_atoi is faster than atoi - without the shadow of the doubt - however it works only for unsigned integers.

Below, I put evaluated version of Paddy's fast_atoi that also allows only unsigned integers but speeds conversion up even more by replacing costly operation * with +

unsigned int fast_atou(const char *str)
{
    unsigned int val = 0;
    while(*str) {
        val = (val << 1) + (val << 3) + *(str++) - 48;
    }
    return val;
}

Here, I put complete version of fast_atoi() that i'm using sometimes which converts singed integers as well:

int fast_atoi(const char *buff)
{
    int c = 0, sign = 0, x = 0;
    const char *p = buff;

    for(c = *(p++); (c < 48 || c > 57); c = *(p++)) {if (c == 45) {sign = 1; c = *(p++); break;}}; // eat whitespaces and check sign
    for(; c > 47 && c < 58; c = *(p++)) x = (x << 1) + (x << 3) + c - 48;

    return sign ? -x : x;
}

View more solutions

48,555

Author by

user788171

Updated on January 21, 2020

Comments

user788171 over 4 years
As mentioned in the title, I'm looking for something that can give me more performance than atoi. Presently, the fastest way I know is
```
atoi(mystring.c_str())
```
Finally, I would prefer a solution that doesn't rely on Boost. Does anybody have good performance tricks for doing this?

Additional Information: int will not exceed 2 billion, it is always positive, the string has no decimal places in it.
dyp almost 11 years

Yet the naive implementation doesn't comply to the Standard: it doesn't discard leading whitespaces. If you don't need the guarantees of the Standard..
jogojapan almost 11 years

I think I have also seen that. Is this the presentation you refer to? It's not C++ and Beyond, though. And I think it's mostly about int-to-string rather than reverse. But there is a lot to learn from that anyway.
Tony Delroy almost 11 years

@DyP: '\0' will break out of the loop... it's < '0'. Anyway, the linked page doesn't list any functions other than this naive loop which are serious candidates to outperform atoi - they're not utilising any efficiencies coming from the insights above re an expectation of valid characters, always being positive, known maximum size, no need for any error checking....
dyp almost 11 years

Oops you're right, brain comparison operator broken, sry.. But then you could change it to while(*p != '\0')..
Mike Dunlavey almost 11 years

+ That is basically the way I do it, although in the loop I tend to write {x *= 10; x += (*p++ - '0');}. It probably compiles to about the same thing.
johnnycrash almost 10 years

(*p++)&15 is probably faster than *p++ - '0'
DarthGizka over 9 years

That's the canonical C++ way but it is several orders of magnitude slower than a slimmed 'naive' conversion loop.
Eric Pauley about 7 years

@johnnycrash not sure why it would be. Bitwise & and constant subtraction are both one instruction.
Eric Pauley about 7 years

Not sure if the bit shift solution is actually faster, since x86 truncated multiplication is one instruction and gcc will know that the top bits don't matter.
Caleb Reister over 6 years

I was able to use this to create a function template for different integer types and run it on an AVR.
Ben Voigt about 5 years

While you can decompose 10 into 16 - 4 - 2, a simpler decomposition is 8 + 2.
Ben Voigt about 5 years

EQ1 is clearly buggy, and when code isn't even tested it casts doubt on the benchmark.
chux - Reinstate Monica about 4 years

Linked int atoi(const char *str) fails to detect all overflow.
chux - Reinstate Monica about 4 years

UV for unsigned int naive(const char *p)
chux - Reinstate Monica about 4 years

I like the use of unsigned for a swift compare. Not a fan of the UB when p[0]=='\0'.
chux - Reinstate Monica about 4 years

"I really don't see room for improvement here." 10*num + (*str - '0') is UB when final result s/b LONG_MIN. isspace(*str), isdigit(*str) UB when *str < 0 - not likely of concern for OP though.
chux - Reinstate Monica about 4 years

"Multiplication is always slower that sum and shift" --> Not always.
hamSh about 4 years

can you specify an example?