Unexpected results when working with very big integers on interpreted languages

28,517

Solution 1

Python works:

>>> sum(x for x in xrange(1000000000 + 1))
500000000500000000

Or:

>>> sum(xrange(1000000000+1))
500000000500000000

Python's int auto promotes to a Python long which supports arbitrary precision. It will produce the correct answer on 32 or 64 bit platforms.

This can be seen by raising 2 to a power far greater than the bit width of the platform:

>>> 2**99
633825300114114700748351602688L

You can demonstrate (with Python) that the erroneous values you are getting in PHP is because PHP is promoting to a float when the values are greater than 2**32-1:

>>> int(sum(float(x) for x in xrange(1000000000+1)))
500000000067108992

Solution 2

Your Go code uses integer arithmetic with enough bits to give an exact answer. Never touched PHP or Node.js, but from the results I suspect the math is done using floating point numbers and should be thus expected not to be exact for numbers of this magnitude.

Solution 3

The reason is that the value of your integer variable sum exceeds the maximum value. And the sum you get is result of float-point arithmetic which involves rounding off. Since other answers did not mention the exact limits, I decided to post it.

The max integer value for PHP for:

  • 32-bit version is 2147483647
  • 64-bit version is 9223372036854775807

So it means either you are using 32 bit CPU or 32 bit OS or 32 bit compiled version of PHP. It can be found using PHP_INT_MAX. The sum would be calculated correctly if you do it on a 64 bit machine.

The max integer value in JavaScript is 9007199254740992. The largest exact integral value you can work with is 253 (taken from this question). The sum exceeds this limit.

If the integer value does not exceed these limits, then you are good. Otherwise you will have to look for arbitrary precision integer libraries.

Solution 4

Here is the answer in C, for completeness:

#include <stdio.h>

int main(void)
{
    unsigned long long sum = 0, i;

    for (i = 0; i <= 1000000000; i++)    //one billion
        sum += i;

    printf("%llu\n", sum);  //500000000500000000

    return 0;
}

The key in this case is using C99's long long data type. It provides the biggest primitive storage C can manage and it runs really, really fast. The long long type will also work on most any 32 or 64-bit machine.

There is one caveat: compilers provided by Microsoft explicitly do not support the 14 year-old C99 standard, so getting this to run in Visual Studio is a crapshot.

Solution 5

My guess is that when the sum exceeds the capacity of a native int (231-1 = 2,147,483,647), Node.js and PHP switch to a floating point representation and you start getting round-off errors. A language like Go will probably try to stick with an integer form (e.g., 64-bit integers) as long as possible (if, indeed, it didn't start with that). Since the answer fits in a 64-bit integer, the computation is exact.

Share:
28,517

Related videos on Youtube

Baba
Author by

Baba

--- START --- Am basically the type that ask questions like How to Block 100,000+ Individual IP addresses and answers questions like User recognition without cookies or local storage --- END ----

Updated on September 21, 2020

Comments

  • Baba
    Baba almost 4 years

    I am trying to get the sum of 1 + 2 + ... + 1000000000, but I'm getting funny results in PHP and Node.js.

    PHP

    $sum = 0;
    for($i = 0; $i <= 1000000000 ; $i++) {
        $sum += $i;
    }
    printf("%s", number_format($sum, 0, "", ""));   // 500000000067108992
    

    Node.js

    var sum = 0;
    for (i = 0; i <= 1000000000; i++) {
        sum += i ;
    }
    console.log(sum); // 500000000067109000
    

    The correct answer can be calculated using

    1 + 2 + ... + n = n(n+1)/2
    

    Correct answer = 500000000500000000, so I decided to try another language.

    GO

    var sum , i int64
    for i = 0 ; i <= 1000000000; i++ {
        sum += i
    }
    fmt.Println(sum) // 500000000500000000
    

    But it works fine! So what is wrong with my PHP and Node.js code?

    Perhaps this a problem of interpreted languages, and that's why it works in a compiled language like Go? If so, would other interpreted languages such as Python and Perl have the same problem?

    • tereško
      tereško almost 11 years
      you need this: php.net/manual/en/book.bc.php , or else wou will bash your head against IEEE 754 till hell freezes over.
    • Baba
      Baba almost 11 years
      bcadd almost took forever .. I had to give up
    • Ali
      Ali almost 11 years
      Not only NodeJS, but client-side JavaScript also result the same. 500000000067109000
    • Jeffrey
      Jeffrey almost 11 years
      For handling large numbers in PHP (i.e. 64-bit), use the GMP functions, in this case gmp_add().
    • Danger Code master
      Danger Code master almost 11 years
      I've been playing with this, and I'm a bit mystified as to what's happening under the hood in those two languages to get 500000000067108992. Curiously, I can get exactly 499999999067108992 in C++ by switching from int to double just prior to overflow when summing the series in a loop, and I'm not clear on why I lose precision in that case either.
    • Graham Borland
      Graham Borland almost 11 years
      For super efficiency, your loops should really start at 1 instead of 0. :P
    • Phong
      Phong almost 11 years
      sum(1 to N) = (N/2)*(N+1)
    • Droidman
      Droidman almost 11 years
      hmm... I'm getting 499999999500000000 in java (long). Kinda strange
    • Admin
      Admin almost 11 years
    • Baba
      Baba almost 11 years
      @GrahamBorland how does start with 1 improve efficiency ?
    • Brian Warshaw
      Brian Warshaw almost 11 years
      @Baba 0 is superflous for your calculation, so there's no need to have an extra iteration of the loop to add 0 to 0.
    • Cristian Diaconescu
      Cristian Diaconescu almost 11 years
      @Baba you'll only run the for loop 10^9 times instead of (1 + 10^9) times :-p
    • Jonathan Dickinson
      Jonathan Dickinson almost 11 years
      You could use int-hinting to use integer ops instead of float ops, unfortunately the result doesn't fit in a 32-bit integer. Still, it's worth sharing: gist.github.com/jcdickinson/6155436
    • Abhinav
      Abhinav almost 11 years
      @Maver1ck : in java use big integer(docs.oracle.com/javase/1.4.2/docs/api/java/math/BigI‌​nteger.html) for these sort of calculations :)
    • Matt
      Matt almost 11 years
      Great question. By the way, you could make your Go code even faster using concurrency!
    • Gustav Larsson
      Gustav Larsson almost 11 years
      @Maver1ck You probably just didn't do an inclusive loop.
    • Raen Herron
      Raen Herron almost 11 years
      Hmm... I'm getting -5340232216128654848 from the Go code (go1.1.1)
    • Baba
      Baba almost 11 years
      How is that possible ? are you using 32 or 63 bits ?
    • Raen Herron
      Raen Herron almost 11 years
      Same exact as the one above, so 64 bit and on a 64 bit system
    • Baba
      Baba almost 11 years
      can anyone explain why go gives -5340232216128654848 on some certain systems ?
    • Raen Herron
      Raen Herron almost 11 years
      Ah, the code above has an extra 0 in the long number.
    • jn1kk
      jn1kk almost 11 years
      Why is this a community wiki?
    • flornquake
      flornquake almost 11 years
      @jsn It happens automatically when a question has more than 30 answers. (See revisions of this question.)
    • Andrew Barber
      Andrew Barber almost 11 years
      As @flornquake says, the question was made community wiki because it has 30 answers. From everything I see from the question and answers, this post is pretty much a poster-child for why that particular auto-trigger for Community Wiki was put in.
    • ShrekOverflow
      ShrekOverflow almost 11 years
      @Baba for such cases use maths instead of loops.. math aint hard brah
    • user602525
      user602525 almost 11 years
      PHP 5.3.10-1ubuntu3.7 with Suhosin-Patch (cli) on 64-bit Ubuntu 12.04 Precise and came out with: 500000000500000000
    • Kai Sellgren
      Kai Sellgren almost 11 years
      Why are people posting the same code sample converted to their favorite language? Are they not supposed to answer this question rather than post more useless snippets?
    • user568109
      user568109 almost 11 years
      @R.id.pandacoder Not to be selfish but did you see my answer.
  • bitWorking
    bitWorking almost 11 years
    not true for PHP. look here
  • Baba
    Baba almost 11 years
    Did you run this on 32 or 64 bit system ?
  • Baba
    Baba almost 11 years
    Did you run this on 32 or 64 bit system ?
  • mob
    mob almost 11 years
    true for 32-bit PHP. Anything larger than PHP_INT_MAX is coerced to a floating-point representation.
  • dawg
    dawg almost 11 years
    64 bit build on OS X. Takes about 25 seconds.
  • Baba
    Baba almost 11 years
    Thanks for the info ... Let see if some can run this on 32 bit and see if the result is the same .. also test on 64 bit windows .. it works +1
  • dawg
    dawg almost 11 years
    It should work regardless (32 vs 64 bit) since Python ints auto promote to arbitrary precision rather than overflow. Might take a while longer tho.
  • Alok Singhal
    Alok Singhal almost 11 years
    Python on any system will work in this case, since Python switches to long integers automatically if needed. And if that's not enough, it will switch to big integers as well.
  • dawg
    dawg almost 11 years
    @0x499602D2: That is kinda harsh. The OP himself voted it up. He asked specifically if this was similar problem on Python. Answer, no it is not. Code to show that it is not. WTH?
  • Nate
    Nate almost 11 years
    Yep. If PHP encounters a number beyond the bounds of the integer type, it will be interpreted as a float instead. Also, an operation which results in a number beyond the bounds of the integer type will return a float instead. - php.net/manual/en/language.types.integer.php
  • Jay M
    Jay M almost 11 years
    The Python example is overly long, just use sum(xrange(int(1e9)+1)) (....sum works on iterables)
  • Miguel Prz
    Miguel Prz almost 11 years
    it was executed on 64 bits system
  • Peter Olson
    Peter Olson almost 11 years
    And in NodeJS (and JavaScript in general) all arithmetic operations (except bit operations) behave as if they were done with floating point numbers. Whether or not they actually are is an under-the-hood distinction subject to the decisions of individual JavaScript engines.
  • toasted_flakes
    toasted_flakes almost 11 years
    In javascript's specification, there's no integer types. All numbers are floating points.
  • Qtax
    Qtax almost 11 years
    4.99999999067109e+017 on Perl v5.16.1 MSWin32-x86.
  • shawnhcorey
    shawnhcorey almost 11 years
    If you really need big numbers, use the bignum or bigint. Both are core modules, that is, they are install with Perl v5.8.0 or higher. See http://perldoc.perl.org/bignum.html and http://perldoc.perl.org/bigint.html
  • CyberSkull
    CyberSkull almost 11 years
    I got 500000000500000000 running this on a 32-bit PPC Mac, running Perl 5.12.4.
  • Peter Olson
    Peter Olson almost 11 years
    @grasGendarme There are. The ES5 spec specifies various integer conversions and mandates that they be called in bitwise shifts, for example. That is to say behind the scenes, integers types are used in Javascript, but all the arithmetic operators convert their operands to floating point numbers before doing anything with them (barring compiler optimizations).
  • Esailija
    Esailija almost 11 years
    Both SpiderMonkey (Firefox) and V8 (Chrome) also provide the extension Math.imul because even ((x|0)*(y|0))|0) is not enough to provide integer semantics. The fact that integers are used is leaking pretty hard.
  • MSalters
    MSalters almost 11 years
    MSVC++ is a C++ compiler, and C++ got long long in the C++11 standard. It's been a MSVC++ and g++ extension for a few years, though.
  • jwg
    jwg almost 11 years
    Do you think it might be possible to walk across every bridge across the Pregel in Kaliningrad, without crossing any bridge twice? Many people have tried and failed, but no-one has yet established that it is impossible. This seems like a challenge you would be uniquely qualified to solve.
  • linac
    linac almost 11 years
    @Baba It's the same as with Node.js/JavaScript in the OP. As to why 500000000067109000 vs. 500000000067108992 … no idea.
  • CyberSkull
    CyberSkull almost 11 years
    @MSalters Which edition of Visual Studio got this? I think the last version I used was the current one from 3 or 4 years ago.
  • CyberSkull
    CyberSkull almost 11 years
    @MSalters So being a C++ feature it won't really help anyone compiling a straight C program. I never tried switching from C to C++, so I don't know if that workaround would actually work.
  • Tor Klingberg
    Tor Klingberg almost 11 years
    And nicely, GCC or Clang with optimizations turn the whole loop into movabsq $500000000500000000, %rsi
  • CyberSkull
    CyberSkull almost 11 years
    @TorKlingberg Which optimizations?
  • MSalters
    MSalters almost 11 years
    @CyberSkull: VS2010, I think. Might have been VS2008, though. After all, they already had __int64 so it was just a rename.
  • Tor Klingberg
    Tor Klingberg almost 11 years
    Just gcc -O3 or clang -O3. I don't know the name of the specific optimization. Basically the compiler notices that the result of the loop does not depend on any argument, and calculates it at compile time.
  • Scotty Bauer
    Scotty Bauer almost 11 years
    At any optimization level -O through -O3 it will movabs $0x6f05b59f17f6500,%rsi
  • Magnuss
    Magnuss almost 11 years
    1.upto(1000000000).inject(:+)
  • Devin Lane
    Devin Lane almost 11 years
    C99 long long has a minimum size of 64 bits and as far as I know is 64 bit on both 32-bit and 64-bit platforms. I haven't seen general support for quad or octo ints.
  • cgenco
    cgenco almost 11 years
    @Magnuss : That's what I thought I tried at first, but it caused a massive memory leak. Yours seems to work...
  • Baba
    Baba almost 11 years
    bitrayne claims to have -5340232216128654848 on 64 bits go1.1.1 ... Any ideas ?
  • Stephan T. Lavavej
    Stephan T. Lavavej almost 11 years
    VC has supported long long for a very long time - at least 2003 according to MSDN and the "ll" printf syntax was supported by 2005. Additionally, note that unsigned long long should be printed with %llu .
  • Greg Hendershott
    Greg Hendershott almost 11 years
    Deleted my answer posted 6 minutes after you, once I noticed yours. :)
  • zzzz
    zzzz almost 11 years
    @Baba: Good catch, there's one zero to many in the Go version for i = 0 ; i <= 10000000000; i++ { to produce the correct number fmt.Println(sum) // 500000000500000000. All of the other language examples have the correct 1000000000.
  • The Mask
    The Mask almost 11 years
    Compilers provide by Microsoft doesn't support 14 years-old C99 std because C is too few used on Windows's applications. C++ is instead of.
  • didierc
    didierc almost 11 years
    Perhaps Baba is intrigued by the use of dots for thousand separators: english usually expects commas. You could also use underscores as a more neutral mean.
  • huaiyuan
    huaiyuan almost 11 years
    Can be simplified as (time (loop for x from 1 to 1000000000 sum x)). I got ~5x speed by adding declaration: (time (locally (declare (optimize (speed 3) (safety 0))) (loop for i of-type fixnum from 1 to 1000000000 sum i of-type fixnum)))
  • S.D.
    S.D. almost 11 years
    Isn't that supposed to depend on the machine's word length etc ? Why would a particular language want to do this in its own design ?
  • jwg
    jwg almost 11 years
    The only tiny bit of useful that the $MY_FAVOURITE_LANGUAGE answers have is if they provide the result...
  • jwg
    jwg almost 11 years
    Can anyone confirm that that's really how long adding that many bigints takes?
  • informatimago
    informatimago almost 11 years
    This is erroneous. Don't let you be blinded by the other languages! The correct way to write it in lisp is: (defun sum-from-1-to-n (n) (/ (* n (1+ n)) 2)) (time (sum-from-1-to-n 1000000000)) took 14 microseconds (0.000014 seconds) to run. During that period, and with 4 available CPU cores, 0 microseconds (0.000000 seconds) were spent in user mode 0 microseconds (0.000000 seconds) were spent in system mode --> 500000000500000000
  • Blacksad
    Blacksad almost 11 years
    @jwg yeah sorry I missed the end of the line -- updated answer.
  • postfuturist
    postfuturist almost 11 years
    @informatimago : It's not erroneous. I was copying the imperative loop style of the question and as many have pointed out, the question itself mentions there is an easier way to calculate. Chillax.
  • Gelldur
    Gelldur almost 11 years
    1000000000 isn't treated as int? you dont must write 1000000000LL or 1000000000L?
  • CyberSkull
    CyberSkull almost 11 years
    @Gelldur I checked the assembly file, in this case (on Intel 64) it doesn't matter since one billion fits into an int just as well as a long.
  • greyfade
    greyfade almost 11 years
    Node.js explicitly does not have an int type. It's working in a float type.
  • Ted Hopp
    Ted Hopp almost 11 years
    @greyfade - Yeah, I guess that's true of all EcmaScript-compliant environments.
  • Baba
    Baba almost 11 years
    Was was able to get a 32 bit system surprisingly go gave me 500000000067108992 ?? same as PHP
  • zzzz
    zzzz almost 11 years
    @Baba: I cannot find any reference from 'Was'. Can you please provide some pointer or link?
  • Baba
    Baba almost 11 years
    I can it on my system .. i can give you screen shorts if you want
  • zzzz
    zzzz almost 11 years
    @Baba: Please use e.g. play.golang.org to share the code. It doesn't matter that it may not run there.
  • Baba
    Baba almost 11 years
    @jnml play is 64 bits .. it does not have that issue .... try ruing my code on a 32 bit windows and see what i mean
  • zzzz
    zzzz almost 11 years
    @Baba: Use play only as a code pastie (in this case). Ad "try runing my code": paste somewhere (eg. to play.golang.org) a full compilable code. Don't care about if it runs or not in play.golang.org. I'll try to run it locall of course (if I cannot see the error w/o running the code).
  • Baba
    Baba almost 11 years
    here is the code i guess its messed up because i used float64 and not int64 .. Just confirmed it has nothing to do with 32 or 64 bits
  • Zach Hunter
    Zach Hunter almost 5 years
    Isn’t that (2**31 - 1)?
  • Ted Hopp
    Ted Hopp almost 5 years
    @ZacharyHunter - Indeed it is. Thanks for catching that error.