Why does the floating-point value of 4*0.1 look nice in Python 3 but 3*0.1 doesn't?
Solution 1
The simple answer is because 3*0.1 != 0.3
due to quantization (roundoff) error (whereas 4*0.1 == 0.4
because multiplying by a power of two is usually an "exact" operation). Python tries to find the shortest string that would round to the desired value, so it can display 4*0.1
as 0.4
as these are equal, but it cannot display 3*0.1
as 0.3
because these are not equal.
You can use the .hex
method in Python to view the internal representation of a number (basically, the exact binary floating point value, rather than the base-10 approximation). This can help to explain what's going on under the hood.
>>> (0.1).hex()
'0x1.999999999999ap-4'
>>> (0.3).hex()
'0x1.3333333333333p-2'
>>> (0.1*3).hex()
'0x1.3333333333334p-2'
>>> (0.4).hex()
'0x1.999999999999ap-2'
>>> (0.1*4).hex()
'0x1.999999999999ap-2'
0.1 is 0x1.999999999999a times 2^-4. The "a" at the end means the digit 10 - in other words, 0.1 in binary floating point is very slightly larger than the "exact" value of 0.1 (because the final 0x0.99 is rounded up to 0x0.a). When you multiply this by 4, a power of two, the exponent shifts up (from 2^-4 to 2^-2) but the number is otherwise unchanged, so 4*0.1 == 0.4
.
However, when you multiply by 3, the tiny little difference between 0x0.99 and 0x0.a0 (0x0.07) magnifies into a 0x0.15 error, which shows up as a one-digit error in the last position. This causes 0.1*3 to be very slightly larger than the rounded value of 0.3.
Python 3's float repr
is designed to be round-trippable, that is, the value shown should be exactly convertible into the original value (float(repr(f)) == f
for all floats f
). Therefore, it cannot display 0.3
and 0.1*3
exactly the same way, or the two different numbers would end up the same after round-tripping. Consequently, Python 3's repr
engine chooses to display one with a slight apparent error.
Solution 2
repr
(and str
in Python 3) will put out as many digits as required to make the value unambiguous. In this case the result of the multiplication 3*0.1
isn't the closest value to 0.3 (0x1.3333333333333p-2 in hex), it's actually one LSB higher (0x1.3333333333334p-2) so it needs more digits to distinguish it from 0.3.
On the other hand, the multiplication 4*0.1
does get the closest value to 0.4 (0x1.999999999999ap-2 in hex), so it doesn't need any additional digits.
You can verify this quite easily:
>>> 3*0.1 == 0.3
False
>>> 4*0.1 == 0.4
True
I used hex notation above because it's nice and compact and shows the bit difference between the two values. You can do this yourself using e.g. (3*0.1).hex()
. If you'd rather see them in all their decimal glory, here you go:
>>> Decimal(3*0.1)
Decimal('0.3000000000000000444089209850062616169452667236328125')
>>> Decimal(0.3)
Decimal('0.299999999999999988897769753748434595763683319091796875')
>>> Decimal(4*0.1)
Decimal('0.40000000000000002220446049250313080847263336181640625')
>>> Decimal(0.4)
Decimal('0.40000000000000002220446049250313080847263336181640625')
Solution 3
Here's a simplified conclusion from other answers.
If you check a float on Python's command line or print it, it goes through function
repr
which creates its string representation.Starting with version 3.2, Python's
str
andrepr
use a complex rounding scheme, which prefers nice-looking decimals if possible, but uses more digits where necessary to guarantee bijective (one-to-one) mapping between floats and their string representations.This scheme guarantees that value of
repr(float(s))
looks nice for simple decimals, even if they can't be represented precisely as floats (eg. whens = "0.1")
.At the same time it guarantees that
float(repr(x)) == x
holds for every floatx
Solution 4
Not really specific to Python's implementation but should apply to any float to decimal string functions.
A floating point number is essentially a binary number, but in scientific notation with a fixed limit of significant figures.
The inverse of any number that has a prime number factor that is not shared with the base will always result in a recurring dot point representation. For example 1/7 has a prime factor, 7, that is not shared with 10, and therefore has a recurring decimal representation, and the same is true for 1/10 with prime factors 2 and 5, the latter not being shared with 2; this means that 0.1 cannot be exactly represented by a finite number of bits after the dot point.
Since 0.1 has no exact representation, a function that converts the approximation to a decimal point string will usually try to approximate certain values so that they don't get unintuitive results like 0.1000000000004121.
Since the floating point is in scientific notation, any multiplication by a power of the base only affects the exponent part of the number. For example 1.231e+2 * 100 = 1.231e+4 for decimal notation, and likewise, 1.00101010e11 * 100 = 1.00101010e101 in binary notation. If I multiply by a non-power of the base, the significant digits will also be affected. For example 1.2e1 * 3 = 3.6e1
Depending on the algorithm used, it may try to guess common decimals based on the significant figures only. Both 0.1 and 0.4 have the same significant figures in binary, because their floats are essentially truncations of (8/5)(2^-4) and (8/5)(2^-6) respectively. If the algorithm identifies the 8/5 sigfig pattern as the decimal 1.6, then it will work on 0.1, 0.2, 0.4, 0.8, etc. It may also have magic sigfig patterns for other combinations, such as the float 3 divided by float 10 and other magic patterns statistically likely to be formed by division by 10.
In the case of 3*0.1, the last few significant figures will likely be different from dividing a float 3 by float 10, causing the algorithm to fail to recognize the magic number for the 0.3 constant depending on its tolerance for precision loss.
Edit: https://docs.python.org/3.1/tutorial/floatingpoint.html
Interestingly, there are many different decimal numbers that share the same nearest approximate binary fraction. For example, the numbers 0.1 and 0.10000000000000001 and 0.1000000000000000055511151231257827021181583404541015625 are all approximated by 3602879701896397 / 2 ** 55. Since all of these decimal values share the same approximation, any one of them could be displayed while still preserving the invariant eval(repr(x)) == x.
There is no tolerance for precision loss, if float x (0.3) is not exactly equal to float y (0.1*3), then repr(x) is not exactly equal to repr(y).
Related videos on Youtube
Aivar
Teaching assistant in University of Tartu, Department of CS
Updated on September 07, 2020Comments
-
Aivar over 3 years
I know that most decimals don't have an exact floating point representation (Is floating point math broken?).
But I don't see why
4*0.1
is printed nicely as0.4
, but3*0.1
isn't, when both values actually have ugly decimal representations:>>> 3*0.1 0.30000000000000004 >>> 4*0.1 0.4 >>> from decimal import Decimal >>> Decimal(3*0.1) Decimal('0.3000000000000000444089209850062616169452667236328125') >>> Decimal(4*0.1) Decimal('0.40000000000000002220446049250313080847263336181640625')
-
Bathsheba over 7 years@MorganThrapp: no it isn't. The OP is asking about the rather arbitrary-looking formatting choice. Neither 0.3 nor 0.4 can be represented exactly in binary floating point.
-
Morgan Thrapp over 7 yearsIt's not arbitrary at all, it's showing any significant digits.
-
BartoszKP over 7 yearsObligatory link under every floating point related question: docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html
-
Mooing Duck over 7 years@BartoszKP: Having read the document several times, it doesn't explain why Python is displaying
0.3000000000000000444089209850062616169452667236328125
as0.30000000000000004
and0.40000000000000002220446049250313080847263336181640625
as.4
even though they appear to have the same accuracy, and thus doesn't answer the question. -
Random832 over 7 yearsSee also stackoverflow.com/questions/28935257/… - I'm somewhat irritated that it got closed as a duplicate but this one hasn't.
-
Bakuriu over 7 years@Gilles No this is not a duplicate of that question. This is a question about string representation of floating points in python.
-
coteyr over 7 yearsGood ole 2 + 2 = 5 for extremely large values of 2
-
ShadowRanger over 7 yearsThe What's new in Python 3.1 docs (scroll to end of linked section, just before "New, Improved and Deprecated Modules") are a useful explanation for why/when Python 2.7/3.1+ have much shorter
float
repr
s for some values. Straight from the horse's mouth, so to speak. -
Antti Haapala -- Слава Україні over 7 yearsReopened, please do not close this as a duplicate of "is floating point math broken".
-
-
NPE over 7 yearsThis is an amazingly comprehensive answer, thank you. (In particular, thanks for showing
.hex()
; I didn't know it existed.) -
Mark Ransom over 7 years@NPE then you might be interested in
float.fromhex()
too, it does the reverse. -
supercat over 7 yearsI wonder if it would be worth noting the precise decimal values of the nearest "doubles" to 0.1, 0.3, and 0.4, since a lot of people can't read floating-point hex.
-
Mark Ransom over 7 years@supercat you make a good point. Putting those super large doubles into the text would be distracting, but I thought of a way to add them.
-
supercat over 7 yearsOut of curiosity, does Python always try to use the shortest string that is within 0.50 ulp of the given value, or does it use the shortest string that is within e.g. 0.47 ulp of the given value? Some floating-point libraries, if given a decimal string which almost exactly halfway between two values that are representable as "double", may not always return the value which is closer to the exact value represented by the string, but printing one more decimal digit would solve that problem.
-
nneonneo over 7 years@supercat: Python tries to find the shortest string that would round to the desired value, whatever that happens to be. Obviously the evaluated value must be within 0.5ulp (or it would round to something else), but it may require more digits in ambiguous cases. The code is very gnarly, but if you want to take a peek: hg.python.org/cpython/file/03f2c8fc24ea/Python/dtoa.c#l2345
-
Aivar over 7 yearsCan we then say that Python's repr uses selective rounding (meaning it doesn't use same simple rounding rule for all floats)?
-
Mark Dickinson over 7 yearsYour answer is accurate for Python versions >= 3.2, where
str
andrepr
are identical for floats. For Python 2.7,repr
has the properties you identify, butstr
is much simpler - it simply computes 12 significant digits and produces an output string based on those. For Python <= 2.6, bothrepr
andstr
are based on a fixed number of significant digits (17 forrepr
, 12 forstr
). (And nobody cares about Python 3.0 or Python 3.1 :-) -
Aivar over 7 yearsThanks @MarkDickinson! I included your comment in the answer.
-
Bakuriu over 7 years@supercat This has changed in python3.1, see the issue with the patch. In any case: the default representation is designed to produce the more readable result that completely preserves the value of the float. This means that
eval(repr(f)) == f
for all floatsf
(andeval(s)
does the same asfloat(s)
). Howeverfloat('0.100000000000000012') == 0.1
even though it is actually closer to0.10000000000000002
(which is the next representable double). -
Mark Dickinson over 7 years@Bakuriu: I'm not sure what you're saying. The
float
constructor always does correct rounding. The nearest representable float to0.100000000000000012
is0.1000000000000000055511151231257827021181583404541015625
, which Python displays as0.1
. -
Mark Dickinson over 7 years@supercat: Always the shortest string that's within 0.5 ulp. (Strictly within if we're looking at a float with odd LSB; i.e., the shortest string that makes it work with round-ties-to-even). Any exceptions to this are a bug, and should be reported.
-
Bergi over 7 yearsWhat does the
p
stand for in that hex representation? And are these actually valid number literals (like hex integers), or are they only a custom formatting? -
Mark Ransom over 7 years@Bergi the
p
takes the place of thee
in scientific notation, but I don't know the rationale for choosing a different letter. They are not valid literals, you need to use thefloat.fromhex()
function with a string as I mentioned earlier. -
Bergi over 7 years@MarkRansom Surely they did use something else than
e
because that's already a hex digit. Maybep
for power instead of exponent. -
Mark Dickinson over 7 years@Bergi: The use of
p
in this context goes back (at least) to C99, and also appears in IEEE 754 and in various other languages (including Java). Whenfloat.hex
andfloat.fromhex
were implemented (by me :-), Python was merely copying what was by then established practice. I don't know whether the intention was 'p' for "Power", but it seems like a nice way to think about it. -
Antti Haapala -- Слава Україні over 7 yearsNote that the rounding from shell comes from
repr
thus the Python 2.7 behaviour would be identical... -
Antti Haapala -- Слава Україні over 7 yearsThis does not really add much to the existing answers.
-
Mark Dickinson over 7 years"Depending on the algorithm used, it may try to guess common decimals based on the significant figures only." <- This seems like pure speculation. Other answers have described what Python actually does.
-
Aleksandr Dubinsky over 7 years@nneonneo "Python tries to find the shortest string that would round to the desired value." That should be the first line of your answer.