Convert float to string in positional format (without scientific notation and false precision)
Solution 1
Unfortunately it seems that not even the new-style formatting with float.__format__
supports this. The default formatting of float
s is the same as with repr
; and with f
flag there are 6 fractional digits by default:
>>> format(0.0000000005, 'f')
'0.000000'
However there is a hack to get the desired result - not the fastest one, but relatively simple:
- first the float is converted to a string using
str()
orrepr()
- then a new
Decimal
instance is created from that string. -
Decimal.__format__
supportsf
flag which gives the desired result, and, unlikefloat
s it prints the actual precision instead of default precision.
Thus we can make a simple utility function float_to_str
:
import decimal
# create a new context for this task
ctx = decimal.Context()
# 20 digits should be enough for everyone :D
ctx.prec = 20
def float_to_str(f):
"""
Convert the given float to a string,
without resorting to scientific notation
"""
d1 = ctx.create_decimal(repr(f))
return format(d1, 'f')
Care must be taken to not use the global decimal context, so a new context is constructed for this function. This is the fastest way; another way would be to use decimal.local_context
but it would be slower, creating a new thread-local context and a context manager for each conversion.
This function now returns the string with all possible digits from mantissa, rounded to the shortest equivalent representation:
>>> float_to_str(0.1)
'0.1'
>>> float_to_str(0.00000005)
'0.00000005'
>>> float_to_str(420000000000000000.0)
'420000000000000000'
>>> float_to_str(0.000000000123123123123123123123)
'0.00000000012312312312312313'
The last result is rounded at the last digit
As @Karin noted, float_to_str(420000000000000000.0)
does not strictly match the format expected; it returns 420000000000000000
without trailing .0
.
Solution 2
If you are satisfied with the precision in scientific notation, then could we just take a simple string manipulation approach? Maybe it's not terribly clever, but it seems to work (passes all of the use cases you've presented), and I think it's fairly understandable:
def float_to_str(f):
float_string = repr(f)
if 'e' in float_string: # detect scientific notation
digits, exp = float_string.split('e')
digits = digits.replace('.', '').replace('-', '')
exp = int(exp)
zero_padding = '0' * (abs(int(exp)) - 1) # minus 1 for decimal point in the sci notation
sign = '-' if f < 0 else ''
if exp > 0:
float_string = '{}{}{}.0'.format(sign, digits, zero_padding)
else:
float_string = '{}0.{}{}'.format(sign, zero_padding, digits)
return float_string
n = 0.000000054321654321
assert(float_to_str(n) == '0.000000054321654321')
n = 0.00000005
assert(float_to_str(n) == '0.00000005')
n = 420000000000000000.0
assert(float_to_str(n) == '420000000000000000.0')
n = 4.5678e-5
assert(float_to_str(n) == '0.000045678')
n = 1.1
assert(float_to_str(n) == '1.1')
n = -4.5678e-5
assert(float_to_str(n) == '-0.000045678')
Performance:
I was worried this approach may be too slow, so I ran timeit
and compared with the OP's solution of decimal contexts. It appears the string manipulation is actually quite a bit faster. Edit: It appears to only be much faster in Python 2. In Python 3, the results were similar, but with the decimal approach slightly faster.
Result:
Python 2: using
ctx.create_decimal()
:2.43655490875
Python 2: using string manipulation:
0.305557966232
Python 3: using
ctx.create_decimal()
:0.19519368198234588
Python 3: using string manipulation:
0.2661344590014778
Here is the timing code:
from timeit import timeit
CODE_TO_TIME = '''
float_to_str(0.000000054321654321)
float_to_str(0.00000005)
float_to_str(420000000000000000.0)
float_to_str(4.5678e-5)
float_to_str(1.1)
float_to_str(-0.000045678)
'''
SETUP_1 = '''
import decimal
# create a new context for this task
ctx = decimal.Context()
# 20 digits should be enough for everyone :D
ctx.prec = 20
def float_to_str(f):
"""
Convert the given float to a string,
without resorting to scientific notation
"""
d1 = ctx.create_decimal(repr(f))
return format(d1, 'f')
'''
SETUP_2 = '''
def float_to_str(f):
float_string = repr(f)
if 'e' in float_string: # detect scientific notation
digits, exp = float_string.split('e')
digits = digits.replace('.', '').replace('-', '')
exp = int(exp)
zero_padding = '0' * (abs(int(exp)) - 1) # minus 1 for decimal point in the sci notation
sign = '-' if f < 0 else ''
if exp > 0:
float_string = '{}{}{}.0'.format(sign, digits, zero_padding)
else:
float_string = '{}0.{}{}'.format(sign, zero_padding, digits)
return float_string
'''
print(timeit(CODE_TO_TIME, setup=SETUP_1, number=10000))
print(timeit(CODE_TO_TIME, setup=SETUP_2, number=10000))
Solution 3
As of NumPy 1.14.0, you can just use numpy.format_float_positional
. For example, running against the inputs from your question:
>>> numpy.format_float_positional(0.000000054321654321)
'0.000000054321654321'
>>> numpy.format_float_positional(0.00000005)
'0.00000005'
>>> numpy.format_float_positional(0.1)
'0.1'
>>> numpy.format_float_positional(4.5678e-20)
'0.000000000000000000045678'
numpy.format_float_positional
uses the Dragon4 algorithm to produce the shortest decimal representation in positional format that round-trips back to the original float input. There's also numpy.format_float_scientific
for scientific notation, and both functions offer optional arguments to customize things like rounding and trimming of zeros.
Solution 4
If you are ready to lose your precision arbitrary by calling str()
on the float number, then it's the way to go:
import decimal
def float_to_string(number, precision=20):
return '{0:.{prec}f}'.format(
decimal.Context(prec=100).create_decimal(str(number)),
prec=precision,
).rstrip('0').rstrip('.') or '0'
It doesn't include global variables and allows you to choose the precision yourself. Decimal precision 100 is chosen as an upper bound for str(float)
length. The actual supremum is much lower. The or '0'
part is for the situation with small numbers and zero precision.
Note that it still has its consequences:
>> float_to_string(0.10101010101010101010101010101)
'0.10101010101'
Otherwise, if the precision is important, format
is just fine:
import decimal
def float_to_string(number, precision=20):
return '{0:.{prec}f}'.format(
number, prec=precision,
).rstrip('0').rstrip('.') or '0'
It doesn't miss the precision being lost while calling str(f)
.
The or
>> float_to_string(0.1, precision=10)
'0.1'
>> float_to_string(0.1)
'0.10000000000000000555'
>>float_to_string(0.1, precision=40)
'0.1000000000000000055511151231257827021182'
>>float_to_string(4.5678e-5)
'0.000045678'
>>float_to_string(4.5678e-5, precision=1)
'0'
Anyway, maximum decimal places are limited, since the float
type itself has its limits and cannot express really long floats:
>> float_to_string(0.1, precision=10000)
'0.1000000000000000055511151231257827021181583404541015625'
Also, whole numbers are being formatted as-is.
>> float_to_string(100)
'100'
Solution 5
I think rstrip
can get the job done.
a=5.4321654321e-08
'{0:.40f}'.format(a).rstrip("0") # float number and delete the zeros on the right
# '0.0000000543216543210000004442039220863003' # there's roundoff error though
Let me know if that works for you.
Antti Haapala -- Слава Україні
Preferred pronouns he/him/his/himself if you're wondering (or 'e/'im/'is/'imself), or you can use singular "they". Being a native speaker of Finno-Ugric language, which do not separate gender in pronouns, I am kind of deaf to the gendered pronouns anyway. Master of Science in Industrial Engineering and Management who has worked in various software projects over a decade. Has worked as a software engineer for both multinational corporations and small private companies. A self-proclaimed guru, has gained a good understanding of various aspects and levels of software engineering, project management, SW product development and leadership tasks. Nowadays I'm mostly involved in bootstrapping, designing, writing new information system products/projects - preferably in Python if I have a say; not shying away from data engineering either. I've also worked with in operating system kernels and system-level simulations, social networking sites, MS-DOS games, multi-user games, GUIs, 3D sound, cryptocurrencies and smart contracts and a CAD for designing railway signalling interlockings. Expert in Python, C, C++, Java, JS/ES/AS/TS, x86 assembler. I've written Pascal, Basic, Perl5, PHP, LPC, Kotlin, Basics etc quite a lot too. Scheme, CL, Ruby, Forth, Prolog, C#, Perl6, Erlang, Haskell, Objective-C, ARMv6 and 6502 assemblers: I've played with them but I've never done almost anything useful in these languages. Where I am at home: Python. Linux. AWS. Did I mention Python?
Updated on July 05, 2022Comments
-
Antti Haapala -- Слава Україні almost 2 years
I want to print some floating point numbers so that they're always written in decimal form (e.g.
12345000000000000000000.0
or0.000000000000012345
, not in scientific notation, yet I'd want to the result to have the up to ~15.7 significant figures of a IEEE 754 double, and no more.What I want is ideally so that the result is the shortest string in positional decimal format that still results in the same value when converted to a
float
.It is well-known that the
repr
of afloat
is written in scientific notation if the exponent is greater than 15, or less than -4:>>> n = 0.000000054321654321 >>> n 5.4321654321e-08 # scientific notation
If
str
is used, the resulting string again is in scientific notation:>>> str(n) '5.4321654321e-08'
It has been suggested that I can use
format
withf
flag and sufficient precision to get rid of the scientific notation:>>> format(0.00000005, '.20f') '0.00000005000000000000'
It works for that number, though it has some extra trailing zeroes. But then the same format fails for
.1
, which gives decimal digits beyond the actual machine precision of float:>>> format(0.1, '.20f') '0.10000000000000000555'
And if my number is
4.5678e-20
, using.20f
would still lose relative precision:>>> format(4.5678e-20, '.20f') '0.00000000000000000005'
Thus these approaches do not match my requirements.
This leads to the question: what is the easiest and also well-performing way to print arbitrary floating point number in decimal format, having the same digits as in
repr(n)
(orstr(n)
on Python 3), but always using the decimal format, not the scientific notation.That is, a function or operation that for example converts the float value
0.00000005
to string'0.00000005'
;0.1
to'0.1'
;420000000000000000.0
to'420000000000000000.0'
or420000000000000000
and formats the float value-4.5678e-5
as'-0.000045678'
.
After the bounty period: It seems that there are at least 2 viable approaches, as Karin demonstrated that using string manipulation one can achieve significant speed boost compared to my initial algorithm on Python 2.
Thus,
- If performance is important and Python 2 compatibility is required; or if the
decimal
module cannot be used for some reason, then Karin's approach using string manipulation is the way to do it. - On Python 3, my somewhat shorter code will also be faster.
Since I am primarily developing on Python 3, I will accept my own answer, and shall award Karin the bounty.
- If performance is important and Python 2 compatibility is required; or if the
-
Bakuriu almost 8 yearsWhy don't you use
decimal.localcontext
?with localcontext() as ctx: ctx.prec = 20; d1 = Decimal(str(f))
-
Antti Haapala -- Слава Україні almost 8 years@Bakuriu why would I, it can only be slower
-
Antti Haapala -- Слава Україні almost 8 yearsThere is no need to create that Decimal at all, your approach works for
float
s already, but the result of false precision was rejected in the question. These are rounded measurement results, not some arbitrary binary fractions. -
Antti Haapala -- Слава Україні almost 8 yearsYou could actually specify the initialization (
def format_float
;import decimal; ctx = ...
) as the second argument totimeit
; that way it doesn't get included to the measurements. -
Karin almost 8 yearsAhh that seems obvious from the docs now. Great to know! I've updated my timing code and it looks much cleaner now thanks to you :)
-
Antti Haapala -- Слава Україні almost 8 yearsI need to add one more case to test, though. The number can be negative, your's still calculates
n = -4.5678e-5
;assert(format_float(n) == '-0.000045678')
incorrectly :D -
Antti Haapala -- Слава Україні almost 8 yearsAnd another gotcha more: This is way faster on Python 2 than my code, but slower on Python 3; seems that in Python 3 the decimal constructor performs much better than in Python 2.
-
Wayne Werner almost 8 yearsI'm consistently surprised how often the naive "just stringify it" approach works, and sometimes works even better than other cases.
-
Karin almost 8 years@Antti Fascinating! I can confirm your approach is must faster in Python 3 than Python 2. Another weirdness though, is that the
420000000000000000.0
use case actually fails for me for your decimal approach in Python 2 and 3. Very strange =\ -
Antti Haapala -- Слава Україні almost 8 years@Karin it is because it seems that if
decimal
has more than 16 places, there is no.0
any longer. -
Antti Haapala -- Слава Україні almost 8 yearsWhy are you adjusting the precision in my approach? I fixed it to 20 to get all of the 15.7 decimal digits of precision of IEEE-754 doubles.
-
Karin almost 8 years@Antti But then how did it work for you in your answer's example usage?
-
Antti Haapala -- Слава Україні almost 8 yearsFrankly, I didn't remember that the returned string was without
.0
, I didn't copy-paste my example output from Python shell, instead writing it here. Good catch :D I fixed my answer. -
Martijn Pieters almost 8 years
decimal
has received several speed improvements in Python 3.3 (switch to libmpdec, caching, etc.) leading to 10x - 100x performance gains depending on what you are trying to make it do. -
Antti Haapala -- Слава Україні over 7 yearsUnfortunately as I stated in my question, I do not want any residual fractional part from the fact that these happened to be stored as binary.
-
user2357112 over 7 yearsI see precision loss in the output for 0.000000000123123123123123123123 - the
float_to_str
output cuts off at only 12 digits of precision, not enough to reconstruct the original float. -
Antti Haapala -- Слава Україні over 7 yearsKarin, not only were you the only answerer to understand what I sought for, but you also found a clever approach to achieve it using string manipulation that performs very well on Python 2. :D Thus I awarded the bounty to you. However, I chose to accept my self-answer in this case since the project for which we needed this uses Python 3, and we're already successfully using my approach.
-
Antti Haapala -- Слава Україні over 7 years@user2357112 good catch. You're using Python 2; in Python 2
str
only has 12 digits of precision whilerepr
uses the Python 3 compatible algorithm. In Python 3, both forms are similar, thus the confusion. I changed my code to userepr
. -
Antti Haapala -- Слава Україні over 7 years(Ah one more thing, this should be using
repr
instead ofstr
to get consistent results Python 2 vs 3.) -
Karin over 7 years@Antti Thanks! This was a fun use case :) Also updated my code to use
repr
as suggested. -
Antti Haapala -- Слава Україні about 5 yearsHey, that's nice. Not practical if NumPy is not needed otherwise, but if it is this is definitely what one should be using.
-
Marses over 4 yearsGood answer, but to be honest, I feel like this should be implemented in python directly and doable through
.format
. I don't see why.format
doesn't include this use case. Printing a number in non-scientific notation with significant figures for example requires a hack like this. Yet I imagine it's an extremely common use case for plotting scientific figures with short logarithmic scales. -
Marses over 4 yearsEven better answer. Though my opinion is that this functionality should be included directly as an option in the
.format
method for strings. Decimal representations with a significant figure limit are an extremely common use case in scientific graphs with logarithmic scales. -
recolic almost 4 yearsstill not working for
float_to_str(333333333333333333333333333333333333333333333333333333333333333333333333333333.333333333333333333333333333333333333333333333333333333333333333)
-
fivelements almost 3 yearsthis only works when exp in float_to_str() is <0. The 3rd test case happens to work because there is only one decimal digit in the scientific notation. It won't work if there are more than 1. (n = 421000000000000000.0 will not work)
-
T. de Jong almost 3 yearsThis helped me a lot! Thanks for the clear explanation
-
N4v about 2 yearsWhy can't the global decimal context be used here?
-
Antti Haapala -- Слава Україні about 2 years@N4v who knows what the setting is :D