Extract fractional part of double efficiently in C

c floating-point double micro-optimization bit-manipulation

39,626

Solution 1

How about something simple?

double fraction = whole - ((long)whole);

This just subtracts the integer portion of the double from the value itself, the remainder should be the fractional component. It's possible, of course, this could have some representation issues.

Solution 2

#include <math.h>
double fraction = fmod(d, 1.0);

Solution 3

The optimal implementation depends entirely on the target architecture.

On recent Intel processors, this can be achieved with two instructions: roundsd and subsd, but that can't be expressed in portable C code.

On some processors, the fastest way to do this is with integer operations on the floating point representation. Early Atom and many ARM CPUs come to mind.

On some other processors, the fastest thing is to cast to integer and back, then subtract, branching to protect large values.

If you're going to be handling lots of values, you can set the rounding mode to round-to-zero, then add and subtract +/-2^52 to the number truncated to integer, then subtract from the original value to get the fraction. If you don't have SSE4.1, but do have an otherwise modern Intel CPU and want to vectorize, this is typically the best you can do. It only makes sense if you have many values to process, however, because changing the rounding mode is somewhat expensive.

On other architectures, other implementations are optimal. In general, it doesn't make sense to talk about "efficiency" of C programs; only the efficiency of a specific implementation on a specific architecture.

Solution 4

Proposal

The function remainder computes the remainder, but not the integer part like modf does:

#include <math.h>

double fracpart(double input)
{
    return remainder(input, 1.);
}

This is the most efficient (and portable) way, as it doesn't compute unnecessary values to do the job (cf. modf, (long), fmod, etc.)

Benchmark

As Mattew suggested in the comments, I wrote some benchmark code to compare this solution to all the other ones offered on this page.

Please find below the time measurements for 65536 computations (compiled with Clang with optimizations turned off):

method 1 took 0.002389 seconds (using remainder)
method 2 took 0.000193 seconds (casting to long)
method 3 took 0.000209 seconds (using floor)
method 4 took 0.000257 seconds (using modf)
method 5 took 0.010178 seconds (using fmod)

Again with Clang, this time using the -O3 flag:

method 1 took 0.002222 seconds (using remainder)
method 2 took 0.000000 seconds (casting to long)
method 3 took 0.000000 seconds (using floor)
method 4 took 0.000223 seconds (using modf)
method 5 took 0.010131 seconds (using fmod)

Turns out the simplest solution seems to give the best results on most platforms, and the specific methods to perform that task (fmod, modf, remainder) are actually super-slow!

Solution 5

Some profiling and experimentation using C++ in Microsoft Visual Studio 2015 indicates that the best method for positive numbers is:

double n;
// ...
double fractional_part = n - floor(n);

It's faster than modf, and, as has already been mentioned, the remainder function rounds to the nearest integer, and therefore isn't of use.

View more solutions

39,626

Author by

Jeremy Salwen

Updated on December 24, 2021

Comments

Jeremy Salwen over 2 years
I'm looking to take an IEEE double and remove any integer part of it in the most efficient manner possible.

I want
```
1035 ->0
1045.23->0.23
253e-23=253e-23
```
I do not care about properly handling denormals, infinities, or NaNs. I do not mind bit twiddling, as I know I am working with IEEE doubles, so it should work across machines.

Branchless code would be much preferred.

My first thought is (in pseudo code)
```
char exp=d.exponent;
(set the last bit of the exponent to 1)
d<<=exp*(exp>0);
(& mask the last 52 bits of d)
(shift d left until the last bit of the exponent is zero, decrementing exp each time)
d.exponent=exp;
```
But the problem is that I can't think of an efficient way to shift d left until the last bit of the exponent is zero, plus it seems it would need to output zero if all of the last bits weren't set. This seems to be related to the base 2 logarithm problem.

Help with this algorithm or any better ones would be much appreciated.

I should probably note that the reason I want branchless code is because I want it to efficiently vectorize.
jberg about 13 years

Not sure how the speed compares, but you can also do double fraction = whole%1;
Mark Elliot about 13 years

@jberg: you can do fmod (% is integer only), and I'm almost sure it's slower since % will involve a division.
Jeremy Salwen about 13 years

The problem is that it won't necessarily fit into a float. My upper bound. I'm looking to handle two different cases, one with numbers |x|<2^60, which this will work for, but others 2^60<|x|<2^128. A long can only hold up to 2^65. So perhaps I will use this one for the first case, but I still need something efficient for the second.
Mark Elliot about 13 years

@Jeremy: in that case fmod is the way to go.
Jeremy Salwen about 13 years

I'm specifically asking because this is not efficient enough for my purposes. Obviously I could strip the NaN, infinity handling, and integer part calculation from some implementation of fmod, but it still has more branching in it than I like.
user541686 about 13 years

@Jeremy: Ah, I see... how about subtracting the floor? That shouldn't require any branching.
Jeremy Salwen about 13 years

That's a good point. Floor may be vectorizable. I will look into that.
Jeremy Salwen about 13 years

@Mark, I'm not sure that's a vectorizable call, but I'll look into it.
user541686 about 13 years

@Jeremy: _mm_floor_pd might help.
Jeremy Salwen about 13 years

Hmm... It seems GCC won't vectorize it.
user541686 about 13 years

@Jeremy: It's an SSE4 instruction unfortunately, so your CPU might not support it... did you import smmintrin.h and compile with the SSE flag?
Jeremy Salwen about 13 years

I compiled with gcc -march=native -mtune=native -ftree-vectorizer-verbose=4 -O3 -std=c99 -lm main.c. I think none of my computers support SSE4, although one does support SSE4a. Is there a way to do something similar without SSE4?
Jeremy Salwen about 13 years

Actually, duh, this is the answer. If it's too big for a long, I can detect that and just set the fractional part to zero. Accepted.
user541686 about 13 years

@Jeremy: Compile with -msse4. But if your CPU doesn't support it then it won't work; in that case, you can't really use SIMD to vectorize this particular calculation. :\
R.. GitHub STOP HELPING ICE about 13 years

Indeed, any value that big cannot have a fractional part. But you should use int64_t, not long. long might be only 32 bits, in which case the values you need won't fit.
Jeremy Salwen about 13 years

Perhaps I should have been more specific about efficiency. I mean efficient portable C99 code which GCC will vectorize. In the sense of mean squared running time on processors weighted by popularity of PC model ownership.
aka.nice over 11 years

I'm found of add/subtract 2^52 (with proper copysign) and round to zero mode, original and neat.
shadow_map almost 9 years

Won't this be a problem since remainder() rounds to nearest integral value? fmod() would be ok though.
Matthew D. Scholefield almost 8 years

I'd be interested to see how this efficiency compares to others.
Mathieu Rodic almost 8 years

@MatthewD.Scholefield excellent question, I'm benchmarking right now :)
Peter Cordes almost 8 years

Can't you express the desired semantics with floor() and the - operator? Then a compiler targeting SSE4 can use roundsd / roundpd to implement those semantics. (But I guess you might need -fno-math-errno or maybe the full -ffast-math to let it actually vectorize floor() to roundpd.)
Peter Cordes almost 8 years

compiled with Clang with optimizations turned off. Then your results are meaningless. Optimization doesn't speed everything up by the same percent. See my answer on this question about an assignment where they had to optimize for -O0.
Peter Cordes almost 8 years

@R..: floor would be a better choice than a cast to integer here, I think, at least for modern CPUs (x86 with SSE4 roundpd). Implementing this way requires the compiler to make code that will fail if the double is outside the range of representable long values. A round-trip from FP to integer and back can be slower than rounding. I tried both versions on the Godbolt compiler explorer. floor didn't inline with just -fno-math-errno, unfortunately, so I used -ffast-math. If you can't do that, and can't assume SSE4, casting looks good.
R.. GitHub STOP HELPING ICE almost 8 years

A conversion from double to long has UB for values outside the range of long, so the compiler should be able to optimize it assuming the value is in range (and thereby avoiding any conversion to integer type).
Mathieu Rodic almost 8 years

The results are extremely similar with optimizations turned on (see edited answer).
Peter Cordes almost 8 years

0.000000 seconds. Sounds like your benchmark optimized away. I do expect that floor or casting to long are going to be very efficient, based on the fact that they only take a couple of fast asm instructions on x86 (vs. slow FP division), but this doesn't demonstrate anything.
Xofo over 5 years

If there was a case for merging answers this would be it.
jorgbrown over 2 years

And here's proof that it vectorizes well: <godbolt.org/z/738GfxvYc>
Andrew Henle over 2 years

The question is tagged C, not C++.
Peter Cordes over 2 years

std::min is not fmin / std::fmin - they have different semantics for NaN and signed zero. See What is the instruction that gives branchless FP min and max on x86? So you should probably tweak that comment to say min/max, not fmin/fmax. (And continue using std::min/max so it can inline efficiently to minps/maxps without extra work for NaN propagation.)
jorgbrown over 2 years

Thanks for catching that Andrew; I've switched the code to C now.
jorgbrown over 2 years

Peter: Yeah, oddly, clang does the right thing with either max or fmax.... but gcc only does the right thing when max is used. So I switched the code and all but one comment to use max rather than fmax, and you caught the one place that I hadn't changed.... but now it's fixed.
jorgbrown over 2 years

I updated your benchmark and added some checking of answers; the checks make it impossible for the compiler to completely optimize-away any of the methods, and also show that some of the solutions are buggy, like floor, remainder, and casting to long. See ideone.com/TAE8PD
jorgbrown over 2 years

Also, I updated Matthieu's benchmark and ran it; see ideone.com/TAE8PD (Summary: There are only 3 correct answers here: fmod, modf, and this answer. And this answer is faster than any other answer other than cast-to-long, but cast-to-long gives the wrong answer when the input floating-point number is larger than "long" can represent.