C - Serialization of the floating point numbers (floats, doubles)

17,062

Solution 1

Assuming you're using mainstream compilers, floating point values in C and C++ obey the IEEE standard and when written in binary form to a file can be recovered in any other platform, provided that you write and read using the same byte endianess. So my suggestion is: pick an endianess of choice, and before writing or after reading, check if that endianess is the same as in the current platform; if not, just swap the bytes.

Solution 2

You could always convert to IEEE-754 format in a fixed byte order (either little endian or big endian). For most machines, that would require either nothing at all or a simple byte swap to serialize and deserialize. A machine that doesn't support IEEE-754 natively will need a converter written, but doing that with ldexp and frexp (standard C library functions)and bit shuffling is not too tough.

Solution 3

This might give you a good start - it packs a floating point value into an int and long long pair, which you can then serialise in the usual way.

#define FRAC_MAX 9223372036854775807LL /* 2**63 - 1 */

struct dbl_packed
{
    int exp;
    long long frac;
};

void pack(double x, struct dbl_packed *r)
{
    double xf = fabs(frexp(x, &r->exp)) - 0.5;

    if (xf < 0.0)
    {
        r->frac = 0;
        return;
    }

    r->frac = 1 + (long long)(xf * 2.0 * (FRAC_MAX - 1));

    if (x < 0.0)
        r->frac = -r->frac;
}

double unpack(const struct dbl_packed *p)
{
    double xf, x;

    if (p->frac == 0)
        return 0.0;

    xf = ((double)(llabs(p->frac) - 1) / (FRAC_MAX - 1)) / 2.0;

    x = ldexp(xf + 0.5, p->exp);

    if (p->frac < 0)
        x = -x;

    return x;
}

Solution 4

What do you mean, "portable"?

For portability, remember to keep the numbers within the limits defined in the Standard: use a single number outside these limits, and there goes all portability down the drain.

double planck_time = 5.39124E-44; /* second */

5.2.4.2.2 Characteristics of floating types <float.h>

[...]
10   The values given in the following list shall be replaced by constant
     expressions with implementation-defined values [...]
11   The values given in the following list shall be replaced by constant
     expressions with implementation-defined values [...]
12   The values given in the following list shall be replaced by constant
     expressions with implementation-defined (positive) values [...]
[...]

Note the implementation-defined in all these clauses.

Solution 5

Converting to an ascii representation would be the simplest, but if you need to deal with a colossal number of floats, then of course you should go binary. But this can be a tricky issue if you care about portability. Floating point numbers are represented differently in different machines.

If you don't want to use a canned library, then your float-binary serializer/deserializer will simply have to have "a contract" on where each bit lands and what it represents.

Here's a fun website to help with that: link.

Share:
17,062

Related videos on Youtube

psihodelia
Author by

psihodelia

Software Engineer

Updated on April 16, 2022

Comments

  • psihodelia
    psihodelia about 2 years

    How to convert a floating point number into a sequence of bytes so that it can be persisted in a file? Such algorithm must be fast and highly portable. It must allow also the opposite operation, deserialization. It would be nice if only very tiny excess of bits per value (persistent space) is required.

    • yeoman
      yeoman about 7 years
      Some questions are interesting even in case they happen to have emerged from homework. Simply banning all facts and topics on this site if they ever were the subject of anybody's homework would mean to erase half of it, I assume...
  • Christoph
    Christoph over 14 years
    according to the C99 spec, annex F, conforming implementations should define __STDC_IEC_559__, which in principle could be used as a compile-time check, but is useless in practice as there are issues with gcc ( gcc.gnu.org/c99status.html , scroll down to 'Further Issues')
  • Sam
    Sam over 14 years
    Compiler's don't necessarily dictate the IEEE floating point format. There are still computers which use other formats unfortunately (VAX/Alpha, IBM). But +1 ensuring you have the endianness right.
  • Sam
    Sam over 14 years
    The problem comes with FP standards that lack some of the "features" of IEEE. Namely the VAX and IBM floating point formats...You're in for a world of hurt w.r.t. corner cases. Thankfully, people have written excellent converters which handle these cases gracefully (I'm looking at you USGS! I owe you a beer).
  • Sam
    Sam over 14 years
    It is portable only to machines sharing the same floating point format. Having been down this road, I will give you the following advice: Standardize on Little Endian IEEE-754 and make everybody else convert to/from that if necessary. You will be MUCH happier in the end. You will have portability through a rigid standard.
  • Fabio Ceconello
    Fabio Ceconello over 14 years
    Right, but they have to know the format used by the platform to support it in the RTL. Also, many platforms (these days especially embedded) don't have a math coprocessor, so they do dictate the format in the accompanying emulation lib. So I thought it'd be easier to refer to the compiler.
  • Fabio Ceconello
    Fabio Ceconello over 14 years
    Isn't the case to treat those platforms that don't support the IEEE standard as exceptions, and when the (rare) version for them is needed, just do the necessary conversions only there? Here's a good article about the differences: codeproject.com/KB/applications/libnumber.aspx
  • Chris Dodd
    Chris Dodd over 14 years
    An ANSI compliant frexp function hides most of that for you. Of course, you may end up with cases where serialization and deserialization gives you a (close but) different value.
  • Kevin Cox
    Kevin Cox over 9 years
    The question clearly asks about a portable method, which this is obviously not.
  • old_timer
    old_timer over 9 years
    "floating point" is by definition not portable, there are numerous formats and the specific format was not specified. C isnt very portable either, the question was flawed at best.
  • Malcolm McLean
    Malcolm McLean over 7 years
    Addressed. Code now in. (Link also has single precision but it follows straightforwardsly).