What is the size of float and double in C and C++?

107,325

Solution 1

Excerpt from the C99 standard, normative annex F (The C++-standard does not explicitly mention this annex, though it includes all affected functions without change per reference. Also, the types have to match for compatibility.):

IEC 60559 floating-point arithmetic

F.1 Introduction

1 This annex specifies C language support for the IEC 60559 floating-point standard. The IEC 60559 floating-point standard is specifically Binary floating-point arithmetic for microprocessor systems, second edition (IEC 60559:1989), previously designated IEC 559:1989 and as IEEE Standard for Binary Floating-Point Arithmetic (ANSI/IEEE 754−1985). IEEE Standard for Radix-Independent Floating-Point Arithmetic (ANSI/IEEE 854−1987) generalizes the binary standard to remove dependencies on radix and word length. IEC 60559 generally refers to the floating-point standard, as in IEC 60559 operation, IEC 60559 format, etc. An implementation that defines __STDC_IEC_559__ shall conform to the specifications in this annex.356) Where a binding between the C language and IEC 60559 is indicated, the IEC 60559-specified behavior is adopted by reference, unless stated otherwise. Since negative and positive infinity are representable in IEC 60559 formats, all real numbers lie within the range of representable values.

So, include <math.h> (or in C++ maybe <cmath>), and test for __STDC_IEC_559__.

If the macro is defined, not only are the types better specified (float being 32bits and double being 64bits among others), but also the behavior of builtin operators and standard-functions is more specified.
Lack of the macro does not give any guarantees.

For x86 and x86_64 (amd64), you can rely on the types float and double being IEC-60559-conformant, though functions using them and operations on them might not be.

Solution 2

Does not say anything about the size.

3.9.1.8

There are three floating point types: float, double, and long double. The type double provides at least as much precision as float, and the type long double provides at least as much precision as double. The set of values of the type float is a subset of the set of values of the type double; the set of values of the type double is a subset of the set of values of the type long double. The value representation of floating-point types is implementation-defined. Integral and floating types are collectively called arithmetic types. Specializations of the standard template std::numeric_limits (18.3) shall specify the maximum and minimum values of each arithmetic type for an implementation.

Solution 3

The C++ standard doesn't say anything, but in most of the platforms C++ use the single/double precision standard from IEEE, which define single precision as 4 bytes, and double precision as 8 bytes.

I'm not sure about the exceptions for these cases.

Solution 4

As floating point operations are implemented at a low level by CPUs, the C++ standard does not mandate a size for either a float, double or long double. All it says is that the order I specified them is in equal or increasing order of precision.

Your best bet is to use static_assert, sizeof, typedef and #define carefully in order to define cross platform floating point types.

Solution 5

I want to point out that even if you have same size floats you can not be sure these floats are equally interpreted on different platforms. You can read a lot of papers about 'floats over network'. Floats non-determinism is a known problem.

Share:
107,325
mans
Author by

mans

Updated on September 09, 2021

Comments

  • mans
    mans over 2 years

    I was looking to see if there is any standard type similar to uint32_t which always would map into a 32-bit unsigned integral type but I could not find any.

    Is the size of float always 4 byte on all platform?
    Is the size of double always 8?

    Does either standard say anything on the matter?

    I want to make sure that my size is always the same on all platforms (x86 and x64) so I am using standard int types, but I could not find any similar typedef for float and double.

  • Deduplicator
    Deduplicator over 9 years
    Used properly, they are not a hazard but a boon. Naturally, you want fixed-representation types for binary serialization.
  • Deduplicator
    Deduplicator over 9 years
    The integer types are not implemented at any higher-level than floating-point, still there are exact-width fixed-representation types since C99. You might want to work on the reasoning.
  • mans
    mans over 9 years
    This is about int types and I am using them, but this question is about float/double types which is not covered by this cross platform types.
  • mans
    mans over 9 years
    is it correct if we say that any platform which implement IEEE standard, then they are the same? In amy application, I need to make sure that data is transferred correctly from an ARM system to an Intel base system.
  • mans
    mans over 9 years
    Does this means that any platform which uses IEEE standard, are compatible? Is platform is hardware related to development related?
  • GreenScape
    GreenScape over 9 years
    @mans There are different IEEE standards there, different compilers may not fully implement them. There is endianness problem. So you can't be 100% sure. I'd recommend using alternatives as: 1) fixed point real numbers; 2) convert to string and back; 3) use existing libraries that provide portability. see this
  • CashCow
    CashCow over 9 years
    The double and float types in C++ have to be compatible with the same types in C as C++ has direct integration with C.
  • Deduplicator
    Deduplicator over 9 years
    @CashCow: Sure. Still, I wanted to provide reasonable expectation for the macro being defined too.
  • MSalters
    MSalters over 9 years
    @mans: Not entirely compatible. For instance, IEEE doesn't say anthying about endianness. The Most Significant Bit is the sign bit in IEEE 754, but that doesn't tell you in which byte you find the MSB.