How to detect machine word size in C/C++?

c++ c cpu-registers

11,735

Solution 1

I think you want

sizeof(size_t) which is supposed to be the size of an index. ie. ar[index]

32 bit machine

char 1
int 4
long 4
long long 8
size_t 4

64 bit machine

char 1
int 4
long 8
long long 8
size_t 8

It may be more complicated because 32 bit compilers run on 64 bit machines. Their output 32 even though the machine is capable of more.

I added windows compilers below

Visual Studio 2012 compiled win32

char 1
int 4
long 4
long long 8
size_t 4

Visual Studio 2012 compiled x64

char 1
int 4
long 4
long long 8
size_t 8

Solution 2

Because the C and C++ languages deliberately abstract away such considerations as the machine word size, it's unlikely that any method will be 100% reliable. However, there are the various int_fastXX_t types that may help you infer the size. For example, this simple C++ program:

#include <iostream>
#include <cstdint>

#define SHOW(x) std::cout << # x " = " << x << '\n'

int main()
{
    SHOW(sizeof(int_fast8_t));
    SHOW(sizeof(int_fast16_t));
    SHOW(sizeof(int_fast32_t));
    SHOW(sizeof(int_fast64_t));
}

produces this result using gcc version 5.3.1 on my 64-bit Linux machine:

sizeof(int_fast8_t) = 1
sizeof(int_fast16_t) = 8
sizeof(int_fast32_t) = 8
sizeof(int_fast64_t) = 8

This suggests that one means to discover the register size might be to look for the largest difference between a required size (e.g. 2 bytes for a 16-bit value) and the corresponding int_fastXX_t size and using the size of the int_fastXX_t as the register size.

Further results

Windows 7, gcc 4.9.3 under Cygwin on 64-bit machine: same as above

Windows 7, Visual Studio 2013 (v 12.0) on 64-bit machine:

sizeof(int_fast8_t) = 1
sizeof(int_fast16_t) = 4
sizeof(int_fast32_t) = 4
sizeof(int_fast64_t) = 8

Linux, gcc 4.6.3 on 32-bit ARM and also Linux, gcc 5.3.1 on 32-bit Atom:

sizeof(int_fast8_t) = 1
sizeof(int_fast16_t) = 4
sizeof(int_fast32_t) = 4
sizeof(int_fast64_t) = 8

Solution 3

Even in machine architecture a word may be multiple things. AFAIK you have different hardware related quantities:

character: generally speaking it is the smallest element that can be exchanged to or from memory - it is now almost everywhere 8 bits but used to be 6 on some older architectures (CDC in the early 80s)
integer: an integer register (e.g.EAX on a x86). IMHO an acceptable approximation is sizeof(int)
address: what can be addressed on the architecture. IMHO an acceptable approximation is sizeof(uintptr_t)
not speaking of floating points...

Let's do some history:

Machine class     |   character    |  integer    | address
-----------------------------------------------------------
old CDC           |     6 bits     |    60 bits  |  ?
8086              |     8 bits     |    16 bits  |  2x16 bits(*)
80x86 (x >= 3)    |     8 bits     |    32 bits  |  32 bits
64bits machines   |     8 bits     |    32 bits  |  64 bits    
                  |                |             |
general case(**)  |     8 bits     | sizeof(int) | sizeof(uintptr_t)

(*) it was a special addressing mode where the high word was shifted by only 8 bits to produce a 20 bits address - but far pointers used to bit 32bits long

(**) uintptr_t does not make much sense on old architecture because the compilers (when they existed) did not support that type. But if a decent compiler was ported on them, I assume that the values would be that.

But BEWARE: the types are defined by the compiler, not the architecture. That means that if you found an 8 bits compiler on a 64 machine, you would probably get sizeof(int) = 16 and sizeof(uintptr_t) = 16. So the above only make sense if you use a compiler adapted to the architecture...

Solution 4

I'll give you the right answer to the question you should be asking:

Q: How do I choose the fastest hash routine for a particular machine if I don't have to use a particular one and it doesn't have to be the same except within a single build (or maybe run) of an application?

A: Implement a parametrized hashing routine, possibly using a variety of primitives including SIMD instructions. On a given piece of hardware, some set of these will work and you will want to enumerate that set using some combination of compile time #ifdefs and dynamic CPU feature detection. (E.g. you can't use AVX2 on any ARM processor, determined at compile time, and you can't use it on older x86, determined by the cpuinfo instruction.) Take the set that works and time them on test data on the machines of interest. Either do so dynamically at system/application startup or test as many cases as you can and hardcode which routine to use on which system based on some sniffing algorithm. (E.g. the Linux kernel does this to determine the fastest memcpy routine, etc.)

The circumstances under which you need the hash to be consistent will be application dependent. If you need the choice to be entirely at compile time, then you'll need to craft a set of preprocessor macros the compiler defines. Often it is possible to have multiple implementations that produce the same hash but using different hardware approaches for different sizes.

Skipping SIMD is probably not a good idea if you are defining a new hash and want it to be really fast, though it may be possible in some applications to saturate the memory speed without using SIMD so it doesn't matter.

If all of that sounds like too much work, use size_t as the accumulator size. Or use the largest size for which std::atomic tells you the type is lock free. See: std::atomic_is_lock_free, std::atomic::is_lock_free, or std::atomic::is_always_lock_free.

Solution 5

By "machine word size" we'll have to assume that the meaning is: the largest size of a piece of data that the CPU can process in a single instruction. (Sometimes called data bus width although that's a simplicifaction.)

On various CPU:s, size_t, uintptr_t and ptrdiff_t could be anything - these are related to the address bus width, rather than the CPU data width. So we can forget about these types, they don't tell us anything.

On all mainstream CPU:s, char is always 8 bits, short is always 16 bits and long long is always 64 bits. So the only interesting types remaining are int and long.

The following mainstream CPU:s do exist:

8 bits

int   = 16 bits   
long  = 32 bits

16 bits

int   = 16 bits   
long  = 32 bits

32 bits

int   = 32 bits   
long  = 32 bits

64 bits

int   = 32 bits   
long  = 32 bits

Unconventional variations to the above may exist, but generally there's no telling from the above how to distinguish 8-bit from 16-bit or 32-bit from 64-bit.

Alignment is no help to us either, because it may or may not apply to various CPU:s. Many CPU:s can read misaligned words just fine, but at the expensive of slower code.

So there is no way to tell the "machine word size" by using standard C.

It is however possible to write fully portable C that can run on anything between 8 and 64 bits, by using the types from stdint.h, notably the uint_fast types. Some things to keep in mind are:

Implicit integer promotions across different systems. Anything of uint32_t or larger is generally safe and portable.
The default type of integer constants ("literals"). This is most often (but not always) int, and what an int is on a given system may vary.
Alignment and struct/union padding.
Pointer size is not necessarily the same as machine word size. Particularly true on many 8, 16 and 64 bit computers.

View more solutions

11,735

Author by

rustyx

Please be positive and stop the trolling!!! Update May 12, 2022. The US is fighting a hybrid proxy-information war. I don't normally comment on politics and I don't give a rat's about Russia or China, but what the US is doing is very, very disturbing. They must stop before it's too late! Brian Berletic - An update on what's really happening in Ukraine Scott Ritter - Ukraine, Finland and Nato, a Warning to the People of Finland Jimmy Dore - on what happened in Ukraine in 2014 Brian Berletic - Washington Preps Taiwan to be Asia's "Ukraine"

Updated on June 08, 2022

Comments

rustyx about 2 years
Is there a more-or-less reliable way (not necessarily perfect) to detect the machine word size of the target architecture for which I'm compiling?

By machine word size I mean the size of the integer accumulator register (e.g. EAX on x86, RAX on x86_64 etc., not streaming extensions, segment or floating-point registers).

The standard does not seem to provide a "machine word" data type. So I'm not looking for a 100% portable way, just something that works in most common cases (Intel x86 Pentium+, ARM, MIPS, PPC - that is, register-based, contemporary commodity processors).

size_t and uintptr_t sound like good candidates (and in practice matched the register size everywhere I tested) but are of course something else and are thus not guaranteed to always do so as is already described in Is size_t the word size.

Context

Let's assume I'm implementing a hashing loop over a block of contiguous data. It is OK to have the resulting hash depend on the compiler, only speed matters.

Example: http://rextester.com/VSANH87912

Testing on Windows shows that hashing in chunks of 64 bits is faster in 64-bit mode and in 32 bits in 32-bit mode:
```
64-bit mode
int64: 55 ms
int32: 111 ms

32-bit mode
int64: 252 ms
int32: 158 ms
```
rustyx over 8 years

Nice suggestion. Too bad this doesn't work in Windows (prints the same values on 32-bit and 64-bit platform).
rustyx over 8 years

So which one would I use as the machine word size?
Robert Jacobs over 8 years

I would go with size_t since array access is inherently a register operation.
Edward over 8 years

@rustyx: You're right. I've added to my answer to show that.
Robert Jacobs over 8 years

Your result may be the result of which compiler you used. it is possible to use a 32 bit compiler on a 64 machine. The result will run.
Edward over 8 years

@RobertJacobs: The result is entirely the result of which compiler is used, so it's much more of a compiler test than a CPU test.
Robert Jacobs over 8 years

@Edward That is why I asked your compiler settings on the Windows 7 Visual Studio 2013 test. Is the result a 32 or 64 bit executable?
Edward over 8 years

Command line used for VS: cl /EHsc /O2 fast.cpp. The result is a 64-bit executable.
Pete Becker over 8 years

size_t is supposed to be "large enough to contain the size in bytes of any object" [support.types]/6. It has nothing to do with array access.
Robert Jacobs over 8 years

@PeteBecker See en.cppreference.com/w/cpp/types/size_t. "size_t can store the maximum size of a theoretically possible object of any type (including array)". Also "size_t is commonly used for array indexing and loop counting." There is a reason malloc takes a size_t as an argument.
Pete Becker over 8 years

@RobertJacobs - yes, malloc takes an argument of type size_t. That's because size_t can represent the size of any object; that's what the standard requires, which is why I quoted the standard. Array indexing is about pointer ranges; if you're indexing into an array of char your index type had better be big enough to hold the maximum size of a char array, even if a single register is not large enough to hold such an index value.
rustyx over 8 years

sizeof(int) is not an acceptable approximation on x86_64. There it is still 4 whereas RAX is 8 bytes.
M.M over 8 years

@Edward The compiler targets a particular CPU though.
jotik over 8 years

@rustyx x86_64 is a single architecture with fixed integer accumulation register sizes. What approximation on x86_64 are you talking about? :D
Brett Hale about 8 years

Even with the x86-64, there's the x32 ABI. It hasn't really caught on, but it's an ILP32 convention where size_t is 32 bits, though it's still a 64-bit 'mode'. MIPS and SGI did this is the late 90s with the N32 ABI.
VladP over 6 years

Great and correct answer, in my opinion. Some platforms have more than 8 bits per character, hence, CHAR_BIT is a must as you noticed.
Adriano Repetti over 6 years

Correct me if I'm wrong but size of pointers and size of data registers are different things (regardless CHAR_BIT, BTW). For example you might have (generally speaking) a 16 bit address space (addressable without segmentation...) and 8 bit registers. Just to mention something well-known you can think about 8008 (and many many other modern microcontrollers). Things may even be more complicate (8080 and successors). A modern (and explicitly mentioned by OP) PPC (with e200z7 cores) is for example a 32 bit CPU with 64 bit general purpose registers.
Persixty over 6 years

As soon as you're running a 32-bit compiler on 64-bit hardware you have to ask "What is the question even asking?" and "Why?".
Lundin over 6 years

The answer is incorrect since this isn't true on any of the numerous 8 bit MCU:s out there: AVR, PIC, HC08, R8C, 8051, Z80... At least the former 4 are in mass production still.
phuclv over 6 years

unfortunately this relies on the existence of int_fast8/16/32/64_t so it won't work on systems with e.g. 24-bit registers
Noah about 2 years

Would suggest warning against using int_fast for actually fast word sizes on systems that use GLIBC. They are mistuned for some popular architectures (x86_64/armv8/basically anything other than alpha) and unchangeable due to ABI concerns.