Bitwise shift and the largest integer in Bash

27,666

Solution 1

Bash uses intmax_t variables for arithmetic. On your system these are 64 bits in length, so:

$ echo $((1<<62))
4611686018427387904

which is

100000000000000000000000000000000000000000000000000000000000000

in binary (1 followed by 62 0s). Shift that again:

$ echo $((1<<63))
-9223372036854775808

which is

1000000000000000000000000000000000000000000000000000000000000000

in binary (63 0s), in two's complement arithmetic.

To get the biggest representable integer, you need to subtract 1:

$ echo $(((1<<63)-1))
9223372036854775807

which is

111111111111111111111111111111111111111111111111111111111111111

in binary.

As pointed out in ilkkachu's answer, shifting takes the offset modulo 64 on 64-bit x86 CPUs (whether using RCL or SHL), which explains the behaviour you're seeing:

$ echo $((1<<64))
1

is equivalent to $((1<<0)). Thus $((1<<1025)) is $((1<<1)), $((1<<1026)) is $((1<<2))...

You'll find the type definitions and maximum values in stdint.h; on your system:

/* Largest integral types.  */
#if __WORDSIZE == 64
typedef long int                intmax_t;
typedef unsigned long int       uintmax_t;
#else
__extension__
typedef long long int           intmax_t;
__extension__
typedef unsigned long long int  uintmax_t;
#endif

/* Minimum for largest signed integral type.  */
# define INTMAX_MIN             (-__INT64_C(9223372036854775807)-1)
/* Maximum for largest signed integral type.  */
# define INTMAX_MAX             (__INT64_C(9223372036854775807))

Solution 2

From the CHANGES file for bash 2.05b:

j. The shell now performs arithmetic in the largest integer size the machine supports (intmax_t), instead of long.

On x86_64 machines intmax_t corresponds to signed 64-bits integers. So you get meaningful values between -2^63 and 2^63-1. Outside that range you just get wrap-arounds.

Solution 3

Shifting by 1024 gives a one, because the shift amount is effectively taken modulo the number of bits (64), so 1024 === 64 === 0, and 1025 === 65 === 1.

Shifting something other than a 1 makes it clear it's not a bit rotation, as the higher bits don't wrap around to the low end before the shift value is (at least) 64:

$ printf "%x\n" $(( 5 << 63 )) $(( 5 << 64 ))
8000000000000000
5

It may be that this behaviour depends on the system. The bash code Stephen linked to shows just a plain shift, without any checking for the right hand value. If I remember correctly, x86 processors only use the bottom six bits of the shift value (in 64-bit mode), so the behaviour may be directly from the machine language. Also, I think shifts by more than the bit width aren't clearly defined in C either (gcc warns for that).

Solution 4

producing an integer by shifting a bit. How far can I go?

Until the integer representation wraps around (the default in most shells).
A 64 bit integer usually wraps around at 2**63 - 1.
That's 0x7fffffffffffffff or 9223372036854775807 in dec.

That number '+1' becomes negative.

That's the same as 1<<63, thus:

$ echo "$((1<<62)) $((1<<63)) and $((1<<64))"
4611686018427387904 -9223372036854775808 and 1

After that the process repeats again.

$((1<<80000)) $((1<<1022)) $((1<<1023)) $((1<<1024)) $((1<<1025)) $((1<<1026))

The result depends on the mod 64 of the shifting value[a].

[a] From: Intel® 64 and IA-32 Architectures Software Developer’s Manual: Volume 2 The count is masked to 5 bits (or 6 bits if in 64-bit mode and REX.W is used). The count range is limited to 0 to 31 (or 63 if 64-bit mode and REX.W is used)..

Also:remember that $((1<<0)) is 1

$ for i in 80000 1022 1023 1024 1025 1026; do echo "$((i%64)) $((1<<i))"; done
 0 1
62 4611686018427387904
63 -9223372036854775808
 0 1
 1 2
 2 4

So, it all depends on how close is the number to a multiple of 64.

Testing the limit:

The robust way to test which is the maximum positive (and negative) integer is to test each one bit in turn. Its less than 64 steps for most computers anyway, it won't be too slow.

bash

First we need the biggest integer of the form 2^n (1 bit set followed by zeros). We can do that by shifting left until the next shift makes the number negative, also called "wrap around":

a=1;   while ((a>0));  do ((b=a,a<<=1))  ; done

Where b is the result: the value before the last shift that fails the loop.

Then we need to try every bit to find out which ones affect the sign of e:

c=$b;d=$b;
while ((c>>=1)); do
      ((e=d+c))
      (( e>0 )) && ((d=e))
done;
intmax=$d

The maximum integer (intmax) results from the last value of d.

On the negative side (less than 0) we repeat all the tests but testing when a bit could be made 0 without wrapping around.

A whole test with printing of all steps is this (for bash):

#!/bin/bash
sayit(){ printf '%020d 0x%016x\n' "$1"{,}; }
a=1;       while ((a>0)) ; do((b=a,a<<=1))              ; sayit "$a"; done
c=$b;d=$b; while((c>>=1)); do((e=d+c));((e>0))&&((d=e)) ; sayit "$d"; done;
intmax=$d
a=-1;      while ((a<0)) ; do((b=a,a<<=1))              ; sayit "$b"; done;
c=$b;d=$b; while ((c<-1)); do((c>>=1,e=d+c));((e<0))&&((d=e)); sayit "$d"; done
intmin=$d       

printf '%20d max positive value 0x%016x\n' "$intmax" "$intmax"
printf '%20d min negative value 0x%016x\n' "$intmin" "$intmin"

sh

Translated to almost any shell:

#!/bin/sh
printing=false
sayit(){ "$printing" && printf '%020d 0x%016x\n' "$1" "$1"; }
a=1;       while [ "$a" -gt 0  ];do b=$a;a=$((a<<1)); sayit "$a"; done
c=$b;d=$b; while c=$((c>>1)); [ "$c" -gt 0 ];do e=$((d+c)); [ "$e" -gt 0 ] && d=$e ; sayit "$d"; done;
intmax=$d
a=-1;      while [ "$a" -lt 0  ];do b=$a;a=$((a<<1)); sayit "$b"; done;
c=$b;d=$b; while [ "$c" -lt -1 ];do c=$((c>>1));e=$((d+c));[ "$e" -lt 0 ] && d=$e ; sayit "$d"; done
intmin=$d       

printf '%20d max positive value 0x%016x\n' "$intmax" "$intmax"
printf '%20d min negative value 0x%016x\n' "$intmin" "$intmin"

Running the above for many shells,
all (except bash 2.04 and mksh) accepted values up to (2**63 -1) in this computer.

It is interesting to report that the att shell:

$ attsh --version
version         sh (AT&T Research) 93u+ 2012-08-01

printed an error on values of $((2^63)), not ksh though.

Share:
27,666

Related videos on Youtube

Gilles 'SO- stop being evil'
Author by

Gilles 'SO- stop being evil'

Updated on September 18, 2022

Comments

  • Gilles 'SO- stop being evil'
    Gilles 'SO- stop being evil' almost 2 years

    This is an exploration question, meaning I'm not completely sure what this question is about, but I think it's about the biggest integer in Bash. Anyhow, I'll define it ostensively.

    $ echo $((1<<8))
    256
    

    I'm producing an integer by shifting a bit. How far can I go?

    $ echo $((1<<80000))
    1
    

    Not this far, apparently. (1 is unexpected, and I'll return to it.) But,

    $ echo $((1<<1022))
    4611686018427387904
    

    is still positive. Not this, however:

    $ echo $((1<<1023))
    -9223372036854775808
    

    And one step further afield,

    $ echo $((1<<1024))
    1
    

    Why 1? And why the following?

    $ echo $((1<<1025))
    2
    $ echo $((1<<1026))
    4
    

    Would someone like to analyse this series?

    UPDATE

    My machine:

    $ uname -a
    Linux tomas-Latitude-E4200 4.4.0-47-generic #68-Ubuntu SMP Wed Oct 26 19:39:52 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
    
    • user
      user over 7 years
      -9223372036854775808 = 0xF333333333333334. That's a funny-looking edge case. Of course, 4611686018427387904 = 0x4000000000000000. I suspect you are hitting some kind of wraparound on the number of bits to shift by. Why are you doing this, anyway?
    • Admin
      Admin over 7 years
      @MichaelKjörling For amusement ;-p
    • hvd
      hvd over 7 years
      @MichaelKjörling No, it's not. -9223372036854775808 would be 0x8000000000000000. You left out the final digit when checking: -922337203685477580 would be 0xF333333333333334.
  • cuonglm
    cuonglm over 7 years
    @StephenKitt Note that bash not only look in stdint.h to find the size of intmax_t, also bash still can handle 2**63-1 on 32-bit machine.
  • Stephen Kitt
    Stephen Kitt over 7 years
    @cuonglm check out stdint.h on your 32-bit machine, it probably defines intmax_t as long long int which stores 64 bits.
  • Stephen Kitt
    Stephen Kitt over 7 years
    @cuonglm also if you look at the source code I linked to you'll see there's no special handling for shifting, it's just the compiler's << operator.
  • Kevin
    Kevin over 7 years
    Does bash do anything to ensure that it does not perform undefined behavior (e.g. shifting a one into the sign bit, or other overflow)? While in practice said UB is likely to be limited to the value returned, in theory it poisons the entire execution of the shell.