bash can't store hexvalue 0x00 in variable

298

Solution 1

You can't store a null byte in a string because Bash uses C-style strings, which reserve the null byte for terminators. So you need to rewrite your script to simply pipe the sequence that contains the null byte without Bash needing to store it in the middle. For example, you can do this:

printf "\x36\xc9\xda\x00\xb4" | hd

Notice, by the way, that you don't need echo; you can use Bash's printf for this an many other simple tasks.

Or instead of chaining, you can use a temporary file:

printf "\x36\xc9\xda\x00\xb4" > /tmp/mysequence
hd /tmp/mysequence

Of course, this has the problem that the file /tmp/mysequence may already exist. And now you need to keep creating temporary files and saving their paths in strings.

Or you can avoid that by using process substitution:

hd <(printf "\x36\xc9\xda\x00\xb4")

The <(command) operator creates a named pipe in the file system, which will receive the output of command. hd will receive, as its first argument, the path to that pipe—which it will open and read almost like any file. You can read more about it here: https://unix.stackexchange.com/a/17117/136742.

Solution 2

You can use zsh instead which is the only shell that can store the NUL character in its variables. That character even happens to be in the default value of $IFS in zsh.

nul=$'\0'

Or:

nul=$'\x0'

Or

nul=$'\u0000'

Or

nul=$(printf '\0')

However note that you can't pass such a variable as an argument or environment variable to a command that is executed as the arguments and environment variables are NUL-delimited strings passed to the execve() system call (a limitation of the system's API, not the shell). In zsh, you can however pass NUL bytes as arguments to functions or builtin commands.

echo $'\0' # works
/bin/echo $'\0' # doesn't

Solution 3

Bash uses C strings internally which cannot store the null byte. Store the value in a temporary file like this:

    zHex=$(mktemp --tmpdir "$(basename "$0")-XXXX")
    trap "rm -f ${zHex@Q}" EXIT

The variable zHex now contains a unique file name. The file referenced by $zHex can be deleted manually, but the file will be automatically deleted when the program terminates for any reason.

Then use the variable like this:

    echo -ne "\x36\xc9\xda\x00\xb4" > "$zHex"
    hd "$zHex"

This does NOT store the value with null bytes into a variable. Instead, it uses a variable to store the name of a file. The file, like any other file, may contain null bytes and can be used over and over. The file itself will most likely never be physically written to the disk.

Via a trap, bash deletes the file automatically, so you need not worry about removing it manually unless you are creating an a crazy array of garbage. Due to RAM buffering, this technique is decently fast.

Share:
298

Related videos on Youtube

Vladgor
Author by

Vladgor

Updated on September 18, 2022

Comments

  • Vladgor
    Vladgor almost 2 years

    The following code works as expected in windows xp, but in windows 10 the image starts flickering. How do I make it work in windows 10?

    #include <windows.h>
    #include <ctime>
    #include <vector>
    
    #define xMax 180
    #define yMax 45
    #define Fps 250
    
    class dbconsole
    {
    private:
        int width, height, FPS, delay;
        HANDLE h0, h1;
        std::vector<CHAR_INFO> chiBuffer;
        bool curBuffer;
        int drawingTimer;
    
        void preparebuffer(HANDLE &h)
        {
            CONSOLE_CURSOR_INFO cursor = {false, 1};
            SMALL_RECT windowRectangle = {0,0,width-1,height-1};
            h = CreateConsoleScreenBuffer(
                    GENERIC_READ | GENERIC_WRITE,
                    FILE_SHARE_READ | FILE_SHARE_WRITE,
                    NULL,
                    CONSOLE_TEXTMODE_BUFFER,
                    NULL);
            SetConsoleCursorInfo(h, &cursor);
            SetConsoleScreenBufferSize (h, {width,height});
            SetConsoleWindowInfo(h,true,&windowRectangle);
        }
    
    public:
    
        dbconsole(int Width, int Height, int fps)
        {
            chiBuffer.reserve(Width*Height);
            width = Width;
            height = Height;
            FPS = fps;
            preparebuffer(h0);
            preparebuffer(h1);
            curBuffer = 0;
            drawingTimer = clock();
            for (int i = 0; i < xMax; i++) for (int j = 0; j < yMax; j++) chiBuffer[i+width*j] = {'t',16};
        }
    
        void depict()
        {
            SMALL_RECT srctWriteRect;
            srctWriteRect.Top = 0;
            srctWriteRect.Left = 0;
            srctWriteRect.Bottom = height-1;
            srctWriteRect.Right = width-1;
            if ((clock()-drawingTimer)*FPS>CLOCKS_PER_SEC)
            {
                if (curBuffer)
                {
                    WriteConsoleOutput(h0, &chiBuffer[0], {width,height}, {0,0}, &srctWriteRect);
                    SetConsoleActiveScreenBuffer(h0);
                }
                else
                {
                    WriteConsoleOutput(h1, &chiBuffer[0], {width,height}, {0,0}, &srctWriteRect);
                    SetConsoleActiveScreenBuffer(h1);
                }
                curBuffer=!curBuffer;
                drawingTimer = clock();
            }
        }
    
    };
    
    int main(void)
    {
        dbconsole myConsole = dbconsole(xMax,yMax,Fps);
        while (true) myConsole.depict();
    }
    

    I want the program to show black letters 't' on blue background, but with no flickering and with double buffering

    • grek40
      grek40 over 8 years
      I remember having a similar problem long before windows 10... I'm not sure, but I think it was the XP era. For me, the solution was to write an update routine for the current screen buffer instead of frequently switching buffers, however I did not go for 250 fps there...
    • Vladgor
      Vladgor over 8 years
      Switching to 60 fps doesn't help. Also, this drawing worked for windows 8. And I want to use those consoles to display some information where I may need do redraw everyting in an instant, so double buffering seems necessary.
    • grek40
      grek40 over 8 years
      ofcourse... 60fps is probably your monitor refresh rate anyway. You should go down to 1fps so you can evaluate the effect of a single refresh. As long as you can see any effect, its going to flicker with faster updates.
    • Vladgor
      Vladgor over 8 years
      With 1 fps no flickering occurs, but a cursor appears in the top-left corner, while I was sure that the line CONSOLE_CURSOR_INFO cursor = {false, 1}; was supposed to hide it.
    • grek40
      grek40 over 8 years
      Well then, I have no idea how to hide a cursor in windows 10 but I guess you can figure it out now that you know what to look for ;)
    • Kusalananda
      Kusalananda over 7 years
      I get bash: warning: command substitution: ignored null byte in input.
    • ctrl-alt-delor
      ctrl-alt-delor over 7 years
      You are missing quotes it should be header="$(echo -ne "\x36\xc9\xda\x00\xb4")"; echo -n "$header" | hd however this just gives same result.
    • ctrl-alt-delor
      ctrl-alt-delor over 7 years
      This works header="\x36\xc9\xda\x00\xb4"; echo -n "$header" | hd, but is not the same thing as it is storing the human readable form.
  • Vladgor
    Vladgor over 8 years
    Thank you, the cursor has disappeared, it is working at 1fps, but the image is still flickering at 40fps
  • Frank
    Frank over 7 years
    "You can use zsh instead". No thanks - I'm teaching myself bash-scripting as a beginner right now. I don't want to confuse myself with an other syntax. But thank you veray much for suggest it
  • mirabilos
    mirabilos over 7 years
    While correct, this is an implementation detail and not the exact reason. I looked at it, and the POSIX standard actually requires this behaviour, so there you have the actual reason. (As some have pointed out, zsh will do it, but only in nōn-POSIX mode.) I actually looked into it because I was wondering if it was worth to implement this in mksh
  • Stéphane Chazelas
    Stéphane Chazelas over 7 years
    As a matter of fact, you used zsh syntax in your question. echo -n $header to mean to pass the content of the $header variable as a last argument to echo -n is zsh (or fish or rc or es) syntax, not bash syntax. In bash, that has a very different meaning. More generally zsh is like ksh (bash, the GNU shell, being more or less a part-clone of ksh, the Unix de-facto shell) but with most of the design idiosyncrasies of the Bourne shell fixed (and a lot of extra features, and a lot more user-friendly/less astonishing).
  • done
    done over 7 years
    Be careful: zsh may change a zero byte sometimes: echo $(printf 'ab\0cd') | od -vAn -tx1c prints ` 61 62 20 63 64 0a`, that is an space where a NUL should exist.
  • Stéphane Chazelas
    Stéphane Chazelas over 7 years
    @sorontar, yes, as I said \0 is in the default $IFS, so $(printf 'ab\0cd') is split into ab and cd. Try with echo "$(printf 'ab\0cd')" instead.
  • done
    done over 7 years
    And that is something no other (none, nil) shell will reproduce. That makes an script behave in very special ways in zsh. In my opinion: zsh is just trying to be too clever.
  • Charles Duffy
    Charles Duffy over 7 years
    Having "fixed" the design misfeatures present in the POSIX sh standard that getting accustomed to writing zsh scripts means one is getting accustomed to practices which would be buggy if exercised in any other shell. This isn't such a problem with a syntax that's so unlike a different language that skills or habits aren't likely to transfer, but such is not the case at hand.
  • Stéphane Chazelas
    Stéphane Chazelas over 7 years
    @sorontar, both echo $(printf 'ab\0cd') and echo "$(printf 'ab\0cd')" are unspecified in POSIX and not working "properly" in every other shell. OTOH, the behaviour is clearly specified in zsh and works as documented. It makes perfect sense to split on the NUL byte by default. That can be useful in ls -ld -- $(grep -rZl whatever .) though you'd rather write ls -ld -- ${(0)"$(grep -rZl whatever .)"} in that case, as you don't want to split on the other $IFS character.
  • Stéphane Chazelas
    Stéphane Chazelas over 7 years
    @CharlesDuffy, that is a fair point. OTOH, shells like rc (or to some extent fish) with a radically different syntax and that have fixed the Bourne issues never took off for the very reason that they're not Bourne-like. IMO, zsh's stance is courageous and laudable here and a step in the right direction.
  • Stéphane Chazelas
    Stéphane Chazelas over 7 years
    @mirabilos, would you care to expand on that? AFAICT, behaviour is unspecified per POSIX for command substitution when the output has NUL characters, and for zsh in POSIX mode, the only relevant difference I can think of is that in sh emulation, \0 is not in the default value of $IFS. echo "$(printf 'a\0b')" still works OK in sh emulation in zsh.
  • done
    done over 7 years
    @StéphaneChazelas I will be bold also and ask a very naive question: isn't zsh supposed to: In ZSH, however, word splitting is disabled by default (which is great), that should mean that the "Command Substitution" string should not be split. I am sure that I am wrong and I will be clearly corrected by stating clearly why in this particular case the naive question I am making is invalid. But that just miss the point: One has to be an expert in zsh to make it work the way one wants. Simple users easily get lost .
  • Stéphane Chazelas
    Stéphane Chazelas over 7 years
    @sorontar, word splitting (but not globbing which would not make sense) happens in zsh upon command substitution, because that's generally what you want. (though in that specific case, I'm not sure I agree with that particular design decision). Generally zsh chooses the path of least astonishment, that's the opposite of needing to be expert to work with it, I can't think of where you're getting that from.
  • giusti
    giusti over 7 years
    @mirabilos Considering that the shells predates the POSIX standard by a decade or more, I guess you could find out that the actual actual reason is that shells used C-style strings and the standard was built around that.
  • Paulb
    Paulb over 7 years
    I found a good Q for detailed discussion on printf versus echo. unix.stackexchange.com/questions/65803/…
  • fpmurphy
    fpmurphy over 6 years
    @StéphaneChazelas, zshdoes not actually store raw NUL characters in a variable. Just as ksh93 uses a 'hack' (base64) to store NUL and other characters in binary variables, zsh also uses a 'hack' to store NUL (and some other characters) in a variable - a Meta byte (0x83) followed by a byte containing 'character xor 32'. See zsh.h.
  • Stéphane Chazelas
    Stéphane Chazelas over 6 years
    @fpmurphy1, that's internal only and transparent to the user. In zsh, $var[1] for instance gets the first character of $var whether it's a NUL character or other. How zsh stores it internally is irrelevant as it's not visible to the user. That's different in ksh93. In ksh93, If a $var contains the base64 encoding of abc, ${var:0:1} will contain the first character of that base64 encoding, not a, which is not useful. ${#var} will expand to the length of the encoding, not the length of the data it is meant to represent.
  • AdminBee
    AdminBee almost 3 years
    Please note that the original problem arises from NULL bytes inside a string variable, whereas your post concerns strings that are terminated by a NULL byte.
  • Paul
    Paul almost 3 years
    I read the first comment about data containing null bytes. Maybe my rewording of the first paragraph will help you understand. The problem with this solution is the terminating null byte.
  • Kusalananda
    Kusalananda almost 3 years
    There seems to be code missing. You also save the temporary file's pathname in zHeader, but then appear to remove $zTemp (but with literal quotes inserted around the name with @Q, for some unexplained reason). The answer is correct, but the code is irrelevant to the question.