How to use pseudo-arrays in POSIX shell script?

5,641

Solution 1

The idea is to encode the list of arbitrary strings into a scalar variable in a format that can later be used to reconstruct the list or arbitrary strings.

 $ save_pseudo_array x "y z" $'x\ny' "a'b"
'x' \
'y z' \
'x
y' \
'a'\''b' \

$

When you stick set -- in front of that, it makes shell code that reconstructs that list of x, y z strings and stores it in the $@ array, which you just need to evaluate.

The sed takes care of properly quoting each string (adds ' at the beginning of the first line, at the end of the last line and replaces all 's with '\'').

However, that means running one printf and sed command for each argument, so it's pretty inefficient. That could be done in a more straightforward way with just one awk invocation:

save_pseudo_array() {
  LC_ALL=C awk -v q=\' '
    BEGIN{
      for (i=1; i<ARGC; i++) {
        gsub(q, q "\\" q q, ARGV[i])
        printf "%s ", q ARGV[i] q
      }
      print ""
    }' "$@"
}

Solution 2

The basic idea is to use set to re-create the experience of working with indexed values from an array. So when you want to work with an array, you instead run set with the values; that’s

set -- 1895 955 1104 691 1131 660 1145 570 1199 381

Then you can use $1, $2, for etc. to work with the given values.

All that’s not much use if you need multiple arrays though. That’s where the save and eval trick comes in: Rich’s save function¹ processes the current positional parameters and outputs a string, with appropriate quoting, which can then be used with eval to restore the stored values. Thus you run

coords=$(save "$@")

to save the current working array into coords, then create a new array, work with that, and when you need to work with coords again, you eval it:

eval "set -- $coords"

To understand the example you have to consider that you’re working with two arrays here, the one with values set previously, and which you store in coords, and the array containing 1895, 955 etc. The snippet itself doesn’t make all that much sense on its own, you’d have some processing between the set and eval lines. If you need to return to the 1895, 955 array later, you’d save that first before restoring coords:

newarray=$(save "$@")
eval "set -- $coords"

That way you can restore $newarray later.


¹ Defined as

save () {
for i do printf %s\\n "$i" | sed "s/'/'\\\\''/g;1s/^/'/;\$s/\$/' \\\\/" ; done
echo " "
}

Share:
5,641
Vlastimil Burián
Author by

Vlastimil Burián

I am passionate about Linux systems in general and POSIX shell scripting in particular.

Updated on September 18, 2022

Comments

  • Vlastimil Burián
    Vlastimil Burián almost 2 years

    How to use pseudo-arrays in POSIX shell script?

    I want to replace an array of 10 integers in a Bash script with something similar into POSIX shell script.

    I managed to come across Rich’s sh (POSIX shell) tricks, on section Working with arrays.

    What I tried:

    save_pseudo_array()
    {
        for i do
            printf %s\\n "$i" | sed "s/'/'\\\\''/g;1s/^/'/;\$s/\$/' \\\\/"
        done
        echo " "
    }
    
    coords=$(save_pseudo_array "$@")
    set -- 1895 955 1104 691 1131 660 1145 570 1199 381
    eval "set -- $coords"
    

    I don't comprehend it, that's the problem, if anyone could shed some light on it, much appreciated.

  • mtraceur
    mtraceur over 6 years
    There's something to be said about portability vs efficiency here about the printf ... | sed ... vs awk, though: I don't remember all practical nuances of awk portability vs sed, but it's definitely a bigger minefield. If the target is just strictly POSIX, that might be fine, but if the target is practical portability to systems in practical use today, it might not be.
  • Stephen Kitt
    Stephen Kitt over 6 years
    @mtraceur, AWK is part of POSIX and quite portable (if you avoid GNU extensions). (And I realise you’re not saying it’s not part of POSIX.)
  • Stéphane Chazelas
    Stéphane Chazelas over 6 years
    @mtraceur, yes basically, the problem here would be the /bin/awk of Solaris that is the one with the API from Unix V7 in the late 70s (so without -v, ARGV...). That said on Solaris, there is a POSIX awk in /usr/xpg4/bin/awk, and more generally on Solaris you know that you can't expect much from the default environment and that you need to do a PATH=$(getconf PATH):$PATH to be able to do anything.
  • Harold Fischer
    Harold Fischer over 5 years
    @StéphaneChazelas Is there a particular reason you are using LC_ALL=C with your awk command? I didn't think you needed to do this unless you were comparing strings with the == operator.
  • ʇsәɹoɈ
    ʇsәɹoɈ about 5 years
    Where is this save function defined?