How to split a tab-delimited string in bash script WITHOUT collapsing blanks?

11,553

Solution 1

IFS is only one-or-more if the characters are whitespace. Non-whitespace characters are single delimiters. So a simple solution, if there is some non-whitespace character which you are confident is not in your string, is to translate tabs to that character and then split on it:

IFS=$'\2' read -ra ITEMS <<<"${LINE//$'\t'/$'\2'}"

Unfortunately, assumptions like "there is no instance of \2 in the input" tend to fail in the long-run, where "in the long-run" translates to "at the worst possible time". So you might want to do it in two steps:

IFS=$'\2' read -ra TEMP < <(tr $'\t\2' $'\2\t' <<<"$LINE")
ITEMS=("${TEMP[@]//$'\t'/$'\2'}")

Solution 2

One possibility: instead of splitting with IFS, use the -d option to read tab-terminated "lines" from the string. However, you need to ensure that your string ends with a tab as well, or you will lose the last item.

items=()
while IFS='' read -r -d$'\t' x; do
   items+=( "$x" )
done <<< $'   foo   \t  bar\nbaz \t   foobar\t'

printf "===%s===\n" "${items[@]}"

Ensuring a trailing tab without adding an extra field can be accomplished with

if [[ $str != *$'\t' ]]; then str+=$'\t'; fi

if necessary.

Share:
11,553
Neil C. Obremski
Author by

Neil C. Obremski

Ribbit

Updated on June 09, 2022

Comments

  • Neil C. Obremski
    Neil C. Obremski almost 2 years

    I have my string in $LINE and I want $ITEMS to be the array version of this, split on single tabs and retaining blanks. Here's where I'm at now:

    IFS=$'\n' ITEMS=($(echo "$LINE" | tr "\t" "\n"))
    

    The issue here is that IFS is one-or-more so it gobbles up new-lines, tabs, whatever. I've tried a few other things based on other questions posted here but they assume that there will always be a value in all fields, never blank. And the one that seems to hold the key is far beyond me and operating on an entire file (I am just splitting a single string).

    My preference here is a pure-BASH solution.