How do I get awk to NOT use space as a delimeter?
Solution 1
It's not awk
, but the shell (the default value of IFS
) that's causing word splitting.
You could fix that by saying:
while read -r i; do
USERNAME=$(echo "$i" | awk 'BEGIN{FS="[|,:]"} ; {print $1}');
echo "username: $USERNAME";
done < $INPUT
In order to verify how the shell is reading the input, add
echo "This is a line: ${i}"
in the loop.
Solution 2
You can use any regex field separator in awk, eg using optional comma followed by double quote:
awk -F ',?"' '{print $2, $4, $6, $8, $10, $12, "<" $14 ">"}' f1
john beatles.com arse [email protected] 1 1 <on holiday>
paul beatles.com bung 0 1 <also on holiday>
Enclose last field $14
n < and >
to showcase how it gets in a single awk variable.
Comments
-
vmos almost 2 years
I've got a CSV that I'm trying to process, but some of my fields contain commas, line breaks and spaces and now that I think about it, there's probably some apostrophes in there too.
For the commas and line breaks, I've converted them to other strings at the output phase and convert them back at the end (yes it's messy but I only need to run this once) I realise that I may have to do this with the spaces too but I've broken the problem down to it's basic parts to see if I can work around it
Here's an input.csv
"john","beatles.com","arse","[email protected]","1","1","on holiday" "paul","beatles.com","bung","","0","1","also on holiday"
(I've tried with and without quotes)
here's the script
INPUT="input.csv" for i in `cat ${INPUT}` do #USERNAME=`echo $i | awk -v FS=',' '{print $1}'` USERNAME=`echo $i | awk 'BEGIN{FS="[|,:]"} ; {print $1}'` echo "username: $USERNAME" done
So that should just input john and paul but instead I get
username: "john" username: holiday" username: "paul" username: on username: holiday"
because it sees the spaces and interprets them as new rows.
Can I get it to stop that?