how to use awk/cut to get column data which has space
Solution 1
If the columns are separated by tabs you can specify the tab character as the field separator. This will prevent the default behavior of awk to treat spaces as separate columns.
cat <data file> | awk -F"\t" '{print $1, $2}'
root@ubuntu32:/tmp# cat testtext | awk -F"\t" '{print $1, $2}'
16 SQL*Plus
16 TOAD background query session
Solution 2
Liked @Costas suggestion, and another option is:
gawk '
{
f1=substr($0,2,2)
f2=substr($0,4,36)
gsub(/ *$/, "", f2)
print f1 " " f2
}
'
Solution 3
One way to do this could involve unexpand
. The description for it and the expand
utility can be found here:
- The
unexpand
utility shall copy files or standard input to standard output, converting<blank>
characters at the beginning of each line into the maximum number of<tab>
characters followed by the minimum number of<space>
characters needed to fill the same column positions originally filled by the translated<blank>
characters. By default, tabstops shall be set at every eighth column position. Each<backspace>
shall be copied to the output, and shall cause the column position count for tab calculations to be decremented; the count shall never be decremented to a value less than one.
You'd probably want the -a
switch though.
-
-a
- In addition to translating<blank>
characters at the beginning of each line, translate all sequences of two or more<blank>
characters immediately preceding a tab stop to the maximum number of<tab>
characters followed by the minimum number of<space>
characters needed to fill the same column positions originally filled by the translated<blank>
characters.
It's a simple utility for converting many spaces in sequence to tabs instead. In that way you could...
unexpand -a <<\IN | cut -f1
16 SQL*Plus vilconv1 dox-conv2
16 TOAD background query session Disha WORKGROUP\AD
IN
...which prints...
16 SQL*Plus
16 TOAD background query session
I just use cut
there, but if you wanted to you could use awk
or anything else really. I only suggest it because you almost definitely already have it installed, it is very simple to use, and very fast. It solves the space problem by swapping delimiters - and it does so very easily.
I also use a here-document just to show how it works, but you'd probably want to do instead...
unexpand -a <infile | filter program
Related videos on Youtube
stackoverflow_unicorn
Updated on September 18, 2022Comments
-
stackoverflow_unicorn over 1 year
I have data in below format:
16 SQL*Plus vilconv1 dox-conv2 16 TOAD background query session Disha WORKGROUP\AD
now I want to get data by column, I am using below command
awk '{print $1,$2}'
but since column 2 has spaces it;s giving me below output :
16 SQL*Plus 16 TOAD
whereas what I want is:
16 SQL*Plus 16 TOAD background query session
-
Admin about 9 yearsDoes your data fit a fixed-width format?
-
Admin about 9 yearsIs each row delimited by tabs, or are those literal tabs? Also, is it only the second column that has spaces? Are there spaces in any other column?
-
Admin about 9 yearsUse
cut -c -40
-
Admin about 9 yearsAnd maybe sed 's/ *$//' to remove trailing spaces, if that matters
-
Admin about 9 years@Costas that's by far the best method if there's no tabs, why not make it an answer?
-
-
jasonwryan about 9 yearsNo need to flog the feline; Awk can be passed a filename for input...
-
Arunas Bartisius over 4 yearsthanks, this helped me a lot to find a way to extract part of string of specific column