Search in specific column for pattern and output entire line
Solution 1
The simplest approach would probably be awk
:
awk -F'|' '$4~/^5/' file
The -F'|'
sets the field separator to |
. The $4~/^5/
will be true if the 4th field starts with 5
. The default action for awk
when something evaluates to true is to print the current line, so the script above will print what you want.
Other choices are:
Perl
perl -F'\|' -ane 'print if $F[3]=~/^5/' file
Same idea. The
-a
switch causesperl
to split its input fields on the value given by-F
into the array@F
. We then print if the 4th element (field) of the array (arrays start counting at 0) starts with a5
.grep
grep -E '^([^|]*\|){3}5' file
The regex will match a string of non-
|
followed by a|
3 times, and then a5
.GNU or BSD
sed
sed -En '/([^|]*\|){3}5/p' file
The
-E
turns on extended regular expressions and the-n
suppresses normal output. The regex is the same as thegrep
above and thep
at the end makessed
print only lines matching the regex.
Solution 2
This will print all lines that match |5
and then no more |
until the end of the line:
grep '|5[^|]*$' <in >out
Related videos on Youtube
Kit Goodman
Updated on September 18, 2022Comments
-
Kit Goodman over 1 year
I'm working in HDFS and am trying to get the entire line where the 4th column starts with the number 5:
100|20151010|K|5001 695|20151010|K|1010 309|20151010|R|5005 410|20151010|K|5001 107|20151010|K|1062 652|20151010|K|5001
Hence should output:
100|20151010|K|5001 309|20151010|R|5005 410|20151010|K|5001 652|20151010|K|5001
-
terdon over 8 years@mikeserv thanks, greater portability is always a good thing but where is that documented? I tried it and it does indeed work on GNU sed but the
-E
isn't mentioned in eitherman
orinfo
. It does activate ERE, right? -
mikeserv over 8 yearsits not, except in the source. that happens a lot with open source stuff - people submit a patch to do the same thing something already does because they're used to doing it with a different letter but then dont care to write a new SYNOPSIS. anyway, its worked for a long time. and
-E
xtended regexp is slated for the next POSIX version, too, so might as well just get used to it. plus,-r
doesn't make any sense. and yeah, it's a synonym - they both do exactly the same thing. almost the same - i think w/-r
you can switch back-re ... -Ge ...
or something, but who would?