Extracting lines based on conditions

17,748

Solution 1

Possible solution with awk:

awk -F',' '$3 == "c" && $5' file

Depending on actual data this may not work as desired as mentioned in comments (thanks Janis for pointing this: it will miss f,g,c,i,0 e.g 5th field is 0) so you can do following:

awk -F',' '$3 == "c" && $5 != ""' file

And as this is the accepted answer I am adding not so obvious forcing 5th field to string (as in cuonglm(+1) solution):

awk -F',' '$3 == "c" && $5""' file

Solution 2

sed -n '/,$/!s/^\([^,]*,\)\{2\}c/&/p'

...will work for a POSIX sed. If you can use a sed which implements AT&T Augmented regular expressions - such as the one freely available in the astopen package - you could do it like:

sed -nX '/^(([^,]*,){2}c.*)&(.*,)!$/p'

Of course, if the latter case is true, you probably have a similar grep (as can be compiled as a ksh93 builtin, incidentally) and so you should probably do instead:

grep -xX '(([^,]*,){2}c.*)&(.*,)!'

Solution 3

With awk:

awk -F, '$3 == "c" && $5""' file

In awk, 0 and "" are two false values in boolean context. So if you do something like $3 == "c" && $5, you will miss lines which the fifth field is 0. $5"" force awk coerce fifth field to string, string "0" will be evaluated to true.

Share:
17,748

Related videos on Youtube

Admin
Author by

Admin

Updated on September 18, 2022

Comments

  • Admin
    Admin over 1 year

    Each line in a comma-separated file has 5 fields.

    a,b,c,d,e
    f,g,c,i,
    j,k,c,m,n
    o,p,c,r,s
    t,u,c,w,
    x,y,z,aa,bb
    

    How can I extract the lines which have c in the 3rd field and their 5th field is NOT empty? The result would be:

    a,b,c,d,e
    j,k,c,m,n
    o,p,c,r,s
    
  • Janis
    Janis about 9 years
    Note that, depending on the actual data, this may not work as desired. It would not match f,g,c,i,0, i.e. with a 0 in the last column. The fix is of course easy: awk -F',' '$3 == "c" && $5 != "".
  • cuonglm
    cuonglm about 9 years
    @Janis: $5"" is enough. See my answer
  • Anthon
    Anthon about 9 years
    I would assign the rstrip() to a new variable and the output from split as well and re-use that for as more readable and faster solution
  • Janis
    Janis about 9 years
    @cuonglm; Yes, forcing the field to string is also possible. But, IMO, it's still less clear than spending two more characters to create the explicit and (also to awk beginners!) more obvious condition $5!="".
  • heemayl
    heemayl about 9 years
    @Anthon yeah, you are right..answer edited..
  • mikeserv
    mikeserv about 9 years
    @glennjackman - Not at all. The description of the file is: Each line in a comma-separated file has 5 fields. Admittedly if that were not the case an additional test like 's/,//5;t' might be called for.