Grep characters before and after match?

245,663

Solution 1

3 characters before and 4 characters after

$> echo "some123_string_and_another" | grep -o -P '.{0,3}string.{0,4}'
23_string_and

Solution 2

grep -E -o ".{0,5}test_pattern.{0,5}" test.txt 

This will match up to 5 characters before and after your pattern. The -o switch tells grep to only show the match and -E to use an extended regular expression. Make sure to put the quotes around your expression, else it might be interpreted by the shell.

Solution 3

You could use

awk '/test_pattern/ {
    match($0, /test_pattern/); print substr($0, RSTART - 10, RLENGTH + 20);
}' file

Solution 4

You mean, like this:

grep -o '.\{0,20\}test_pattern.\{0,20\}' file

?

That will print up to twenty characters on either side of test_pattern. The \{0,20\} notation is like *, but specifies zero to twenty repetitions instead of zero or more.The -o says to show only the match itself, rather than the entire line.

Solution 5

I'll never easily remember these cryptic command modifiers so I took the top answer and turned it into a function in my ~/.bashrc file:

cgrep() {
    # For files that are arrays 10's of thousands of characters print.
    # Use cpgrep to print 30 characters before and after search pattern.
    if [ $# -eq 2 ] ; then
        # Format was 'cgrep "search string" /path/to/filename'
        grep -o -P ".{0,30}$1.{0,30}" "$2"
    else
        # Format was 'cat /path/to/filename | cgrep "search string"
        grep -o -P ".{0,30}$1.{0,30}"
    fi
} # cgrep()

Here's what it looks like in action:

$ ll /tmp/rick/scp.Mf7UdS/Mf7UdS.Source
-rw-r--r-- 1 rick rick 25780 Jul  3 19:05 /tmp/rick/scp.Mf7UdS/Mf7UdS.Source
$ cat /tmp/rick/scp.Mf7UdS/Mf7UdS.Source | cgrep "Link to iconic"
1:43:30.3540244000 /mnt/e/bin/Link to iconic S -rwxrwxrwx 777 rick 1000 ri
$ cgrep "Link to iconic" /tmp/rick/scp.Mf7UdS/Mf7UdS.Source
1:43:30.3540244000 /mnt/e/bin/Link to iconic S -rwxrwxrwx 777 rick 1000 ri

The file in question is one continuous 25K line and it is hopeless to find what you are looking for using regular grep.

Notice the two different ways you can call cgrep that parallels grep method.

There is a "niftier" way of creating the function where "$2" is only passed when set which would save 4 lines of code. I don't have it handy though. Something like ${parm2} $parm2. If I find it I'll revise the function and this answer.

Share:
245,663

Related videos on Youtube

Legend
Author by

Legend

Just a simple guy :)

Updated on July 08, 2022

Comments

  • Legend
    Legend 6 months

    Using this:

    grep -A1 -B1 "test_pattern" file
    

    will produce one line before and after the matched pattern in the file. Is there a way to display not lines but a specified number of characters?

    The lines in my file are pretty big so I am not interested in printing the entire line but rather only observe the match in context. Any suggestions on how to do this?

  • Benubird
    Benubird about 9 years
    A good answer for small amounts of data, but it starts getting slow when you are matching >100 characters - e.g. in my giant xml file, I want {1,200} before and after, and it is too slow to use.
  • ssobczak
    ssobczak over 8 years
    The awk version by @amit_g is much faster.
  • Xofo
    Xofo about 8 years
    Not available on Mac OSX, so really this is not a widely available solution. The -E version (listed below) is a better solution. What is -P? Read on ... -P, --perl-regexp Interpret PATTERN as a Perl regular expression (PCRE, see below). This is highly experimental and grep -P may warn of unimplemented features.
  • Apollon
    Apollon almost 8 years
    Works nicely even with somewhat bigger files also
  • Kev
    Kev over 7 years
    Inexplicably, for me, this prints a certain number of lines of beautiful output, then says "Aborted", every time the same number of lines, which depends on what I'm searching for, but is never the full number of matches, by far. bash 4.1.2(1) and grep 2.6.3, CentOS 6.5.
  • Kev
    Kev over 7 years
    The -E version below does not have this trouble, for some reason. Also, if I search for something that doesn't exist, I get only the Aborted line.
  • glerYbo
    glerYbo about 7 years
    On OSX install via: brew install homebrew/dupes/grep and run it as ggrep.
  • koox00
    koox00 over 6 years
    how can you use this to find multiple matches per line?
  • Alexander  Pravdin
    Alexander Pravdin almost 6 years
    This command is not working for me: grep: Invalid content of \{\}
  • Lew Rockwell Fan
    Lew Rockwell Fan over 5 years
    What's the significance of the first number in the curly-bracketed pairs? Like the 0s in "grep -E -o ".{0,5}test_pattern.{0,5}" test.txt "?
  • matanster
    matanster almost 5 years
    As implied by @Benubird this will be performance-wise impossible to use for huge files with moderately wide surroundings desired for the match target.
  • CodeMonkey
    CodeMonkey over 4 years
    Good answer, interesting that it's capped at 2^8-1 for length in the {} so {0,255} works {0,256} gives grep: invalid repetition count(s)
  • Abdollah
    Abdollah over 3 years
    It's really faster but not as accurate as @ekse's answer.
  • Adam Hughes
    Adam Hughes almost 3 years
    This seems to get considerably less performant as I increase the number of matching chars (5 -> 25 ->50), any idea why?
  • GKP
    GKP 9 months
    not working for me bash-5.1$ echo "some123_string_and_another" | grep -o -P '.{0,3}string.{0,4}' grep: unrecognized option: P