How to do a grep on remote machine and print out the line which contains those words?

33,531

Solution 1

If you are open to other formats, consider:

inc="hello|animal|atttribute|metadata"
exc="timeout|runner" 
ssh machineB "grep -E '$inc' path/ptd.log | grep -vE '$exc'"

Faster Alternative

If your log files are large and you are grepping for fixed words, as opposed to fancy regular expressions, you may want to consider this approach:

inc='hello
animal
atttribute
metadata'

exc='timeout
runner'

ssh office "grep -F '$inc' ptd.log | grep -vF '$exc'"

By putting each word on a separate line, we can use grep's -F feature for fixed strings. This turns off regex processing, making the process faster.

Solution 2

It may not seem possible but you can make use of grep's -f option to make use of that list of words, even though they're in a environment variable and not a proper file. The trick is in fooling grep into thinking that they are from a file like so:

$ ssh machineB 'grep -f <(echo $wordsToInclude|tr , "\n") file1 file2 file3'

This will run the grep ... command remotely via ssh on machineB. It will take your variable, $wordsToInclude and switch the commas to end of line characters, (, -> \n). This list of words is then fed into grep via its -f switch.

To run this through the exclude list simply add that as a 2nd grep after the initial one via a pipe.

$ ssh machineB 'grep -f <(echo $wordsToInclude|tr , "\n") \
    file1 file2 file3 | grep -vf <(echo $wordsToExclude)'

Solution 3

SSH is run with a command like so:

ssh host command

Or in your case:

ssh -t machineB "grep -E \"$wordsToInclude\" ptd.log | grep -v \"$wordsToExclude\""

The -t prevents an "ioctl error". I'd also recommend using grep's fixed words for increased speed, as specified by this answer by @John1024. Just put each word on it's own line, like:

wordsToInclude='hello
animal
atttribute
metadata'

wordsToExclude='timeout
runner'

And add -F to grep's options.

Share:
33,531

Related videos on Youtube

david
Author by

david

Updated on September 18, 2022

Comments

  • david
    david over 1 year

    I have few logs files in my machineB under this directory /opt/ptd/Logs/ as shown below - My logs files are pretty big.

    david@machineB:/opt/ptd/Logs$ ls -lt
    -rw-r--r-- 1 david david  49651720 Oct 11 16:23 ptd.log
    -rw-r--r-- 1 david david 104857728 Oct 10 07:55 ptd.log.1
    -rw-r--r-- 1 david david 104857726 Oct 10 07:50 ptd.log.2
    

    I am trying to write a generic shell script which should try to parse all my log file in machineB for a particular pattern and print the line which has those patterns. I will be running my below shell script from machineA which has all the ssh keys setup everything meaning I need to remotely grep on the logs files on machineB from machineA.

    #!/bin/bash
    
    wordsToInclude="hello,animal,atttribute,metadata"
    wordsToExclude="timeout,runner"
    
    # now grep on the various log file for above words and print out the lines accordingly
    

    Meaning, I will have words separated by comma in wordsToInclude variable - If my logs contain hello word then print out that line, also print out the line which contains animal word. Similarly with attribute and metadata words.

    And also I will have words separated by comma in wordsToExclude variable - If any of the lines contains those words then don't print out those line.

    I am going with the above format for now for storing the words but any better format is fine to me. I can have long list of words in wordsToInclude and wordsToExclude variable so that's why I am going with storing them in those variables.

    I know how to do a grep on small set of variables. If I need to do grep from the command line directly on machineB, then I will do it like this -

    grep -E 'hello|animal|atttribute|metadata' ptd.log | grep -v 'timeout'
    

    But I am not sure how do I combine this in my shell script so that I can do a remote ssh grep on machineB from machineA.