grep all the lines in a file and write line to a file from the pattern matching point

36,317

Solution 1

Use the -o option of grep to select only the desired portion, in your case use pattern on line .* to select the portion starting from on line till the end of the line(s):

% grep -o 'on line .*' temp.txt >new.txt

% cat new.txt 
on line jhkjhvjdbvjvbvbdjkvn  
on line fdgdgdgdd  
on line safffasffaf  
on line adaddsd

Solution 2

Given the vi tag on this question, and the fact that I've found that automated file editing with POSIX-compliant ex commands gets short shrift on this site compared to the plethora of advice on sed, awk, grep and even Perl, here is a POSIX-compliant ex command that will perform the desired filtering:

ex -sc 'g/.*\(on line\)/s//\1/ | .w!>>output
q!' input

Note the embedded newline in the command—this is necessary for full POSIX portability as there is no other definite way to end the global command; however most implementations allow multiple -c commands, in which case the following one liner would work just the same:

ex -sc 'g/.*\(on line\)/s//\1/ | .w!>>output' -c 'q!' input

There is bit of regex magic and a lot of ex-command magic contained in this command, and since ex doesn't seem to be very widely known, I'll explain each part:

-s starts ex in silent mode, "in preparation for batch processing", so nothing gets output to your terminal.

-c means "Run the following command when the file is opened." (input is the name of the file to open.)

The ex command itself is really two commands:

g/.*\(on line\)/s//\1/ | .w!>>output
q!

g is the "global" command and means, "Run the following commands (the rest of the line) on all lines of the file matching the specified regex."

The regex given is .*\(on line\), which means 'Any characters any number of times, including 0, followed by "on line"'. The parentheses are used to capture "on line" for backreferencing later.

In actual fact the g command itself could just as well be g/on line/ and it would work the same. However, the substitute command I wrote uses nothing for its regex—s//—which means "reuse the last used regex". Then the s command uses \1 for the replacement text, meaning "on line" in this case.

The pipe symbol | in an ex command doesn't mean a pipe as it does in the shell. Instead it is usually used to delimit separate ex commands, each to be run sequentially but independently. However the global command is an exception to this: in a global command, the vertical bar separates commands which are all within the global command—that is, such commands are only run on the lines matching the regex specified in the global command.

The command following the vertical bar is in this case a write command. It's preceded by a dot . specifying "current line"; without this address specifier the write command will write the entire file, regardless of what is the current line. (Since we're using the write command within a global command, if we were to omit the dot, the write command would write the entire file after each matching line had the substitution command performed on it!)

The >> means, "If the file already exists, append to it rather than giving an error." Since we're writing to the file multiple times, this is necessary, otherwise we would only end up with the last line that was written to the output file. The ! preceding the >> means "If the file doesn't already exist, create the file and write to it rather than throwing an error." (Without the ! it's unspecified in POSIX whether this would happen or not.) And of course output is the name of the file to write to.

Finally, of course, q! means "quit without saving changes to the current file." We've made substitutions on many lines of the input file, but we don't want to save those changes, so we use q!.


There are some other approaches which are equivalent, for example the following:

ex -sc '%s/.*\(on line\)/\1/e | v//d
w output | q!' input

But this uses the e flag to the substitute command, which is not in POSIX. (If this flag is omitted, the batch processing will stop on the occasion where the regex .*\(on line\) isn't found anywhere in the file.)


Of course, where ex really shines is in in-place file editing. But it can certainly be used to filter a file out to another file, as illustrated above.

Solution 3

Try this:

grep -o 'on line .*' temp.txt > out.txt

The -o parameter makes grep only output the matching part of the line, which is what you want.

Share:
36,317

Related videos on Youtube

Admin
Author by

Admin

Updated on September 18, 2022

Comments

  • Admin
    Admin almost 2 years

    For example, a temp.txt file contains information like below:

    adsf on line jhkjhvjdbvjvbvbdjkvn  
    qerwtt on line fdgdgdgdd  
    qwqertg on line safffasffaf  
    wrt on line adaddsd
    

    I want to grep for on line in all the lines of the file and write the remaining part of the lines to another file, i.e after the process on temp.txt file the new file should contain:

    on line jhkjhvjdbvjvbvbdjkvn  
    on line fdgdgdgdd  
    on line safffasffaf  
    on line adaddsd  
    

    How can I do that in linux terminal?

  • heemayl
    heemayl over 8 years
    How does it differ from the already provided answer?
  • replay
    replay over 8 years
    It's the same, we both wrote it at the same time :P
  • Wildcard
    Wildcard over 8 years
    I suppose the OP didn't quite specify whether he wanted to include lines from the original file that don't contain "on line"...I read it that he doesn't want those lines (since he said "grep for it" which would exclude non-matching lines) but I guess you interpreted it differently. You could add -n and the p flag (to the s command) to better approximate grep -o, though.