sed to replace only matching part in search string

sed
11,916

You can use a search to specify the line to match, and then a simpler regex in the substitute:

sed "/file1\.jar (MD5: [0-9A-Fa-f]*)/s/(MD5: [^)]*)/(MD5: $(md5 file1.jar | awk '{print $4}'))/"

That uses the $(...) notation to run the command. The tricky bit in that is at the end, where the sequence ))/" appears. The first close parenthesis is the end of the $(...) notation; the second is a character in the replacement text.

The first regex /file1\.jar (MD5: [0-9A-Fa-f]*)/ specifies fairly precisely the line to be matched. Then, knowing it is the correct line, the pattern in the substitute can be simpler: the search part /(MD5: [^)]*)/ looks for just the parenthesized MD5 data, safe in the knowledge that even though many other lines contain the same pattern, the substitution will only be applied to the one desired line.

I might be inclined to use:

md5=$(md5 file1.jar | awk '{print $4}')
sed "/file1\.jar (MD5: [0-9A-Fa-f]*)/  s/(MD5: [^)]*)/(MD5: $md5)/"

which clarifies what's what considerably (and doesn't involve a horizontal scroll bar on SO). You could be even more precise in the line matching pattern:

md5=$(md5 file1.jar | awk '{print $4}')
sed "/^file1\.jar (MD5: [0-9A-Fa-f]\{32\})\$/  s/(MD5: [^)]*)/(MD5: $md5)/"

That insists on exactly 32 hex digits and the close parenthesis at the end of the line.


One of the comments asks:

Can sed operate in such a way that the replacement string replaces only the matching groups in the search pattern? For example, given 's/A B \(D\)/C/', it outputs A B C.

If I understand the (clarification of the) question, then you can do what you want with appropriate capturing - but the replacement part will have to specify exactly what you want as output (no shortcuts like you seem to be after). So, for the example, you would write something like:

s/\(A B \)\(D\)/\1C/

(where the capturing \(D\) does not need the capturing parentheses since the captured material is not used in the replacement, and you could write either of:

s/\(A B \)D/\1C/
s/\(A B\) D/\1 C/

You could also do:

/A B / s/D/C/

This has a search (for the A B sequence) and then the substitute looks for D and replaces it with C. This is basically what the main answer is suggesting. You can probably also do:

/\(A B\) D/ s//\1 C/

The 'empty search' should repeat the match, but the replacement has to be written out in full, and that is effectively the same as one of the previous commands:

s/\(A B\) D/\1 C/
Share:
11,916
Admin
Author by

Admin

Updated on July 01, 2022

Comments

  • Admin
    Admin almost 2 years

    I have a file that contains:

    Lorem ipsum dolem file1.jar.

    • file1.jar (MD5: 12345678901234567890123456789012)
    • file2.jar (MD5: 09876543210987654321098765432109)
    • file3.jar (MD5: 24681357902468135790246813579024)

    and I'd like to replace the first MD5. This sed command does the job:

    sed "s/file1.*MD5\:\(.*\)/file1.jar \(MD5\: `md5 file1.jar | awk '{print $4}'`\)/"
    

    Is there a way to tell sed to replace only the matching group while leaving the rest of the line alone? For example:

    sed "s/file1.*MD5\:\(.*\)/`md5 file1.jar | awk '{print $4}'`/"
    
  • Jonathan Leffler
    Jonathan Leffler over 12 years
    You can do better than writing 32 dots with .\{32\}, or (since you're after hex digits) [0-9a-fA-F]\{32\}. Also, your use of naked parentheses as capturing requires 'extended regular expressions', which requires option -E on MacOS X and BSD, and -r (or -regex-extended) with GNU sed - and which isn't available on native AIX, HP-UX or Solaris versions of sed.
  • Admin
    Admin over 12 years
    +1 Great explanation. While your command might make the search more accurate (much appreciated), it doesn't exactly address my original question. Perhaps I incorrectly phrased it to begin with. Can sed operate in such a way that the replacement string replaces only the matching groups in the search pattern? For example, given 's/A B \(D\)/C/', it outputs A B C.
  • Jonathan Leffler
    Jonathan Leffler over 12 years
    I don't understand what you're asking, then. You can do what you want with appropriate capturing - but the replacement part will have to specify exactly what you want as output (no shortcuts like you seem to be after).
  • Admin
    Admin over 12 years
    Ah ok. That actually answers my question. :) Do you mind editing your answer to include your comment?
  • jaypal singh
    jaypal singh over 12 years
    Is there a reason why we should have such a long RegEx pattern, to identify the line that needs to be changed? /^file1\.jar (MD5: [0-9A-Fa-f]\{32\})\$/ is similar to /^file1/.
  • Jonathan Leffler
    Jonathan Leffler over 12 years
    @Jaypal: you could probably use a much shorter regex - the one I quoted is secure against most accidents or misformed information, whereas simpler regexes might manage to be confused by something unexpected. It depends on your knowledge of what might be found in the data, and is a judgement call. You could almost certainly use just /^file1\.jar / with approximately zero chance of confusion, but if the file has an SHA1 hash, the substitute operation might fail, etc. I doubt whether that's a real problem. You might also need to consider whether the jar files could have path components.
  • vpalmu
    vpalmu over 12 years
    I never can seem to remember which commands want ( for substring and which commands want (.
  • Admin
    Admin over 12 years
    @JonathanLeffler, Thanks again!