What is the meaning of :a;$!N; in a sed command?
These are the, admittedly cryptic, sed
commands. Specifically (from man sed
):
: label
Label for b and t commands.t label
If a s/// has done a successful substitution since the last input line was read and since the last t or T command, then branch to label; if label is omitted, branch to end of script.n N Read/append the next line of input into the pattern space.
So, the script you posted can be broken down into (spaces added for readbility):
sed ':a; $!N; s/\n/string/; ta'
--- ---- ------------- --
| | | |--> go back (`t`) to `a`
| | |-------------> substitute newlines with `string`
| |----------------------> If this is not the last line (`$!`), append the
| next line to the pattern space.
|----------------------------> Create the label `a`.
Basically, what this is doing could be written in pseudocode as
while (not end of line){
append current line to this one and replace \n with 'string'
}
You can understand this a bit better with a more complex input example:
$ printf "line1\nline2\nline3\nline4\nline5\n" | sed ':a;$!N;s/\n/string/;ta'
line1stringline2stringline3stringline4stringline5
I am not really sure why the !$
is needed. As far as I can tell, you can get the same output with
printf "line1\nline2\nline3\nline4\nline5\n" | sed ':a;N;s/\n/string/;ta'
Related videos on Youtube
Avinash Raj
Updated on September 18, 2022Comments
-
Avinash Raj over 1 year
$ (echo hello; echo there) | sed ':a;$!N;s/\n/string/;ta' hellostringthere
Above
sed
command replaces new line character with the string "string". But I don't know the meaning of:a;$!N;s/\n/string/;ta
within the single quotes. I know the middle parts/\n/string/
. But I don't know the function of first (:a;$!N;
) and last (ta
) part.-
Admin almost 10 yearsplease look at stackoverflow.com/questions/1251999/….
-
Admin almost 10 yearsWhat about the last part?
-
Admin almost 10 yearsThe "t" command branches to a named label if the last substitute command modified pattern space.
-
-
Braiam almost 10 yearsThe !$ is to don't match the last newline, IMO.
-
terdon almost 10 years@Braiam not too sure about that, it's
$!
not!$
. However, it might also be!N
and not$!
. -
Braiam almost 10 yearsI was trying to parse the texinfo page but didn't found references to neither
!N
or$!
. So, I still keep my thinking that is looking if the last line is newline or EOF. -
steeldriver almost 10 yearsI try to think of
$!
as an address 'range' with a postfix complement operator - so$!N
(doN
everywhere except for address$
) is really the same syntax as something likem,n!d
(delete everything except linesm
ton
). -
Sergiy Kolodyazhnyy over 5 years
:
is analog ofgoto
label, and in fact:
used to be goto label in Thompson shell, so it'd be familiar to people back in the day using both sed and Thompson shell -
msciwoj over 5 yearswhen does
append the next line to the pattern space
happen? Does the substitution happen on just new line before append or on the concatenated after substitution (less effective? as tries to substitute same beginning all over again?) -
terdon over 5 years@msciwoj they happen in the order they are written. That's why it works. If the substitution were only done on the original line, before concatenating, then it would only ever remove one
\n
from the first line and there would be no point in concatenating. -
msciwoj over 5 years@terdon I was rather thinking it would take place on the new line being appended (before appending). Imagine 10 lines 100 chars each - If append happens before substitution then it should have huge performance cost (first time substituting on 100 chars, 2nd time on 200 chars, 3rd time on 300 chars and so on, each next time going through the beginning of the string that is already substituted). Is that how this works?
-
terdon over 5 years@msciwoj I see what you mean. I think that is indeed how it works, but only because I am assuming the operations happen in the order in which they are written. My previous comment was wrong, it could also work by substituting first and appending later. You make a good point. This might be worth its own question, either here or on Unix & Linux.