"^ backslash not last character on line" in gawk
gensub()
expects a string as second argument. You are trying to concatenate the /
and ,
around an expression (\\1+3)
which you assume will be evaluated by the function. It will not. It is evaluated before calling the function. You use \1
to refer to the matching capture group ()
in the regexp, but you can only use it in a string, not in an expression.
So at best you could use as second argument "/\\1+3,"
, but you would then get the result ...Backslash/49+3,Black
. You cannot evaluate the 49+3 part in this way.
If you want to do arithmetic on the match, you must first extract the string, do the arithmetic, then place it back in the string. For example,
awk '{ n = split($0, d, /\/([0-9]+),/, s)
print d[1] "/"(substr(s[1],2)+3)"," d[2] }'
This uses gnu awk's split()
function with a regexp to split the line into 3 parts: the part before the match in d[1]
, the part after the match in d[2]
, and the matched string "/49,"
in s[1]. You should really check n
is 2 to ensure you got exactly one match.
You can then extract the number from the matched string by simply skipping over the initial "/"
, do the arithmetic, and concatenate all the parts together again.
If the pattern may appear several times in one line of your data, a better solution is to use match()
to find only the last occurence and cut up the line using substr()
:
awk '{ match($0, /.*\/([0-9]+),/, m)
a = m[1,"start"]
b = m[1,"length"]
if(a)print substr($0,1,a-1) substr($0,a,b)+3 substr($0,a+b)
else print }'
Here the pattern has .*
added at the front to match only the last occurence.
a
is set to the character position of the start of the capture group ()
in the regexp, and b
to its length, so substr($0,a,b)
is just the number. The final line is reassembled from the two other parts of the original data.
Related videos on Youtube
![Tim](https://i.stack.imgur.com/3PCjR.png?s=256&g=1)
Tim
Elitists are oppressive, anti-intellectual, ultra-conservative, and cancerous to the society, environment, and humanity. Please help make Stack Exchange a better place. Expose elite supremacy, elitist brutality, and moderation injustice to https://stackoverflow.com/contact (complicit community managers), in comments, to meta, outside Stack Exchange, and by legal actions. Push back and don't let them normalize their behaviors. Changes always happen from the bottom up. Thank you very much! Just a curious self learner. Almost always upvote replies. Thanks for enlightenment! Meanwhile, Corruption and abuses have been rampantly coming from elitists. Supportive comments have been removed and attacks are kept to control the direction of discourse. Outright vicious comments have been removed only to conceal atrocities. Systematic discrimination has been made into policies. Countless users have been harassed, persecuted, and suffocated. Q&A sites are for everyone to learn and grow, not for elitists to indulge abusive oppression, and cover up for each other. https://softwareengineering.stackexchange.com/posts/419086/revisions https://math.meta.stackexchange.com/q/32539/ (https://i.stack.imgur.com/4knYh.png) and https://math.meta.stackexchange.com/q/32548/ (https://i.stack.imgur.com/9gaZ2.png) https://meta.stackexchange.com/posts/353417/timeline (The moderators defended continuous harassment comments showing no reading and understanding of my post) https://cs.stackexchange.com/posts/125651/timeline (a PLT academic had trouble with the books I am reading and disparaged my self learning posts, and a moderator with long abusive history added more insults.) https://stackoverflow.com/posts/61679659/revisions (homework libels) Much more that have happened.
Updated on September 18, 2022Comments
-
Tim almost 2 years
I would like to match a number between
/
and,
in each line, and increase it by 3. For exampleThe Ubiquitous Backslash/49,Black
becomes
The Ubiquitous Backslash/52,Black
My gawk command is:
$ gawk '{b=gensub(/\/([0-9]+),/, "/" (\\1+3) ",") ; print b}' add.jpdf gawk: cmd. line:1: ^ backslash not last character on line
I was wondering what "^ backslash not last character on line" means? Which gawk syntax rule does my solution violate?
Thanks.
-
Tim almost 7 yearsThanks. (1) From gnu.org/software/gawk/manual/html_node/String-Functions.html,
b = gensub(/(.+) (.+)/, "\\2 \\1", "g", a)
, are\1
and\2
evaluated by the function or before calling the function? Why does it work? (2) Doessubstr(s[1],2)
return the suffice of strings[1]
starting from the second character ofs[1]
? Will its output include the last character,
? -
meuh almost 7 years
\1
and\2
in the string passed to gensub will get replaced by gensub for every match. But the string passed to gensub stays otherwise fixed. In your post you try to provide an expression, but that must be converted into a string (which can contain\1
) before it gets passed to gensub. Thesubstr
call is as you described, it starts from character position 2 until the end. -
Tim almost 7 yearsThanks. I see. How would you do to "check n is 2 to ensure you got exactly one match"?
-
Tim almost 7 yearsIn
gensub(/\/([0-9]+),/, "/" (\\1+3) ",")
, what will\\1+3
in the second argument become after evaluating and then what will it become after converting to string? -
meuh almost 7 yearsI added an alternative answer for when there is more than one match on a line. You cannot ask about
\\1+3
, as it is illegal awk, so means nothing. -
Tim almost 7 yearsThanks. Can your way deal with the case that there is no match (where we should do nothing)?
-
meuh almost 7 yearsThe
if(a)
tests if there was a match. If not just print the line. I edited the answer. -
Tim almost 7 yearsThanks a lot @meuh. I was wondering what syntax error "^ backslash not last character on line" in my original solution mean?