GNU Pattern Match and replace exact number of characters
Your regular expression is a mix between basic and extended regular expression.
As an extended regular expression (using {13}
and \|
as a literal pipe):
sed -E 's/\|37[0-9]{13}\|/|37xxxxxxxxxxxxx|/g'
Alternatively, as a basic regular expression (using \{13\}
and |
as a literal pipe):
sed 's/|37[0-9]\{13\}|/|37xxxxxxxxxxxxx|/g'
This turns your example string into
dlkfhfd|fedfe|dfwe3f347fde|3745978|dlkfhr**|37xxxxxxxxxxxxx|**fedfe|dfwe3f347fde
Also note that there is no need to escape the |
in the replacement part of the expression as that part is never interpreted as a regular expression.
In awk
:
awk -F '|' -vOFS='|' '
{
for (i=1; i<=NF; ++i))
if (length($i)==15 && match($i,"^37[0-9]"))
$i="37xxxxxxxxxxxxx"
print
}'
One could have used gsub()
here, but that would have made it more or less identical to the sed
solution, and therefore boring.
This has the benefit that the substitution would also occur in the first or last field even if that field was not delimited by |
at both ends.
Related videos on Youtube
Ishan
Updated on September 18, 2022Comments
-
Ishan over 1 year
This question may have been listed but I was not able to find one exact hit.
I am trying to skim through a file, match a pattern and replace it with something else. However there are other occurrences of the pattern but I need to replace only those which are 17 characters in length.
Example:
Content:
dlkfhfd|fedfe|dfwe3f347fde|3745978|dlkfhr**|376663781736102|**fedfe|dfwe3f347fde
Expectation:
dlkfhfd|fedfe|dfwe3f347fde|3745978|dlkfhr**|37xxxxxxxxxxxxx|**fedfe|dfwe3f347fde
Progress: I was able to match the expression with regexp pattern :
**\|37[0-9]{13}\|**
However if I put it in an
sed
, it just replaces everything in the file.sed -e s/\|37[0-9]{13}\|/\|37xxxxxxxxxxxxx\|/g
My sed version is 4.2.2
-
Ishan about 6 yearsForgot to mention the file is pipe delimited.
-
-
Ishan about 6 yearsOne question though, is there a specific version of SED which supports extended and regular both? Or it has to be either of them?
-
Kusalananda about 6 years@Ishan I have tested the two
sed
commands with both GNU and BSDsed
. Neither of these twosed
implementations allow for mixing extended and basic regular expression syntax. Also, if they did, they would not know what to do with|
and/or\|
.