Linux command to replace string in LARGE file with another string
Solution 1
sed is a good choice for large files.
sed -i.bak -e 's%C://temp%//home//some//blah%' large_file.sql
It is a good choice because doesn't read the whole file at once to change it. Quoting the manual:
A stream editor is used to perform basic text transformations on an input stream (a file or input from a pipeline). While in some ways similar to an editor which permits scripted edits (such as ed), sed works by making only one pass over the input(s), and is consequently more efficient. But it is sed's ability to filter text in a pipeline which particularly distinguishes it from other types of editors.
The relevant manual section is here. A small explanation follows
-i.bak enables in place editing leaving a backup copy with .bak extension
s%foo%bar% uses s, the substitution command, which substitutes matches of first string in between the % sign, 'foo', for the second string, 'bar'. It's usually written as s// but because your strings have plenty of slashes, it's more convenient to change them for something else so you avoid having to escape them.
Example
vinko@mithril:~$ sed -i.bak -e 's%C://temp%//home//some//blah%' a.txt vinko@mithril:~$ more a.txt //home//some//blah D://temp //home//some//blah D://temp vinko@mithril:~$ more a.txt.bak C://temp D://temp C://temp D://temp
Solution 2
Just for completeness. In place replacement using perl
.
perl -i -p -e 's{c://temp}{//home//some//blah}g' mysql.dmp
No backslash escapes required either. ;)
Solution 3
Try sed? Something like:
sed 's/c:\/\/temp/\/\/home\/\/some\/\/blah/' mydump.sql > fixeddump.sql
Escaping all those slashes makes this look horrible though, here's a simpler example which changes foo to bar.
sed 's/foo/bar/' mydump.sql > fixeddump.sql
As others have noted, you can choose your own delimiter, which would prevent the leaning toothpick syndrome in this case:
sed 's|c://temp\\|home//some//blah|' mydump.sql > fixeddump.sql
The clever thing about sed is that it operating on a stream rather than a file all at once, so you can process huge files using only a modest amount of memory.
Solution 4
There's also a non-standard UNIX utility, rpl, which does the exact same thing that the sed
examples do; however, I'm not sure whether rpl
operates streamwise, so sed
may be the better option here.
Solution 5
perl -pi -e 's#c://temp#//home//some//blah#g' yourfilename
The -p will treat this script as a loop, it will read the specified file line by line running the regex search and replace.
-i This flag should be used in conjunction with the -p flag. This commands Perl to edit the file in place.
-e Just means execute this perl code.
Good luck
rockstardev
Updated on July 26, 2022Comments
-
rockstardev over 1 year
I have a huge SQL file that gets executed on the server. The dump is from my machine and in it there are a few settings relating to my machine. So basically, I want every occurance of
"c://temp"
to be replace by"//home//some//blah"
How can this be done from the command line?
-
dalloliogm over 14 yearsYou can use a different character to avoid having to quote the slashes, for example sed -e "s%C://temp%/home//some//blah%". Also, the -i option allows you to save the file inplace, when you are sure of the options.
-
dalloliogm over 14 yearsyou missed the last underscore: "s_c://temp/_/home//some//blah_"
-
Vinko Vrsalovic over 14 yearsHeh, per chance, are you a friend of the developer of rpl? :-)
-
rockstardev over 14 yearsThis is the command I'm typing: sed -i.bak -e 's%C:\\temp\%/home/liveon/public_html/tmp' liveon.sql and this is the error I'm getting: sed: -e expression #1, char 41: unterminated `s' command Anyone?
-
Meredith L. Patterson over 14 yearsNope, never heard of the guy outside of the util; it came in handy for doing a batch-replace job on a few thousand text files once and I've kept it in my toolbox.
-
Telemachus over 14 yearsPlease note that if you use the
-i
flag without an extension, you get no backup. If you want a backup, try-i.bak
which will do the in-place edit and give you a backup of the original asoriginal.bak
, pretty much for free. -
Telemachus over 14 yearsIt would be worth saying why you recommend it in this case (or why you might, since you half take back the recommendation). That is, rather than just throw up the name of a utility, tell us what you liked about it, please.
-
Tyler McHenry over 14 yearsrpl is nice for simple replacements because it has a much more user-friendly syntax than the combination of sed and find that it replaces. It also has a neat dry-run feature where it will tell you what it would replace without actually doing the replacement. It's main limitation is that it only does straight replacements and no regular expressions.
-
Dave Jarvis over 14 yearsAlso, RD, make sure to escape backslashes properly.
-
Meredith L. Patterson over 14 years@Telemachus - Tyler nailed it.
-
jrockway over 14 yearsI let my version control system handle making the backups.
-
Telemachus over 14 years@Jrockway: that's lovely for you I'm sure, but it assumes that the files in question are under version control and that you know what -i.bak does and have chosen not to use it. I just wish people who recommend the -i switch would take two seconds to explain the difference between -i and -i.bak. It will really hurt if the files you play with are not under version control and you make a simple typo (e.g, forget the -p flag).
-
humkins over 10 yearsThank you Paul! Intellij Idea becomes crazy and doing this for tens of minutes whereas with sed it takes just 1 sec to replace backslash with double backslash in my sql file.