Modifying a character in a string in Lua
Solution 1
Strings in Lua are immutable. That means, that any solution that replaces text in a string must end up constructing a new string with the desired content. For the specific case of replacing a single character with some other content, you will need to split the original string into a prefix part and a postfix part, and concatenate them back together around the new content.
This variation on your code:
function replace_char(pos, str, r)
return str:sub(1, pos-1) .. r .. str:sub(pos+1)
end
is the most direct translation to straightforward Lua. It is probably fast enough for most purposes. I've fixed the bug that the prefix should be the first pos-1
chars, and taken advantage of the fact that if the last argument to string.sub
is missing it is assumed to be -1
which is equivalent to the end of the string.
But do note that it creates a number of temporary strings that will hang around in the string store until garbage collection eats them. The temporaries for the prefix and postfix can't be avoided in any solution. But this also has to create a temporary for the first ..
operator to be consumed by the second.
It is possible that one of two alternate approaches could be faster. The first is the solution offered by Paŭlo Ebermann, but with one small tweak:
function replace_char2(pos, str, r)
return ("%s%s%s"):format(str:sub(1,pos-1), r, str:sub(pos+1))
end
This uses string.format
to do the assembly of the result in the hopes that it can guess the final buffer size without needing extra temporary objects.
But do beware that string.format
is likely to have issues with any \0
characters in any string that it passes through its %s
format. Specifically, since it is implemented in terms of standard C's sprintf()
function, it would be reasonable to expect it to terminate the substituted string at the first occurrence of \0
. (Noted by user Delusional Logic in a comment.)
A third alternative that comes to mind is this:
function replace_char3(pos, str, r)
return table.concat{str:sub(1,pos-1), r, str:sub(pos+1)}
end
table.concat
efficiently concatenates a list of strings into a final result. It has an optional second argument which is text to insert between the strings, which defaults to ""
which suits our purpose here.
My guess is that unless your strings are huge and you do this substitution frequently, you won't see any practical performance differences between these methods. However, I've been surprised before, so profile your application to verify there is a bottleneck, and benchmark potential solutions carefully.
Solution 2
You should use pos
inside your function instead of literal 1
and 3
, but apart from this it looks good. Since Lua strings are immutable you can't really do much better than this.
Maybe
"%s%s%s":format(str:sub(1,pos-1), r, str:sub(pos+1, str:len())
is more efficient than the ..
operator, but I doubt it - if it turns out to be a bottleneck, measure it (and then decide to implement this replacement function in C).
dotminic
Updated on July 29, 2022Comments
-
dotminic almost 2 years
Is there any way to replace a character at position N in a string in Lua.
This is what I've come up with so far:
function replace_char(pos, str, r) return str:sub(pos, pos - 1) .. r .. str:sub(pos + 1, str:len()) end str = replace_char(2, "aaaaaa", "X") print(str)
I can't use gsub either as that would replace every capture, not just the capture at position N.
-
Arrowmaster about 13 yearsYes the
..
operator is the slowest way to concatenate strings since a new string is created for every..
. Faster methods includestring.format
andtable.concat
. This shouldn't cause any noticeable effects though unless you are working with very large strings or many concatenation operations. For example I had a script using over 500MB of memory to process a less than 1MB file by using around 5..
per line of input while sorting and reconstructing the input as output. Changing it to store strings in a table andtable.concat
at the end made it so fast I didn't even bother measuring. -
Paŭlo Ebermann about 13 years@Arrowmaster: Do you know that in
a .. b .. c
there are two (instead of only one) new strings created, or do you simply assume this? In principle this could be optimized by the compiler/interpreter to create only one new string, like it is done in Java for the+
operator. Your example is another case, since there you really have to create new strings with every statement. -
dotminic about 13 years@Paŭlo Ebermann yeah I just copied the code, forgot to remove the literals. @Arrowmaster @Paŭlo Ebermann I'll compare the .. operator to the format method. Thanks for the insight.
-
Arrowmaster about 13 years@Paŭlo: The reference version of Lua does not have much if any compiler optimizations. I'm not sure about other implementations such as LuaJIT.
-
dotminic about 13 yearsThanks for the in depth explanation
-
Alexander Gladysh about 13 yearsYou need parens around
"%s%s%s"
here. -
Alexander Gladysh about 13 yearsAbout optimizations: as far as I remember, standard Lua does try transform all
..
concatenations in a single expression to a single VM instruction (up to a point). Soa .. b .. c
does not create an intermediate string. (Buta .. (b .. c)
should create one.) -
Alexander Gladysh about 13 yearsAnd usually
table.concat
(and table creation which it requires) are worth it only in loops. If you have a single expression, go for..
. (And, anyway, you should not try to optimize prematurely; write it in most concise way first, profile and optimize later) -
sylvanaar about 13 yearsLua has a special opcode "CONCAT" which does not create intermediate strings. Use of parenthesis does cause intermediate strings to be created either.
-
Igorio over 11 yearsDoes or doesn't? The 'either' is throwing me.
-
Paŭlo Ebermann over 11 years@RossCharette: I suppose it's "doesn't". But you should include an
@sylvanaar
in your message so he gets a notification about it. -
Delusional Logic over 10 yearsThis is old. But i just got done solving a minor bug in some code i wrote. Turns out that the
replace_char2
method don't insert null (\0
) chars. -
RBerteig about 10 years@DelusionalLogic Good point.
string.format
is based solidly on standard C'ssprintf()
function, and is likely to have issues with embedded NUL bytes.