Modifying a character in a string in Lua

17,908

Solution 1

Strings in Lua are immutable. That means, that any solution that replaces text in a string must end up constructing a new string with the desired content. For the specific case of replacing a single character with some other content, you will need to split the original string into a prefix part and a postfix part, and concatenate them back together around the new content.

This variation on your code:

function replace_char(pos, str, r)
    return str:sub(1, pos-1) .. r .. str:sub(pos+1)
end

is the most direct translation to straightforward Lua. It is probably fast enough for most purposes. I've fixed the bug that the prefix should be the first pos-1 chars, and taken advantage of the fact that if the last argument to string.sub is missing it is assumed to be -1 which is equivalent to the end of the string.

But do note that it creates a number of temporary strings that will hang around in the string store until garbage collection eats them. The temporaries for the prefix and postfix can't be avoided in any solution. But this also has to create a temporary for the first .. operator to be consumed by the second.

It is possible that one of two alternate approaches could be faster. The first is the solution offered by Paŭlo Ebermann, but with one small tweak:

function replace_char2(pos, str, r)
    return ("%s%s%s"):format(str:sub(1,pos-1), r, str:sub(pos+1))
end

This uses string.format to do the assembly of the result in the hopes that it can guess the final buffer size without needing extra temporary objects.

But do beware that string.format is likely to have issues with any \0 characters in any string that it passes through its %s format. Specifically, since it is implemented in terms of standard C's sprintf() function, it would be reasonable to expect it to terminate the substituted string at the first occurrence of \0. (Noted by user Delusional Logic in a comment.)

A third alternative that comes to mind is this:

function replace_char3(pos, str, r)
    return table.concat{str:sub(1,pos-1), r, str:sub(pos+1)}
end

table.concat efficiently concatenates a list of strings into a final result. It has an optional second argument which is text to insert between the strings, which defaults to "" which suits our purpose here.

My guess is that unless your strings are huge and you do this substitution frequently, you won't see any practical performance differences between these methods. However, I've been surprised before, so profile your application to verify there is a bottleneck, and benchmark potential solutions carefully.

Solution 2

You should use pos inside your function instead of literal 1 and 3, but apart from this it looks good. Since Lua strings are immutable you can't really do much better than this.

Maybe

 "%s%s%s":format(str:sub(1,pos-1), r, str:sub(pos+1, str:len())

is more efficient than the .. operator, but I doubt it - if it turns out to be a bottleneck, measure it (and then decide to implement this replacement function in C).

Share:
17,908
dotminic
Author by

dotminic

Updated on July 29, 2022

Comments

  • dotminic
    dotminic almost 2 years

    Is there any way to replace a character at position N in a string in Lua.

    This is what I've come up with so far:

    function replace_char(pos, str, r)
        return str:sub(pos, pos - 1) .. r .. str:sub(pos + 1, str:len())
    end
    
    str = replace_char(2, "aaaaaa", "X")
    print(str)
    

    I can't use gsub either as that would replace every capture, not just the capture at position N.

  • Arrowmaster
    Arrowmaster about 13 years
    Yes the .. operator is the slowest way to concatenate strings since a new string is created for every ... Faster methods include string.format and table.concat. This shouldn't cause any noticeable effects though unless you are working with very large strings or many concatenation operations. For example I had a script using over 500MB of memory to process a less than 1MB file by using around 5 .. per line of input while sorting and reconstructing the input as output. Changing it to store strings in a table and table.concat at the end made it so fast I didn't even bother measuring.
  • Paŭlo Ebermann
    Paŭlo Ebermann about 13 years
    @Arrowmaster: Do you know that in a .. b .. c there are two (instead of only one) new strings created, or do you simply assume this? In principle this could be optimized by the compiler/interpreter to create only one new string, like it is done in Java for the + operator. Your example is another case, since there you really have to create new strings with every statement.
  • dotminic
    dotminic about 13 years
    @Paŭlo Ebermann yeah I just copied the code, forgot to remove the literals. @Arrowmaster @Paŭlo Ebermann I'll compare the .. operator to the format method. Thanks for the insight.
  • Arrowmaster
    Arrowmaster about 13 years
    @Paŭlo: The reference version of Lua does not have much if any compiler optimizations. I'm not sure about other implementations such as LuaJIT.
  • dotminic
    dotminic about 13 years
    Thanks for the in depth explanation
  • Alexander Gladysh
    Alexander Gladysh about 13 years
    You need parens around "%s%s%s" here.
  • Alexander Gladysh
    Alexander Gladysh about 13 years
    About optimizations: as far as I remember, standard Lua does try transform all .. concatenations in a single expression to a single VM instruction (up to a point). So a .. b .. c does not create an intermediate string. (But a .. (b .. c) should create one.)
  • Alexander Gladysh
    Alexander Gladysh about 13 years
    And usually table.concat (and table creation which it requires) are worth it only in loops. If you have a single expression, go for ... (And, anyway, you should not try to optimize prematurely; write it in most concise way first, profile and optimize later)
  • sylvanaar
    sylvanaar about 13 years
    Lua has a special opcode "CONCAT" which does not create intermediate strings. Use of parenthesis does cause intermediate strings to be created either.
  • Igorio
    Igorio over 11 years
    Does or doesn't? The 'either' is throwing me.
  • Paŭlo Ebermann
    Paŭlo Ebermann over 11 years
    @RossCharette: I suppose it's "doesn't". But you should include an @sylvanaar in your message so he gets a notification about it.
  • Delusional Logic
    Delusional Logic over 10 years
    This is old. But i just got done solving a minor bug in some code i wrote. Turns out that the replace_char2 method don't insert null (\0) chars.
  • RBerteig
    RBerteig about 10 years
    @DelusionalLogic Good point. string.format is based solidly on standard C's sprintf() function, and is likely to have issues with embedded NUL bytes.