Convert string to number & vice versa complexity

20,265

Solution 1

Numeric strings are numbers formatted in positional notation - so the value of each digit multiplied by a power of the base needs to be taken into account in order to convert the number into a binary format.

So yeah, it's an O(N) operation because the running time increases linearly as more digits are added. However, in practice N may be limited by whatever numeric data-types the language supports (e.g. int32_t, int64_t). But if arbitrary precision number types are used (which some languages, like Python, use by default) then there is no limit to the number of digits (other than available memory obviously).

Solution 2

To convert to a number you always have to read all digits. So it's at least O(n).

Now doing something like (pseudocode)

a = 0
foreach digit in string
do
   a = 10 * a + digit
end

Is O(n). So the complexity is O(n)

Solution 3

If you're converting a number N to a string. It takes O(log(N)) with base 10. (If you divide by 10 and keep the remainder) If you're converting a string with length N, then it takes O(N). (If you use the algorithm whic keeps adding to your number 10^(N)*digit(N))

If you use functions which are not yours (let's say for string), you can only expect them to be slower.

Solution 4

I'm reasonably certain that dealing with pure numeric operators (in c++ and c# I think it would be the "%" modulus operator) is going to be more efficient if coded correctly because at some level you have to check for similar features (does the end match the beginning) and performing a conversion between string and number can only add to the complexity of the operation if you can do the same thing without performing that conversion.

That said, I would not worry about the performance impact of converting between numbers and strings because it's probably insignificant compared to the performance impact of most other areas of a program. Numeric types are limited to 64 bits, which puts a relatively low cap on the number of digits you can plan to parse anyway, unless you are implementing/using custom-coded large-number types.

You don't have to worry about the complexity being O(n) where n is the magnitude of the number. It would be more like O(n) where n is the number of digits (which has the low cap I mentioned) or (as mentioned in another reply) O(log(n)) if n is the magnitude of the number. Relatively insignificant performance impact.

Now if, as you suggest, you have no limit on N (which is not possible because with 2 GB of RAM you can only store numbers with up to 2 billion digits), then we might have to think more about the performance of performing mathematical operators. Consider the performance of the "%" and "/" operator on this large number type. But then realize that in order to convert the number to a string, it's basically using those same operators anyway. Once again, you can't beat handling it as a number directly if you do it right.

Solution 5

C# and C/C++ don't have any special information in the strings which represents the (possible) numerical value. Therefore they need to parse the string digit by digit when converting.

However, the number of digits is limited, so we've got just O(1): the conversion time is bounded (usually by the conversion of the biggest number). For a 32-bit int, the conversion has to consider maximum of 10 decimal digits (and possibly a sign).

Conversion from string is actually O(1) as well, because during parsing of it, it's enough to consider only a bounded number of characters (10+1 in case of 32-bit int).

Strictly speaking, we cannot use O-notation for the case of int-to-string conversion, since the maximal value of int is bounded. Anyway, the time needed for conversion (in both directions) is limited by a constant.

As @Charles suggests, other languages (Python) actually can use arbitrary-precision numbers. For parsing such numbers, the time is O(number of digits), which O(string length) and O(log(number)) for both conversions, respectively. With arbitrary-precision numbers, one cannot do it faster, since for both conversions every digit must be considered. For the conversions to/from a limited-precision numbers, the same O(1) reasoning applies. However I didn't profile the parsing in Python myself, so maybe a less efficient algorithm is used there.


EDIT: following the @Steve suggestion, I checked that parsing in C/C++ and C# skips the initial whitespace, so the time for string->int conversion is actually O(input length). In case it's known that the string is trimmed, the conversion is again O(1).

Share:
20,265
Srikar Appalaraju
Author by

Srikar Appalaraju

Hi I am Srikar, I think I have been programmed to feel that I need to write this message...

Updated on July 09, 2022

Comments

  • Srikar Appalaraju
    Srikar Appalaraju almost 2 years

    What would be the complexity of converting a string to its equivalent number or vice versa? Does it change depending on programming language?

    On the face of it, one needs to traverse the entire string to convert it to a number, so it is O(n), or is some typecasting used?

    This doubt came about when I was writing a routine to check if a given number is a palindrome or not. One approach would be to keep dividing the number by the base (here 10), accumulate digits, and put them together at the end. Example: 309/10=rem(9), 30/10=rem(0), 3/10=rem(3). we get 903.

    Another approach I took was to convert this number into a string, and since strings have loads of member functions to split, reverse etc., the code was a lot shorter and cleaner, but is this the best way to do this?

  • Vlad
    Vlad over 13 years
    strictly speaking, it's wrong. for O(n) notation, n must tend to infinity.
  • Charles Salvia
    Charles Salvia over 13 years
    @Vlad, N does tend towards infinity if arbitrary precision numbers are used.
  • Vlad
    Vlad over 13 years
    @Charles: you are right. So this makes sense only for the languages with built-in arbitrary-precision numbers (this excludes C/C++ and C#).
  • Fred Foo
    Fred Foo over 13 years
    Both take O(N) where N is the number of digits. This is the proper way to analyze arithmetic algorithms.
  • Fred Foo
    Fred Foo over 13 years
    @Vlad: who said the bigints must be "built-in"?
  • Fred Foo
    Fred Foo over 13 years
    Of course we can use big O notation, we just get O(1) when not using bigints.
  • Yakov Galka
    Yakov Galka over 13 years
    @Charles and @Vlad: but for arbitrary-precision this becomes O(n^2) since calculating the multiplication 10*a is O(n) operation.
  • Vlad
    Vlad over 13 years
    @larsmans: I am explicitly speaking about C/C++ and C#, which don't have built-in bigints.
  • Vlad
    Vlad over 13 years
    @larsmans: the OP talks about the built-in conversions. The built-in conversions cannot work with non-built-in types. :-P
  • Yakov Galka
    Yakov Galka over 13 years
    @Vlad: for arbitrary-precision it's O((number of digits)^2), see the comment to @Peters answer.
  • Vlad
    Vlad over 13 years
    @ybungalobill: I don't see why it's true. After all, it depends on the internal representation of bigints. If bigints consist of several chunks (corresponding to the representation N = n_0 + 10^kn_1 + 10^2kn_2 etc.), the multiplication by 10 can be very efficient. Did you measure the performance of parsing in Python?
  • Yakov Galka
    Yakov Galka over 13 years
    @Vlad: when you're talking about such a conversion you usually assume that the base of the string (10) is different from the base used by the internal representation. Otherwise it's possible to make O(1). Ad we are talking about complexity, not performance.
  • Vlad
    Vlad over 13 years
    @ybungalobill: I would personally choose an internal representation which allows the typical operations to run fast (even sacrificing several bits). Parsing is definitely one of such operations. So, did you test the Python's performance yourself, or you have another source of information?
  • Yakov Galka
    Yakov Galka over 13 years
    @Vlad: What you say is that you prefer to avoid the conversion by storing the decimal string itself... Well, I'm sure that's what's done in some cases, but it's sort of cheating when answering this question. I don't use python, but some applications (e.g. GIMPS) must use binary representation, so there the complexity of the actual conversion is interesting.
  • Vlad
    Vlad over 13 years
    @ybungalobill: not necessary the decimal representation. You can store chunks which have values in interval 0..10^10-1 (which fits into a 32-bit int. Anyway, the OP asks about not theoretical minimum (which seems to be better with my representation), but about the real performance of the language.
  • Steve Jessop
    Steve Jessop over 13 years
    @Vlad: "it's enough to consider only a bounded number of characters" - unless your conversion skips initial whitespace ;-)
  • Vlad
    Vlad over 13 years
    @Steve: well, that's true. So, for C (sscanf) the performance is O(string size). :-(
  • Steve Jessop
    Steve Jessop over 13 years
    @Vlad: sure, but I was mostly joking. O(number of initial spaces in the string + constant) is still O(1) as long as the input doesn't have any whitespace to skip. If the input does have whitespace to skip then that's necessarily a linear-complexity operation whether we do it or our caller does. So skipping whitespace doesn't really "cost" anyone any complexity, just shifts it to us from our caller.
  • Vlad
    Vlad over 13 years
    @Steve: well, but theoretically this indeed makes the performance O(string size). Practically, there is a nice use case of trimmed strings :-)
  • Steve Jessop
    Steve Jessop over 13 years
    @Vlad: and in theory it's the theoretical performance that matters, but in practice it's the practical performance :-)
  • Vlad
    Vlad over 13 years
    @Steve: ... and for the OP's case we can be sure that there are no leading spaces, since the string is obtained by converting from the number.
  • synepis
    synepis over 13 years
    If the number is N=10^5. How many steps to convert it to a string? I think O(log(N)), because that is the number of digits.