finding substrings of a string

34,095

Solution 1

To understand this, note that any substring needs two parameters: start index and end index.

For example: string str = "Hello World"; // length == 11

Lets begin by taking only one-character substring at time: H, e, l, l, o, , W, o, r, l, d. Then by taking 2 characters at time: He, el, ll, lo, o , W, Wo, or, rl, ld. Then by taking 3 chars: Hel, .. etc.

So the total substring count is 11 + 10 + 9 + .. + 1 and, in general, for i from 1 to n, we have n - i + 1 substrings. By summition:

Sigma (n + 1 - i) from i = 1 to n, yields n * (n + 1) - n * (n + 1) / 2 which is n * (n + 1) / 2

Solution 2

A substring is determined by where it starts and where it ends in the original string. For any substring we have those two end points. Reversely, for any two characters in the string there is exactly one substring that starts and ends at those points.

Thus the number of all substrings is the number of all pairs of (not necessary distinct) characters.

There are n*(n-1)/2 pairs of distinct characters. You also need to add the non-distinct pairs, which are n.

So the total number is n * (n-1) / 2 + n = n * (n+1) / 2.

Solution 3

Well, It is the sum of all substrings of length 1 (exactly n), plus all substrings of length 2 (n-1), plus all substrings of length n (which is a proper string). Then, you have n + n-1 + n-2 ... + 1 = (n * (n+1)) / 2

The sum can be computed using natural induction and it is also known due to Gauss who solved this sum when he was at school.

Solution 4

A substring is defined by it's two endpoints,basically we can consider substrings as slices of a string. Let's understand it with an example. consider string "abc" of length 3

a b c

To slice the string we have 4 positions starting from before a to the end of string after c.From these 4 available options we have to choose 2 positions to create a slice i.e a substring which is equal to 4C2 == n+1C2==n*(n+1)/2

Solution 5

Suppose you want to find out substring of "abc", that would be {"a","ab","abc","b","bc","c"} we would compute it by the following code

for(int i=0;i<len;i++){
        for(int j=i;j<len;j++){

        }
    }

Looking at this code, it is easy to tell that the loops run as

3 times in the first run, 2 times in the second run,1 times in the third run

As it returns a substring everytime,therefore it is equal to summation of n that is = n(n + 1) / 2

Share:
34,095

Related videos on Youtube

Chander Shivdasani
Author by

Chander Shivdasani

A Programmer in race with the universe. While the universe is busy creating fools, im busy writing complex programs.

Updated on July 24, 2022

Comments

  • Chander Shivdasani
    Chander Shivdasani almost 2 years

    For a string of length n, the formula to compute all the substrings are: n(n+1)/2 Can someone help me out in intuitively understanding this formula?

    Wikipedia says: "The number of substrings of a string of length where symbols only occur once, is the number of ways to choose two distinct places between symbols to start/end the substring"

  • madCode
    madCode almost 9 years
    did you mean n*(n+1)/2?