Computing All The Possible Substrings of a Given String
Solution 1
Just use two for-loops:
generate substrings(string):
for start in [0,1,...,string.length-1]:
for end in [start,...,string.length-1]:
yield string[start...end]
You can also do it this way with two for-loops:
generate substrings(string):
for substringLength in [1,2,...,string.length]:
for start in range [0,1,...,string.length-substringLength]:
yield string[start...(start+substringLength-1)]
yield ""
You probably want to include the empty string ""
in the sequence you return as well, as it is a substring of all strings.
You also need to consider if it is valid to yield a duplicate string multiple times (e.g. do you return "ABA" twice as a substring of "ABABA"?). If the answer is no, merely make a hashtable called alreadyYielded
, and whenever you yield, abort if you've already yielded the string, otherwise add the value to the hashtable in case you see it again. For example:
seen = new HashTable()
...
substring = string[...]
if substring not in seen:
seen.add(substring)
yield substring
...
Solution 2
Here's a 2-cent answer:
for (indexOfFirstLetterOfString = 0; indexOfFirstLetterOfString < string.length; indexOfFirstLetterOfString++) {
for (indexOfLastLetterOfString = indexOfFirstLetterOfString + 1; indexOfLastLetterOfString < string.length; indexOfLastLetterOfString++) {
addToArrayOfStrings ( string.substring (indexOfFirstLetterOfString, indexOfLastLetterOfString - indexOfFirstLetterOfString))
incrementCounter();
}
}
To get the number of combinations, simply add a counter to the inner loop.
For instance, in perl, this might look like:
$a = "ABCDE";
$numberOfSubstrings = 0;
for ($indexOfFirstLetter = 0; $indexOfFirstLetter <= length($a); $indexOfFirstLetter++) {
for ($indexOfLastLetter = $indexOfFirstLetter + 1; $indexOfLastLetter <= length($a); $indexOfLastLetter++) {
print substr($a, $indexOfFirstLetter, $indexOfLastLetter - $indexOfFirstLetter) . "\n";
$numberOfSubStrings++;
}
}
print "Number of substrings: " . $numberOfSubStrings;
Related videos on Youtube
neilmarion
Updated on June 04, 2022Comments
-
neilmarion almost 2 years
Possible Duplicate:
How to find all substrings of a string in PHP
Find all subsets of a listHow can I compute all the possible substrings of a string? For example given a string ABCDE. All its possible substrings will be
A, B, C, D, E, AB, BC, CD, DE, ABC, BCD, CDE, ABCD, BCDE, ABCDE
Thanks! A pseudocode will be highly appreciated. :D
-
bobbymcr over 12 yearsThis type of question has been asked many times before: stackoverflow.com/questions/728972 , stackoverflow.com/questions/1592039 , stackoverflow.com/questions/5023081 , stackoverflow.com/questions/6780935 , etc. etc. Just search on "powerset" or "subset".
-
ninjagecko over 12 yearsI disagree with both Ken and bobbymcr. While it is true that the linked question is "how to find substrings of a string?" it seems to involve some heavy PHP with minimal explanation, as do the answers. All of the links bobbymcr linked are completely separate problems, though related (substrings are not subsets). Upon cursory inspection, I cannot find a similar language-agnostic question with clear answers given in pseudocode.
-
bobbymcr over 12 years@ninjagecko: On second reading, I see that you're right -- the OP seems to want something more specialized than pure subsets, but something like substrings with fully connected elements (e.g. from ABC -> AB and BC, but not AC).
-
-
ninjagecko over 12 yearsneilmarion: if you look at the number of strings the second loop returns, it looks like this:
LENGTH+(LENGTH-1)+(LENGTH-2)+...+1
, which is equal toLENGTH*(LENGTH+1)/2
-
DyingIsFun over 7 yearsI get a syntax error for
...
. What am I missing?