how to find the number of distinct subsequences of a string?
Solution 1
It's a classic dynamic programming problem.
Let:
dp[i] = number of distinct subsequences ending with a[i]
sum[i] = dp[1] + dp[2] + ... + dp[i]. So sum[n] will be your answer.
last[i] = last position of character i in the given string.
A null string has one subsequence, so dp[0] = 1
.
read a
n = strlen(a)
for i = 1 to n
dp[i] = sum[i - 1] - sum[last[a[i]] - 1]
sum[i] = sum[i - 1] + dp[i]
last[a[i]] = i
return sum[n]
Explanation
dp[i] = sum[i - 1] - sum[last[a[i]] - 1]
Initially, we assume we can append a[i]
to all subsequences ending on previous characters, but this might violate the condition that the counted subsequences need to be distinct. Remember that last[a[i]]
gives us the last position a[i]
appeared on until now. The only subsequences we overcount are those that the previous a[i]
was appended to, so we subtract those.
sum[i] = sum[i - 1] + dp[i]
last[a[i]] = i
Update these values as per their definition.
If your indexing starts from 0, use a[i - 1]
wherever I used a[i]
. Also remember to wrap your computations in a mod
function if you're going to submit code. This should be implemented like this:
mod(x) = (x % m + m) % m
In order to correctly handle negative values in some languages (such as C/C++).
Solution 2
There exists an easier solution to this problem.
The idea is: If all character of the string are distinct, total number of subsequences is 2^n.
Now, if we find any character that have already occurred before, we should consider its last occurrence only (otherwise sequence won't be distinct). So we have to subtract the number of subsequences due to its previous occurrence.
My implementation is like this:
read s
dp[0] = 1
len = strlen(s)
last[s.length()] = {-1} //declaring `last` array with same as length of string `s` and all elements initialized with -1.
for (i = 1; i <= len; i++)
{
dp[i] = (dp[i - 1] * 2)
if (last[s[i]] > 0) dp[i] = (dp[i] - dp[last[s[i]] - 1])
last[s[i]] = i
}
Admin
Updated on July 09, 2022Comments
-
Admin almost 2 years
Here is another spoj problem that asks how to find the number of distinct subsequences of a string ?
For example,
Input
AAA
ABCDEFG
CODECRAFTOutput
4
128
496How can I solve this problem ?
-
Paul Draper over 11 yearsThe dp array is actually unnecessary here, but it does help the explanation of this.
-
Minkesh Jain over 6 yearsThanks mostafiz. geeksforgeeks.org/count-distinct-subsequences it has complete solution/
-
user3797829 over 5 years@mostafiz why do we subtract -1 from
last[s[i]] - 1]
, i.e. why do we do thisdp[last[s[i]] - 1]
and not justdp[last[s[i]]]
-
ajaysinghnegi over 5 years@MostafizRahman Can you explain what do you mean by
last occurrence
in your answer? -
ajaysinghnegi over 5 years@AjaySinghNegi It means no. of subsequences when dp[i] was calculated with the present character appearing previously at that time. Hence, that many no of subsequences at that time should be deducted from the present no. of subsequences so that they are not calculated redundantly again at present calculation of
dp[i] = 2*dp[i-1]
. -
Mostafiz Rahman over 5 yearsas I've just accepted your edit, I seems like you got your answer, right?