Finding longest substring in alphabetical order

17,305

Solution 1

There are many things to improve in your code but making minimum changes so as to make it work. The problem is you should have if last_pos(i) != None: in your for loop (i instead of i+1) and you should compare diff (not diff - 1) against maxLen. Please read other answers to learn how to do it better.

for i in range(len(s)):
    if last_pos(i) != None:
        diff = last_pos(i) - i + 1
    if diff > maxLen:
        maxLen = diff
        startPos = i
        endPos = startPos + diff - 1

Solution 2

Here. This does what you want. One pass, no need for recursion.

def find_longest_substring_in_alphabetical_order(s):
    groups = []
    cur_longest = ''
    prev_char = ''
    for c in s.lower():
        if prev_char and c < prev_char:
            groups.append(cur_longest)
            cur_longest = c
        else:
            cur_longest += c
        prev_char = c
    return max(groups, key=len) if groups else s

Using it:

>>> find_longest_substring_in_alphabetical_order('hixwluvyhzzzdgd')
'luvy'
>>> find_longest_substring_in_alphabetical_order('eseoojlsuai')
'jlsu'
>>> find_longest_substring_in_alphabetical_order('drurotsxjehlwfwgygygxz')
'ehlw'

Note: It will probably break on strange characters, has only been tested with the inputs you suggested. Since this is a "homework" question, I will leave you with the solution as is, though there is still some optimization to be done, I wanted to leave it a little bit understandable.

Solution 3

You can use nested for loops, slicing and sorted. If the string is not all lower-case then you can convert the sub-strings to lower-case before comparing using str.lower:

def solve(strs):
  maxx = ''
  for i in xrange(len(strs)):
      for j in xrange(i+1, len(strs)):
          s = strs[i:j+1]
          if ''.join(sorted(s)) == s:
              maxx = max(maxx, s, key=len)
          else:
              break
  return maxx

Output:

>>> solve('hixwluvyhzzzdgd')
'luvy'
>>> solve('eseoojlsuai')
'jlsu'
>>> solve('drurotsxjehlwfwgygygxz')
'ehlw'

Solution 4

Here is a single pass solution with a fast loop. It reads each character only once. Inside the loop operations are limited to

  • 1 string comparison (1 char x 1 char)
  • 1 integer increment
  • 2 integer subtractions
  • 1 integer comparison
  • 1 to 3 integer assignments
  • 1 string assignment

No containers are used. No function calls are made. The empty string is handled without special-case code. All character codes, including chr(0), are properly handled. If there is a tie for the longest alphabetical substring, the function returns the first winning substring it encountered. Case is ignored for purposes of alphabetization, but case is preserved in the output substring.

def longest_alphabetical_substring(string):
    start, end = 0, 0     # range of current alphabetical string
    START, END = 0, 0     # range of longest alphabetical string yet found
    prev = chr(0)         # previous character

    for char in string.lower():   # scan string ignoring case
        if char < prev:       # is character out of alphabetical order?
            start = end       #     if so, start a new substring 
        end += 1              # either way, increment substring length 

        if end - start > END - START:  # found new longest?  
            START, END = start, end    #     if so, update longest 
        prev = char                    # remember previous character

    return string[START : END]   # return longest alphabetical substring 

Result

>>> longest_alphabetical_substring('drurotsxjehlwfwgygygxz')
'ehlw'
>>> longest_alphabetical_substring('eseoojlsuai')
'jlsu'
>>> longest_alphabetical_substring('hixwluvyhzzzdgd')
'luvy'
>>>

Solution 5

Simple and easy.

Code :

s = 'hixwluvyhzzzdgd' 
r,p,t = '','',''
for c in s:
    if p <= c:
        t += c
        p = c
    else:
        if len(t) > len(r):
            r = t
        t,p = c,c
if len(t) > len(r):
    r = t
print 'Longest substring in alphabetical order is: ' + r

Output :

Longest substring in alphabetical order which appeared first: luvy
Share:
17,305
Eugene S
Author by

Eugene S

Professional Software Testing Engineer specializing in automation with Open Source tools. Interested in: Test automation: Selenium, BDD, Cucumber Programming: Java, Python, Spring Data scraping Audio: recording/mixing

Updated on August 10, 2022

Comments

  • Eugene S
    Eugene S over 1 year

    EDIT: I am aware that a question with similar task was already asked in SO but I'm interested to find out the problem in this specific piece of code. I am also aware that this problem can be solved without using recursion.

    The task is to write a program which will find (and print) the longest sub-string in which the letters occur in alphabetical order. If more than 1 equally long sequences were found, then the first one should be printed. For example, the output for a string abczabcd will be abcz.

    I have solved this problem with recursion which seemed to pass my manual tests. However when I run an automated tests set which generate random strings, I have noticed that in some cases, the output is incorrect. For example:

    if s = 'hixwluvyhzzzdgd', the output is hix instead of luvy

    if s = 'eseoojlsuai', the output is eoo instead of jlsu

    if s = 'drurotsxjehlwfwgygygxz', the output is dru instead of ehlw

    After some time struggling, I couldn't figure out what is so special about these strings that causes the bug.

    This is my code:

    pos = 0
    maxLen = 0
    startPos = 0
    endPos = 0
    
    
    def last_pos(pos):
        if pos < (len(s) - 1):
            if s[pos + 1] >= s[pos]:
                pos += 1
                if pos == len(s)-1:
                    return len(s)
                else:
                    return last_pos(pos)
            return pos
    
    
    for i in range(len(s)):
        if last_pos(i+1) != None:
            diff = last_pos(i) - i
        if diff - 1 > maxLen:
            maxLen = diff
            startPos = i
            endPos = startPos + diff
    
    print s[startPos:endPos+1]
    
  • Eugene S
    Eugene S over 10 years
    Yes! You are right about the diff - 1 > maxLen comparison being wrong. It forgot to change it after playing with the code. BTW, you can delete the +1 and -1 and the result will not change.
  • ajay
    ajay over 10 years
    yes but just for making out the sense of what they are, I kept +1 and -1.
  • Gareth Rees
    Gareth Rees over 10 years
    η-reduction: instead of lambda x: len(x), just write len.
  • Inbar Rose
    Inbar Rose over 10 years
    @GarethRees You are right, I am surprised I didn't notice that one. Thanks.
  • Nic
    Nic almost 9 years
    Please explain how to fix the problem, rather than posting the fixed code.
  • Rohit
    Rohit almost 9 years
    Interestingly, this fails when string is 'abcdefghijklmnopqrstuvwxyz'. The error occurs because the groups is empty
  • avina k
    avina k almost 9 years
    supports testcase 'abcdefghijklmnopqrstuvwxyz'
  • Inbar Rose
    Inbar Rose almost 9 years
    @Rohit You are Correct! Good find, solution is to simply return s if there are no groups.
  • mbomb007
    mbomb007 over 7 years
    Code-only answers are discouraged. Please add an explanation of how your code works and solves the problem.
  • Tyler
    Tyler over 6 years
    Welcome to SO. A good answer describes what a code snippet is doing, why it is important, etc. Just giving a segment of code does not contribute to the goal of the site for future readers to find this answer very helpful