Find the longest common starting substring in a set of strings

41,535

Solution 1

It's a matter of taste, but this is a simple javascript version: It sorts the array, and then looks just at the first and last items.

//longest common starting substring in an array

function sharedStart(array){
    var A= array.concat().sort(), 
    a1= A[0], a2= A[A.length-1], L= a1.length, i= 0;
    while(i<L && a1.charAt(i)=== a2.charAt(i)) i++;
    return a1.substring(0, i);
}

DEMOS

sharedStart(['interspecies', 'interstelar', 'interstate'])  //=> 'inters'
sharedStart(['throne', 'throne'])                           //=> 'throne'
sharedStart(['throne', 'dungeon'])                          //=> ''
sharedStart(['cheese'])                                     //=> 'cheese'
sharedStart([])                                             //=> ''
sharedStart(['prefix', 'suffix'])                           //=> ''

Solution 2

In Python:

>>> from os.path import commonprefix
>>> commonprefix('interspecies interstelar interstate'.split())
'inters'

Solution 3

You just need to traverse all strings until they differ, then take the substring up to this point.

Pseudocode:

loop for i upfrom 0
     while all strings[i] are equal
     finally return substring[0..i]

Common Lisp:

(defun longest-common-starting-substring (&rest strings)
  (loop for i from 0 below (apply #'min (mapcar #'length strings))
     while (apply #'char=
                  (mapcar (lambda (string) (aref string i))
                          strings))
     finally (return (subseq (first strings) 0 i))))

Solution 4

Ruby one-liner:

l=strings.inject{|l,s| l=l.chop while l!=s[0...l.length];l}

Solution 5

My Haskell one-liner:

import Data.List

commonPre :: [String] -> String
commonPre = map head . takeWhile (\(x:xs)-> all (==x) xs) . transpose

EDIT: barkmadley gave a good explanation of the code below. I'd also add that haskell uses lazy evaluation, so we can be lazy about our use of transpose; it will only transpose our lists as far as necessary to find the end of the common prefix.

Share:
41,535
mislav
Author by

mislav

Ruby and JavaScript master. Contributed to various opensource projects including Ruby on Rails and Prototype.js.

Updated on December 13, 2020

Comments

  • mislav
    mislav over 3 years

    This is a challenge to come up with the most elegant JavaScript, Ruby or other solution to a relatively trivial problem.

    This problem is a more specific case of the Longest common substring problem. I need to only find the longest common starting substring in an array. This greatly simplifies the problem.

    For example, the longest substring in [interspecies, interstelar, interstate] is "inters". However, I don't need to find "ific" in [specifics, terrific].

    I've solved the problem by quickly coding up a solution in JavaScript as a part of my answer about shell-like tab-completion (test page here). Here is that solution, slightly tweaked:

    function common_substring(data) {
      var i, ch, memo, idx = 0
      do {
        memo = null
        for (i=0; i < data.length; i++) {
          ch = data[i].charAt(idx)
          if (!ch) break
          if (!memo) memo = ch
          else if (ch != memo) break
        }
      } while (i == data.length && idx < data.length && ++idx)
    
      return (data[0] || '').slice(0, idx)
    }
    

    This code is available in this Gist along with a similar solution in Ruby. You can clone the gist as a git repo to try it out:

    $ git clone git://gist.github.com/257891.git substring-challenge
    

    I'm not very happy with those solutions. I have a feeling they might be solved with more elegance and less execution complexity—that's why I'm posting this challenge.

    I'm going to accept as an answer the solution I find the most elegant or concise. Here is for instance a crazy Ruby hack I come up with—defining the & operator on String:

    # works with Ruby 1.8.7 and above
    class String
      def &(other)
        difference = other.to_str.each_char.with_index.find { |ch, idx|
          self[idx].nil? or ch != self[idx].chr
        }
        difference ? self[0, difference.last] : self
      end
    end
    
    class Array
      def common_substring
        self.inject(nil) { |memo, str| memo.nil? ? str : memo & str }.to_s
      end
    end
    

    Solutions in JavaScript or Ruby are preferred, but you can show off clever solution in other languages as long as you explain what's going on. Only code from standard library please.

    Update: my favorite solutions

    I've chosen the JavaScript sorting solution by kennebec as the "answer" because it struck me as both unexpected and genius. If we disregard the complexity of actual sorting (let's imagine it's infinitely optimized by the language implementation), the complexity of the solution is just comparing two strings.

    Other great solutions:

    Thanks for participating! As you can see from the comments, I learned a lot (even about Ruby).