Add the least amount of characters to make a palindrome

15,624

Solution 1

  1. Revert the string
  2. Use a modified Knuth-Morris-Pratt to find the latest match (simplest modification would be to just append the original string to the reverted string and ignore matches after len(string).
  3. Append the unmatched rest of the reverted string to the original.

1 and 3 are obviously linear and 2 is linear beacause Knuth-Morris-Pratt is.

Solution 2

If only appending is allowed

A Scala solution:

def isPalindrome(s: String) = s.view.reverse == s.view

def makePalindrome(s: String) = 
  s + s.take((0 to s.length).find(i => isPalindrome(s.substring(i))).get).reverse

If you're allowed to insert characters anywhere

Every palindrome can be viewed as a set of nested letter pairs.

a  n  n  a         b  o  b
|  |  |  |         |  *  |
|   --   |         |     |
---------           -----

If the palindrome length n is even, we'll have n/2 pairs. If it is odd, we'll have n/2 full pairs and one single letter in the middle (let's call it a degenerated pair).

Let's represent them by pairs of string indexes - the left index counted from the left end of the string, and the right index counted from the right end of the string, both ends starting with index 0.

Now let's write pairs starting from the outer to the inner. So in our example:

anna: (0, 0) (1, 1)
bob: (0, 0) (1, 1)

In order to make any string a palindrome, we will go from both ends of the string one character at a time, and with every step, we'll eventually add a character to produce a correct pair of identical characters.

Example: Assume the input word is "blob"

  1. Pair (0, 0) is (b, b) ok, nothing to do, this pair is fine. Let's increase the counter.
  2. Pair (1, 1) is (l, o). Doesn't match. So let's add "o" at position 1 from the left. Now our word became "bolob".
  3. Pair (2, 2). We don't need to look even at the characters, because we're pointing at the same index in the string. Done.

Wait a moment, but we have a problem here: in point 2. we arbitrarily chose to add a character on the left. But we could as well add a character "l" on the right. That would produce "blolb", also a valid palindrome. So does it matter? Unfortunately it does because the choice in earlier steps may affect how many pairs we'll have to fix and therefore how many characters we'll have to add in the future steps.

Easy algorithm: search all the possiblities. That would give us a O(2^n) algorithm. Better algorithm: use Dynamic Programming approach and prune the search space.

In order to keep things simpler, now we decouple inserting of new characters from just finding the right sequence of nested pairs (outer to inner) and fixing their alignment later. So for the word "blob" we have the following possibilities, both ending with a degenerated pair:

(0, 0) (1, 2)
(0, 0) (2, 1)

The more such pairs we find, the less characters we will have to add to fix the original string. Every full pair found gives us two characters we can reuse. Every degenerated pair gives us one character to reuse.

The main loop of the algorithm will iteratively evaluate pair sequences in such a way, that in step 1 all valid pair sequences of length 1 are found. The next step will evaluate sequences of length 2, the third sequences of length 3 etc. When at some step we find no possibilities, this means the previous step contains the solution with the highest number of pairs.

After each step, we will remove the pareto-suboptimal sequences. A sequence is suboptimal compared to another sequence of the same length, if its last pair is dominated by the last pair of the other sequence. E.g. sequence (0, 0)(1, 3) is worse than (0, 0)(1, 2). The latter gives us more room to find nested pairs and we're guaranteed to find at least all the pairs that we'd find for the former. However sequence (0, 0)(1, 2) is neither worse nor better than (0, 0)(2, 1). The one minor detail we have to beware of is that a sequence ending with a degenerated pair is always worse than a sequence ending with a full pair.

After bringing it all together:

def makePalindrome(str: String): String = {

  /** Finds the pareto-minimum subset of a set of points (here pair of indices).
    * Could be done in linear time, without sorting, but O(n log n) is not that bad ;) */
  def paretoMin(points: Iterable[(Int, Int)]): List[(Int, Int)] = {
    val sorted = points.toSeq.sortBy(identity)
    (List.empty[(Int, Int)] /: sorted) { (result, e) =>
      if (result.isEmpty || e._2 <= result.head._2)
        e :: result
      else
        result
    }
  }

  /** Find all pairs directly nested within a given pair.
    * For performance reasons tries to not include suboptimal pairs (pairs nested in any of the pairs also in the result)
    * although it wouldn't break anything as prune takes care of this. */
  def pairs(left: Int, right: Int): Iterable[(Int, Int)] = {
    val builder = List.newBuilder[(Int, Int)]
    var rightMax = str.length
    for (i <- left until (str.length - right)) {
      rightMax = math.min(str.length - left, rightMax)
      val subPairs =
        for (j <- right until rightMax if str(i) == str(str.length - j - 1)) yield (i, j)

      subPairs.headOption match {
        case Some((a, b)) => rightMax = b; builder += ((a, b))
        case None =>
      }
    }

    builder.result()
  }

  /** Builds sequences of size n+1 from sequence of size n */
  def extend(path: List[(Int, Int)]): Iterable[List[(Int, Int)]] =
    for (p <- pairs(path.head._1 + 1, path.head._2 + 1)) yield p :: path

  /** Whether full or degenerated. Full-pairs save us 2 characters, degenerated save us only 1. */
  def isFullPair(pair: (Int, Int)) =
    pair._1 + pair._2 < str.length - 1

  /** Removes pareto-suboptimal sequences */
  def prune(sequences: List[List[(Int, Int)]]): List[List[(Int, Int)]] = {
    val allowedHeads = paretoMin(sequences.map(_.head)).toSet
    val containsFullPair = allowedHeads.exists(isFullPair)
    sequences.filter(s => allowedHeads.contains(s.head) && (isFullPair(s.head) || !containsFullPair))
  }

  /** Dynamic-Programming step */
  @tailrec
  def search(sequences: List[List[(Int, Int)]]): List[List[(Int, Int)]] = {
    val nextStage = prune(sequences.flatMap(extend))
    nextStage match {
      case List() => sequences
      case x => search(nextStage)
    }
  }

  /** Converts a sequence of nested pairs to a palindrome */
  def sequenceToString(sequence: List[(Int, Int)]): String = {
    val lStr = str
    val rStr = str.reverse

    val half =
      (for (List(start, end) <- sequence.reverse.sliding(2)) yield
        lStr.substring(start._1 + 1, end._1) + rStr.substring(start._2 + 1, end._2) + lStr(end._1)).mkString

    if (isFullPair(sequence.head))
      half + half.reverse
    else
      half + half.reverse.substring(1)
  }

  sequenceToString(search(List(List((-1, -1)))).head)
}

Note: The code does not list all the palindromes, but gives only one example, and it is guaranteed it has the minimum length. There usually are more palindromes possible with the same minimum length (O(2^n) worst case, so you probably don't want to enumerate them all).

Solution 3

O(n) time solution. Algorithm: Need to find the longest palindrome within the given string that contains the last character. Then add all the character that are not part of the palindrome to the back of the string in reverse order.

Key point: In this problem, the longest palindrome in the given string MUST contain the last character.

ex: input: abacac output: abacacaba Here the longest palindrome in the input that contains the last letter is "cac". Therefore add all the letter before "cac" to the back in reverse order to make the entire string a palindrome.

written in c# with a few test cases commented out

 static public void makePalindrome()
    {
        //string word = "aababaa";
        //string word = "abacbaa";
        //string word = "abcbd";
        //string word = "abacac";
        //string word = "aBxyxBxBxyxB";
        //string word = "Malayal";
        string word = "abccadac";

        int j = word.Length - 1;
        int mark = j;
        bool found = false;

        for (int i = 0; i < j; i++)
        {
            char cI = word[i];
            char cJ = word[j];

            if (cI == cJ)
            {
                found = true;
                j--;
                if(mark > i)
                    mark = i;
            }
            else
            {
                if (found)
                {
                    found = false;
                    i--;
                }
                j = word.Length - 1;
                mark = j;
            }
        }

        for (int i = mark-1; i >=0; i--)
            word += word[i];

        Console.Write(word);

    }
}

Note that this code will give you the solution for least amount of letter to APPEND TO THE BACK to make the string a palindrome. If you want to append to the front, just have a 2nd loop that goes the other way. This will make the algorithm O(n) + O(n) = O(n). If you want a way to insert letters anywhere in the string to make it a palindrome, then this code will not work for that case.

Solution 4

I believe @Chronical's answer is wrong, as it seems to be for best case scenario, not worst case which is used to compute big-O complexity. I welcome the proof, but the "solution" doesn't actually describe a valid answer.

KMP finds a matching substring in O(n * 2k) time, where n is the length of the input string, and k substring we're searching for, but does not in O(n) time tell you what the longest palindrome in the input string is.

To solve this problem, we need to find the longest palindrome at the end of the string. If this longest suffix palindrome is of length x, the minimum number of characters to add is n - x. E.g. the string aaba's longest suffix substring is aba of length 3, thus our answer is 1. The algorithm to find out if a string is a palindrome takes O(n) time, whether using KMP or the more efficient and simple algorithm (O(n/2)):

Take two pointers, one at the first character and one at the last character

Compare the characters at the pointers, if they're equal, move each pointer inward, otherwise return false

When the pointers point to the same index (odd string length), or have overlapped (even string length), return true

Using the simple algorithm, we start from the entire string and check if it's a palindrome. If it is, we return 0, and if not, we check the string string[1...end], string[2...end] until we have reached a single character and return n - 1. This results in a runtime of O(n^2).

Splitting up the KMP algorithm into

Build table

Search for longest suffix palindrome

Building the table takes O(n) time, and then each check of "are you a palindrome" for each substring from string[0...end], string[1...end], ..., string[end - 2...end] each takes O(n) time. k in this case is the same factor of n that the simple algorithm takes to check each substring, because it starts as k = n, then goes through k = n - 1, k = n - 2... just the same as the simple algorithm did.

TL; DR:

KMP can tell you if a string is a palindrome in O(n) time, but that supply an answer to the question, because you have to check if all substrings string[0...end], string[1...end], ..., string[end - 2...end] are palindromes, resulting in the same (but actually worse) runtime as a simple palindrome-check algorithm.

Solution 5

If some wants to solve this in ruby, The solution can be very simple

str = 'xcbc' # Any string that you want.
arr1 = str.split('')
arr2 = arr1.reverse
count = 0

while(str != str.reverse)
  count += 1
  arr1.insert(count-1, arr2[count-1])
  str = arr1.join('')
end

puts str
puts str.length - arr2.count
Share:
15,624
Waley Chen
Author by

Waley Chen

Updated on July 31, 2022

Comments

  • Waley Chen
    Waley Chen almost 2 years

    The question:

    Given any string, add the least amount of characters possible to make it a palindrome in linear time.

    I'm only able to come up with a O(N2) solution.

    Can someone help me with an O(N) solution?

  • Chronial
    Chronial almost 8 years
    Your first solution (only appending) has quadratic runtime, right?