Is there a better way to write a "string contains X" method?

29,401

Solution 1

If you search Hoogle for the signature of the function you're looking for (String -> String -> Bool) you should see isInfixOf among the top results.

Solution 2

isInfixOf from Data.List will surely solve the problem, however in case of longer haystacks or perverse¹ needles you should consider more advanced string matching algorithms with a much better average and worst case complexity.


¹ Consider a really long string consisting only of a's and a needle with a lot of a's at the beginning and one b at the end.

Solution 3

Consider using the text package(text on Hackage, now also part of Haskell Platform) for your text-processing needs. It provides a Unicode text type, which is more time- and space-efficient than the built-in list-based String. For string search, the text package implements a Boyer-Moore-based algorithm, which has better complexity than the naïve method used by Data.List.isInfixOf.

Usage example:

Prelude> :s -XOverloadedStrings
Prelude> import qualified Data.Text as T
Prelude Data.Text> T.breakOnAll "abc" "defabcged"
[("def","abcged")]
Prelude Data.Text> T.isInfixOf "abc" "defabcged"
True
Share:
29,401

Related videos on Youtube

Programmin Tool
Author by

Programmin Tool

It should be illegal to be this awesome. Then again if it were, I would have been put to death a long time ago.

Updated on August 28, 2020

Comments

  • Programmin Tool
    Programmin Tool over 3 years

    Just stared using Haskell and realized (at far as I can tell) there is no direct way to check a string to see if it contains a smaller string. So I figured I'd just take a shot at it.

    Essentially the idea was to check if the two strings were the same size and were equal. If the string being checked was longer, recursively lop of the head and run the check again until the string being checked was the same length.

    The rest of the possibilities I used pattern matching to handle them. This is what I came up with:

    stringExists "" wordToCheckAgainst = False
    stringExists wordToCheckFor "" = False
    stringExists wordToCheckFor wordToCheckAgainst | length wordToCheckAgainst < length wordToCheckFor = False
                                                   | length wordToCheckAgainst == length wordToCheckFor = wordToCheckAgainst == wordToCheckFor
                                                   | take (length wordToCheckFor) wordToCheckAgainst == wordToCheckFor = True
                                                   | otherwise = stringExists wordToCheckFor (tail wordToCheckAgainst)
    
  • Tom Crockett
    Tom Crockett over 12 years
    Note that any more efficient algorithm would require a different data structure than String (or [a]).
  • Jan
    Jan over 12 years
    Indeed, they usually need some initial preprocessing of the input string . Implementing them in a way resulting in a (String -> String -> Bool) function is absolutely possible, though.
  • dave4420
    dave4420 over 12 years
    Examining the source of isInfixOf is instructive.
  • Dan Burton
    Dan Burton over 12 years
    If you're really interested in performance, you'd probably implement it like this: Data.Text.Search.indices (note Data.Text.isInfixOf is implemented in terms of this; due to laziness it can stop after the first index is found)
  • ely
    ely about 8 years
    Needing OverloadedStrings just to check for substring matches is a pretty nasty wart though. The alternative of having to wrap in pack and unpack is really unsatisfactory as well.
  • ThreeFx
    ThreeFx over 7 years
    @dave4420 Link is broken.
  • Loovjo
    Loovjo almost 7 years
  • Charlie
    Charlie over 6 years
    here's the source: isInfixOf needle haystack = any (isPrefixOf needle) (tails haystack)