Is there a better way to write a "string contains X" method?
Solution 1
If you search Hoogle for the signature of the function you're looking for (String -> String -> Bool
) you should see isInfixOf
among the top results.
Solution 2
isInfixOf
from Data.List
will surely solve the problem, however in case of longer haystacks or perverse¹ needles you should consider more advanced string matching algorithms with a much better average and worst case complexity.
¹ Consider a really long string consisting only of a
's and a needle with a lot of a
's at the beginning and one b
at the end.
Solution 3
Consider using the text
package(text
on Hackage, now also part of Haskell Platform) for your text-processing needs. It provides a Unicode text type, which is more time- and space-efficient than the built-in list-based String
. For string search, the text
package implements a Boyer-Moore-based algorithm, which has better complexity than the naïve method used by Data.List.isInfixOf
.
Usage example:
Prelude> :s -XOverloadedStrings
Prelude> import qualified Data.Text as T
Prelude Data.Text> T.breakOnAll "abc" "defabcged"
[("def","abcged")]
Prelude Data.Text> T.isInfixOf "abc" "defabcged"
True
Related videos on Youtube
Programmin Tool
It should be illegal to be this awesome. Then again if it were, I would have been put to death a long time ago.
Updated on August 28, 2020Comments
-
Programmin Tool over 3 years
Just stared using Haskell and realized (at far as I can tell) there is no direct way to check a string to see if it contains a smaller string. So I figured I'd just take a shot at it.
Essentially the idea was to check if the two strings were the same size and were equal. If the string being checked was longer, recursively lop of the head and run the check again until the string being checked was the same length.
The rest of the possibilities I used pattern matching to handle them. This is what I came up with:
stringExists "" wordToCheckAgainst = False stringExists wordToCheckFor "" = False stringExists wordToCheckFor wordToCheckAgainst | length wordToCheckAgainst < length wordToCheckFor = False | length wordToCheckAgainst == length wordToCheckFor = wordToCheckAgainst == wordToCheckFor | take (length wordToCheckFor) wordToCheckAgainst == wordToCheckFor = True | otherwise = stringExists wordToCheckFor (tail wordToCheckAgainst)
-
Tom Crockett over 12 yearsNote that any more efficient algorithm would require a different data structure than
String
(or[a]
). -
Jan over 12 yearsIndeed, they usually need some initial preprocessing of the input string . Implementing them in a way resulting in a
(String -> String -> Bool)
function is absolutely possible, though. -
dave4420 over 12 yearsExamining the source of
isInfixOf
is instructive. -
Dan Burton over 12 yearsIf you're really interested in performance, you'd probably implement it like this: Data.Text.Search.indices (note
Data.Text.isInfixOf
is implemented in terms of this; due to laziness it can stop after the first index is found) -
ely about 8 yearsNeeding OverloadedStrings just to check for substring matches is a pretty nasty wart though. The alternative of having to wrap in
pack
andunpack
is really unsatisfactory as well. -
ThreeFx over 7 years@dave4420 Link is broken.
-
Loovjo almost 7 years
-
Charlie over 6 yearshere's the source:
isInfixOf needle haystack = any (isPrefixOf needle) (tails haystack)