What is the best way to split a string by a delimiter functionally?
Solution 1
Doesn't Data.List.Split.splitOn do this?
Solution 2
splitBy delimiter = foldr f [[]]
where f c l@(x:xs) | c == delimiter = []:l
| otherwise = (c:x):xs
Edit: not by the original author, but below is a more (overly?) verbose, and less flexible version (specific to Char
/String
) to help clarify how this works. Use the above version because it works on any list of a type with an Eq
instance.
splitBy :: Char -> String -> [String]
splitBy _ "" = [];
splitBy delimiterChar inputString = foldr f [""] inputString
where f :: Char -> [String] -> [String]
f currentChar allStrings@(partialString:handledStrings)
| currentChar == delimiterChar = "":allStrings -- start a new partial string at the head of the list of all strings
| otherwise = (currentChar:partialString):handledStrings -- add the current char to the partial string
-- input: "a,b,c"
-- fold steps:
-- first step: 'c' -> [""] -> ["c"]
-- second step: ',' -> ["c"] -> ["","c"]
-- third step: 'b' -> ["","c"] -> ["b","c"]
-- fourth step: ',' -> ["b","c"] -> ["","b","c"]
-- fifth step: 'a' -> ["","b","c"] -> ["a","b","c"]
Solution 3
This is a bit of a hack, but heck, it works.
yourFunc str = map (+1) $ read ("[" ++ str ++ "]")
Here is a non-hack version using unfoldr
:
import Data.List
import Control.Arrow(second)
-- break' is like break but removes the
-- delimiter from the rest string
break' d = second (drop 1) . break d
split :: String -> Maybe (String,String)
split [] = Nothing
split xs = Just . break' (==',') $ xs
yourFunc :: String -> [Int]
yourFunc = map ((+1) . read) . unfoldr split
Solution 4
Just for fun, here is how you could create a simple parser with Parsec:
module Main where
import Control.Applicative hiding (many)
import Text.Parsec
import Text.Parsec.String
line :: Parser [Int]
line = number `sepBy` (char ',' *> spaces)
number = read <$> many digit
One advantage is that it's easily create a parser which is flexible in what it will accept:
*Main Text.Parsec Text.Parsec.Token> :load "/home/mikste/programming/Temp.hs"
[1 of 1] Compiling Main ( /home/mikste/programming/Temp.hs, interpreted )
Ok, modules loaded: Main.
*Main Text.Parsec Text.Parsec.Token> parse line "" "1, 2, 3"
Right [1,2,3]
*Main Text.Parsec Text.Parsec.Token> parse line "" "10,2703, 5, 3"
Right [10,2703,5,3]
*Main Text.Parsec Text.Parsec.Token>
Solution 5
This is application of HaskellElephant's answer to original question with minor changes
splitByDelimiter :: Char -> String -> [String] splitByDelimiter = unfoldr . splitSingle splitSingle :: Char -> String -> Maybe (String,String) splitSingle _ [] = Nothing splitSingle delimiter xs = let (ys, zs) = break (== delimiter) xs in Just (ys, drop 1 zs)
Where the function splitSingle split the list in two substrings by first delimiter.
For example:
"1,2,-5,-23,15" -> Just ("1", "2,-5,-23,15")
Related videos on Youtube
sign
Updated on July 09, 2022Comments
-
sign over 1 year
I tried to write the program in Haskell that will take a string of integer numbers delimitated by comma, convert it to list of integer numbers and increment each number by 1.
For example
"1,2,-5,-23,15" -> [2,3,-4,-22,16]
Below is the resulting program
import Data.List main :: IO () main = do n <- return 1 putStrLn . show . map (+1) . map toInt . splitByDelimiter delimiter $ getList n getList :: Int -> String getList n = foldr (++) [] . intersperse [delimiter] $ replicate n inputStr delimiter = ',' inputStr = "1,2,-5,-23,15" splitByDelimiter :: Char -> String -> [String] splitByDelimiter _ "" = [] splitByDelimiter delimiter list = map (takeWhile (/= delimiter) . tail) (filter (isPrefixOf [delimiter]) (tails (delimiter : list))) toInt :: String -> Int toInt = read
The most hard part for me was programming of function
splitByDelimiter
that take a String and return list of Strings"1,2,-5,-23,15" -> ["1","2","-5","-23","15"]
Thought it is working, I am not happy with the way it is written. There are a lot of parentheses, so it looks Lisp like. Also the algorithm is somewhat artificial:
Prepend delimiter to beginning of string
",1,2,-5,-23,15"
Generate list of all tails
[",1,2,-5,-23,15", "1,2,-5,-23,15", ",2,-5,-23,15", .... ]
Filter and left only strings that begins with delimiter
[",1,2,-5,-23,15", ",2,-5,-23,15", .... ]
Drop first delimiter and take symbols until next delimiter will be met
["1", "2", .... ]
So the questions are:
How I can improve function
splitByDelimiter
?Can I remove prepend and drop of delimiter and make direct split of string?
How I can rewrite the function so there will be less parentheses?
May be I miss something and there are already standard function with this functionality?
-
pat over 10 years
foldr (++) []
is otherwise known asconcat
,putStrLn . show
is otherwise known asprint
. Also,n <- return 1
is a little odd; you can just dolet n = 1
and avoid wrapping and unwrapping the monad. -
Norman Ramsey over 9 yearspossible duplicate of How to split a string in Haskell?
-
Daniel Pratt almost 13 yearsWhereas this package is not part of the basic install (Haskell Platform), I think it tends to get overlooked.
-
sign almost 13 yearsThank you. It does exactly what I need.
-
sign almost 13 yearsThank you. This is a good point of view. I like the way how unfoldr is used here.
-
CoR over 11 yearsYour split is faster than splitOn by 43ns on my comp in ghci :)
-
ljedrz about 10 yearsThis is brilliant; it took me way too long to understand how it works, but I love it.
-
ljedrz about 10 yearsThis implementation of split function works differently than you would expect - it doesn't properly split strings with commas at the end - one "" is missing. If you want to make sure that a split function is 100% functional, it should be reversible by interspersing with the same delimiter for all permutations of a delimited string, eg. "a,b,c".
-
fotNelton almost 10 yearsDoesn't work for empty strings, though, i.e. it evaluates to
[""]
rather than[]
. -
rob almost 7 yearsMinor, but could use
many1
as innumber = read <$> many1 digit
so that invalid input like "1,,2" results in a Left value instead of an exception from Prelude.read. -
Andy White over 6 yearsI agree with @ljedrz - it took me way to long to understand, but it is brilliant! I hope you don't mind but I added a less flexible, but extremely verbose addendum to your answer to help other people understand what's happening.
-
Steven Armstrong over 6 yearsMinor nitpick, but this is the functionality I would expect for a
splitOn
function, notsplitBy
. ForsplitBy
I would expectsplitBy fn = foldr f [[]] where f c l@(x:xs) = bool ((c:x):xs) ([]:l) $ fn c
, with the currentsplitOn c
functionality recovered bysplitOn c = splitBy (==c)
-
Will Ness over 3 yearsuse of
length
is an anti-pattern (destroys laziness), usespan
/break
instead;(if bs == [] then [] else tail bs)
==drop 1 bs
. -
Will Ness over 3 yearsrepeated singletons appending on the right is an anti-pattern (leads to quadratic behavior).
-
Andrew Koster over 3 years
splitOneOf
is a generally more useful function, especially if you need to take arbitrary whitespace into account.