F#, char seq -> strings

15,004

Solution 1

I was just researching this myself. I found that System.String.Concat works pretty well, e.g.

"abcdef01234567" |> Seq.take 5 |> String.Concat;;

assuming that you've opened System.

Solution 2

The functions in the Seq module only deal with sequences -- i.e., when you call them with a string, they only "see" a Seq<char> and operate on it accordingly. Even if they made a special check to see if the argument was a string and took some special action (e.g., an optimized version of the function just for strings), they'd still have to return it as a Seq<char> to appease the F# type system -- in which case, you'd need to check the return value everywhere to see if it was actually a string.

The good news is that F# has built-in shortcuts for some of the code you're writing. For example:

"abcdef01234567" |> Seq.take 5

can be shortened to:

"abcdef01234567".[..4]  // Returns the first _5_ characters (indices 0-4).

Some of the others you'll still have to use Seq though, or write your own optimized implementation to operate on strings.

Here's a function to get the distinct characters in a string:

open System.Collections.Generic

let distinctChars str =
    let chars = HashSet ()
    let len = String.length str
    for i = 0 to len - 1 do
        chars.Add str.[i] |> ignore
    chars

Solution 3

F# has a String module which contains some of the Seq module functionality specialised for strings.

Solution 4

F# has gained the ability to use constructors as functions since this question was asked 5 years ago. I would use String(Char[]) to convert characters to a string. You can convert to and from an F# sequence or an F# list, but I'd probably just use the F# array module using String.ToCharArray method too.

printfn "%s" ("abcdef01234567".ToCharArray() |> Array.take 5 |> String)

If you really wanted to use a char seq then you can pipe it to a String like so:

printfn "%s" ("abcdef01234567" |> Seq.take 5 |> Array.ofSeq |> String)
Share:
15,004
Admin
Author by

Admin

Updated on June 18, 2022

Comments

  • Admin
    Admin almost 2 years

    A quick question that may be more of a rant (but I hope to be enlightened instead).

    In F# a string is compatible with Seq such that "abcd" |> Seq.map f will work on a string.

    This is a brilliant facility for working with strings, for example to take the first 5 chars from a string:

    "abcdef01234567" |> Seq.take 5
    

    Or removing duplicate characters:

    "abcdeeeeeee" |> Seq.distinct
    

    The problem being that once you have the char seq result, it becomes extremely awkward to convert this back to a string again, String.concat "" requires that the members are strings, so I end up doing this a lot:

    "abcdef01234567" 
    |> Seq.take 5
    |> Seq.map string
    |> String.concat ""
    

    So much so that I have a function I use in 90% of my projects:

    let toString : char seq -> string = Seq.map string >> String.concat ""
    

    I feel this is over the top, but everywhere I look to find an alternative I am met with heinous things like StringBuilder or inlining a lambda and using new:

    "abcdef01234567" 
    |> Seq.take 5
    |> Seq.toArray 
    |> fun cs -> new string (cs) (* note you cannot just |> string *)
    

    My (perhaps crazy) expectation that I would like to see in the language is that when Seq is used on string, the type signature from the resulting expression should be string -> string. Meaning, what goes in is what comes out. "abcd" |> Seq.take 3 = "abc".

    Is there a reason my expectations of high level string manipulation is mistaken in this case?

    Does anyone have a recommendation for approaching this in a nice manner, I feel like I must be missing something.