PowerShell use regular expression to split a string

33,392

Solution 1

In PowerShell when you use a -split function if you have part of the match in brackets () you are asking for that match to be returned as well. I am sure that the same is true with the static method of [regex] as well. Consider the output from the two following commands (which are similar to yours) and you will see

[regex]::split("1,2   3", '(,|\s+)')

1
,
2

3

[regex]::split("1,2   3", ',|\s+')

1
2
3

In the first example you see that the comma and whitespace have been returned as elements. What I am explaining is documented in About_Split

By default, the delimiter is omitted from the results. To preserve all or part of the delimiter, enclose in parentheses the part that you want to preserve.

In this particular case

As pointed out in the comments there are 2 more ideal regex strings that would handle this particular case better

(?:,|\s)+ or [,\s]+

Former using a non capturing group and latter being a character class.

Solution 2

I might be late to the party but below example works for me.

"Lastname,FirstName, Address pincode" -split "[\s,]+"
Lastname
FirstName
Address
pincode 
Share:
33,392
Just a learner
Author by

Just a learner

Updated on July 09, 2022

Comments

  • Just a learner
    Just a learner almost 2 years

    This is my code:

    [regex]::split("1,2   3", '(,|\s)+')
    

    What I want is an array with three elements 1, 2, 3, however, what I got it is an array with five elements.

    PS C:\Users\a> [regex]::split("1,2   3", '(,|\s)+').Length
    5
    PS C:\Users\a>
    

    How to get what I want?

    Update

    Add the actual split result instead of the length.

    PS E:\> [regex]::split("1,2   3", '(,|\s)+')
    1
    ,
    2
    
    3
    PS E:\> [regex]::split("1,2   3", '(,|\s)+').length
    5
    PS E:\> [regex]::split("1,2   3", '[,\s]+')
    1
    2
    3
    PS E:\> [regex]::split("1,2   3", '[,\s]+').length
    3
    PS E:\>
    

    Update

    Thanks @Matt's answer and it points me to the right direction. From help about_split the doc states that:

    By default, the delimiter is omitted from the results. To preserve all or part of the delimiter, enclose in parentheses the part that you want to preserve.

    Below are some of my testing.

    PS E:\tutorial>     "Lastname/:/FirstName/:/Address" -split "/(:)/"
    Lastname
    :
    FirstName
    :
    Address
    PS E:\tutorial>     "Lastname/:/FirstName/:/Address" -split "/:/"
    Lastname
    FirstName
    Address
    PS E:\tutorial>     "Lastname/:/FirstName/:/Address" -split "(/:/)"
    Lastname
    /:/
    FirstName
    /:/
    Address
    PS E:\tutorial>     "Lastname/:/FirstName/:/Address" -split "/:(/)"
    Lastname
    /
    FirstName
    /
    Address
    PS E:\tutorial>     "Lastname/:/FirstName/:/Address" -split "(/):(/)"
    Lastname
    /
    /
    FirstName
    /
    /
    Address
    PS E:\tutorial>     "Lastname/:/FirstName/:/Address" -split "(/)(:)(/)"
    Lastname
    /
    :
    /
    FirstName
    /
    :
    /
    Address
    PS E:\tutorial> [regex]::split("Lastname/:/FirstName/:/Address", '/(:)/')
    Lastname
    :
    FirstName
    :
    Address
    PS E:\tutorial> [regex]::split("Lastname/:/FirstName/:/Address", '/:/')
    Lastname
    FirstName
    Address
    PS E:\tutorial> [regex]::split("Lastname/:/FirstName/:/Address", '/:(/)')
    Lastname
    /
    FirstName
    /
    Address
    PS E:\tutorial> [regex]::split("Lastname/:/FirstName/:/Address", '(/):(/)')
    Lastname
    /
    /
    FirstName
    /
    /
    Address
    PS E:\tutorial> [regex]::split("Lastname/:/FirstName/:/Address", '(/)(:)(/)')
    Lastname
    /
    :
    /
    FirstName
    /
    :
    /
    Address
    PS E:\tutorial>