Split comma separated column data into additional columns

158,826

Solution 1

If the number of fields in the CSV is constant then you could do something like this:

select a[1], a[2], a[3], a[4]
from (
    select regexp_split_to_array('a,b,c,d', ',')
) as dt(a)

For example:

=> select a[1], a[2], a[3], a[4] from (select regexp_split_to_array('a,b,c,d', ',')) as dt(a);
 a | a | a | a 
---+---+---+---
 a | b | c | d
(1 row)

If the number of fields in the CSV is not constant then you could get the maximum number of fields with something like this:

select max(array_length(regexp_split_to_array(csv, ','), 1))
from your_table

and then build the appropriate a[1], a[2], ..., a[M] column list for your query. So if the above gave you a max of 6, you'd use this:

select a[1], a[2], a[3], a[4], a[5], a[6]
from (
    select regexp_split_to_array(csv, ',')
    from your_table
) as dt(a)

You could combine those two queries into a function if you wanted.

For example, give this data (that's a NULL in the last row):

=> select * from csvs;
     csv     
-------------
 1,2,3
 1,2,3,4
 1,2,3,4,5,6

(4 rows)

=> select max(array_length(regexp_split_to_array(csv, ','), 1)) from csvs;
 max 
-----
   6
(1 row)

=> select a[1], a[2], a[3], a[4], a[5], a[6] from (select regexp_split_to_array(csv, ',') from csvs) as dt(a);
 a | a | a | a | a | a 
---+---+---+---+---+---
 1 | 2 | 3 |   |   | 
 1 | 2 | 3 | 4 |   | 
 1 | 2 | 3 | 4 | 5 | 6
   |   |   |   |   | 
(4 rows)

Since your delimiter is a simple fixed string, you could also use string_to_array instead of regexp_split_to_array:

select ...
from (
    select string_to_array(csv, ',')
    from csvs
) as dt(a);

Thanks to Michael for the reminder about this function.

You really should redesign your database schema to avoid the CSV column if at all possible. You should be using an array column or a separate table instead.

Solution 2

split_part() does what you want in one step:

SELECT split_part(col, ',', 1) AS col1
     , split_part(col, ',', 2) AS col2
     , split_part(col, ',', 3) AS col3
     , split_part(col, ',', 4) AS col4
FROM   tbl;

Add as many lines as you have items in col (the possible maximum).
Columns exceeding data items will be empty strings ('').

Share:
158,826

Related videos on Youtube

Gallop
Author by

Gallop

Updated on March 04, 2022

Comments

  • Gallop
    Gallop about 2 years

    I have comma separated data in a column:

    Column 
    ------- 
    a,b,c,d 
    

    I want to split the comma separated data into multiple columns to get this output:

    Column1  Column2 Column3 Column4 
    -------  ------- ------- -------
    a        b       c       d 
    

    How can this be achieved?

  • Gallop
    Gallop over 12 years
    Thanks shall check and revert
  • John Powell
    John Powell about 10 years
    And would appear to execute a lot faster than the regexp_split_to_array version.
  • Erwin Brandstetter
    Erwin Brandstetter about 10 years
    @JohnBarça: All regular expression function are comparatively expensive. Powerful, but for a price ...
  • Michael
    Michael over 9 years
    Consider using string_to_array instead of regexp_split_to_array; it should be faster since it doesn't have the overhead of regular expression processing.
  • mu is too short
    mu is too short over 9 years
    @Michael You could add that as another answer if you'd like. Or I could add string_to_array as an option in mine, not sure how I missed that.
  • Dennis Bauszus
    Dennis Bauszus about 8 years
    Just had to run a big load of the US census. string_to_array was about faster by the factor 10.
  • mu is too short
    mu is too short about 8 years
    @DennisBauszus: Nice. Did you check split_part too? Just curious.
  • Dennis Bauszus
    Dennis Bauszus about 8 years
    Smashing. Faster by factor 3 over string_to_array. Should be marked as the answer. Note to myself: Must read all the answers.
  • Dennis Bauszus
    Dennis Bauszus about 8 years
    Legend! This is by far the fastest approach to this kind of issue.
  • mu is too short
    mu is too short about 8 years
    @DennisBauszus Wouldn't argue with that. It is usually a good idea to look for what Erwin Brandsetter or Craig Ringer have to say on PostgreSQL questions, they really know their stuff and their answers tend to be quite thorough.
  • juanitogan
    juanitogan almost 8 years
    mu's answer is one step as well if the number of values is known such as this answer assumes. Michael's addition of string_to_array nullifies the regexp expense.
  • Erwin Brandstetter
    Erwin Brandstetter about 6 years
    @juanitogan: True, string_to_array nullifies the regexp expense. But for a hand full of values, this is much faster, still. See the comment with feedback above.
  • Vérace
    Vérace over 2 years
    The question refers to PostgreSQL - that's SQL Server syntax?