how to split 'number' to separate columns in pandas DataFrame

13,710

Solution 1

# make string version of original column, call it 'col'
df['col'] = df['col1'].astype(str)

# make the new columns using string indexing
df['col1'] = df['col'].str[0:2]
df['col2'] = df['col'].str[2:4]
df['col3'] = df['col'].str[4:6]

# get rid of the extra variable (if you want)
df.drop('col', axis=1, inplace=True)

Solution 2

One option is to use extractall() method with regex (\d{2})(\d{2})(\d{2}) which captures every other two digits as columns. ?P<col1> is the name of the captured group which will be converted to the column names:

df.col1.astype(str).str.extractall("(?P<col1>\d{2})(?P<col2>\d{2})(?P<col3>\d{2})").reset_index(drop=True)

#   col1  col2  col3
# 0   10    00    00
# 1   10    00    01
# 2   10    00    02
# 3   10    00    03
# 4   10    00    04
Share:
13,710
Heisenberg
Author by

Heisenberg

Updated on June 25, 2022

Comments

  • Heisenberg
    Heisenberg almost 2 years

    I have a dataframe;

    df=pd.DataFrame({'col1':[100000,100001,100002,100003,100004]})
    
         col1    
    0   100000    
    1   100001
    2   100002
    3   100003
    4   100004
    

    I wish I could get the result below;

        col1   col2    col3
    0   10     00       00 
    1   10     00       01
    2   10     00       02
    3   10     00       03
    4   10     00       04
    

    each rows show the splitted number. I guess the number should be converted to string, but I have no idea next step.... I wanna ask how to split number to separate columns.