pandas: Use if-else to populate new column

10,242

Solution 1

You could convert the boolean series df.col2 > 0 to an integer series (True becomes 1 and False becomes 0):

df['col3'] = (df.col2 > 0).astype('int')

(To create a new column, you simply need to name it and assign it to a Series, array or list of the same length as your DataFrame.)

This produces col3 as:

   col2  col3
0     0     0
1     1     1
2     0     0
3     0     0
4     3     1
5     0     0
6     4     1

Another way to create the column could be to use np.where, which lets you specify a value for either of the true or false values and is perhaps closer to the syntax of the R function ifelse. For example:

>>> np.where(df['col2'] > 0, 4, -1)
array([-1,  4, -1, -1,  4, -1,  4])

Solution 2

I assume that you're using Pandas (because of the 'df' notation). If so, you can assign col3 a boolean flag by using .gt (greater than) to compare col2 against zero. Multiplying the result by one will convert the boolean flags into ones and zeros.

df1 = pd.DataFrame({'col1': [1, 0, 0, 0, 3, 2, 0], 
                    'col2': [0, 1, 0, 0, 3, 0, 4]})

df1['col3'] = df1.col2.gt(0) * 1

>>> df1
Out[70]: 
   col1  col2  col3
0     1     0     0
1     0     1     1
2     0     0     0
3     0     0     0
4     3     3     1
5     2     0     0
6     0     4     1

You can also use a lambda expression to achieve the same result, but I believe the method above is simpler for your given example.

df1['col3'] = df1['col2'].apply(lambda x: 1 if x > 0 else 0)
Share:
10,242
screechOwl
Author by

screechOwl

https://financenerd.blog/blog/

Updated on June 11, 2022

Comments

  • screechOwl
    screechOwl almost 2 years

    I have a DataFrame like this:

    col1       col2      
      1          0
      0          1
      0          0
      0          0
      3          3
      2          0
      0          4
    

    I'd like to add a column that is a 1 if col2 is > 0 or 0 otherwise. If I was using R I'd do something like

    df1[,'col3'] <- ifelse(df1$col2 > 0, 1, 0)
    

    How would I do this in python / pandas?