Creating a new column based on condition with values from another column in python

10,723

use where with your boolean condition, this will set all row values rather than iterating row-wise:

In [120]:
df2['Occupied'] = df2['Value'].where(df2['Hour'] >= 9, 0)
df2

Out[120]:
                 Date  Value  Hour  Occupied
0 2016-02-02 21:00:00    0.6    21       0.6
1 2016-02-02 22:00:00    0.4    22       0.4
2 2016-02-02 23:00:00    0.4    23       0.4
3 2016-02-03 00:00:00    0.3     0       0.0
4 2016-02-03 01:00:00    0.2     1       0.0
5 2016-02-03 02:00:00    0.2     2       0.0
6 2016-02-03 03:00:00    0.1     3       0.0
7 2016-02-03 04:00:00    0.2     4       0.0
8 2016-02-03 05:00:00    0.1     5       0.0
9 2016-02-03 06:00:00    0.4     6       0.0
Share:
10,723
Muhammad
Author by

Muhammad

Updated on June 04, 2022

Comments

  • Muhammad
    Muhammad almost 2 years

    I have a Dataframe and would like to create a new column based on condition, in this new column if a certain condition is met then the value will be from another column otherwise it needs to be zero. The Orginal DataFrame is;

    df2 = pd.read_csv('C:\Users\ABC.csv')
    df2['Date'] = pd.to_datetime(df2['Date'])
    df2['Hour'] = df2.Date.dt.hour
    df2['Occupied'] = ''
    Date                 Value  Hour    Occupied
    2016-02-02 21:00:00  0.6    21  
    2016-02-02 22:00:00  0.4    22  
    2016-02-02 23:00:00  0.4    23  
    2016-02-03 00:00:00  0.3    0   
    2016-02-03 01:00:00  0.2    1   
    2016-02-03 02:00:00  0.2    2   
    2016-02-03 03:00:00  0.1    3   
    2016-02-03 04:00:00  0.2    4   
    2016-02-03 05:00:00  0.1    5   
    2016-02-03 06:00:00  0.4    6
    

    I would like to have same values as df2.Value in the Occupied column if df2.Hour is greater than or equal to 9, otherwise the values will be zero in the Occupied column. I have tried the following code but it does not work as I would like to (it prints same values as df2.Value without considering else statement);

    for i in df2['Hour']:
        if i >= 9:
            df2['Occupied'] = df2.Value
        else:
            df2['Occupied'] = 0
    

    Any idea what is wrong with this?