Conditionally fill column values based on another columns value in pandas
Solution 1
You probably want to do
df['Normalized'] = np.where(df['Currency'] == '$', df['Budget'] * 0.78125, df['Budget'])
Solution 2
Similar results via an alternate style might be to write a function that performs the operation you want on a row, using row['fieldname']
syntax to access individual values/columns, and then perform a DataFrame.apply method upon it
This echoes the answer to the question linked here: pandas create new column based on values from other columns
def normalise_row(row):
if row['Currency'] == '$'
...
...
...
return result
df['Normalized'] = df.apply(lambda row : normalise_row(row), axis=1)
Solution 3
An option that doesn't require an additional import for numpy
:
df['Normalized'] = df['Budget'].where(df['Currency']=='$', df['Budget'] * 0.78125)
Solution 4
Taking Tom Kimber's suggestion one step further, you could use a Function Dictionary to set various conditions for your functions. This solution is expanding the scope of the question.
I'm using an example from a personal application.
# write the dictionary
def applyCalculateSpend (df_name, cost_method_col, metric_col, rate_col, total_planned_col):
calculations = {
'CPMV' : df_name[metric_col] / 1000 * df_name[rate_col],
'Free' : 0
}
df_method = df_name[cost_method_col]
return calculations.get(df_method, "not in dict")
# call the function inside a lambda
test_df['spend'] = test_df.apply(lambda row: applyCalculateSpend(
row,
cost_method_col='cost method',
metric_col='metric',
rate_col='rate',
total_planned_col='total planned'), axis = 1)
cost method metric rate total planned spend
0 CPMV 2000 100 1000 200.0
1 CPMV 4000 100 1000 400.0
4 Free 1 2 3 0.0
Comments
-
Jan Willem Tulp almost 3 years
I have a
DataFrame
with a few columns. One columns contains a symbol for which currency is being used, for instance a euro or a dollar sign. Another column contains a budget value. So for instance in one row it could mean a budget of 5000 in euro and in the next row it could say a budget of 2000 in dollar.In pandas I would like to add an extra column to my DataFrame, normalizing the budgets in euro. So basically, for each row the value in the new column should be the value from the budget column * 1 if the symbol in the currency column is a euro sign, and the value in the new column should be the value of the budget column * 0.78125 if the symbol in the currency column is a dollar sign.
I know how to add a column, fill it with values, copy values from another column etc. but not how to fill the new column conditionally based on the value of another column.
Any suggestions?