How can I map True/False to 1/0 in a Pandas DataFrame?
Solution 1
A succinct way to convert a single column of boolean values to a column of integers 1 or 0:
df["somecolumn"] = df["somecolumn"].astype(int)
Solution 2
Just multiply your Dataframe by 1 (int)
[1]: data = pd.DataFrame([[True, False, True], [False, False, True]])
[2]: print data
0 1 2
0 True False True
1 False False True
[3]: print data*1
0 1 2
0 1 0 1
1 0 0 1
Solution 3
True
is 1
in Python, and likewise False
is 0
*:
>>> True == 1
True
>>> False == 0
True
You should be able to perform any operations you want on them by just treating them as though they were numbers, as they are numbers:
>>> issubclass(bool, int)
True
>>> True * 5
5
So to answer your question, no work necessary - you already have what you are looking for.
* Note I use is as an English word, not the Python keyword is
- True
will not be the same object as any random 1
.
Solution 4
This question specifically mentions a single column, so the currently accepted answer works. However, it doesn't generalize to multiple columns. For those interested in a general solution, use the following:
df.replace({False: 0, True: 1}, inplace=True)
This works for a DataFrame that contains columns of many different types, regardless of how many are boolean.
Solution 5
You also can do this directly on Frames
In [104]: df = DataFrame(dict(A = True, B = False),index=range(3))
In [105]: df
Out[105]:
A B
0 True False
1 True False
2 True False
In [106]: df.dtypes
Out[106]:
A bool
B bool
dtype: object
In [107]: df.astype(int)
Out[107]:
A B
0 1 0
1 1 0
2 1 0
In [108]: df.astype(int).dtypes
Out[108]:
A int64
B int64
dtype: object
Related videos on Youtube
Simon Righley
Updated on July 08, 2022Comments
-
Simon Righley almost 2 years
I have a column in python
pandas
DataFrame that has booleanTrue
/False
values, but for further calculations I need1
/0
representation. Is there a quickpandas
/numpy
way to do that?-
Jon Clements almost 11 yearsWhat further calculations are required?
-
cs95 almost 4 yearsTo parrot @JonClements, why do you need to convert bool to int to use in calculation? bool works with arithmetic directly (since it is internally an int).
-
sql_knievel over 2 years@cs95 - Pandas uses numpy bools internally, and they can behave a little differently. In plain Python, True + True = 2, but in Pandas, numpy.bool_(True) + numpy.bool_(True) = True, which may not be the desired behavior on your particular calculation.
-
-
jorgeca almost 11 yearsJust be careful with data types if doing floating point math:
np.sin(True).dtype
is float16 for me. -
dwanderson over 7 yearsI've got a dataframe with a boolean column, and I can call
df.my_column.mean()
just fine (as you imply), but when I try:df.groupby("some_other_column").agg({"my_column":"mean"})
I getDataError: No numeric types to aggregate
, so it appears they are NOT always the same. Just FYI. -
BallpointBen about 5 yearsIn pandas version 24 (and maybe earlier) you can aggregate
bool
columns just fine. -
Amadou Kone about 5 yearsIt looks like numpy also throws errors with boolean types:
TypeError: numpy boolean subtract, the
-` operator, is deprecated, use the bitwise_xor, the^
operator, or the logical_xor function instead.` Using @User's answer fixes this. -
colorlace almost 5 yearsAnother reason it's not the same: df.col1 + df.col2 + df.col3 doesn't work for
bool
columns as it does forint
columns -
DustByte over 4 yearsThe corner case is if there are NaN values in
somecolumn
. Usingastype(int)
will then fail. Another approach, which convertsTrue
to 1.0 andFalse
to 0.0 (floats) while preserving NaN-values is to do:df.somecolumn = df.somecolumn.replace({True: 1, False: 0})
-
Homunculus Reticulli about 4 years@DustByte Good catch!
-
AMC about 4 years@DustByte Couldn't you just use
astype(float)
and get the same result? -
AMC about 4 yearsWhat are the advantages of this solution?
-
AMC about 4 yearsThis is identical to this solution, posted 3 years earlier.
-
Golden Lion over 3 yearsif the value is text and a lowercase "true" or "false" then first do a astype(bool].astype(int) and the conversion will work. Sas outputs is bools as lowercase true and false.
-
Phillip Copley over 3 years@AMC There are none, it's a hacky way to do it.
-
AMC over 3 yearsMuch simpler:
df['type'] = df['type'].map({'REAL': 1, 'FAKE': 0})
. In any case, I'm not sure it's too relevant to this question. -
kaishu over 3 yearsThanks for providing simpler solution. As I mentioned in answer, I was trying to find solution for slightly different question, and only similar questions like this were available. Hope my answer and your solution will help someone in future.
-
AMC over 3 yearsThere are other questions which already cover that, though, like stackoverflow.com/q/20250771.
-
Dmitriy Work about 3 years@AMC if your dataframe has
float
types beside booleans this method won't ruin them,df.astype(int)
does. And since it's hacky it's probably a good idea to make intention clear with comment like# bool -> int
. -
Dmitriy Work about 3 yearsThere is an advantage of using
data * 1
againstdata + 0
with mixed types – it works on strings as well, wheredata + 0
throws an error. Equivalent performance-wise. -
unaied about 3 yearshow can this be applied to a number of columns?
-
Avv almost 3 yearsThank you. Should I do this to all columns or there is a command without specifying column name?
-
qwr over 2 yearsadvantage: slightly shorter