Elegant way to create empty pandas DataFrame with NaN of type float

131,115

Solution 1

Simply pass the desired value as first argument, like 0, math.inf or, here, np.nan. The constructor then initializes and fills the value array to the size specified by arguments index and columns:

>>> import numpy as np
>>> import pandas as pd
>>> df = pd.DataFrame(np.nan, index=[0, 1, 2, 3], columns=['A', 'B'])

>>> df
    A   B
0 NaN NaN
1 NaN NaN
2 NaN NaN
3 NaN NaN

>>> df.dtypes
A    float64
B    float64
dtype: object

Solution 2

You could specify the dtype directly when constructing the DataFrame:

>>> df = pd.DataFrame(index=range(0,4),columns=['A'], dtype='float')
>>> df.dtypes
A    float64
dtype: object

Specifying the dtype forces Pandas to try creating the DataFrame with that type, rather than trying to infer it.

Solution 3

Hope this can help!

 pd.DataFrame(np.nan, index = np.arange(<num_rows>), columns = ['A'])

Solution 4

For multiple columns you can do:

df = pd.DataFrame(np.zeros([nrow, ncol])*np.nan)

Solution 5

You can try this line of code:

pdDataFrame = pd.DataFrame([np.nan] * 7)

This will create a pandas dataframe of size 7 with NaN of type float:

if you print pdDataFrame the output will be:

     0
0   NaN
1   NaN
2   NaN
3   NaN
4   NaN
5   NaN
6   NaN

Also the output for pdDataFrame.dtypes is:

0    float64
dtype: object
Share:
131,115
Admin
Author by

Admin

Updated on July 08, 2022

Comments

  • Admin
    Admin almost 2 years

    I want to create a Pandas DataFrame filled with NaNs. During my research I found an answer:

    import pandas as pd
    
    df = pd.DataFrame(index=range(0,4),columns=['A'])
    

    This code results in a DataFrame filled with NaNs of type "object". So they cannot be used later on for example with the interpolate() method. Therefore, I created the DataFrame with this complicated code (inspired by this answer):

    import pandas as pd
    import numpy as np
    
    dummyarray = np.empty((4,1))
    dummyarray[:] = np.nan
    
    df = pd.DataFrame(dummyarray)
    

    This results in a DataFrame filled with NaN of type "float", so it can be used later on with interpolate(). Is there a more elegant way to create the same result?

  • Bill
    Bill over 7 years
    Works for pd.Series too. Excellent!