understanding math errors in pandas dataframes

17,165

Solution 1

math functions such as math.radians expect a numeric value such as a float, not a sequence such as a pandas.Series.

Instead, you could use numpy.radians, since numpy.radians can accept an array as input:

In [95]: np.radians(frame['lat'])
Out[95]: 
0    1.071370
1    0.572451
2    0.610015
3    0.248565
4    0.589419
Name: lat, dtype: float64

Only Series of length 1 can be converted to a float. So while this works,

In [103]: math.radians(pd.Series([1]))
Out[103]: 0.017453292519943295

in general it does not:

In [104]: math.radians(pd.Series([1,2]))
TypeError: cannot convert the series to <type 'float'>

math.radians is calling float on its argument. Note that you get the same error calling float on pd.Series([1,2]):

In [105]: float(pd.Series([1,2]))
TypeError: cannot convert the series to <type 'float'>

Solution 2

I had a similar issue but was using a custom function. The solution was to use the apply function:

def monthdiff(x):
    z = (int(x/100) * 12) + (x - int(x/100) * 100)
    return z

series['age'].apply(monthdiff)

Now, I have a new column with my simple (yet beautiful) calculation applied to every line in the data frame!

Share:
17,165
user3654387
Author by

user3654387

Updated on July 26, 2022

Comments

  • user3654387
    user3654387 almost 2 years

    I'm trying to generate a new column in a pandas dataframe from other columns and am getting some math errors that I don't understand. Here is a snapshot of the problem and some simplifying diagnostics...

    I can generate a data frame that looks pretty good:

    import pandas
    import math as m
    
    data = {'loc':['1','2','3','4','5'],
            'lat':[61.3850,32.7990,34.9513,14.2417,33.7712],
            'lng':[-152.2683,-86.8073,-92.3809,-170.7197,-111.3877]}
    frame = pandas.DataFrame(data)
    
    frame
    
    Out[15]:
    lat lng loc
    0    61.3850    -152.2683    1
    1    32.7990     -86.8073    2
    2    34.9513     -92.3809    3
    3    14.2417    -170.7197    4
    4    33.7712    -111.3877    5
    5 rows × 3 columns
    

    I can do simple math (i.e. degrees to radians):

    In [32]:
    m.pi*frame.lat/180.
    
    Out[32]:
    0    1.071370
    1    0.572451
    2    0.610015
    3    0.248565
    4    0.589419
    Name: lat, dtype: float64
    

    But I can't convert from degrees to radians using the python math library:

     In [33]:
     m.radians(frame.lat)
    
    ---------------------------------------------------------------------------
    TypeError                                 Traceback (most recent call last)
    <ipython-input-33-99a986252f80> in <module>()
    ----> 1 m.radians(frame.lat)
    
    /Users/user/anaconda/lib/python2.7/site-packages/pandas/core/series.pyc in wrapper(self)
         72             return converter(self.iloc[0])
         73         raise TypeError(
    ---> 74             "cannot convert the series to {0}".format(str(converter)))
         75     return wrapper
         76 
    
    TypeError: cannot convert the series to <type 'float'>
    

    And can't even convert the values to floats to try to force it to work:

    In [34]:
    
    float(frame.lat)
    ---------------------------------------------------------------------------
    TypeError                                 Traceback (most recent call last)
    <ipython-input-34-3311aee92f31> in <module>()
    ----> 1 float(frame.lat)
    
    /Users/user/anaconda/lib/python2.7/site-packages/pandas/core/series.pyc in wrapper(self)
         72             return converter(self.iloc[0])
         73         raise TypeError(
    ---> 74             "cannot convert the series to {0}".format(str(converter)))
         75     return wrapper
         76 
    
    TypeError: cannot convert the series to <type 'float'>
    

    I'm sure there must be a simple explanation and would appreciate your help in finding it. Thanks!