Pandas Lambda Function : attribute error 'occurred at index 0'

13,036

Your error here is that your dates are str not datetime, either convert using to_datetime:

df['dCount'] = pd.to_datetime(df['dCount'])

or better just tell read_csv to parse that column as datetime:

DfT_raw = pd.read_csv('./file.csv', parse_dates=['dCount'],index_col = False)

Afterwards you can then get just the date by calling the dt.date accessor

Share:
13,036
LearningSlowly
Author by

LearningSlowly

PhD Student Civil engineer now lost in the world of computers.

Updated on June 04, 2022

Comments

  • LearningSlowly
    LearningSlowly almost 2 years

    I am using Pandas to create a new column in a data frame created from a csv.

    [in] DfT_raw = pd.read_csv('./file.csv', index_col = False)
    [in] print(DfT_raw)
    
    [out]            Region Name dCount ONS    CP  S Ref E  S Ref N   Road  \
    0        East Midlands  E06000015      14/04/00 00:00  37288   434400   336000   A516   
    1        East Midlands  E06000015       14/04/00 00:00  37288   434400   336000   A516   
    2        East Midlands  E06000015       14/04/00 00:00  37288   434400   336000   A516   
    3        East Midlands  E06000015       14/04/00 00:00  37288   434400   336000   A516   
    

    I define a function to strip the time from the datetime fieldn (dCount) and then create a new column 'date'

    [in] def date_convert(dCount):
             return dCount.date()
    
         DfT_raw['date'] = DfT_raw.apply(lambda row: date_convert(row['dCount']), axis=1)
    
    [out] AttributeError: ("'str' object has no attribute 'date'", u'occurred at index 0')
    

    There is some issue with the index_col. I previously used index_col = 1 but got the same error.

    When I print 'dCount' I get

    0          14/04/00 00:00
    1          14/04/00 00:00
    2          14/04/00 00:00
    3          14/04/00 00:00
    4          14/04/00 00:00
    

    The index column is causing the error. How do I ensure this isn't given to the function?