groupby - TypeError 'DataFrame' object is not callable

24,747

is caused by the duplication of 'DateAdded' column. Rename it and you are good to go.

Share:
24,747
Admin
Author by

Admin

Updated on August 05, 2022

Comments

  • Admin
    Admin almost 2 years

    newbie here - my first foray seemed ok, but this is my 2nd use of pandas. In using Pandas 0.12.0 on windows 7, I read 2 dataframes from SQL One works with groupby as expected, so I'm sure my problem isn't syntax. But on the other, where type(reddf) return pandas.core.frame.DataFrame, when try reddf.groupby( 'any column') I get - last few lines -

        c:\python27\lib\site-packages\pandas\core\groupby.pyc in __init__(self, index, grouper,     name, level, sort)
       1197             # no level passed
       1198             if not isinstance(self.grouper, np.ndarray):
    -> 1199                 self.grouper = self.index.map(self.grouper)
       1200                 if not (hasattr(self.grouper,"__len__") and \
       1201                    len(self.grouper) == len(self.index)):
    
    c:\python27\lib\site-packages\pandas\algos.pyd in pandas.algos.arrmap_int64 (pandas\algos.c:62839)()
    

    TypeError: 'DataFrame' object is not callable

    I know groupby is OK, and the column exists, so there's some other constraint / condition on the dataframe that I'm just not aware of or blew past. So what could cause this error? And what should I do? What should I look for in the future?

    info requested

    print type(reddf.index)
    <class 'pandas.core.index.Int64Index'>
    
    print repr(reddf.index) 
    Int64Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19], dtype=int64)
    
    print type(reddf.index.map)
    <type 'instancemethod'>
    
    print repr(reddf.index.map)
    <bound method Int64Index.map of Int64Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19], dtype=int64)>
    
    Just in case
    reddf gives
    <class 'pandas.core.frame.DataFrame'>
    Int64Index: 20 entries, 0 to 19
    Data columns (total 24 columns):
    AssetId                  20  non-null values
    DateAdded                20  non-null values
    ModelId                  20  non-null values
    UsageTypeId              20  non-null values
    DateAdded                20  non-null values
    Name                     20  non-null values
    NatureId                 20  non-null values
    IsContainer              20  non-null values
    SparePartNumber          8  non-null values
    ProductNumber            19  non-null values
    SupportCategoryOid       20  non-null values
    SerialNumber             20  non-null values
    IpAddress                20  non-null values
    Description              20  non-null values
    CustomsId                15  non-null values
    AssetTag                 20  non-null values
    ParentId                 5  non-null values
    ManagementProcessorId    7  non-null values
    OperatingSystem          20  non-null values
    OsVersion                20  non-null values
    SystemName               20  non-null values
    LocationId               10  non-null values
    RomVersion               20  non-null values
    MacAddress               19  non-null values
    dtypes: bool(1), datetime64[ns](2), float64(3), int64(5), object(13)
    

    and I get the error doing a reddf.groupby('ModelId'), in particular. thanks

    Thanks to everyone, The duplicate field name caused me the issue, I can't believe I did not notice before the last comment.

    Now, I don't understand how the .index output eliminated other problems, could you elaborate? What if the index were missing, should not groupby have been able to function properly, why not? Just looking for a short explanation and if you point to code, that's fine. appreciate the help, guys.