Convert month,day,year to month,year with python/pandas?

14,843

Solution 1

I think you can use first to_datetime and then to_period:

df.col = pd.to_datetime(df.col).dt.to_period('m')
print (df)
       col
0  2009-10
1  2009-12
2  2009-04
3  2007-08
4  2008-07
5  2009-06
6  2009-01
7  2007-12
8  2009-09
9  2006-02
10 2009-03
11 2007-02

print (type(df.loc[0,'col']))
<class 'pandas._period.Period'>

Or strftime:

df.col = pd.to_datetime(df.col).dt.strftime('%m/%Y')
print (df)
        col
0   10/2009
1   12/2009
2   04/2009
3   08/2007
4   07/2008
5   06/2009
6   01/2009
7   12/2007
8   09/2009
9   02/2006
10  03/2009
11  02/2007

print (type(df.loc[0,'col']))
<class 'str'>

Or replace by regex:

df.col = df.col.str.replace('/.+/','/')
print (df)
        col
0   10/2009
1   12/2009
2    4/2009
3    8/2007
4    7/2008
5    6/2009
6    1/2009
7   12/2007
8    9/2009
9    2/2006
10   3/2009
11   2/2007

print (type(df.loc[0,'col']))
<class 'str'>

Solution 2

You can use str.split to build the strings:

In [32]:
df['date'] =df['date'].str.split('/').str[0] + '/'  + df['date'].str.split('/').str[-1]
df

Out[32]:
       date
0   10/2009
1   12/2009
2    4/2009
3    8/2007
4    7/2008
5    6/2009
6    1/2009
7   12/2007
8    9/2009
9    2/2006
10   3/2009
11   2/2007

Solution 3

Or you could use a regular expression, if you prefer that kind of solution. This would solve your problem:

import re

res = re.sub(r"/\d\d?/", "/", s)

(Note that s is the date string, either as separate date strings or a long string containing all dates, and that you have your result bound to res.)

Share:
14,843
Joan Triay
Author by

Joan Triay

Updated on June 15, 2022

Comments

  • Joan Triay
    Joan Triay almost 2 years

    I have this kind of list of strings with 9000 rows where each row is month/day/year:

    10/30/2009
    12/19/2009
    4/13/2009
    8/18/2007
    7/17/2008
    6/16/2009
    1/14/2009
    12/18/2007
    9/14/2009
    2/13/2006
    3/25/2009
    2/23/2007
    

    I want convert it and only have the list with month/year if is it possible as dateformat, like this:

    10/2009
    12/2009
    4/2009
    8/2007
    7/2008
    6/2009
    1/2009
    12/2007
    9/2009
    2/2006
    3/2009
    2/2007