Re-order Pandas Series on weekday

10,780

You can use Ordered Categorical and then sort_index:

print bc
   DAY_OF_WEEK    a    b
0       Sunday  0.7  0.5
1       Monday  0.4  0.1
2      Tuesday  0.3  0.2
3    Wednesday  0.4  0.1
4     Thursday  0.3  0.6
5       Friday  0.4  0.9
6     Saturday  0.3  0.2
7       Sunday  0.7  0.5
8       Monday  0.4  0.1
9      Tuesday  0.3  0.2
10   Wednesday  0.4  0.1
11    Thursday  0.3  0.6
12      Friday  0.4  0.9
13    Saturday  0.3  0.2
14      Sunday  0.7  0.5
15      Monday  0.4  0.1
16     Tuesday  0.3  0.2
17   Wednesday  0.4  0.1
18    Thursday  0.3  0.6
19      Friday  0.4  0.9
20    Saturday  0.3  0.2
bc['DAY_OF_WEEK'] = pd.Categorical(bc['DAY_OF_WEEK'], categories=
    ['Monday','Tuesday','Wednesday','Thursday','Friday','Saturday', 'Sunday'],
    ordered=True)

print bc['DAY_OF_WEEK']
0        Sunday
1        Monday
2       Tuesday
3     Wednesday
4      Thursday
5        Friday
6      Saturday
7        Sunday
8        Monday
9       Tuesday
10    Wednesday
11     Thursday
12       Friday
13     Saturday
14       Sunday
15       Monday
16      Tuesday
17    Wednesday
18     Thursday
19       Friday
20     Saturday
Name: DAY_OF_WEEK, dtype: category
Categories (7, object): [Monday < Tuesday < Wednesday < Thursday < Friday < Saturday < Sunday]
crashes_by_day = bc['DAY_OF_WEEK'].value_counts()
crashes_by_day = crashes_by_day.sort_index()
print crashes_by_day
Monday       3
Tuesday      3
Wednesday    3
Thursday     3
Friday       3
Saturday     3
Sunday       3
dtype: int64

crashes_by_day.plot(kind='bar')

Next possible solution without Categorical is set sorting by mapping:

crashes_by_day = bc['DAY_OF_WEEK'].value_counts().reset_index()
crashes_by_day.columns = ['DAY_OF_WEEK', 'count']
print crashes_by_day
  DAY_OF_WEEK  count
0    Thursday      3
1   Wednesday      3
2      Friday      3
3     Tuesday      3
4      Monday      3
5    Saturday      3
6      Sunday      3

days = ['Monday','Tuesday','Wednesday','Thursday','Friday','Saturday', 'Sunday']
mapping = {day: i for i, day in enumerate(days)}
key = crashes_by_day['DAY_OF_WEEK'].map(mapping)
print key
0    3
1    2
2    4
3    1
4    0
5    5
6    6
Name: DAY_OF_WEEK, dtype: int64

crashes_by_day = crashes_by_day.iloc[key.argsort()].set_index('DAY_OF_WEEK')
print crashes_by_day
             count
DAY_OF_WEEK       
Monday           3
Tuesday          3
Wednesday        3
Thursday         3
Friday           3
Saturday         3
Sunday           3

crashes_by_day.plot(kind='bar')

graph

Share:
10,780
jakc
Author by

jakc

GIS Analyst

Updated on July 19, 2022

Comments

  • jakc
    jakc almost 2 years

    Using Pandas, I have pulled in a CSV file and then created a series of the data to find out which days of the week have the most crashes:

    crashes_by_day = bc['DAY_OF_WEEK'].value_counts()
    

    enter image description here

    I have then plotted this out, but of course it plots them in the same ranked order as the series.

    crashes_by_day.plot(kind='bar')
    

    enter image description here

    What is the most efficient way to re-rank these to Mon, Tue, Wed, Thur, Fri, Sat, Sun?

    Do I have to break it out into a list? Thanks.

  • jakc
    jakc over 8 years
    Went with the Ordered Categorical approach, seems the most elegant? Ill do some reading up on that now. Thanks loads.
  • jezrael
    jezrael over 8 years
    Categorical solution is more elegant and more faster.