Re-order Pandas Series on weekday
10,780
You can use Ordered Categorical
and then sort_index
:
print bc
DAY_OF_WEEK a b
0 Sunday 0.7 0.5
1 Monday 0.4 0.1
2 Tuesday 0.3 0.2
3 Wednesday 0.4 0.1
4 Thursday 0.3 0.6
5 Friday 0.4 0.9
6 Saturday 0.3 0.2
7 Sunday 0.7 0.5
8 Monday 0.4 0.1
9 Tuesday 0.3 0.2
10 Wednesday 0.4 0.1
11 Thursday 0.3 0.6
12 Friday 0.4 0.9
13 Saturday 0.3 0.2
14 Sunday 0.7 0.5
15 Monday 0.4 0.1
16 Tuesday 0.3 0.2
17 Wednesday 0.4 0.1
18 Thursday 0.3 0.6
19 Friday 0.4 0.9
20 Saturday 0.3 0.2
bc['DAY_OF_WEEK'] = pd.Categorical(bc['DAY_OF_WEEK'], categories=
['Monday','Tuesday','Wednesday','Thursday','Friday','Saturday', 'Sunday'],
ordered=True)
print bc['DAY_OF_WEEK']
0 Sunday
1 Monday
2 Tuesday
3 Wednesday
4 Thursday
5 Friday
6 Saturday
7 Sunday
8 Monday
9 Tuesday
10 Wednesday
11 Thursday
12 Friday
13 Saturday
14 Sunday
15 Monday
16 Tuesday
17 Wednesday
18 Thursday
19 Friday
20 Saturday
Name: DAY_OF_WEEK, dtype: category
Categories (7, object): [Monday < Tuesday < Wednesday < Thursday < Friday < Saturday < Sunday]
crashes_by_day = bc['DAY_OF_WEEK'].value_counts()
crashes_by_day = crashes_by_day.sort_index()
print crashes_by_day
Monday 3
Tuesday 3
Wednesday 3
Thursday 3
Friday 3
Saturday 3
Sunday 3
dtype: int64
crashes_by_day.plot(kind='bar')
Next possible solution without Categorical
is set sorting by mapping:
crashes_by_day = bc['DAY_OF_WEEK'].value_counts().reset_index()
crashes_by_day.columns = ['DAY_OF_WEEK', 'count']
print crashes_by_day
DAY_OF_WEEK count
0 Thursday 3
1 Wednesday 3
2 Friday 3
3 Tuesday 3
4 Monday 3
5 Saturday 3
6 Sunday 3
days = ['Monday','Tuesday','Wednesday','Thursday','Friday','Saturday', 'Sunday']
mapping = {day: i for i, day in enumerate(days)}
key = crashes_by_day['DAY_OF_WEEK'].map(mapping)
print key
0 3
1 2
2 4
3 1
4 0
5 5
6 6
Name: DAY_OF_WEEK, dtype: int64
crashes_by_day = crashes_by_day.iloc[key.argsort()].set_index('DAY_OF_WEEK')
print crashes_by_day
count
DAY_OF_WEEK
Monday 3
Tuesday 3
Wednesday 3
Thursday 3
Friday 3
Saturday 3
Sunday 3
crashes_by_day.plot(kind='bar')
Comments
-
jakc almost 2 years
Using Pandas, I have pulled in a CSV file and then created a series of the data to find out which days of the week have the most crashes:
crashes_by_day = bc['DAY_OF_WEEK'].value_counts()
I have then plotted this out, but of course it plots them in the same ranked order as the series.
crashes_by_day.plot(kind='bar')
What is the most efficient way to re-rank these to Mon, Tue, Wed, Thur, Fri, Sat, Sun?
Do I have to break it out into a list? Thanks.
-
jakc over 8 yearsWent with the Ordered Categorical approach, seems the most elegant? Ill do some reading up on that now. Thanks loads.
-
jezrael over 8 years
Categorical
solution is more elegant and more faster.