Save the output of a pandas groupby operation to CSV

10,062

Solution 1

What you're asking makes no sense. You may not realize it though. groupby creates a staging area for which to perform aggregation or transformations across groups of data. Like, if we wanted to count the number of observations for each group, that'd be an aggregation.

Because you thought that you could output as some table, I'm going to guess that you thought groupby actually grouped the rows together. That isn't bad interpretation of the term if you had never seen it used before, even if it is incorrect. The way to do that would be to sort using the method sort_values.

df1.sort_values('Score')

       Class Score
0    Physics     A
3    Biology     A
5    English     A
1    Science     B
4    History     B
2  Chemistry     C

If Score were something else that wasn't already ordered lexicographically, we could use the categorical type to handle it for us.

score = df1.Score.astype('category', categories=list('ABCDF'), ordered=True)
df1.assign(Score=score).sort_values('Score')

       Class Score
0    Physics     A
3    Biology     A
5    English     A
1    Science     B
4    History     B
2  Chemistry     C

Finally, you output the data to the file as you expected

df1.sort_values('Score').to_csv("Score.txt", sep="\t")

Solution 2

You need to tell what you want to groupby counts, means or others.

 df1.groupby("Score").count().to_csv('d.csv')

Solution 3

Here is the solution ,I think is close to what you want

df1=df1.reset_index()
df1=df1.groupby(['Score','index']).Class.apply(sum).to_frame()
df1

Out[102]: 
                 Class
Score index           
A     0        Physics
      3        Biology
      5        English
B     1        Science
      4        History
C     2      Chemistry
Share:
10,062

Related videos on Youtube

Tom_Hanks
Author by

Tom_Hanks

Updated on June 04, 2022

Comments

  • Tom_Hanks
    Tom_Hanks almost 2 years

    I would like to ask a question about Pandas groupby. I am using ipython notebook (python3).

    For example, there is a dataframe like this.

    df1 = pd.DataFrame( { "Score" : ["A", "B", "C", "A", "B", "A"] ,"Class":
    ["Physics", "Science", "Chemistry", "Biology", "History", "English"] } )
    

    Then, I want to groupby with Score.

    df1.groupby("Score")
    

    I need a output file of this and I tried

    df1.groupby("Score").to_csv("Score.txt",sep="\t")
    

    but this does not work. Does anyone know how to make output file?

    • cs95
      cs95 over 6 years
      If your question was answered, please accept an answer. Same for your other question too, thanks.
  • BENY
    BENY over 6 years
    Nice explanation ~ :-)
  • cs95
    cs95 over 6 years
    The categories are a nice touch but their effect will probably be lost on OP... :(
  • piRSquared
    piRSquared over 6 years
    I'm trying to think of future readers as well. As a matter of fact, I'm thinking of going back and rewriting a ton of my answers. I have 3,800. There's a lot of opportunity to improve with better explanations or simply outdated.