sort csv by column

127,075

Solution 1

import operator
sortedlist = sorted(reader, key=operator.itemgetter(3), reverse=True)

or use lambda

sortedlist = sorted(reader, key=lambda row: row[3], reverse=True)

Solution 2

To sort by MULTIPLE COLUMN (Sort by column_1, and then sort by column_2)

with open('unsorted.csv',newline='') as csvfile:
    spamreader = csv.DictReader(csvfile, delimiter=";")
    sortedlist = sorted(spamreader, key=lambda row:(row['column_1'],row['column_2']), reverse=False)


with open('sorted.csv', 'w') as f:
    fieldnames = ['column_1', 'column_2', column_3]
    writer = csv.DictWriter(f, fieldnames=fieldnames)
    writer.writeheader()
    for row in sortedlist:
        writer.writerow(row)

Solution 3

The reader acts like a generator. On a file with some fake data:

>>> import sys, csv
>>> data = csv.reader(open('data.csv'),delimiter=';')
>>> data
<_csv.reader object at 0x1004a11a0>
>>> data.next()
['a', ' b', ' c']
>>> data.next()
['x', ' y', ' z']
>>> data.next()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

Using operator.itemgetter as Ignacio suggests:

>>> data = csv.reader(open('data.csv'),delimiter=';')
>>> import operator
>>> sortedlist = sorted(data, key=operator.itemgetter(2), reverse=True)
>>> sortedlist
[['x', ' y', ' z'], ['a', ' b', ' c']]

Solution 4

for sorting csv by column, i would use something like this

import pandas
csvData = pandas.read_csv('myfile.csv')
csvData.sort_values(["date"], axis=0, ascending=[False], inplace=True)
print(csvData)
Share:
127,075
wishi
Author by

wishi

Updated on July 09, 2022

Comments

  • wishi
    wishi almost 2 years

    I want to sort a CSV table by date. Started out being a simple task:

    import sys
    import csv
    
    reader = csv.reader(open("files.csv"), delimiter=";")
    
    for id, path, title, date, author, platform, type, port in reader:
        print date
    

    I used Python's CSV module to read in a file with that structure:

    id;file;description;date;author;platform;type;port
    
    • The date is ISO-8601, therefore I can sort it quite easily without parsing: 2003-04-22 e. g.
    • I want to sort the by date, newest entries first
    • How do I get this reader into a sortable data-structure? I think with some effort I could make a datelist: datelist += date, split and sort. However I have to re-identify the complete entry in the CSV table. It's not just sorting a list of things.
    • csv doesn't seem to have a built in sorting function

    The optimal solution would be to have a CSV client that handles the file like a database. I didn't find anything like that.

    I hope somebody knows some nice sorting magic here ;)

  • Jeff
    Jeff about 10 years
    Does this re-write the file, or just save the sorted list in the variable?
  • Ignacio Vazquez-Abrams
    Ignacio Vazquez-Abrams about 10 years
    @Jeff: It does not touch the original file. If you want to write out the results then you must do so as a separate operation.
  • abaumg
    abaumg almost 7 years
    @IgnacioVazquez-Abrams What is the difference between these two methods, what are they doing? Which one should one choose?
  • Ignacio Vazquez-Abrams
    Ignacio Vazquez-Abrams almost 7 years
    @abaumg: Functionally they are identical. There may be a small speed difference between them, but that probably won't matter unless there are millions of records in the file.
  • Foreever
    Foreever about 6 years
    Headers of csv considered here!!
  • gies0r
    gies0r over 4 years
    This is a very good, generic approach which also works if you load the data into a list of rows which than include a list of columns. Great - Thank you!