How to read the contents of a csv file into a class with each csv row as a class instance

14,494

Solution 1

You can try this:

import csv
class City:
   def __init__(self, row, header):
        self.__dict__ = dict(zip(header, row))

data = list(csv.reader(open('file.csv')))
instances = [City(i, data[0]) for i in data[1:]]

However, since you mentioned that there are many rows, you may want to create an id for each city that will act as your string representation in the list:

import csv
class City:
   def __init__(self, row, header, the_id):
       self.__dict__ = dict(zip(header, row)) 
       self.the_id = the_id
   def __repr__(self):
       return self.the_id

data = list(csv.reader(open('file.csv')))
instances = [City(a, data[0], "city_{}".format(i+1)) for i, a in enumerate(data[1:])]

Your output will be a listing like this:

[city_1, city_2, city_3...]

And any attributes can be called like so:

instances[1].latitude

Regarding your recent comment, to access city attributes by city name, you can slightly restructure instances:

instances = {a[3]:City(a, data[0], "city_{}".format(i+1)) for i, a in enumerate(data[1:])}

Solution 2

Some tips to help you clean up your code:

  1. Instead of this:

    self.yr1970
    

    Define a list to keep track of years and their values:

    tokyo_years = {
        1970: 23.3,
        1975: 26.61,
        # ...
    }
    

    Now pair this structure with each city:

    cities = [
        { 'city': 'Tokyo',     'years': tokyo_years },
        { 'city': 'Vancouver', 'years': vancouver_years },
        # ...
    ]
    
  2. Don't nest so deeply. Also, the following is really weird:

    for row in cityList:
        if row != 'label':
            for row in cityList:
    

    You are looping over something and then looping over it again while you are looping over it...!

  3. Classes belong at top-level. That means there should be 0 spaces preceding class.

             class City:
    

    should be:

    class City:
    

The reason I mention all this is because trying to do anything further with messy code just results in messier code. :) Try to improve your current code by:

  1. Using data structures (lists, dictionaries).
  2. Restricting levels of nested code to 2 max. (Consider using functions to help you with this.)
  3. Putting classes at top-level.

Solution 3

If your data is just an immutable record, use namedtuple:

>>> from collections import namedtuple

>>> City = namedtuple('City', 'lat lon cityName label '
...                   'yr1970 yr1975 yr1980 yr1985 yr1990 yr1995 yr2000 yr2005 yr2010')

You can slice the row as you don't need the first value, and unpack it using *:

>>> row = ['1', '35.6832085', '139.8089447', 'Tokyo', 'Tokyo',
...        '23.3', '26.61', '28.55', '30.3', '32.53', '33.59', '34.45', '35.62', '35.7']

>>> city = City(*row[1:])

>>> city
City(lat='35.6832085', lon='139.8089447', cityName='Tokyo', label='Tokyo',
     yr1970='23.3', yr1975='26.61', yr1980='28.55', yr1985='30.3', yr1990='32.53',
     yr1995='33.59', yr2000='34.45', yr2005='35.62', yr2010='35.7')

You need to add just this object to your lists of cities, not every attribute:

>>> cities.append(city)

Putting it together with a list comprehension filtering out the label rows:

import csv
from collections import namedtuple

City = namedtuple('City',
                  'lat lon cityName label '
                  'yr1970 yr1975 yr1980 yr1985 yr1990 yr1995 yr2000 yr2005 yr2010')

with open('filepath') as f:
    cities = [City(*row[1:]) for row in csv.reader(f)
              if row[0] != 'label']
Share:
14,494
Willard A.
Author by

Willard A.

Updated on June 18, 2022

Comments

  • Willard A.
    Willard A. almost 2 years

    I'm a Python newbie, and I've been struggling with a class assignment for days. I have a csv file that contains data as such:

    id,latitude,longitude,city,label,yr1970,yr1975,yr1980,yr1985,yr1990,yr1995,yr2000,yr2005
    1,35.6832085,139.8089447,Tokyo,Tokyo,23.3,26.61,28.55,30.3,32.53,33.59,34.45,35.62
    

    There are about 40 rows in this file, each containing data related to a world city. As you can see, the top row is the header. I am supposed to create a class in Python and read the csv file into the class, where every row becomes an instance of the class. I am then to store the class instances in a list. I've been able to create one instance where all of the data is stored, but I can't seem to create an instance for each row (and I obviously do not want to do it manually).

    Here's what I've got so far:

    import csv
    Cities = []
    
    
    with open('filepath','rb') as f:
    cityList = csv.reader(f)
    for row in cityList:
        if row != 'label':
            for row in cityList:
                citysName = row[3]
    
    
                class City:
    
                    def __init__(self, cityName=row[3], Label=row[4], Lat=row[1],
                             Lon=row[2], yr1970=row[5], yr1975=row[6], yr1980=row[7],
                                 yr1985=row[8], yr1990=row[9], yr1995=row[10], yr2000=row[11],
                                 yr2005=row[12], yr2010=row[13]):
                        self.cityName = cityName
                        self.label = Label
                        self.lat = Lat
                        self.lon = Lon
                        self.yr1970 = yr1970
                        self.yr1975 = yr1975
                        self.yr1980 = yr1980
                        self.yr1985 = yr1985
                        self.yr1990 = yr1990
                        self.yr1995 = yr1995
                        self.yr2000 = yr2000
                        self.yr2005 = yr2005
                        self.yr2010 = yr2010
    
                citysName = City()
    
                Cities.append(citysName.cityName)
                Cities.append(citysName.label)
                Cities.append(citysName.lat)
                Cities.append(citysName.lon)
                Cities.append(citysName.yr1970)
                Cities.append(citysName.yr1975)
                Cities.append(citysName.yr1980)
                Cities.append(citysName.yr1985)
                Cities.append(citysName.yr1990)
                Cities.append(citysName.yr1995)
                Cities.append(citysName.yr2000)
                Cities.append(citysName.yr2005)
                Cities.append(citysName.yr2010)
    
            print Cities
    

    Again, I'm quite new with Python (and coding in general), and I realize this code is not good, but I'm having a lot of difficulty finding tips for reading csv files into a Python class.