How to read the contents of a csv file into a class with each csv row as a class instance
Solution 1
You can try this:
import csv
class City:
def __init__(self, row, header):
self.__dict__ = dict(zip(header, row))
data = list(csv.reader(open('file.csv')))
instances = [City(i, data[0]) for i in data[1:]]
However, since you mentioned that there are many rows, you may want to create an id for each city that will act as your string representation in the list:
import csv
class City:
def __init__(self, row, header, the_id):
self.__dict__ = dict(zip(header, row))
self.the_id = the_id
def __repr__(self):
return self.the_id
data = list(csv.reader(open('file.csv')))
instances = [City(a, data[0], "city_{}".format(i+1)) for i, a in enumerate(data[1:])]
Your output will be a listing like this:
[city_1, city_2, city_3...]
And any attributes can be called like so:
instances[1].latitude
Regarding your recent comment, to access city attributes by city name, you can slightly restructure instances
:
instances = {a[3]:City(a, data[0], "city_{}".format(i+1)) for i, a in enumerate(data[1:])}
Solution 2
Some tips to help you clean up your code:
-
Instead of this:
self.yr1970
Define a list to keep track of years and their values:
tokyo_years = { 1970: 23.3, 1975: 26.61, # ... }
Now pair this structure with each city:
cities = [ { 'city': 'Tokyo', 'years': tokyo_years }, { 'city': 'Vancouver', 'years': vancouver_years }, # ... ]
-
Don't nest so deeply. Also, the following is really weird:
for row in cityList: if row != 'label': for row in cityList:
You are looping over something and then looping over it again while you are looping over it...!
-
Classes belong at top-level. That means there should be 0 spaces preceding
class
.class City:
should be:
class City:
The reason I mention all this is because trying to do anything further with messy code just results in messier code. :) Try to improve your current code by:
- Using data structures (lists, dictionaries).
- Restricting levels of nested code to 2 max. (Consider using functions to help you with this.)
- Putting classes at top-level.
Solution 3
If your data is just an immutable record, use namedtuple
:
>>> from collections import namedtuple
>>> City = namedtuple('City', 'lat lon cityName label '
... 'yr1970 yr1975 yr1980 yr1985 yr1990 yr1995 yr2000 yr2005 yr2010')
You can slice the row as you don't need the first value, and unpack it using *
:
>>> row = ['1', '35.6832085', '139.8089447', 'Tokyo', 'Tokyo',
... '23.3', '26.61', '28.55', '30.3', '32.53', '33.59', '34.45', '35.62', '35.7']
>>> city = City(*row[1:])
>>> city
City(lat='35.6832085', lon='139.8089447', cityName='Tokyo', label='Tokyo',
yr1970='23.3', yr1975='26.61', yr1980='28.55', yr1985='30.3', yr1990='32.53',
yr1995='33.59', yr2000='34.45', yr2005='35.62', yr2010='35.7')
You need to add just this object to your lists of cities, not every attribute:
>>> cities.append(city)
Putting it together with a list comprehension filtering out the label rows:
import csv
from collections import namedtuple
City = namedtuple('City',
'lat lon cityName label '
'yr1970 yr1975 yr1980 yr1985 yr1990 yr1995 yr2000 yr2005 yr2010')
with open('filepath') as f:
cities = [City(*row[1:]) for row in csv.reader(f)
if row[0] != 'label']
Willard A.
Updated on June 18, 2022Comments
-
Willard A. almost 2 years
I'm a Python newbie, and I've been struggling with a class assignment for days. I have a csv file that contains data as such:
id,latitude,longitude,city,label,yr1970,yr1975,yr1980,yr1985,yr1990,yr1995,yr2000,yr2005 1,35.6832085,139.8089447,Tokyo,Tokyo,23.3,26.61,28.55,30.3,32.53,33.59,34.45,35.62
There are about 40 rows in this file, each containing data related to a world city. As you can see, the top row is the header. I am supposed to create a class in Python and read the csv file into the class, where every row becomes an instance of the class. I am then to store the class instances in a list. I've been able to create one instance where all of the data is stored, but I can't seem to create an instance for each row (and I obviously do not want to do it manually).
Here's what I've got so far:
import csv Cities = [] with open('filepath','rb') as f: cityList = csv.reader(f) for row in cityList: if row != 'label': for row in cityList: citysName = row[3] class City: def __init__(self, cityName=row[3], Label=row[4], Lat=row[1], Lon=row[2], yr1970=row[5], yr1975=row[6], yr1980=row[7], yr1985=row[8], yr1990=row[9], yr1995=row[10], yr2000=row[11], yr2005=row[12], yr2010=row[13]): self.cityName = cityName self.label = Label self.lat = Lat self.lon = Lon self.yr1970 = yr1970 self.yr1975 = yr1975 self.yr1980 = yr1980 self.yr1985 = yr1985 self.yr1990 = yr1990 self.yr1995 = yr1995 self.yr2000 = yr2000 self.yr2005 = yr2005 self.yr2010 = yr2010 citysName = City() Cities.append(citysName.cityName) Cities.append(citysName.label) Cities.append(citysName.lat) Cities.append(citysName.lon) Cities.append(citysName.yr1970) Cities.append(citysName.yr1975) Cities.append(citysName.yr1980) Cities.append(citysName.yr1985) Cities.append(citysName.yr1990) Cities.append(citysName.yr1995) Cities.append(citysName.yr2000) Cities.append(citysName.yr2005) Cities.append(citysName.yr2010) print Cities
Again, I'm quite new with Python (and coding in general), and I realize this code is not good, but I'm having a lot of difficulty finding tips for reading csv files into a Python class.