Converting a csv file into a list of tuples with python

45,605

Solution 1

You can ponder this:

import csv

def fitem(item):
    item=item.strip()
    try:
        item=float(item)
    except ValueError:
        pass
    return item        

with open('/tmp/test.csv', 'r') as csvin:
    reader=csv.DictReader(csvin)
    data={k.strip():[fitem(v)] for k,v in reader.next().items()}
    for line in reader:
        for k,v in line.items():
            k=k.strip()
            data[k].append(fitem(v))

print data 

Prints:

{'Price': [6.05, 8.05, 6.54, 6.05, 7.05, 7.45, 5.45, 6.05, 6.43, 7.05, 8.05, 3.05],
 'Type': ['orange', 'orange', 'orange', 'pear', 'pear', 'pear', 'apple', 'apple', 'apple', 'apple', 'plum', 'plum'], 
 'Brand': ['brand1', 'brand2', 'brand3', 'brand1', 'brand2', 'brand3', 'brand1', 'brand2', 'brand3', 'brand4', 'brand1', 'brand2'], 
 'Weight': [3.2, 5.2, 4.2, 3.2, 3.6, 3.9, 2.7, 3.2, 3.5, 3.9, 4.2, 2.2]}

If you want the csv file literally as tuples by rows:

import csv
with open('/tmp/test.csv') as f:
    data=[tuple(line) for line in csv.reader(f)]

print data
# [('Brand', ' Price', ' Weight', ' Type'), ('brand1', ' 6.05', ' 3.2', ' orange'), ('brand2', ' 8.05', ' 5.2', ' orange'), ('brand3', ' 6.54', ' 4.2', ' orange'), ('brand1', ' 6.05', ' 3.2', ' pear'), ('brand2', ' 7.05', ' 3.6', ' pear'), ('brand3', ' 7.45', ' 3.9', ' pear'), ('brand1', ' 5.45', ' 2.7', ' apple'), ('brand2', ' 6.05', ' 3.2', ' apple'), ('brand3', ' 6.43', ' 3.5', ' apple'), ('brand4', ' 7.05', ' 3.9', ' apple'), ('brand1', ' 8.05', ' 4.2', ' plum'), ('brand2', ' 3.05', ' 2.2', ' plum')]

Solution 2

import csv
with open("some.csv") as f:
       r = csv.reader(f)
       print filter(None,r)

or with list comprehension

import csv
with open("some.csv") as f:
       r = csv.reader(f)
       print [row for row in r if row]

for comparison

In [3]: N = 100000

In [4]: the_list = [randint(0,3) for _ in range(N)]

In [5]: %timeit filter(None,the_list)
1000 loops, best of 3: 1.91 ms per loop

In [6]: %timeit [i for i in the_list if i]
100 loops, best of 3: 4.01 ms per loop

[edit] since your actual output does not have blanks you donot need the list comprehension or the filter you can just say list(r)

Final answer without blank lines

import csv
with open("some.csv") as f:
       print list(csv.reader(f))

if you want dicts you can do

import csv
with open("some.csv") as f:
       reader = list(csv.reader(f))
       print [dict(zip(reader[0],x)) for x in reader]
       #or
       print map(lambda x:dict(zip(reader[0],x)), reader)
Share:
45,605
Sean
Author by

Sean

Updated on September 13, 2020

Comments

  • Sean
    Sean over 3 years

    I am to take a csv with 4 columns: brand, price, weight, and type.

    The types are orange, apple, pear, plum.

    Parameters: I need to select the most possible weight, but by selecting 1 orange, 2 pears, 3 apples, and 1 plum by not exceeding as $20 budget. I cannot repeat brands of the same fruit (like selecting the same brand of apple 3 times, etc).

    I can open and read the csv file through Python, but I'm not sure how to create a dictionary or list of tuples from the csv file?

    For more clarity, here's an idea of the data.

    Brand, Price, Weight, Type
    brand1, 6.05, 3.2, orange
    brand2, 8.05, 5.2, orange
    brand3, 6.54, 4.2, orange
    brand1, 6.05, 3.2, pear
    brand2, 7.05, 3.6, pear
    brand3, 7.45, 3.9, pear
    brand1, 5.45, 2.7, apple
    brand2, 6.05, 3.2, apple
    brand3, 6.43, 3.5, apple
    brand4, 7.05, 3.9, apple
    brand1, 8.05, 4.2, plum
    brand2, 3.05, 2.2, plum
    

    Here's all I have right now:

    import csv
    test_file = 'testallpos.csv'
    csv_file = csv.DictReader(open(test_file, 'rb'), ["brand"], ["price"], ["weight"], ["type"])