Tab-delimited file using csv.reader not delimiting where I expect it to

python csv python-requests

59,932

Solution 1

so whats happening, well, a call to help may shed some light.

>>> help(csv.reader)
 reader(...)
    csv_reader = reader(iterable [, dialect='excel']
                            [optional keyword args])
        for row in csv_reader:
            process(row)

    The "iterable" argument can be any object that returns a line
    of input for each iteration, such as a file object or a list.  The
    optional "dialect" parameter is discussed below.  The function
    also accepts optional keyword arguments which override settings
    provided by the dialect.

so it appears that csv.reader expects an iterator of some kind which will return a line, but we are passing a string which iterates on a char bases which is why its parsing character by character, one way to fix this would be to generate a temp file, but we don't need to, we just need to pass any iterable object.

note the following, which simply splits the string to a list of lines, before its fed to the reader.

import csv
import requests

r = requests.get('http://vote.wa.gov/results/current/export/MediaResults.txt') 
data = r.text
reader = csv.reader(data.splitlines(), delimiter='\t')
for row in reader:
    print row

this seems to work.

I also recommend using csv.DictReader its quite useful.

>>> reader = csv.DictReader(data.splitlines(), delimiter='\t')
>>> for row in reader:
...      print row
{'Votes': '417141', 'BallotName': 'Michael Baumgartner', 'RaceID': '2', 'RaceName': 'U.S. Senator', 'PartyName': '(Prefers Republican Party)', 'TotalBallotsCastByRace': '1387059', 'RaceJurisdictionTypeName': 'Federal', 'BallotID': '23036'}
{'Votes': '15005', 'BallotName': 'Will Baker', 'RaceID': '2', 'RaceName': 'U.S. Senator', 'PartyName': '(Prefers Reform Party)', 'TotalBallotsCastByRace': '1387059', 'RaceJurisdictionTypeName': 'Federal', 'BallotID': '27435'}

basically it returns a dictionary for every row, using the header as the key, this way we don't need to keep track of the order but instead just the name making a bit easier for us ie row['Votes'] seems more readable then row[4]...

Solution 2

This works perfectly:

import csv

reader = csv.reader(open('./MediaResults.txt'),
                    delimiter='\t')
for row in reader:
    print row

The first parameter to csv.readershould be:

any object which supports the iterator protocol and returns a string each time its next() method is called

as per the docs, and you are passing a string, not a file object. A string behaves as a list of single characters hence the behavior you are observing.

Solution 3

Simple problem: The csv.reader didn't expect a string for its input.

Simple solution: Change the input to: data.splitlines().

The csv reader expects an iterable that returns lines one at a time. A string, unfortunately, iterates a character at a time. To solve the problem, use splitlines() to turn the string into a list of lines:

reader = csv.reader(data.splitlines(), delimiter='\t')
for row in reader:
    print row

Solution 4

Perhaps you want to sniff the dialect through the csv API:

csvfile = open("example.csv", "rb")
dialect = csv.Sniffer().sniff(csvfile.read(1024))
csvfile.seek(0)
reader = csv.reader(csvfile, dialect)

This will produce the correct output.

foxyNinja7

Updated on August 28, 2020

Comments

foxyNinja7 over 3 years

I am trying to loop through a tab-delimited file of election results using Python. The following code does not work, but when I use a local file with the same results (the commented out line), it does work as expected.

The only thing I can think of is some headers or content type I need to pass the url, but I cannot figure it out.

Why is this happening?

import csv
import requests

r = requests.get('http://vote.wa.gov/results/current/export/MediaResults.txt') 
data = r.text
#data = open('data/MediaResults.txt', 'r')
reader = csv.reader(data, delimiter='\t')
for row in reader:
    print row

Results in:

...
['', '']
['', '']
['2']
['3']
['1']
['1']
['8']
['', '']
['D']
['a']
['v']
['i']
['d']
[' ']
['F']
['r']
['a']
['z']
['i']
['e']
['', '']
...

Andreas Jung over 11 years

The original is actually that you are passing the data directly to the reader() constructor instead of file handle.

Recents

Why Is PNG file with Drop Shadow in Flutter Web App Grainy?

How to troubleshoot crashes detected by Google Play Store for Flutter app

Cupertino DateTime picker interfering with scroll behaviour

Why does awk -F work for most letters, but not for the letter "t"?

Flutter change focus color and icon color but not works

How to print and connect to printer using flutter desktop via usb?

Critical issues have been reported with the following SDK versions: com.google.android.gms:play-services-safetynet:17.0.0

Flutter Dart - get localized country name from country code

navigatorState is null when using pushNamed Navigation onGenerateRoutes of GetMaterialPage

Android Sdk manager not found- Flutter doctor error

Flutter Laravel Push Notification without using any third party like(firebase,onesignal..etc)

How to change the color of ElevatedButton when entering text in TextField

Saving a downloaded CSV file using Python

Convert text data from requests object to dataframe with pandas

Use python requests to download CSV

How can I call arguments and return the outputs from a Google cloud hosted function?

How to change the separator used in a CSV file?

Sort CSV by column name

Represent a tree hierarchy using an Excel spreadsheet to be easily parsed by Python CSV reader?

pandas read ASCII formatted table

Writing to a text file error - Must be str, not list

Upload CSV file using Python Flask and process it