Can Python remove double quotes from a string, when reading in text file?

86,945

Solution 1

The csv module (standard library) does it automatically, although the docs isn't very specific about skipinitialspace

>>> import csv

>>> with open(name, 'rb') as f:
...     for row in csv.reader(f, delimiter=' ', skipinitialspace=True):
...             print '|'.join(row)

5.6|4.5|6.8|6.5
5.4|8.3|1.2|9.3

Solution 2

for line in open(name, "r"):
    line = line.replace('"', '').strip()
    a, b, c, d = map(float, line.split())

This is kind of bare-bones, and will raise exceptions if (for example) there aren't four values on the line, etc.

Solution 3

There's a module you can use from the standard library called shlex:

>>> import shlex
>>> print shlex.split('5.6  4.5  6.8  "6.5"')
['5.6', '4.5', '6.8', '6.5']

Solution 4

for line in open(fname):
    line = line.split()
    line[-1] = line[-1].strip('"\n')
    floats = [float(i) for i in line]

another option is to use built-in module, that is intended for this task. namely csv:

>>> import csv
>>> for line in csv.reader(open(fname), delimiter=' '):
    print([float(i) for i in line])

[5.6, 4.5, 6.8, 6.5]
[5.6, 4.5, 6.8, 6.5]

Solution 5

Or you can simply replace your line

l = re.split("\s+",string.strip(line)).replace('\"','')

with this:

l = re.split('[\s"]+',string.strip(line))
Share:
86,945
Open the way
Author by

Open the way

Updated on July 09, 2022

Comments

  • Open the way
    Open the way almost 2 years

    I have some text file like this, with several 5000 lines:

    5.6  4.5  6.8  "6.5" (new line)
    5.4  8.3  1.2  "9.3" (new line)
    

    so the last term is a number between double quotes.

    What I want to do is, using Python (if possible), to assign the four columns to double variables. But the main problem is the last term, I found no way of removing the double quotes to the number, is it possible in linux?

    This is what I tried:

    #!/usr/bin/python
    
    import os,sys,re,string,array
    
    name=sys.argv[1]
    infile = open(name,"r")
    
    cont = 0
    while 1:
             line = infile.readline()
             if not line: break
             l = re.split("\s+",string.strip(line)).replace('\"','')
         cont = cont +1
         a = l[0]
         b = l[1]
         c = l[2]
         d = l[3]