Python: How to read file and store certain columns in array
15,997
Solution 1
This would work:
data = []
target = []
with open('faban.txt') as fobj:
for line in fobj:
row = line.split()
data.append(row[:-1])
target.append(row[-1])
Now:
>>> data
[['faban', '1', '0', '0.288'],
['faban', '2', '0', '0.243'],
['simulated', '1', '0', '0.159'],
['faban', '1', '1', '0.189']]
>>> target
['withspy', 'withoutspy', 'withoutspy', 'withoutspy']
Solution 2
I think numpy
has a clean, easy solution here.
>>> import numpy as np
>>> data, target = np.array_split(np.loadtxt('file', dtype=str), [-1], axis=1)
results in:
>>> data.tolist()
[['faban', '1', '0', '0.288'],
['faban', '2', '0', '0.243'],
['simulated', '1', '0', '0.159'],
['faban', '1', '1', '0.189']]
>>> target.flatten().tolist()
['withspy', 'withoutspy', 'withoutspy', 'withoutspy']
Solution 3
You could do that with pandas
using read_table
to read your data, iloc
to subset your data, values
to get values from DataFrame and tolist
method to convert numpy array to list:
import pandas as pd
df = pd.read_table('path_to_your_file', delim_whitespace=True, header=None)
print(df)
0 1 2 3 4
0 faban 1 0 0.288 withspy
1 faban 2 0 0.243 withoutspy
2 simulated 1 0 0.159 withoutspy
3 faban 1 1 0.189 withoutspy
data = df.iloc[:,:-1].values.tolist()
target = df.iloc[:,-1].tolist()
print(data)
[['faban', 1, 0, 0.28800000000000003],
['faban', 2, 0, 0.243],
['simulated', 1, 0, 0.159],
['faban', 1, 1, 0.18899999999999997]]
print(target)
['withspy', 'withoutspy', 'withoutspy', 'withoutspy']
Author by
SaadH
Updated on June 16, 2022Comments
-
SaadH almost 2 years
I am reading a dataset (separated by whitespace) from a file. I need to store all columns apart from last one in the array
data
, and the last column in the arraytarget
.Can you guide me how to proceed further?
That's what I have so far:
with open(filename) as f: data = f.readlines()
Or should I be reading line by line?
PS: The data type of columns is also different.
Edit: Sample Data
faban 1 0 0.288 withspy faban 2 0 0.243 withoutspy simulated 1 0 0.159 withoutspy faban 1 1 0.189 withoutspy