python - using numpy loadtxt reading a csv file with different data types for each column
Solution 1
You are very close to what you are looking for. Try this
data = np.loadtxt('TS.csv', dtype='str,int', delimiter=',', usecols=(0, 1), unpack=True)
Solution 2
I would generally suggest np.genfromtxt
if you have something that np.loadtxt
can't handle, but they both struggle with space delimited files if there is missing data. It'd be hard to define how many missing data points there are without a comma separator for instance.
A similar function that may work is pd.read_csv
or pd.read_table
(same thing mostly), which does take care of this issue. Just make sure to set the parameter delim_whitespace
to True
with this file formatting.
pd.read_table('TS.csv', delim_whitespace=True, header=None)
Related videos on Youtube
Superstar
Updated on May 14, 2020Comments
-
Superstar almost 4 years
I created a csv file with two columns, the first column is time data, and the second one is some measured data values.
2015/1/1 0:00 5 2015/1/1 0:15 10 2015/1/1 0:30 10 2015/1/1 0:45 15 2015/1/1 1:00 5 2015/1/1 1:15 20 2015/1/1 1:30 20 2015/1/1 1:45 40 2015/1/1 2:00 30 2015/1/1 2:15 20 2015/1/1 2:30 25 2015/1/1 2:45 10 2015/1/1 3:00 2015/1/1 3:15 2015/1/1 3:30 2015/1/1 3:45 2015/1/1 4:00 2015/1/1 4:15 2015/1/1 4:30 30 2015/1/1 4:45 50 2015/1/1 5:00 70
Now I want to use
numpy.loadtxt
function to read this two columns into two different numpy arrays with string data type for the date column and integer data type for the value column.I tried different statements to do that, but none of them works.
time, data = np.loadtxt('TS.csv',dtype=str,delimiter=',',usecols=(0, 1),unpack=True) time, data = np.loadtxt('TS.csv',dtype=(str,int),delimiter=',',usecols=(0, 1),unpack=True) time, data = np.loadtxt('TS.csv',dtype=[str,int],delimiter=',',usecols=(0, 1),unpack=True)
Does anyone know how to realize the goal I just described? Thanks for your help!
-
Superstar almost 9 yearsIn general, your solution works well. That's I'm looking for! But when it comes to the special dataset I posted here, there are several empty rows in it. So this argument setting you mentioned doesn't work in this situation. Anyway, your suggestion is really helpful! Thank you very much!
-
Earlee about 2 yearsthis works but in my case (perhaps the numpy version), I have to specify the max. length. for example:
dtype='S10,int'
where 10 after S tells numpy to expect up to 10 characters.