pandas read ASCII formatted table
16,438
Assuming that your ascii data is in a string, x
:
In [1099]: x
Out[1099]: ' ----------------------------------------------------\n | col1 col2 col3 col4 |\n ------------ ------------ ------------ -------------\n 1002 0.402397E-01 0.883513E-02 0.450885E-01 0.118748E-02\n 1003 0.105235 0.474509E-02 0.118508 0.168397E-03\n 1004 0.102625 0.225842E-02 0.317864E-02 0.997383 \n 1 0.603750 0.475112E-01 0.679590 0.114713E-02\n 2 0.534171E-01 0.119815E-01 0.600187E-01 0.830949E-04\n 3 0.283291E-01 0.119353E-01 0.317530E-01 0.243996E-04\n 104 0.739759E-02 0.463873E-02 0.827061E-02 0.145207E-05\n -----------------------------------------------------'
A few options available in pd.read_csv can get you to this dataframe:
In [1123]: pd.read_csv(StringIO(x), sep=' ', skipfooter=1, skiprows=1, skipinitialspace=True).drop([0])
Out[1123]:
| col1 col2 col3 col4 |.1
1 1002 0.402397E-01 0.883513E-02 0.450885E-01 0.001187 NaN
2 1003 0.105235 0.474509E-02 0.118508 0.000168 NaN
3 1004 0.102625 0.225842E-02 0.317864E-02 0.997383 NaN
4 1 0.603750 0.475112E-01 0.679590 0.001147 NaN
5 2 0.534171E-01 0.119815E-01 0.600187E-01 0.000083 NaN
6 3 0.283291E-01 0.119353E-01 0.317530E-01 0.000024 NaN
7 104 0.739759E-02 0.463873E-02 0.827061E-02 0.000001 NaN
Related videos on Youtube
Author by
denfromufa
Currently High-Performance Machine Learning Research at Total. Python, data science, machine learning, deep learning, high-performance computing, numerical simulations, optimization, operations research, software development. Former core developer for pythonnet. Former Faculty of ML @ NAU. Former Google GDE in Machine Learning.
Updated on September 14, 2022Comments
-
denfromufa over 1 year
EDIT:
I found partial answer here:
https://stackoverflow.com/a/26551913/2230844
https://stackoverflow.com/a/15026839/2230844
How can I read in pandas such ASCII formatted table:
---------------------------------------------------- | col1 col2 col3 col4 | ------------ ------------ ------------ ------------- 1002 0.402397E-01 0.883513E-02 0.450885E-01 0.118748E-02 1003 0.105235 0.474509E-02 0.118508 0.168397E-03 1004 0.102625 0.225842E-02 0.317864E-02 0.997383 1 0.603750 0.475112E-01 0.679590 0.114713E-02 2 0.534171E-01 0.119815E-01 0.600187E-01 0.830949E-04 3 0.283291E-01 0.119353E-01 0.317530E-01 0.243996E-04 104 0.739759E-02 0.463873E-02 0.827061E-02 0.145207E-05 -----------------------------------------------------
I noticed this answer using
read_fwf()
, but it requires to manually specify the widths of columns: -
denfromufa almost 8 yearsthis seems hackish and very breakable depending on ascii table format