Python read .txt-files header

15,371

Solution 1

I would use open rather than npy.loadtxt

with open(filename, 'r') as the_file:
    all_data = [line.strip() for line in the_file.readlines()]
    height_line = all_data[3]
    data = all_data[8:]

Then you can parse the value of height_line, to get the Total Height. And all your data from the file will be in the variable data.

Solution 2

This should work!

field = "Total Height"

# Get first 6 lines
with open(filename) as file:
    lines = [next(file) for x in range(6)]

value = None
for line in lines:
    if line.startswith(field):
        # Get the part of the string after the field name
        end_of_string = line[len(field):]

        # Convert it to an int:
        value = int(end_of_string.strip())

print(value) #Should print 61

If you know that the field names and values are separated by a tab character instead of spaces, you could instead use line.split('\t') to break each line into the field name and field value, and then just check if field_name is the field you care about, and if so, use the value, instead of using startswith and then slicing the resulting string to get the end of it.

Share:
15,371
brium-brium
Author by

brium-brium

Physics student and hobby programmer. Using Python, Matlab, Gnuplot, Shellscript, C, C++, LaTeX and more.

Updated on June 04, 2022

Comments

  • brium-brium
    brium-brium almost 2 years

    I need to read some information from a txt file header which looks like this:

    Date    20160122
    SP Number   8
    Gauge   250N internal
    Total Height    61
    SP Modell   SP2
    Corner Distance 150 
    
    Height  Value   Comment
    60  NaN 
    ...
    

    I have a python program program currently doing this:

    depth, N = npy.loadtxt(filename, skiprows=8, unpack=True, usecols = usecols)
    

    However I would like to read out some of the values from the header. Is there a way to do this? I am mostly interested to get the value of "Total Height". On my search I only seem to find answers concerning .csv files.

  • brium-brium
    brium-brium over 7 years
    With that I get: ValueError: need more than 1 value to unpack
  • brium-brium
    brium-brium over 7 years
    how do I parse it and how do I get the values N and depth out of data?
  • brium-brium
    brium-brium over 7 years
    But doesn't that limit it to numbers as entries? I also have letter entries there.
  • brium-brium
    brium-brium over 7 years
    It works. I only have the issue, that some of the files have a NaN entry or even a empty entry. Is there a way to filter that?
  • Christopher Shroba
    Christopher Shroba over 7 years
    You could put the value = ... line inside a try/except block, so that if it tries to parse something thats not a number, the except block catches the exception and sets value equal to None. Then, after the for loop finishes, you can check if value is equal to None. If not, do your normal thing, and if so, either do nothing or add some logic for what should happen if a file doesnt have a good Total Height specified. Also, if this answer answers your question, do you mind upvoting it and marking it as correct? :)
  • Greg
    Greg over 7 years
    @brium-brium you can replace ([0-9]+) with (.+?) and it will find any characters after space and before newline
  • brium-brium
    brium-brium over 7 years
    I added height = height_line.split('\t')[1] and that solves my problem nicely. I also added some filters with height_line.find('\t'),math.isnan and panda.isnull() and it also catches nans and empty entries.
  • jez
    jez over 7 years
    Sounds like it is trying to split a line on a character (e.g. '\t') that does not appear in the line. It was only a guess on my part that \t was being used. As I said, there are issues with the file format that you will have to resolve before proceeding: what is the rule that separates keys from values? The code in my answer must be adapted according to the answer to that question, which cannot be determined from your example.