Pandas Dataframe - Get index values based on condition

18,058

Solution 1

You can add the following code:

Retrieve the index based on the condition. Assumes constant increasing down the channels.

lower_channel_energy = df[df[period]>lower_energy].index[0]
high_channel_energy =  df[(df[period]<higher_energy).shift(-1)==False].index[0]

Printing the channels that we calculated:

print("The lower energy channel is {}".format(lower_channel_energy))
print("The higher energy channel is {}".format(high_channel_energy))

This solution assumes that the energy is increasing on the channels going down.

Solution 2

You can actually read your file directly with Pandas to simplify the program. I can reproduce the output you are expecting with:

import pandas as pd

df = pd.read_csv('data.txt', engine='python' header=1,sep=r'\s{2,}')

period = input('Enter the period: ')
lower_energy = float(input('Enter the lower energy value: '))
higher_energy = float(input('Enter the higher energy value: '))

# select the channels within the ranges provided
lo_e_range = (df[period] > lower_energy)
hi_e_range = (df[period] > higher_energy)

# Indices of the lower and higher energy channels
lec = df[period][lo_e_range].index[0]
hec = df[period][hi_e_range].index[0]

print('The lower energy channel is {}'.format(df['CHANNELS'][lec]))
print('The higher energy channel is {}'.format(df['CHANNELS'][hec]))

I have edited the code to take into account your comment.

Share:
18,058
Neal Titus Thomas
Author by

Neal Titus Thomas

Updated on June 25, 2022

Comments

  • Neal Titus Thomas
    Neal Titus Thomas almost 2 years

    I have a text file called data.txt containing tabular data look like this:

                            PERIOD
    CHANNELS    1      2      3      4       5 
    0         1.51   1.61   1.94   2.13   1.95 
    5         1.76   1.91   2.29   2.54   2.38 
    6         2.02   2.22   2.64   2.96   2.81 
    7         2.27   2.52   2.99   3.37   3.24 
    8         2.53   2.83   3.35   3.79   3.67 
    9         2.78   3.13   3.70   4.21   4.09 
    10        3.04   3.44   4.05   4.63   4.53
    

    In the CHANNELS column are the channel numbers of an instrument and in the other 5 columns are the maximum energy that that particular channel can detect in periods 1, 2, 3, 4 and 5 respectively.

    I want to write a python code which gets the inputs: Period, Lower energy and Higher energy from the user and then gives out the channel numbers corresponding to the Lower energy and Higher energy for a given period.

    For example:

    Enter the period:
    >>1
    Enter the Lower energy:
    >1.0
    Enter the Higher energy:
    >2.0
    #Output
    The lower energy channel is 0
    The higher energy channel is 6
    

    This is what I have written so far:

    import numpy as np
    import pandas as pd
    
    period = int(input('Enter the period: '))
    lower_energy = float(input('Enter the lower energy value: '))
    higher_energy = float(input('Enter the higher energy value: '))
    row_names = [0, 5, 6, 7, 8, 9, 10]
    column_names = [1, 2, 3, 4, 5] 
    data_list = []
    with open('data.txt') as f:
    lines = f.readlines()[2:]
    for line in lines:
        arr = [float(num) for num in line.split()[1:]]
        data_list.append(arr)
    df = pd.DataFrame(data_list, columns=column_names, index=row_names)
    print (df, '\n')
    print (df[period])
    

    Help me add to this.

    • Bryce Ramgovind
      Bryce Ramgovind over 5 years
      I am not sure what the actual ask/problem is ?
    • Neal Titus Thomas
      Neal Titus Thomas over 5 years
      I want to write a python code which gets the inputs: Period, Lower energy and Higher energy from the user and then gives out the channel numbers corresponding to the Lower energy and Higher energy for a given period.
    • T. Ray
      T. Ray over 5 years
      Your criteria for choosing higher and lower energy channels is unclear. For example, if the user enters 2.0 for higher energy, shouldn't the higher energy channel returned be 5 since 2.0 < 2.02 in period 1? Please clarify.
    • Neal Titus Thomas
      Neal Titus Thomas over 5 years
      The period columns are the values of MAXIMUM ENERGY THAT A CHANNEL CAN DETECT. For Period = 1, channel 5 can detect only up to 1.76. Whereas, the higher energy value = 2.0 is greater than. Therefore the correct output should be Channel 6 as it an detect up to 2.02.
  • Neal Titus Thomas
    Neal Titus Thomas over 5 years
    It isn't a typo. The period columns are the values of MAXIMUM ENERGY THAT A CHANNEL CAN DETECT. For Period = 1, channel 5 can detect only up to 1.76. Whereas, the higher energy value = 2.0 is greater than. Therefore the correct output should be Channel 6 as it an detect up to 2.02
  • Neal Titus Thomas
    Neal Titus Thomas over 5 years
    This, isn't working. It gives me the output - The lower energy channel is 0, The higher energy channel is 10 for ANY inputs.
  • Bryce Ramgovind
    Bryce Ramgovind over 5 years
    Sorry made a mistake, but this should work for your problem.
  • Neal Titus Thomas
    Neal Titus Thomas over 5 years
    It does work per se. However, I am still not getting the expected channel vaules as output. The period columns are the values of MAXIMUM ENERGY THAT A CHANNEL CAN DETECT. E.g. For Period = 1, Energy = 2.0. Channel 5 can detect only up to 1.76. Whereas, the energy value = 2.0 is greater than. Therefore the correct output should be Channel 6 as it an detect up to 2.02. Right now i am getting the channel number preceding the one i expect.
  • ejb
    ejb over 5 years
    I have edited the code to take into account your comment.
  • Bryce Ramgovind
    Bryce Ramgovind over 5 years
    Ok, that's not a problem just remove .shift(-1). I thought the requirement was the one preceding, since 2.0 is less then 2.2.