To extract non-nan values from multiple rows in a pandas dataframe

python python-2.7 numpy dataframe pandas

24,511

Solution 1

df.ix[1:6].dropna(axis=1)

As a heads up, irow will be deprecated in the next release of pandas. New methods, with clearer usage, replace it.

http://pandas.pydata.org/pandas-docs/dev/indexing.html#deprecations

Solution 2

In 0.11 (0.11rc1 is out now), this is very easy using .iloc to first select the first 6 rows, then dropna drops any row with a nan (you can also pass some options to dropna to control exactly which columns you want considered)

I realized you want 1:6, I did 0:6 in my answer....

In [8]: df = DataFrame(randn(10,3),columns=list('ABC'),index=date_range('20130101',periods=10))

In [9]: df.ix[6,'A'] = np.nan

In [10]: df.ix[6,'B'] = np.nan

In [11]: df.ix[2,'A'] = np.nan

In [12]: df.ix[4,'B'] = np.nan

In [13]: df.iloc[0:6]
Out[13]: 
                   A         B         C
2013-01-01  0.442692 -0.109415 -0.038182
2013-01-02  1.217950  0.006681 -0.067752
2013-01-03       NaN -0.336814 -1.771431
2013-01-04 -0.655948  0.484234  1.313306
2013-01-05  0.096433       NaN  1.658917
2013-01-06  1.274731  1.909123 -0.289111

In [14]: df.iloc[0:6].dropna()
Out[14]: 
                   A         B         C
2013-01-01  0.442692 -0.109415 -0.038182
2013-01-02  1.217950  0.006681 -0.067752
2013-01-04 -0.655948  0.484234  1.313306
2013-01-06  1.274731  1.909123 -0.289111

24,511

Author by

user2179627

Updated on November 02, 2020

Comments

user2179627 over 3 years

I am working on several taxi datasets. I have used pandas to concat all the dataset into a single dataframe.

My dataframe looks something like this.

                     675                       1039                #and rest 125 taxis
                     longitude     latitude    longitude    latitude
date
2008-02-02 13:31:21  116.56359  40.06489       Nan          Nan
2008-02-02 13:31:51  116.56486  40.06415       Nan          Nan
2008-02-02 13:32:21  116.56855  40.06352       116.58243    39.6313
2008-02-02 13:32:51  116.57127  40.06324       Nan          Nan
2008-02-02 13:33:21  116.57120  40.06328       116.55134    39.6313
2008-02-02 13:33:51  116.57121  40.06329       116.55126    39.6123
2008-02-02 13:34:21  Nan        Nan            116.55134    39.5123

where 675,1039 are the taxi ids. Basically there are totally 127 taxis having their corresponding latitudes and longitudes columned up.

I have several ways to extract not-null values for a row.

df.ix[k,df.columns[np.isnan(df.irow(0))!=1]]
              (or)
df.irow(0)[np.isnan(df.irow(0))!=1]
              (or)
df.irow(0)[np.where(df.irow(0)[df.columns].notnull())[0]]

any of the above commands will return,

675   longitude    116.56359
      latitude     40.064890 
4549  longitude    116.34642
      latitude      39.96662
Name: 2008-02-02 13:31:21

now i want to extract all the notnull values from first few rows(say from row 1 to row 6).

how do i do that?

i can probably loop it up. But i want a non-looped way of doing it.

Any help, suggestions are welcome. Thanks in adv! :)

Recents

Why Is PNG file with Drop Shadow in Flutter Web App Grainy?

How to troubleshoot crashes detected by Google Play Store for Flutter app

Cupertino DateTime picker interfering with scroll behaviour

Why does awk -F work for most letters, but not for the letter "t"?

Flutter change focus color and icon color but not works

How to print and connect to printer using flutter desktop via usb?

Critical issues have been reported with the following SDK versions: com.google.android.gms:play-services-safetynet:17.0.0

Flutter Dart - get localized country name from country code

navigatorState is null when using pushNamed Navigation onGenerateRoutes of GetMaterialPage

Android Sdk manager not found- Flutter doctor error

Flutter Laravel Push Notification without using any third party like(firebase,onesignal..etc)

How to change the color of ElevatedButton when entering text in TextField

How to map numeric data into categories / bins in Pandas dataframe

Python: Convert column from float to int

Using Pandas to Find Minimum Values of Grouped Rows

Error when trying to apply log method to pandas data frame column in Python

'DataFrame' object has no attribute 'value_counts'

Pandas: replace column values based on match from another column

How to calculate covariance Matrix with Pandas

Pandas, how to filter a df to get unique entries?

Pandas: cannot safely convert passed user dtype of int32 for float64

Accessing the index/row/column from a selected cell in Pandas/Python