Determining when a column value changes in pandas dataframe

39,943

Solution 1

You can create a new column for the difference

> df['C'] = df['B'].diff()
> print df
   #  A  B   C
0  1  2  3 NaN
1  2  3  3   0
2  3  4  4   1
3  4  5  4   0
4  5  5  4   0

> df_filtered = df[df['C'] != 0]
> print df_filtered
   #  A  B  C
2  3  4  4  1

This will your required rows

Solution 2

You can do the following which also works for non numerical values:

>>> import pandas as pd
>>> df = pd.DataFrame({"Status": ["A","A","B","B","C","C","C"]})
>>> df["isStatusChanged"] = df["Status"].shift(1, fill_value=df["Status"].head(1)) != df["Status"]
>>> df
  Status  isStatusChanged
0      A            False
1      A            False
2      B             True
3      B            False
4      C             True
5      C            False
6      C            False
>>> 

Note the fill_value could be different depending on your application.

Solution 3

you can use this it is much faster, hope it helps!!

my_column_changes = df["MyStringColumn"].shift() != df["MyStringColumn"]
Share:
39,943
badrobit
Author by

badrobit

Updated on July 09, 2022

Comments

  • badrobit
    badrobit almost 2 years

    I am looking to write a quick script that will run through a csv file with two columns and provide me the rows in which the values in column B switch from one value to another:

    eg:

    dataframe:

    # |  A  |  B  
    --+-----+-----
    1 |  2  |  3
    2 |  3  |  3
    3 |  4  |  4
    4 |  5  |  4
    5 |  5  |  4
    

    would tell me that the change happened between row 2 and row 3. I know how to get these values using for loops but I was hoping there was a more pythonic way of approaching this problem.

  • amc
    amc almost 7 years
    How can you do this with differences for strings? It seems that this diff() only works for numbers. I could convert all the strings to numbers...
  • Kathirmani Sukumar
    Kathirmani Sukumar about 4 years
    If you have numbers which are embedded in strings, you could use something like df['A'].astype(float).diff()
  • Eamonn Kenny
    Eamonn Kenny over 3 years
    This does not work for large strings such as "NL5358383". It can't be converted to a float.
  • Jinhua Wang
    Jinhua Wang over 3 years
    @amc you can map the string to int first.