Remove duplicates values based on multiple column with a condition in query editor Power BI

13,890

Solution 1

I bet the suggestion from @Alexis Olson works just fine, but since you specifically mentioned the Query Editor, here's how I would do it there:


  1. Have your data loaded like below, and just accept the changes made under Changed Type:

enter image description here

Don't worry about the other steps under the Query Settings. We'll get to that eventually.

  1. Select Add Column and click Index Column, so that you get this:

enter image description here

  1. Select Add Column, click Custom Column and insert this little formula in the appearing dialog box Table.AddColumn(#"Added Index", "Custom", each #"Added Index"[Time Spent]{[Index]}-#"Added Index"[Time Spent]{[Index]-1}):

enter image description here

  1. Click OK, and make sure that you're getting this:

enter image description here

  1. I think this step is a little weird, but you'll have to click 'Table' there in the column:

enter image description here

  1. You will get an Error message in the first row, but you can remove that by right-clicking that column, and clicking Remove Errors:

enter image description here

  1. Now you can click the drop-down menu in the Custom Column, select Number Filter and Does Not Equal

enter image description here

  1. And insert 0, or select 0 from the drop-down menu in the dialog box:

enter image description here

  1. This is it, your required numbers should now be filtered away:

enter image description here

Note, however, that this procedure comes at a cost since you're losing the first value due to the first step in the indexing. If the rest of this is something you can use, I can see if we can fix that last little part as well.

Solution 2

You can pick a representative [Time Spent] value from each unique set of rows by taking a max or min over the list of "duplicate" values. Here's the formula for such a custom column, which I'll call [Min Time]:

= List.Min(
      Table.SelectRows(#"Previous Step",
          (C) => (C[Time] = [Time] and
                  C[ID] = [ID] and
                  Number.Abs(C[Time Spent] - [Time Spent]) < 1)
      )[Time Spent])

Once you have this custom column, you can group by [Time], [ID], and [Min Time] to roll up the duplicates and then rename the [Min Time] column to [Time Spent].

Share:
13,890
Admin
Author by

Admin

Updated on July 23, 2022

Comments

  • Admin
    Admin almost 2 years

    I am new to power bi and would require your help to sort out below issue which I am facing.

    Basically I am taking three columns into consideration as below:

    enter image description here

    Question: I would like to remove duplicate values from above table based on conditon " Equal value for "Time" ,"ID" and Absolute difference in "Time spent" is lower or equal than 1" as you can see in the image Rows highlighted falls in this category.

    I would like to get these below rows removed based upon condition.

    enter image description here

    Question: I would like to remove duplicate values from above table based on conditon " Equal value for "Time" ,"ID" and Absolute difference in "Time spent" is lower or equal than 1" as you can see in the image Rows highlighted falls in this category.

    I would like to get these below rows removed based upon condition.

    enter image description here

    I am able to perform this in excel by making us of a fourth column with formulae =IF(AND(A3=A2,B3=B2,ABS(F3-F2)<1),"problem",0) and then filtering out the rows marked as probelm. Please help!!

    Regards

    Mahi

  • Admin
    Admin over 5 years
    Hi vestland thanks for your help this worked well :)
  • vestland
    vestland over 5 years
    No problem! We need more activity on Power BI related problems on this forum =) Now that you've earned more than 15 reputation points, you've got the possibility to upvote answers that are useful to you. And that does not have to be limited to the accepted answer.