Compare two arrays using LINQ

10,175

Solution 1

Build a HashSet of the second, and then filter the first only allowing items if you can't remove the item from the HashSet.

var hs = new HashSet<string>(arrayTwo);
var filtered = arrayOne.Where(item => !hs.Remove(item)).ToArray();

Taking account of your extra requirements in the comments, some nifty use of ILookup works nicely here.

var lookup1 = arrayOne.ToLookup(item => item);
var lookup2 = arrayTwo.ToLookup(item => item);
var output = lookup1.SelectMany(i => i.Take(i.Count() - lookup2[i.Key].Count())).ToArray();

Solution 2

The answer depends on array sizes, duplicate elements count, importance of code speed.

For small arrays, the following code would be the simplest and the best:

List<string> result = new List<string>(arrayOne);
foreach (string element in arrayTwo)
    result.Remove(element);

If you want more efficiency for large arrays, you can use spender's answer.

If you want the most efficient code, you will have to code manually the following algorithm: 1. Sort both arrayOne and arrayTwo. 2. Iterate over both algorithms simultaneously (like in mergesort) and omit pairs with the same elements.

Proc: no heavy Lookup object Cons: need coding

Solution 3

One way to do it would be to include indices as well like:

var result = arrayOne.Select((r, i) => new {Value = r, Index = i})
    .Except(arrayTwo.Select((r, i) => new {Value = r, Index = i}))
    .Select(t => t.Value);

This will give you the required output for your input, but the issue with the above approach is, that, same string on different indices will be treated differently.

The other approach to ignore indices could be done like:

string[] arrayOne = { "One", "Two", "Three", "Three", "Three", "X" };
string[] arrayTwo = { "One", "Two", "Three" };

var query1 = arrayOne.GroupBy(r => r)
    .Select(grp => new
    {
        Value = grp.Key,
        Count = grp.Count(),
    });

var query2 = arrayTwo.GroupBy(r => r)
    .Select(grp => new
    {
        Value = grp.Key,
        Count = grp.Count(),

    });

var result = query1.Select(r => r.Value).Except(query2.Select(r => r.Value)).ToList();
var matchedButdiffferentCount = from r1 in query1
    join r2 in query2 on r1.Value equals r2.Value
    where r1.Count > r2.Count
    select Enumerable.Repeat(r1.Value, r1.Count - r2.Count);

result.AddRange(matchedButdiffferentCount.SelectMany(r=> r));

result will contain {"X", "Three", "Three"}

Solution 4

You can get the desired output by adding an index to each element of the arrays to make them look like

{{ "One", 0 }, { "Two", 0 }, { "Three", 0 }, { "Three", 1 }, { "Three", 2 }}
{{ "One", 0 }, { "Two", 0 }, { "Three", 0 }}

Then you can use Except to remove duplicates

var arrayOneWithIndex = arrayOne
    .GroupBy(x => x)
    .SelectMany(g => g.Select((e, i) => new { Value = e, Index = i }));

var arrayTwoWithIndex = arrayTwo
    .GroupBy(x => x)
    .SelectMany(g => g.Select((e, i) => new { Value = e, Index = i }));

var result = arrayOneWithIndex.Except(arrayTwoWithIndex).Select(x => x.Value);

Solution 5

Coming to this discussion late, and recording this here for reference. LINQ's Except method is using the default equality comparer to determine which items match in your two arrays. The default equality comparer, in this case, invokes the Equals method on the object. For strings, this method has been overloaded to compare the content of the string, not its identity (reference).

This explains why this is occurring in this particular scenario. Granted, it doesn't provide a solution, but I believe that others have already provided excellent answers. (And realistically, this is more than I could fit into a comment.)

One suggestion I might have made was to write a custom comparer, and passed it to the Except overload that accepts one. Custom comparers are not overly complicated, but given your scenario, I understand where you might not have desired to do so.

Share:
10,175

Related videos on Youtube

Sandeep Kushwah
Author by

Sandeep Kushwah

SOreadytohelp Senior Software Developer: Quality Expreince on .Net, MS SQL 2008, WCF, ADO.NET , ASP.NET, Winforms. Like to resolve problems on these Microsoft Technologies. https://alexnisnevich.github.io/untrusted/ to learn javascript.

Updated on June 18, 2022

Comments

  • Sandeep Kushwah
    Sandeep Kushwah almost 2 years

    For example, I have two arrays:

    string[] arrayOne = {"One", "Two", "Three", "Three", "Three"};
    string[] arrayTwo = {"One", "Two", "Three"};
    
    var result = arrayOne.Except(arrayTwo);
    
    foreach (string s in result) Console.WriteLine(s);
    

    I want Items from arrayOne which are not there in arrayTwo. So here I need result as: Three Three but I am getting no results as its treating "Three" as common and not checking the other two items("Three", "Three").

    I dont want to end up writing a huge method to solve this. Tried couple other answer on SO but didnt worked as expected :(.

    Thanks!!!

    • spender
      spender over 8 years
      That depends on whether having string[] arrayTwo = {"Two", "Three", "Three", "One"}; would still filter "One" out of arrayOne... Habib's doesn't do that.
  • Andrey Nasonov
    Andrey Nasonov over 8 years
    This will not work for arrayTwo with duplicate elements
  • spender
    spender over 8 years
    @AndreyNasonov My modded version will take this into account.
  • Sandeep Kushwah
    Sandeep Kushwah over 8 years
    Yeah,.. Quite Nice and Tricky. Thanks for elaborating it clearly.
  • StuartLC
    StuartLC over 8 years
    +1 - For once imperative, mutative code is actually more concise than most of our Linq efforts. Plus it also retains a left to right 'annihilation' of the elements.
  • Sandeep Kushwah
    Sandeep Kushwah over 8 years
    Yeah ur updated code works.. Thanks for givnig ur time to give a try :)
  • Sandeep Kushwah
    Sandeep Kushwah over 8 years
    True that custom comparer's are not over complicated, even I created one, but was excited to learn some new instead of again going for if else.
  • Sandeep Kushwah
    Sandeep Kushwah over 8 years
    Would u like to give an example of Except Overload that u were thinking of?
  • Sandeep Kushwah
    Sandeep Kushwah over 8 years
    Thanks @StuartLC :). I have tested ur answer and it was working as expected but I was looking for some shorter code and so went with splender's answer:)
  • StuartLC
    StuartLC over 8 years
    Agreed - my answer is clumsy compared to Spender's clever use of ToLookup() and Andrey's second answer is elegantly simple.