is there a faster way to extract unique values from object collection?
Solution 1
If you want to get or count the distinct areas in the employee list, you can use a set of strings. I'm changing the variable names to match Java standards. You can get the count afterwards. ideally, these would be lazy methods.
Imperative Code
public Set<String> areas(final List<Employee> employees) {
Set<String> areas = new HashSet<>();
for(final Employee employee: employees) {
areas.add(employee.getArea());
}
return areas;
}
Functional Code (Google Guava)
public Set<String> areas(final List<Employee> employees) {
return Sets.newHashSet(
Lists.transform(employees, new Function<Employee, String>() {
public String apply(Employee e) {
return e.getArea();
}
}));
}
Lambdas (Java 8)
public Set<String> areas(final List<Employee> employees) {
return new HashSet<String>(employees.map(e => e.getArea()));
}
Solution 2
Insert all employees into the HashSet
. From the definition of Set
, they will be all unique.
Set<Employee> unique = new HashSet<Employee>(Arrays.asList(employeeTress));
// unique.toArray() if needed
If you want Employee
objects to be considered equal when they have the same AREA
, you need to properly override the equals()
method in Employee
class.
Solution 3
You can use a Set
to do this, as others have already stated, but if you want items to be considered equal when they have the same AREA
then you'll need to override the equals
method in your Employee
object to make it compare itself to others based on that variable.
You need to know a few things before just overidding the equals method. There's a discussion about it here: What issues should be considered when overriding equals and hashCode in Java?
Solution 4
Just use HashSet, it will ONLY add unique elements to the HashSet.
The objectOfHashSet.add(Object)
function of HashSet will return true on successful
addition of the object,
Set<Employee> hs = new HashSet<Employee>();
if(!hs.add(i2)){
// do some operation here
}
You will also need to override the equals method
here.
public boolean equals(Object obj) {
if (obj == null)
return false;
if (obj == this)
return true;
if (!(obj instanceof Employee))
return false;
// HERE PERFORM YOUR CHECK
if("Employee.NAME".isequals(obj.NAME))
{return true;}
}
Also make sure that the hashCode() of the key objects
that you put into the collection never changes while the object is in the collection. The best way to ensure this is to make your keys immutable
.
Related videos on Youtube
montelof
Updated on September 15, 2022Comments
-
montelof over 1 year
I have a method to extract the values from an object collection that is a employee information:
public class Employee { public String AREA; public String EMPLOYEE_ID; public String EMPLOYEE_NAME; }
I'd like to get all the distinct Areas I did what I thought would be the easier, just check if the ArrayList contains the value, if not the add it, it takes 187ms to complete, :
long startTime = System.currentTimeMillis(); ArrayList<String> distinct_areas = new ArrayList<String>(); for (int i = 0; i < this.employeeTress.length; i++) { if (!distinct_areas.contains(this.employeeTress[i].AREA)) distinct_areas.add(this.employeeTress[i].AREA); } String[] unique = new String[distinct_areas.size()]; distinct_areas.toArray(unique); long endTime = System.currentTimeMillis(); System.out.println("Total execution time: " + (endTime - startTime) + "ms");
then I thought to do it differently to see if it gets faster, sorting the array then check only the last item if its different then add it, and its a little bit faster, it takes 121ms to complete:
startTime = System.currentTimeMillis(); String[] vs = new String[this.employeeTress.length]; for (int i = 0; i < this.employeeTress.length; i++) { vs[i] = this.employeeTress[i].AREA; } Arrays.sort(vs); ArrayList<String> vsunique = new ArrayList<String>(); vsunique.add(vs[0]); for (int i = 0; i < vs.length; i++) { if (!vsunique.get(vsunique.size()-1).equals(vs[i])) { vsunique.add(vs[i]); } } String[] uni = new String[vsunique.size()]; vsunique.toArray(uni); endTime = System.currentTimeMillis(); System.out.println("Total execution time: " + (endTime - startTime) + "ms");
I'm new to Java I'd like to know a better way to do this. *Note, this code should work in android gingerbread API LVL 10 regards.
-
Luiggi MendozaUse a
Set
instead of aList
. Also, I would not worry for this performance improvement until it demonstrates to be a real bottleneck in the application.
-
-
Luiggi Mendoza over 10 yearsIn fact, the
Set
should be to hold theEmployee
data instead of the current array OP's using. -
montelof over 10 yearsImperative form using HashSet is much faster than checking if item is already in the collection. thank you.
-
Eric Jablow over 10 yearsThat's because the JRE is doing the same thing. The other versions simply remove the explicit looping. They don't buy you that much snce this isn't a lazy problem; you need to compute the entire thing. Still, you should consider other ways of storing the data. A relational database could store your employees, and with appropriate indexing, the db could work extremely fast. I did fix a typo.