How to efficiently remove duplicates from an array without using Set
Solution 1
Since this question is still getting a lot of attention, I decided to answer it by copying this answer from Code Review.SE:
You're following the same philosophy as the bubble sort, which is very, very, very slow. Have you tried this?:
Sort your unordered array with quicksort. Quicksort is much faster than bubble sort (I know, you are not sorting, but the algorithm you follow is almost the same as bubble sort to traverse the array).
Then start removing duplicates (repeated values will be next to each other). In a
for
loop you could have two indices:source
anddestination
. (On each loop you copysource
todestination
unless they are the same, and increment both by 1). Every time you find a duplicate you increment source (and don't perform the copy). @morgano
Solution 2
you can take the help of Set collection
int end = arr.length;
Set<Integer> set = new HashSet<Integer>();
for(int i = 0; i < end; i++){
set.add(arr[i]);
}
now if you will iterate through this set, it will contain only unique values. Iterating code is like this :
Iterator it = set.iterator();
while(it.hasNext()) {
System.out.println(it.next());
}
Solution 3
If you are allowed to use Java 8 streams:
Arrays.stream(arr).distinct().toArray();
Solution 4
Note: I am assuming the array is sorted.
Code:
int[] input = new int[]{1, 1, 3, 7, 7, 8, 9, 9, 9, 10};
int current = input[0];
boolean found = false;
for (int i = 0; i < input.length; i++) {
if (current == input[i] && !found) {
found = true;
} else if (current != input[i]) {
System.out.print(" " + current);
current = input[i];
found = false;
}
}
System.out.print(" " + current);
output:
1 3 7 8 9 10
Solution 5
Slight modification to the original code itself, by removing the innermost for loop.
public static int[] removeDuplicates(int[] arr){
int end = arr.length;
for (int i = 0; i < end; i++) {
for (int j = i + 1; j < end; j++) {
if (arr[i] == arr[j]) {
/*int shiftLeft = j;
for (int k = j+1; k < end; k++, shiftLeft++) {
arr[shiftLeft] = arr[k];
}*/
arr[j] = arr[end-1];
end--;
j--;
}
}
}
int[] whitelist = new int[end];
/*for(int i = 0; i < end; i++){
whitelist[i] = arr[i];
}*/
System.arraycopy(arr, 0, whitelist, 0, end);
return whitelist;
}
ashur
Updated on July 08, 2022Comments
-
ashur almost 2 years
I was asked to write my own implementation to remove duplicated values in an array. Here is what I have created. But after tests with 1,000,000 elements it took very long time to finish. Is there something that I can do to improve my algorithm or any bugs to remove ?
I need to write my own implementation - not to use
Set
,HashSet
etc. Or any other tools such as iterators. Simply an array to remove duplicates.public static int[] removeDuplicates(int[] arr) { int end = arr.length; for (int i = 0; i < end; i++) { for (int j = i + 1; j < end; j++) { if (arr[i] == arr[j]) { int shiftLeft = j; for (int k = j+1; k < end; k++, shiftLeft++) { arr[shiftLeft] = arr[k]; } end--; j--; } } } int[] whitelist = new int[end]; for(int i = 0; i < end; i++){ whitelist[i] = arr[i]; } return whitelist; }