Fast Way to Get All Unique Values from a List in Java
Let’s say you have an ArrayList of string values and want to get all unique values from it. A seemingly obvious way to do this would be to create a list and before adding the next item to check that an identical item doesn’t already exist in the list.
ArrayList uniqueVals = new ArrayList(); for (Object x : myList) if (!uniqueVals.contains(x)) uniqueVals.add(x);
Unfortunately, this can be extremely slow when you are dealing with large lists. An alternative is the following workaround that uses an efficient algorithm implemented in the set class. It builds on some code I recently published for finding the union of two collections.
public static Collection Union(Collection coll1, Collection coll2) { Set union = new HashSet(coll1); union.addAll(new HashSet(coll2)); return new ArrayList(union); } public static ArrayList GetUniqueValues(Collection values) { return (ArrayList)Union(values, values); }
Essentially you’re performing the union on a single list.
This could be tested in the following way.
ArrayList x = new ArrayList(); x.add("abc"); x.add("abc"); x.add("abc"); x.add("def"); x.add("def"); x.add("ghi"); for (Object y : Lists.GetUniqueValues(x)) System.out.println(y);
This would give you abc, def, and ghi.
The above two methods could be combined into one, but that’s not really helpful because the way it is now you get two useful methods for the price of one.
return new ArrayList(new HashSet(values)); will serve the same purpose.
There’s only one problem I can see with both of those solutions: It relies on there not being any collision between hashCode() values. In general it’s the case, but the contract of the hashCode() method doesn’t guarantee it, so you could potentially lose unique values that hash to the same as something else.
not to mention it is not the most efficient way to do it
Stuart and Oleg suggest this is not the best way to approach the solution. What do you (or someone else) suggest as a way of doing it better? This forum is about sharing information.
The example only works because the items you add to the List are added following a natural order. If you test it using:
x.add(“def”);
x.add(“abc”);
x.add(“abc”);
x.add(“ghi”);
x.add(“def”);
Lists.GetUniqueValues(x) will return a List that contains (“abc”, “def”, “ghi”) which is not what is expected… What is expected is (“def”, “abc”, “ghi”).
What I suggest is one of the following solutions:
1- You can include external free libraries in your application? Go with the Apache library Commons Collections, the SetUniqueList is what you are looking for(http://commons.apache.org/collections/api-release/org/apache/commons/collections/list/SetUniqueList.html),
2- You can’t include any external libraries? Create a new class UniqueArrayList that extends ArrayList and overrides the add methods so they check if the object to add is already in the list to prevent duplicates.
one way to do it efficiently is to use Arrays.sort method on your array, using the default comparator for the types which implement Comparable or defying a new Comparator for your custom types. Than just iterate through the array to get unique values like in the code below
int[] a = new int[6];
a[0] = 10; a[1] = 100; a[2] = 100;
a[3] = 20; a[4] = 10; a[5] = 10;
Arrays.sort(a);
int value = a[0];
int i = 1;
while(i < a.length) {
if(a[i] == value) {
if(i == (a.length – 1)) {
System.out.println(a[i]);
}
} else {
System.out.println(a[i - 1]);
value = a[i];
if(i == (a.length – 1)) {
System.out.println(a[i]);
}
}
i++;
}
this will fine as long as a proper Comparator is provided to Arrays.sort() method
[...] Instead of looping through each letter of the given letter list (letters), Get a sub-list of all of the unique letters in the letter list (link: Fast way to get all unique value from a list in Java) [...]