Java 集合类主要由两个接口派生而出: Collection 和 Map。在 Collection 集合中,我们经常用到的是 List 集合和 Map 集合,而 Set 集合出场的机会就相对比较的少了。在书本上学习的时候就只知道 Set 集合是无序并且是不可重复的,所以也就对 Set 集合排序的问题没有怎么好好考虑,知其然而不知其所以然。但,最近在项目中就遇到一个关于 Set 集合排序的问题,所以我又拿起了书本,仔细阅读关于集合方面的资料,并且浏览网上相关的教程以及论坛中的帖子等。收集和总结了一些 Set 集合方面的知识,供大家参考。
说起 Set 集合想到的就是 Set 集合是无序并且不重复的集合。当试图把两个相同的对象加入一个 Set 中时,对象会调用 equals 方法比较两个对象元素是否相同,相同则不会加入。
// Dummy value to associate with an Object in the backing Map
private static final Object PRESENT = new Object();
/**
* Adds the specified element to this set if it is not already present.
* More formally, adds the specified element <tt>e</tt> to this set if
* this set contains no element <tt>e2</tt> such that
* <tt>(e==null ? e2==null : e.equals(e2))</tt>.
* If this set already contains the element, the call leaves the set
* unchanged and returns <tt>false</tt>.
*
* @param e element to be added to this set
* @return <tt>true</tt> if this set did not already contain the specified
* element
*/
public boolean add(E e) {
return map.put(e, PRESENT)==null;
}
/**
* Associates the specified value with the specified key in this map.
* If the map previously contained a mapping for the key, the old
* value is replaced.
*
* @param key key with which the specified value is to be associated
* @param value value to be associated with the specified key
* @return the previous value associated with <tt>key</tt>, or
* <tt>null</tt> if there was no mapping for <tt>key</tt>.
* (A <tt>null</tt> return can also indicate that the map
* previously associated <tt>null</tt> with <tt>key</tt>.)
*/
public V put(K key, V value) {
if (table == EMPTY_TABLE) {
inflateTable(threshold);
}
if (key == null)
return putForNullKey(value);
int hash = hash(key);
int i = indexFor(hash, table.length);
for (Entry<K,V> e = table[i]; e != null; e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
V oldValue = e.value;
e.value = value;
e.recordAccess(this);
return oldValue;
}
}
modCount++;
addEntry(hash, key, value, i);
return null;
}
Set集合有几个派生类:HashSet LinkedHashSet TreeSet 等
HashSet
/**
* Constructs a new, empty set; the backing <tt>HashMap</tt> instance has
* default initial capacity (16) and load factor (0.75).
*/
public HashSet() {
map = new HashMap<>();
}
private transient HashMap<E,Object> map;
/**
* Constructs an empty <tt>HashMap</tt> with the default initial capacity
* (16) and the default load factor (0.75).
*/
public HashMap() {
this(DEFAULT_INITIAL_CAPACITY, DEFAULT_LOAD_FACTOR);
}
/**
* Constructs an empty <tt>HashMap</tt> with the specified initial
* capacity and load factor.
*
* @param initialCapacity the initial capacity
* @param loadFactor the load factor
* @throws IllegalArgumentException if the initial capacity is negative
* or the load factor is nonpositive
*/
public HashMap(int initialCapacity, float loadFactor) {
if (initialCapacity < 0)
throw new IllegalArgumentException("Illegal initial capacity: " +
initialCapacity);
if (initialCapacity > MAXIMUM_CAPACITY)
initialCapacity = MAXIMUM_CAPACITY;
if (loadFactor <= 0 || Float.isNaN(loadFactor))
throw new IllegalArgumentException("Illegal load factor: " +
loadFactor);
this.loadFactor = loadFactor;
threshold = initialCapacity;
init();
}
/**
* The default initial capacity - MUST be a power of two.
*/
static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16
/**
* The load factor used when none specified in constructor.
*/
static final float DEFAULT_LOAD_FACTOR = 0.75f;
通过查看 JDK 原码我们可以很清晰的看出 HashSet 的底层就是一个 HashMap,所以 HashSet 的很多特性都是 HashMap 的延续。
HashSet的特点有:
元素的排列顺序是随机的,不保证排列顺序
HashSet 非线程安全的,所以速度快
允许元素值为 null
HashSet 集合判断两个元素是否相等的标准是两个对象(元素)通过 equals 方法比较返回 true,并且两个对象的 HashCode() 方法返回值也要相等。因为 HashCode 集合中存入一个元素时,HashSet 会调用该对象的的 HashCode 方法得到对象的HashCode 值,决定该对象在 HashCode 中的存储位置。如果加入的两个对象通过equals() 的返回值相等,但是 HashCode() 的返回值不相等,HashSet 依然会认为两个元素是不相等的,可以添加成功。所以,当把某个类的对象作为元素存储到HashSet 中,重写这个类的 equals 方法和 HashCode 方法时,尽量保证两个对象通过 equals() 方法返回 true 时,他们的 HashCode() 方法的返回值也是相等的。
/**
* Compares the specified object with this set for equality. Returns
* <tt>true</tt> if the given object is also a set, the two sets have
* the same size, and every member of the given set is contained in
* this set. This ensures that the <tt>equals</tt> method works
* properly across different implementations of the <tt>Set</tt>
* interface.<p>
*
* This implementation first checks if the specified object is this
* set; if so it returns <tt>true</tt>. Then, it checks if the
* specified object is a set whose size is identical to the size of
* this set; if not, it returns false. If so, it returns
* <tt>containsAll((Collection) o)</tt>.
*
* @param o object to be compared for equality with this set
* @return <tt>true</tt> if the specified object is equal to this set
*/
public boolean equals(Object o) {
if (o == this)
return true;
if (!(o instanceof Set))
return false;
Collection c = (Collection) o;
if (c.size() != size())
return false;
try {
return containsAll(c);
} catch (ClassCastException unused) {
return false;
} catch (NullPointerException unused) {
return false;
}
}
LinkedHashSet
LinkedHashSet 是 HashSet 的一个子类。使用链表来维护元素的次序。也就是说当遍历 LinkedHashSet 集合时,将会以元素添加时的顺序来访问集合中的数据。LinkedHashSet 需要维护元素的插入顺序,因此性能低于 HashSet,但是在迭代访问
Set 全部元素时有较好的性能,因为以链表维护内部的顺序。是非同步的,非线程安全。
TreeSet
TreeSet 是 SortedSet 接口的实现类。如其名字,TreeSet 可以确保内部元素有序,采用红黑树的结构来存储集合元素,TreeSet 支持两种排序方法:自然排序和定制排序。
自然排序:TreeSet 调用集合元素的 compareTo(Object obj) 方法来比较元素之间的大小,然后按照元素升序排列。这就要求对象元素必须实现 Comparable 接口中的
compareTo(Object obj) 方法。在自然排序下,TreeSet 判断两个对象是否相同的唯一标准:两个对象通过 compareTo(Object o) 方法比较是否相等:该方法返回 0,则认为相等,返回 1 则认为不相等。Java 的一些常见的类,例如 Character,String,Date 等已经实现了 Comparable 接口。
/**
* Adds the specified element to this set if it is not already present.
* More formally, adds the specified element {@code e} to this set if
* the set contains no element {@code e2} such that
* <tt>(e==null ? e2==null : e.equals(e2))</tt>.
* If this set already contains the element, the call leaves the set
* unchanged and returns {@code false}.
*
* @param e element to be added to this set
* @return {@code true} if this set did not already contain the specified
* element
* @throws ClassCastException if the specified object cannot be compared
* with the elements currently in this set
* @throws NullPointerException if the specified element is null
* and this set uses natural ordering, or its comparator
* does not permit null elements
*/
public boolean add(E e) {
return m.put(e, PRESENT)==null;
}
/**
* The backing map.
*/
private transient NavigableMap<E,Object> m;
/**
* Associates the specified value with the specified key in this map.
* If the map previously contained a mapping for the key, the old
* value is replaced.
*
* @param key key with which the specified value is to be associated
* @param value value to be associated with the specified key
*
* @return the previous value associated with {@code key}, or
* {@code null} if there was no mapping for {@code key}.
* (A {@code null} return can also indicate that the map
* previously associated {@code null} with {@code key}.)
* @throws ClassCastException if the specified key cannot be compared
* with the keys currently in the map
* @throws NullPointerException if the specified key is null
* and this map uses natural ordering, or its comparator
* does not permit null keys
*/
public V put(K key, V value) {
Entry<K,V> t = root;
if (t == null) {
compare(key, key); // type (and possibly null) check
root = new Entry<>(key, value, null);
size = 1;
modCount++;
return null;
}
int cmp;
Entry<K,V> parent;
// split comparator and comparable paths
Comparator<? super K> cpr = comparator;
if (cpr != null) {
do {
parent = t;
cmp = cpr.compare(key, t.key);
if (cmp < 0)
t = t.left;
else if (cmp > 0)
t = t.right;
else
return t.setValue(value);
} while (t != null);
}
else {
if (key == null)
throw new NullPointerException();
Comparable<? super K> k = (Comparable<? super K>) key;
do {
parent = t;
cmp = k.compareTo(t.key);
if (cmp < 0)
t = t.left;
else if (cmp > 0)
t = t.right;
else
return t.setValue(value);
} while (t != null);
}
Entry<K,V> e = new Entry<>(key, value, parent);
if (cmp < 0)
parent.left = e;
else
parent.right = e;
fixAfterInsertion(e);
size++;
modCount++;
return null;
}
定制排序:在创建 TreeSet 集合对象时,需要关联一个 Comparator 对象,并且实现 Comparator 中的 compare(T obj1,T obj 2),该方法用于比较 o1 和 o2 的大小。
/**
* Constructs a new, empty tree set, sorted according to the specified
* comparator. All elements inserted into the set must be <i>mutually
* comparable</i> by the specified comparator: {@code comparator.compare(e1,
* e2)} must not throw a {@code ClassCastException} for any elements
* {@code e1} and {@code e2} in the set. If the user attempts to add
* an element to the set that violates this constraint, the
* {@code add} call will throw a {@code ClassCastException}.
*
* @param comparator the comparator that will be used to order this set.
* If {@code null}, the {@linkplain Comparable natural
* ordering} of the elements will be used.
*/
public TreeSet(Comparator<? super E> comparator) {
this(new TreeMap<>(comparator));
}
/**
* Constructs a new, empty tree map, ordered according to the given
* comparator. All keys inserted into the map must be <em>mutually
* comparable</em> by the given comparator: {@code comparator.compare(k1,
* k2)} must not throw a {@code ClassCastException} for any keys
* {@code k1} and {@code k2} in the map. If the user attempts to put
* a key into the map that violates this constraint, the {@code put(Object
* key, Object value)} call will throw a
* {@code ClassCastException}.
*
* @param comparator the comparator that will be used to order this map.
* If {@code null}, the {@linkplain Comparable natural
* ordering} of the keys will be used.
*/
public TreeMap(Comparator<? super K> comparator) {
this.comparator = comparator;
}
/**
* The comparator used to maintain order in this tree map, or
* null if it uses the natural ordering of its keys.
*
* @serial
*/
private final Comparator<? super K> comparator;
/**
* The comparator used to maintain order in this tree map, or
* null if it uses the natural ordering of its keys.
*
* @serial
*/
private final Comparator<? super K> comparator;
原码学习
由于 TreeSet 底层其实就是调用的 TreeMap 的方法,所以我们再仔细看一下 TreeMap 的原码。
TreeMap 的构造方法
public TreeMap() {
comparator = null;
}
public TreeMap(Comparator<? super K> comparator) {
this.comparator = comparator;
}
public TreeMap(Map<? extends K, ? extends V> m) {
comparator = null;
putAll(m);
}
public TreeMap(SortedMap<K, ? extends V> m) {
comparator = m.comparator();
try {
buildFromSorted(m.size(), m.entrySet().iterator(), null, null);
} catch (java.io.IOException cannotHappen) {
} catch (ClassNotFoundException cannotHappen) {
}
}
构造方法一,默认的构造方法,comparator 为空,即采用自然顺序维持TreeMap中节点的顺序。
构造方法二,提供指定的比较器,用于实现比较。
构造方法三,采用自然序维持TreeMap中节点的顺序,同时将传入的Map中的内容添加到 TreeMap 中。
构造方法四,接收 SortedMap 参数,根据 SortedMap 的比较器维持 TreeMap 中的节点顺序。同时通过
private final Entry<K,V> buildFromSorted(int level, int lo, int hi,
int redLevel,
Iterator it,
java.io.ObjectInputStream str,
V defaultVal)
方法将 SortedMap 中的内容添加到 TreeMap 中。
TreeMap 的 put 方法
/**
* Associates the specified value with the specified key in this map.
* If the map previously contained a mapping for the key, the old
* value is replaced.
*
* @param key key with which the specified value is to be associated
* @param value value to be associated with the specified key
*
* @return the previous value associated with {@code key}, or
* {@code null} if there was no mapping for {@code key}.
* (A {@code null} return can also indicate that the map
* previously associated {@code null} with {@code key}.)
* @throws ClassCastException if the specified key cannot be compared
* with the keys currently in the map
* @throws NullPointerException if the specified key is null
* and this map uses natural ordering, or its comparator
* does not permit null keys
*/
public V put(K key, V value) {
Entry<K,V> t = root;
if (t == null) {
//如果根节点为null,将传入的键值对构造成根节点(根节点没有父节点,所以传入的父节点为null)
root = new Entry<K,V>(key, value, null);
size = 1;
modCount++;
return null;
}
// 记录比较结果
int cmp;
Entry<K,V> parent;
// 分割比较器和可比较接口的处理
Comparator<? super K> cpr = comparator;
// 有比较器的处理
if (cpr != null) {
// do while实现在root为根节点移动寻找传入键值对需要插入的位置
do {
// 记录将要被掺入新的键值对将要节点(即新节点的父节点)
parent = t;
// 使用比较器比较父节点和插入键值对的key值的大小
cmp = cpr.compare(key, t.key);
// 插入的key较大
if (cmp < 0)
t = t.left;
// 插入的key较小
else if (cmp > 0)
t = t.right;
// key值相等,替换并返回t节点的value(put方法结束)
else
return t.setValue(value);
} while (t != null);
}
// 没有比较器的处理
else {
// key为null抛出NullPointerException异常
if (key == null)
throw new NullPointerException();
Comparable<? super K> k = (Comparable<? super K>) key;
// 与if中的do while类似,只是比较的方式不同
do {
parent = t;
cmp = k.compareTo(t.key);
if (cmp < 0)
t = t.left;
else if (cmp > 0)
t = t.right;
else
return t.setValue(value);
} while (t != null);
}
// 没有找到key相同的节点才会有下面的操作
// 根据传入的键值对和找到的“父节点”创建新节点
Entry<K,V> e = new Entry<K,V>(key, value, parent);
// 根据最后一次的判断结果确认新节点是“父节点”的左孩子还是又孩子
if (cmp < 0)
parent.left = e;
else
parent.right = e;
// 对加入新节点的树进行调整
fixAfterInsertion(e);
// 记录size和modCount
size++;
modCount++;
// 因为是插入新节点,所以返回的是null
return null;
}
首先一点通性是 TreeMap 的 put 方法和其他 Map 的 put 方法一样,向 Map 中加入键值对,若原先 “键(key)”已经存在则替换 “值(value)”,并返回原先的值。
在 put(K key,V value) 方法的末尾调用了 fixAfterInsertion(Entry x) 方法,这个方法负责在插入节点后调整树结构和着色,以满足「红黑树」的要求。
每一个节点或者着成红色,或者着成黑色
根是黑色的
如果一个节点是红色的,那么它的子节点必须是黑色的
一个节点到一个 null 引用的每一条路径必须包含相同数量的黑色节点
几种 Set 实现类的总结
HashSet 的性能比 TreeSet 的性能高(查询、添加等),因为 TreeSet 需要红黑树算法来维护内部元素的顺序。只有要实现保持排序状态的 Set 时,才用到 TreeSet ,否则应该使用 HashSet;
LinkedHashSet 对于插入和删除操作性能比 HashSet 要差,因为LinkedHashSet 是利用链表来维护添加顺序的,但是正因如此,遍历操作要比 HashSet 要快;
在 HashSet 和 TreeSet 中尽量只添加不可变对象;
HashSet 和 TreeSet 的实现类都是线程不安全的。如果多个线程同时访问修改一个Set集合,必须手动实现线程同步性。例如通过Collections工具类的synchronizeSorted方法包装Set集合来保证Set集合的线程安全性。
SortedSet sortedSet = Collection.synchronizeSorted(new HashSet());