《Thinking in Java 第4版》（十三）——容器深入研究-yourtommy-ChinaUnix博客

淘米挣＊博文精选yourtommy.blog.chinaunix.net

首页　| 　博文目录　| 　关于我

yourtommy

博客访问： 1738555
博文数量： 438
博客积分： 9799
博客等级：中将
技术积分： 6092
用户组：普通用户
注册时间： 2012-03-25 17:25

文章分类

全部博文（438）

人工智能（6）
决策支持系统（3）
网络（57）

云计算（9）

Apache（8）

web前端（11）

TCP/IP（29）
数据库（9）

数据库基础（9）
Unix环境高级编程（195）

17-高级IPC（7）

16-网络IPC：套接（8）

15-进程间通信（9）

14-高级I／O（7）

13-守护进程（7）

12-线程控制（11）

11-线程（7）

10-信号（16）

9-进程关系（11）

8-进程控制（16）

7-进程环境（11）

6-系统数据文件和（10）

5-标准I／O库（15）

4-文件与目录（23）

3-文件I/O（15）

2-UNIX标准与实现（10）

1-UNIX系统总览（12）
算法（10）

算法导论（10）
综合（15）

代码之美（11）

性能与优化（4）
Android（12）

Android编程环境（1）

Android 从入门到（11）
高层设计（29）

设计模式Head Fir（21）

重构（8）
编程语言（59）

Java EE（2）

javascript（8）

C语言（6）

Fortran77-90-95（12）

Perl入门（15）

Java编程思想（16）
Linux（41）

makefile（11）

GDB（9）

Linux Shell编程（5）

Linux编辑器（4）

Linux命令（5）

Linux工具（3）

vi（3）

Ubuntu（1）
未分配的博文（2）

文章存档

2019年（1）

2013年（8）

2012年（429）

我的朋友

相关博文

《Thinking in Java 第4版》（十三）——容器深入研究

分类： Java

2012-03-25 23:50:23

Java容器分为两大类Collection和Map，它们分别对应着两个接口。我们先来研究Collection这一大类。

Collection又分为三类List、Set和Queue，它们三个同样对应着三个接口。看一下Collection接口的定义：

interface Collection<T> {
boolean add(T e);
boolean addAll(Collection<? extends T> c);
void clear();
boolean contains(Object o);
boolean containsAll(Collection<?> c);
boolean isEmpty();
Iterator<T> iterator();
boolean remove(Object o);
boolean removeAll(Collection<?> c);
boolean retainAll(Collection<?> c);
int size();
Object[] toArray();
<T> T[] toArray(T[] a);
}

这些方法从字面上看都很理解。其中retainAll与removeAll相对，只保留参数中的元素。同时它还可以返回一个迭代器Iterator。

再看看List接口的定义：

interface List<T> extends Collection<T> {
void add(int index, T element);
boolean addAll(int index, Collection<? extends T> c);
T get(int index);
int indexOf(Object o);
int lastIndexOf(Object o);
ListIterator<T> listIterator();
ListIterator<T> listIterator(int index);
T remove(int index);
T set(int index, T element);
List<T> subList(int fromIndex, int toIndex);
}

List加入了对index的支持。同时它还可以另一个迭代器类型ListIterator。

接下来是Set接口的定义：

interface Set<T> extends Collection<T> {
}

它没有加入任何新的方法。

还有Queue接口的定义：

interface Queue<T> extends Collection<T> {
T element();
boolean offer(T e);
T peek();
T poll();
T remove();
}

element方法和peek方法都是获取队首元素，不同的是当队列为空是element抛出异常（NoSuchElementException）而peek返回null。poll和remove都是返回并删除队首元素，当队列为空时remove抛出异常而poll返回null。offer方法类似于add。它在不违反队列容量限制的情况下试图插入一个元素，不成功的话返回false而不抛出异常（add方法会抛出异常IllegalStateException）。

虽然这些接口定义了各种方法，但实现它们的类并不必支持这些方法。实现接口的类在方法的实现中抛出UnSupportedOperationException来表示它不支持该方法。比如一个ArrayList的长度是可以变化的，但是如果由Arrays.asList()得到的话，那它就不再支持改变长度的方法了：

List<Integer> arrayAsList = Arrays.asList(1, 3, 5);
List<Integer> modifiableList = new ArrayList<Integer>(arrayAsList);
modifiableList.clear(); // success!
arrayAsList.set(2, 3); // success! Able to change an element
arrayAsList.clear(); // UnsupportedOperationException: cannot change the size

UnSupportedOperationException是一个RuntimeException，它表示一个编程错误，其实和C++里的Assert一样。Collections.unmodifiableList方法会得到一个只读的List：

List<Integer> unmodifiableList = Collections.unmodifiableList(modifiableList);
unmodifiableList.set(2, 3); // UnsupportedOperationException: cannot modify the list

Collections.unmodifiableList返回的其实是一个内部类Collections$UnmodifiableRandomAccessList的一个实例。

相应的，Collections.synchronizedCollection、Collections.synchronizedList、Collections.synchronizedMap、Collections.synchronizedSet、Collections.synchronizedSortedMap和Collections.synchronizedSortedSet会返回支持线程同步的容器。它们也都是Collections的内部类。

每种类型的接口都有一个Abstract类的实现，比如Collection接口被AbstractCollection实现。AbstractCollection实现了Collection里绝大多数的方法，除了iterator()方法和size()方法。那些被实现的方法仅仅是抛出一个UnSupportedOperationException。Abstract容器类存在的价值在于，当你需要写一个容器类，可以继承自这个Abstract容器类并覆盖你仅需要支持的方法，其它的不需要支持的方法可以继续抛出异常。这样你就不必耗费大量精力去实现容器接口的所有方法。相应的，你还可以发现AbstractList、AbstractSet、AbstractMap等Abstract容器类。Java类库里所有的具体容器类都继承自某个Abstract容器类。

现在来看看Iterator类。一个Collection都会返回一个Iterator用于遍历这个容器。看看Iterator的定义：

interface Iterator<T> {
boolean hasNext();
T next();
void remove();
}

当要遍历一个容器时，用hasNext判断是否有下一个元素，用next取得下一个元素，如果没有下一个元素next方法会抛出NoSuchElementException。通过remove删除当前元素，删除后当前的元素不再可用。有些Iterator不支持remove，这种情况下这个方法会简单抛出一个UnSupportedOperationException。
一旦容器被修改，之前获得的Iterator就会失效。如果这时还试图访问这个Iterator就会得到异常：

Collection<Integer> c = new ArrayList<Integer>();
c.add(1);
Iterator<Integer> itor = c.iterator();
c.add(3); // modify the collection
if (itor.hasNext())
itor.next(); // java.util.ConcurrentModificationException

List可以返回一个ListIterator类型。看看它的定义：

interface ListIterator<T> {
void add(T e);
boolean hasPrevious();
int nextIndex();
T previous();
int previousIndex();
void set(T e);
}

可以看到ListIterator支持了向前和向后遍历，同时还支持添加元素和修改当前元素的操作。

实现List接口的类有ArrayList和LinkedList。一个用数组实现，另一个用双向链表实现。除了这两个类外，还有早期开发的不再推荐使用的Vector和Stack类。

实现Set接口的类有HashSet、TreeSet和LinkedHashSet。
HashSet用散列的方式存储集合元素。它支持快速查找。放入HashSet的元素必须支持hashCode和equals方法。
LinkedHashSet可以达到HashSet一样的查找速度，同时内部维护了一个链表。当使用跌代器遍历LinkedHashSet时，遍历的顺序和元素插入的顺序一样。放入LinkedHashSet的元素同样需要支持hashCode。
TreeSet使用红黑树实现集合，它对内部元素排序。在遍历元素时按照由小到大的顺序。放入TreeSet的元素必须实现Comparable接口。TreeSet实现了SortedSet接口：

interface SortedSet<T> {
Comparator<? super T> comparator();
T first();
SortedSet<T> headSet(T toElement);
T last();
SortedSet<T> subSet(T fromElement, T toElement);
SortedSet<T> tailSet(T fromElement);
}

first方法返回集合中最小的元素。last返回最大的元素。headSet、subSet和tailSet返回子集。

实现Queue接口的类有LinkedList和PriorityQueue。
其实LinkedList同时实现了List接口和Deque接口。Deque是双向队列：

interface Deque<T> extends Queue<T> {
void addFirst(T e);
void addLast(T e);
Iterator<T> descendingIterator();
T getFirst();
T getLast();
boolean offerFirst(T e);
boolean offerLast(T e);
T peekFirst();
T peekLast();
T pollFirst();
T pollLast();
T pop();
void push(T e);
T removeFirst();
boolean removeFirstOccurrence(Object o);
T removeLast();
boolean removeLastOccurrence(Object o);
}

PriorityQueue根据优先级确定下一个移出对列的元素。放入PriorityQueue的元素需要实现Comparable接口，最小的元素会先在队首。看下面的示例代码：

PriorityQueue<Integer> pq = new PriorityQueue<Integer>();
pq.offer(5);
pq.offer(13);
pq.offer(9);
pq.offer(12);
pq.offer(6);
while (!pq.isEmpty()) {
System.out.println(pq.poll()); // 5, 6, 9, 12, 13
}

接下来看看Map接口的定义：

interface Map<K, V> {
void clear();
boolean containsKey(Object key);
boolean containsValue(Object value);
Set<java.util.Map.Entry<K, V>> entrySet();
V get(Object key);
boolean isEmpty();
Set<K> keySet();
V put(K key, V value);
void putAll(Map<? extends K, ? extends V> m);
V remove(Object key);
int size();
Collection<V> values();
}

注意到entrySet方法返回一个Map.Enry的集合。Map.Entry也是一个接口：

interface Map.Entry<K, V> {
K getKey();
V getValue();
V setValue(V value);
}

还可以看到Map通过values方法转成一个Collection类型。

实现Map接口的类有HashMap、LinkedHashMap、TreeMap、WeakHashMap、ConcurrentHashMap和IdentityHashMap等。
其实可以把Map看成特殊的Collection。Collection里的元素是一个value而Map里的元素是一个键值对（Entry）。从这种角度看HashMap、LinkedHashMap和TreeMap与HashSet、LinkedHashSet和TreeSet其实大同小异。其中TreeMap同样也实现了一个SortedMap的接口，它和SortedSet很相似：

interface SortedMap<K,V> extends Map<K,V> {
Comparator<? super K> comparator();
K firstKey();
SortedMap<K, V> headMap(K toKey);
K lastKey();
SortedMap<K, V> subMap(K fromKey, K toKey);
SortedMap<K, V> tailMap(K fromKey);
}

WeakHashMap里的某个键如果没有在Map外被引用时，这个键会被垃圾回收器回收并且该键的条目会从Map中删除。这个键其实是一个弱引用，即它不会增加对所引用对象的计数。
ConcurrentHashMap是一种线程安全的Map，它不涉及同步加锁。
IdentityHashMap在比较键值时是调用“==”操作符，而不是对象的equals方法。

说到WeakHashMap里的弱引用，我们来研究下java里的引用对象（reference object）。在java.lang.ref包里有一个Reference对象。它有三个子类SoftReference,WeakReference和PhantomReference。

通常情况一下，我们用new创建一个对象并把它赋给一个引用，这个引用是一个强引用（Strong Reference）。比如“Object o = new Integer(3);”里的“o”就是一个强引用。垃圾回收器是绝对不会释放一个强引用所引用的对象的。

软引用（soft reference）比强引用的程度稍弱些。一般情况下，垃圾回收器是不会清理软引用所引用的对象的，看这个例子：

class MyClass {
protected void finalize() {
System.out.println("MyClass.finalize");
}
}
// caller
MyClass mc = new MyClass();
SoftReference sr = new SoftReference(mc, rq);
mc = null; // free the strong reference!!
System.gc(); // output: nothing.

当使用System.gc强行清理垃圾的时候，软引用所引用的对象并不会被清理。但如果内存不够的情况下，被软引用的对象就会被清理掉：

// caller
MyClass mc = new MyClass();
SoftReference sr = new SoftReference(mc, rq);
mc = null;
// now try to use out the memory
ArrayList al = new ArrayList();
while (true)
al.add(new int[204800]);
// output: MyClass.finalize
// output: java.lang.OutOfMemoryError

可以看到在抛出异常前，虚拟机把软引用所指的对象清理掉了。软引用的这样的特性特别适合作为内存的Cache。

弱引用（Weak reference）要比软引用的程度还弱些。如果某对象没有被强引用或软引用，而只是被弱引用，垃圾回收器会清理掉这个对象：

MyClass mc = new MyClass();
WeakReference wr = new WeakReference(mc);
mc = null; // free the strong reference!!
System.gc(); // output: MyClass.finalize

虚引用（Phantom reference）不干涉对象的生命周期，它只保证当它所引用的对象被清理时，虚引用本身被放进一个ReferenceQueue里。

ReferenceQueue rq = new ReferenceQueue();
MyClass mc = new MyClass();
PhantomReference pr = new PhantomReference(mc, rq);
mc = null;
System.gc();
Reference r = null;
try {
r = rq.remove();
} catch (InterruptedException e) {
e.printStackTrace();
}
if (r != null)
System.out.println("You die!!!");

当被虚引用的对象被gc回收时，gc会把PhantomReference放入ReferenceQueue里。由于finalize方法的不可靠，虚引用可以更好地保证一些清理工作。

最后介绍一个很节省存储空间的容器BitSet。它的每个元素是一个位，你可以设置它的各个位，并且可以进行与或操作等。BitSet的最小size是64位。下面给出一个例子：

void print(BitSet bs) {
for (int i = 0; i < bs.size(); ++i)
System.out.print(bs.get(i) ? '1' : '0');
System.out.println();
}
// caller
BitSet bs1 = new BitSet();
bs1.set(2, 15);
print(bs1); // 0011111111111110000000000000000000000000000000000000000000000000
bs1.set(4, false);
print(bs1); // 0011011111111110000000000000000000000000000000000000000000000000
BitSet bs2 = new BitSet();
bs2.set(8, 27);
print(bs2); // 0000000011111111111111111110000000000000000000000000000000000000
bs2.and(bs1);
print(bs2); // 0000000011111110000000000000000000000000000000000000000000000000

阅读(947) | 评论(0) | 转发(0) |

上一篇：《Thinking in Java 第4版》（十二）——数组

下一篇：《Thinking in Java 第4版》（十四）——Java I/O系统

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6