LRU(Least Recently Used)算法是缓存技术中的一种常见思想,顾名思义,最近最少使用,也就是说有两个维度来衡量,一个是时间(最近),一个频率(最少)。如果需要按优先级来对缓存中的K-V实体进行排序的话,需要考虑这两个维度,在LRU中,最近使用频率最高的排在前面,也可以简单的说最近访问的排在前面。这就是LRU的大体思想。
在操作系统中,LRU是用来进行内存管理的页面置换算法,对于在内存中但又不用的数据块(内存块)叫做LRU,操作系统会根据哪些数据属于LRU而将其移出内存而腾出空间来加载另外的数据。
wikipedia对LRU的描述:
In computing, cache algorithms (also frequently called cache replacement algorithms or cache replacement policies) are optimizinginstructions—or algorithms—that a computer program or a hardware-maintained structure can follow in order to manage a cache of information stored on the computer. When the cache is full, the algorithm must choose which items to discard to make room for the new ones.
Least Recently Used (LRU)
Discards the least recently used items first. This algorithm requires keeping track of what was used when, which is expensive if one wants to make sure the algorithm always discards the least recently used item. General implementations of this technique require keeping "age bits" for cache-lines and track the "Least Recently Used" cache-line based on age-bits. In such an implementation, every time a cache-line is used, the age of all other cache-lines changes. LRU is actually a family of caching algorithms with members including 2Q by Theodore Johnson and Dennis Shasha,[3] and LRU/K by Pat O'Neil, Betty O'Neil and Gerhard Weikum.[4]
LRUCache的分析实现
1.首先可以先实现一个FIFO的版本,但是这样只是以插入顺序来确定优先级的,没有考虑访问顺序,并没有完全实现LRUCache。
用Java中的LinkedHashMap实现非常简单。
private int capacity; private java.util.LinkedHashMap<Integer, Integer> cache = new java.util.LinkedHashMap<Integer, Integer>() { @Override protected boolean removeEldestEntry(Map.Entry<Integer, Integer> eldest) { return size() > capacity; }
};
程序中重写了removeEldestEntry()方法,如果大小超过了设置的容量就删除优先级最低的元素,在 FIFO版本中优先级最低的为最先插入的元素。
2.如果足够了解LinkedHashMap,实现LRUCache也是非常简单的。在LinkedHashMap中提供了可以设置容量、装载因子和顺序的构造方法。如果要实现LRUCache就可以把顺序的参数设置成true,代表访问顺序,而不是默认的FIFO的插入顺序。这里把装载因子设置为默认的0.75。并且还要重写removeEldestEntry()方法来维持当前的容量。这样一来可以有两种方法来实现LinkedHashMap版本的LRUCache。一种是继承一种是组合。
继承:
package lrucache.one; import java.util.LinkedHashMap; import java.util.Map; /** *LRU Cache的LinkedHashMap实现,继承。 *@author wxisme *@time 2015-10-18 上午10:27:37 */ public class LRUCache extends LinkedHashMap<Integer, Integer>{ private int initialCapacity; public LRUCache(int initialCapacity) { super(initialCapacity,0.75f,true); this.initialCapacity = initialCapacity; } @Override protected boolean removeEldestEntry( Map.Entry<Integer, Integer> eldest) { return size() > initialCapacity; } @Override public String toString() { StringBuilder cacheStr = new StringBuilder(); cacheStr.append("{"); for (Map.Entry<Integer, Integer> entry : this.entrySet()) { cacheStr.append("[" + entry.getKey() + "," + entry.getValue() + "]"); } cacheStr.append("}"); return cacheStr.toString(); } }
组合:
package lrucache.three; import java.util.LinkedHashMap; import java.util.Map; /** *LRU Cache 的LinkedHashMap实现,组合 *@author wxisme *@time 2015-10-18 上午11:07:01 */ public class LRUCache { private final int initialCapacity; private Map<Integer, Integer> cache; public LRUCache(final int initialCapacity) { this.initialCapacity = initialCapacity; cache = new LinkedHashMap<Integer, Integer>(initialCapacity, 0.75f, true) { @Override protected boolean removeEldestEntry( Map.Entry<Integer, Integer> eldest) { return size() > initialCapacity; } }; } public void put(int key, int value) { cache.put(key, value); } public int get(int key) { return cache.get(key); } public void remove(int key) { cache.remove(key); } @Override public String toString() { StringBuilder cacheStr = new StringBuilder(); cacheStr.append("{"); for (Map.Entry<Integer, Integer> entry : cache.entrySet()) { cacheStr.append("[" + entry.getKey() + "," + entry.getValue() + "]"); } cacheStr.append("}"); return cacheStr.toString(); } }
测试代码:
public static void main(String[] args) { LRUCache cache = new LRUCache(5); cache.put(5, 5); cache.put(4, 4); cache.put(3, 3); cache.put(2, 2); cache.put(1, 1); System.out.println(cache.toString()); cache.put(0, 0); System.out.println(cache.toString()); }
运行结果:
{[5,5][4,4][3,3][2,2][1,1]} {[4,4][3,3][2,2][1,1][0,0]}
可见已经实现了LRUCache的基本功能。
3.如果不用Java API提供的LinkedHashMap该如何实现LRU算法呢?首先我们要确定操作,LRU算法中的操作无非是插入、删除、查找并且要维护一定的顺序,这样我们有很多种选择,可以用数组,链表,栈,队列,Map中的一种或几种。先看栈和队列,虽然可以明确顺序实现FIFO或者FILO,但是LRU中是需要对两端操作的,既需要删除tail元素又需要移动head元素,可以想象效率是不理想的。我们要明确一个事实,数组和Map的只读操作复杂度为O(1),非只读操作的复杂度为O(n)。链式结构则相反。这么一来我们如果只使用其中的一种必定在只读或非只读操作上耗时过多。那我们大可以选择链表+Map组合结构。如果选择单向链表在对链表两端操作的时候还是要耗时O(n)。综上考虑,双向链表+Map结构应该是最好的。
在这种实现方式中,用双向链表来维护优先级顺序,也就是访问顺序。实现非只读操作。用Map存储K-V值,实现只读操作。访问顺序:最近访问(插入也是一种访问)的移动到链表头部,如果达到上限则删除链表尾部的元素。
1 package lrucache.tow; 2 3 import java.util.HashMap; 4 import java.util.Map; 5 6 /** 7 *LRUCache链表+HashMap实现 8 *@author wxisme 9 *@time 2015-10-18 下午12:34:36 10 */ 11 public class LRUCache<K, V> { 12 13 private final int initialCapacity; //容量 14 15 private Node head; //头结点 16 private Node tail; //尾结点 17 18 private Map<K, Node<K, V>> map; 19 20 public LRUCache(int initialCapacity) { 21 this.initialCapacity = initialCapacity; 22 map = new HashMap<K, Node<K, V>>(); 23 } 24 25 /** 26 * 双向链表的节点 27 * @author wxisme 28 * 29 * @param <K> 30 * @param <V> 31 */ 32 private class Node<K, V> { 33 public Node pre; 34 public Node next; 35 public K key; 36 public V value; 37 38 public Node(){} 39 40 public Node(K key, V value) { 41 this.key = key; 42 this.value = value; 43 } 44 45 } 46 47 48 /** 49 * 向缓存中添加一个K,V 50 * @param key 51 * @param value 52 */ 53 public void put(K key, V value) { 54 Node<K, V> node = map.get(key); 55 56 //node不在缓存中 57 if(node == null) { 58 //此时,缓存已满 59 if(map.size() >= this.initialCapacity) { 60 map.remove(tail.key); //在map中删除最久没有use的K,V 61 removeTailNode(); 62 } 63 node = new Node(); 64 node.key = key; 65 } 66 node.value = value; 67 moveToHead(node); 68 map.put(key, node); 69 } 70 71 /** 72 * 从缓存中获取一个K,V 73 * @param key 74 * @return v 75 */ 76 public V get(K key) { 77 Node<K, V> node = map.get(key); 78 if(node == null) { 79 return null; 80 } 81 //最近访问,移动到头部。 82 moveToHead(node); 83 return node.value; 84 } 85 86 /** 87 * 从缓存中删除K,V 88 * @param key 89 */ 90 public void remove(K key) { 91 Node<K, V> node = map.get(key); 92 93 map.remove(key); //从hashmap中删除 94 95 //在双向链表中删除 96 if(node != null) { 97 if(node.pre != null) { 98 node.pre.next = node.next; 99 } 100 if(node.next != null) { 101 node.next.pre = node.pre; 102 } 103 if(node == head) { 104 head = head.next; 105 } 106 if(node == tail) { 107 tail = tail.pre; 108 } 109 110 //除去node的引用 111 node.pre = null; 112 node.next = null; 113 node = null; 114 } 115 116 } 117 118 119 /** 120 * 把node移动到链表头部 121 * @param node 122 */ 123 private void moveToHead(Node node) { 124 125 //切断node 126 127 if(node == head) return ; 128 129 if(node.pre !=null) { 130 node.pre.next = node.next; 131 } 132 if(node.next != null) { 133 node.next.pre = node.pre; 134 } 135 if(node == tail) { 136 tail = tail.pre; 137 } 138 139 if(tail == null || head == null) { 140 tail = head = node; 141 return ; 142 } 143 144 145 //把node移送到head 146 node.next = head; 147 head.pre = node; 148 head = node; 149 node.pre = null; 150 151 } 152 153 /** 154 * 删除链表的尾结点 155 */ 156 private void removeTailNode() { 157 if(tail != null) { 158 tail = tail.pre; 159 tail.next = null; 160 } 161 } 162 163 164 @Override 165 public String toString() { 166 167 StringBuilder cacheStr = new StringBuilder(); 168 cacheStr.append("{"); 169 //因为元素的访问顺序是在链表里维护的,这里要遍历链表 170 Node<K, V> node = head; 171 while(node != null) { 172 cacheStr.append("[" + node.key + "," + node.value + "]"); 173 node = node.next; 174 } 175 176 cacheStr.append("}"); 177 178 return cacheStr.toString(); 179 } 180 181 }
测试数据:
public static void main(String[] args) { LRUCache<Integer, Integer> cache = new LRUCache<Integer, Integer>(5); cache.put(5, 5); cache.put(4, 4); cache.put(3, 3); cache.put(2, 2); cache.put(1, 1); System.out.println(cache.toString()); cache.put(0, 0); System.out.println(cache.toString()); }
运行结果:
{[1,1][2,2][3,3][4,4][5,5]} {[0,0][1,1][2,2][3,3][4,4]}
也实现了LRUCache的基本操作。
等等!一样的测试数据为什么结果和上面LinkedHashMap实现不一样!
细心观察可能会发现,虽然都实现了LRU,但是双向链表+HashMap确实是访问顺序,而LinkedHashMap却还是一种插入顺序?
深入源码分析一下:
private static final long serialVersionUID = 3801124242820219131L; /** * The head of the doubly linked list. */ private transient Entry<K,V> header; /** * The iteration ordering method for this linked hash map: <tt>true</tt> * for access-order, <tt>false</tt> for insertion-order. * * @serial */ private final boolean accessOrder;
/** * LinkedHashMap entry. */ private static class Entry<K,V> extends HashMap.Entry<K,V> { // These fields comprise the doubly linked list used for iteration. Entry<K,V> before, after; Entry(int hash, K key, V value, HashMap.Entry<K,V> next) { super(hash, key, value, next); }
private transient Entry<K,V> header; private static class Entry<K,V> extends HashMap.Entry<K,V> { Entry<K,V> before, after; …… }
从上面的代码片段可以看出,LinkedHashMap也是使用了双向链表,而且使用了Map中的Hash算法。LinkedHashMap是继承了HashMap,实现了Map的。
/** * Constructs an empty <tt>LinkedHashMap</tt> instance with the * specified initial capacity, load factor and ordering mode. * * @param initialCapacity the initial capacity * @param loadFactor the load factor * @param accessOrder the ordering mode - <tt>true</tt> for * access-order, <tt>false</tt> for insertion-order * @throws IllegalArgumentException if the initial capacity is negative * or the load factor is nonpositive */ public LinkedHashMap(int initialCapacity, float loadFactor, boolean accessOrder) { super(initialCapacity, loadFactor); this.accessOrder = accessOrder; }
上面的代码是我们使用的构造方法。
public V get(Object key) { Entry<K,V> e = (Entry<K,V>)getEntry(key); if (e == null) return null; e.recordAccess(this); return e.value; }
void recordAccess(HashMap<K,V> m) { LinkedHashMap<K,V> lm = (LinkedHashMap<K,V>)m; if (lm.accessOrder) { lm.modCount++; remove(); addBefore(lm.header); } }
void recordRemoval(HashMap<K,V> m) {
remove();
}
这是实现访问顺序的关键代码。
/** * Inserts this entry before the specified existing entry in the list. */ private void addBefore(Entry<K,V> existingEntry) { after = existingEntry; before = existingEntry.before; before.after = this; after.before = this; }
void addEntry(int hash, K key, V value, int bucketIndex) { createEntry(hash, key, value, bucketIndex); // Remove eldest entry if instructed, else grow capacity if appropriate Entry<K,V> eldest = header.after; if (removeEldestEntry(eldest)) { removeEntryForKey(eldest.key); } else { if (size >= threshold) resize(2 * table.length); } } /** * This override differs from addEntry in that it doesn't resize the * table or remove the eldest entry. */ void createEntry(int hash, K key, V value, int bucketIndex) { HashMap.Entry<K,V> old = table[bucketIndex]; Entry<K,V> e = new Entry<K,V>(hash, key, value, old); table[bucketIndex] = e; e.addBefore(header); size++; }
通过这两段代码我们可以知道,出现上面问题的原因是实现访问顺序的方式不一样,链表+HashMap是访问顺序优先级从前往后,而LinkedHashMap中是相反的。
拓展一下:
public HashMap(int initialCapacity, float loadFactor) { if (initialCapacity < 0) throw new IllegalArgumentException("Illegal initial capacity: " + initialCapacity); if (initialCapacity > MAXIMUM_CAPACITY) initialCapacity = MAXIMUM_CAPACITY; if (loadFactor <= 0 || Float.isNaN(loadFactor)) throw new IllegalArgumentException("Illegal load factor: " + loadFactor); // Find a power of 2 >= initialCapacity int capacity = 1; while (capacity < initialCapacity) capacity <<= 1; this.loadFactor = loadFactor; threshold = (int)(capacity * loadFactor); table = new Entry[capacity]; init(); }
上面这段代码是HashMap的初始化代码,可以知道,初始容量是设置为1的,然后不断的加倍知道大于设置的容量为止。这是一种节省存储的做法。如果设置了装载因子,在后续的扩充操作中容量是初始设置容量和装载因子之积。
上面的所有实现都是单线程的。在并发的情况下不适用。可以使用java.util.concurrent包下的工具类和Collections工具类进行并发改造。
JDK中的LinkedHashMap实现效率还是很高的。可以看一个LeetCode的中的应用:http://www.cnblogs.com/wxisme/p/4888648.html
参考资料:
http://www.cnblogs.com/lzrabbit/p/3734850.html#f1
https://en.wikipedia.org/wiki/Cache_algorithms#LRU
http://zhangshixi.iteye.com/blog/673789
如有错误,敬请指正。