removeAll - 走看看

zoukankan html css js c++ java

removeAll
Java中，List是最常用到的一种集合类。我们也经常对List进行操作，也没有碰到什么问题。但是刚刚在调用removeAll方法是，碰到了严重的性能问题。

问题是这样的：

原集合：List<T> source，有大概200,000数据。

目标集合：List<T> destination，有大概150,000数据。

两者中都可能有重复的元素，两者中可能有相同的元素。已经实现了T中的hashCode()，equals()方法。我调用了source.removeAll(destination)，结果花费了15分钟时间。这真是不可接受的。

下来就是自己瞎折腾，试图实现一个与removeAll()方法功能一样的方法，但是性能要有提高。

思路一：有资料表明，给List中add()数据的速度要比从List中remove()数据的快。试着实现了下，但是效果不明显，与原来的removeAll()差别不大。源代码如下：
[html] view plain copy

public List<T> removeAll_01(List<T> source, List<T> destination) {

    List<T> result = new LinkedList<T>();

    for(T t : source) {

        if (!destination.contains(t)) {

            result.add(t);

        }

    }

    return result;

}
思路二：运用Set可以去重这一特性。将source中的元素逐个添加到由destination生成的Set中，如果Set.add(e)为true，说明e应该保留到结果中，否则放弃e。因为source中可能存在重复的元素，因此想到用Map来保存source中的元素与其在source中出现的次数。结果令我大跌眼镜，太JB帅了。性能的提高有好几个数量级。源代码如下：
[html] view plain copy

public List<T> removeAll_02(List<T> source, List<T> destination) {

    List<T> result = new LinkedList<T>();



    Map<T, Integer> sourceMap = new HashMap<T, Integer>();

    for (T t : source) {

        if (sourceMap.containsKey(t)) {

            sourceMap.put(t, sourceMap.get(t) + 1);

        } else {

            sourceMap.put(t, 1);

        }

    }



    Set<T> all = new HashSet<T>(destination);

    for (Map.Entry<T, Integer> entry : sourceMap.entrySet()) {

        T key = entry.getKey();

        Integer value = entry.getValue();

        if (all.add(key)) {

            for (int i = 0; i < value; i++) {

                result.add(key);

            }

        }

    }

    return result;

}
思路三：比较下思路二和思路一的代码实现，思路二优于思路一？为什么，为什么？难道Map.containsKey()方法要比List.contains()方法快几何倍数。所以有了思路三。因为Map.containsKey()实际就是Set.contains()，所以对思路一的代码做了少许修改。证实了想法，结果比思路二更好。源代码如下：
[html] view plain copy

public List<T> removeAll_03(List<T> source, List<T> destination) {

    List<T> result = new LinkedList<T>();

    Set<T> destinationSet = new HashSet<T>(destination);

    for(T t : source) {

        if (!destinationSet.contains(t)) {

            result.add(t);

        }

    }

    return result;

}
那么，让我没看看具体的下效果如何。为了方便测试，T选用Integer，List中的元素用随机数Random.nextInt()，随机数的最大之为1,000,000。

上述结果表明，随着数据量的变大，思路二和思路三的表现非常出色。为什么会这样呢？归根到底，还是因为Map.containsKey()和Set.contains()的速度快。
查看全文

相关阅读:
AngularJs的UI组件ui-Bootstrap分享（十三）——Progressbar
AngularJs的UI组件ui-Bootstrap分享（十二）——Rating
AngularJs的UI组件ui-Bootstrap分享（十一）——Typeahead
AngularJs的UI组件ui-Bootstrap分享（十）——Model
AngularJs的UI组件ui-Bootstrap分享（九）——Alert
AngularJs的UI组件ui-Bootstrap分享（八）——Tooltip和Popover
AngularJs的UI组件ui-Bootstrap分享（七）——Buttons和Dropdown
AngularJs的UI组件ui-Bootstrap分享（六）——Tabs
Xamarin Error：Could not find android.jar for API Level 23.
nodejs文件压缩-使用gulp命令（安装过程）

原文地址：https://www.cnblogs.com/pureEve/p/6553428.html