zoukankan      html  css  js  c++  java
  • Java String的高效统计子串出现次数

    结论: 使用 substring 时, 尽量采用双脚标方式,

    理由: substring 指定双脚标时, 比默认指定但脚标时会少复制一些字符

    性能验证:

    import org.apache.commons.io.FileUtils;
    
    import java.io.File;
    import java.io.IOException;
    import java.time.Duration;
    import java.time.LocalTime;
    
    public class TestProcessing {
        public static void main(String[] args) throws IOException {
    
            //功能测试小文件
            //String oldfile = "I:\StudyProject\5sProject\filesearch\test-source\test.txt";
    
            //性能测试大文件
            String oldfile = "I:\StudyProject\5sProject\filesearch\test-source\深入理解JVM-学习笔记.txt";
            String[] keys = {"加载", "接口", "使用", "初始化", "文件"};
            String content = FileUtils.readFileToString(new File(oldfile), "utf-8");
            int count = 0;
            LocalTime start = LocalTime.now();
            for (String key : keys) {
                for (int i = 0, length = content.length(), keyLength = key.length(); i + keyLength <= length; i++) {
                    if (content.substring(i, i + keyLength).equals(key)) {
                        count++;
                    }
                }
            }
            Duration between = Duration.between(start, LocalTime.now());
            System.out.println("count1: " + count + "  between1: " + between);
    
            int sum = 0;
            LocalTime start2 = LocalTime.now();
            for (String key : keys) {
                String temp = content;
                while (temp.contains(key)) {
                    temp = temp.substring(temp.indexOf(key) + key.length());
                    sum++;
                }
            }
            Duration between2 = Duration.between(start2, LocalTime.now());
            System.out.println("count2: " + sum + "  between2: " + between2);
    
    
        }
    
    }

     测试结果:

    count1: 262890  between1: PT0.663S
    count2: 262890  between2: PT4M55.925S
  • 相关阅读:
    改变人生的32句励志名言(转载)
    Unrecognized Attribute 'xmlns' when working with VS.NET Express Edition
    学外语的十条珍贵经验(转)
    自考版“八荣八耻”
    弟弟手机丢了
    近期准备学习3本书
    盗版vs2005.net买不到
    非常希望有“苏州.net俱乐部”
    My twenty,the end of a dynasty.
    死递归:“段错误”产生的可能原因之一
  • 原文地址:https://www.cnblogs.com/tyxuanCX/p/12592762.html
Copyright © 2011-2022 走看看