zoukankan      html  css  js  c++  java
  • 【算法】计算一篇文章的单词数(C、Java语言实现)

    1. C语言:一个字符一个字符的读取

     (有空再贴出来)

    2.Java语言:按行读取,并用正则分割成多个单词,再用MapReduce并行计算单词数 (我使用的是ieda,有些地方跟eclipse有点区别)

    /**
         * 按流读取文件 (通过read.readLine()获取一行)
         * @param path
         * @return
         * @throws FileNotFoundException
         */
        public BufferedReader openFile(final String path) throws FileNotFoundException {
            BufferedReader reader = new BufferedReader(new FileReader(path));
    
            return reader;
        }
    /**
         * 采用Hash计算单词数
         * @param line
         * @return
         */
        public void hash(final HashMap<String, Integer> hashMap, final String line) {
            // 不能分割b2c,it's这类单词
            String[] words = line.split("[^a-z]+");
    
            for (String word : words) {
                // 去除空格、空行
                if (word.length() > 0) {
                    if (hashMap.containsKey(word) == false) {
                        hashMap.put(word, 1);
                    }
                }
            }
        }
    /**
         * 计算单词个数
         * @param hashMap
         * @return
         */
        public Integer computeWordCount(final HashMap<String, Integer> hashMap) {
            return hashMap.size();
        }

    测试用例:

    public static void main(String args[]) throws IOException {
            String path = Paths.get(PROJECT_ROOT_DIR, "src/main/resources/articles/test.txt").toString();
            BufferedReader reader = openFile(path);
    
            HashMap<String, Integer> hashMap = new HashMap<>();
            String line;
            int wordCount;
    
            while((line = reader.readLine()) != null) {
                hash(hashMap, line);
            }
    
            wordCount = computeWordCount(hashMap);
            System.out.println(wordCount);
        }
  • 相关阅读:
    回调函数(callback)是什么?
    类和对象的关系
    前端性能优化十四个规则:
    响应时间过长而导致网页问题的原因?
    给老爸更换电脑
    Notes for "Python in a Nutshell"
    Debian Jessie升级至Stretch小记
    将LibreOffice文档转换为豆瓣日记
    将Emacs Org任务树导出至Freeplane思维导图
    GNU/Linux下Freeplane的界面渲染问题
  • 原文地址:https://www.cnblogs.com/yrqiang/p/5331628.html
Copyright © 2011-2022 走看看