<Think Python>中统计文献单词的处理代码 - 走看看

zoukankan html css js c++ java

<Think Python>中统计文献单词的处理代码

def process_line(line, hist):
    """Adds the words in the line to the histogram.

    Modifies hist.

    line: string
    hist: histogram (map from word to frequency)
    """
    # replace hyphens with spaces before splitting
    line = line.replace('-', ' ')

    for word in line.split():
        # remove punctuation and convert to lowercase
        word = word.strip(string.punctuation + string.whitespace)    # 单词的分割要找到其本质特点，其首尾必然是字母（中部可能出现标点，如isn't）
        word = word.lower()

        # update the histogram
        hist[word] = hist.get(word, 0) + 1

查看全文

相关阅读:
为 IBM Lotus Notes V8 构建复合应用程序(三)
为 IBM Lotus Notes V8 构建复合应用程序(一)
Lotus Notes 8中全新的Out of Office功能
 为 IBM Lotus Notes V8 构建复合应用程序(四)
为 IBM Lotus Notes V8 构建复合应用程序(六)
为 IBM Lotus Notes V8 构建复合应用程序(五)
为 IBM Lotus Notes V8 构建复合应用程序(七)
为 IBM Lotus Notes V8 构建复合应用程序(二)
为 IBM Lotus Notes V8 构建复合应用程序(十)
opencore内部调度

原文地址：https://www.cnblogs.com/instona/p/3350153.html

Copyright © 2011-2022 走看看