zoukankan      html  css  js  c++  java
  • Python作业:jieba库

    运用jieba库统计词频,并对词频进行排序

    import jieba
    txt = open("文章.txt","r",encoding='gbk',errors='replace').read()
    words  = jieba.lcut(txt)
    counts = {}
    for word in words:
        if len(word) == 1:
            continue
        else:
            counts[word] = counts.get(word,0) + 1
            
    items = list(counts.items())
    items.sort(key=lambda x:x[1], reverse=True) 
    for i in range(15):
        word, count = items[i]
        print ("{0:<10}{1:>5}".format(word, count))

    词云

    from wordcloud import WordCloud
    import matplotlib.pyplot as plt
    import jieba
    def create_word_cloud(filename):
        text = open(file='文章.txt', encoding='utf-8').read()
        wordlist = jieba.cut(text, cut_all=True)
        wl = " ".join(wordlist)
        wc = WordCloud(
            background_color="black",
            max_words=2000,
            font_path='msyhl.ttf',
            height=1200,
            width=1600,
            max_font_size=100,
            random_state=100,
            )
        myword = wc.generate(wl)  
        plt.imshow(myword)
        plt.axis("off")
        plt.show()
        wc.to_file('img_book.png') 
    if __name__ == '__main__':
        create_word_cloud('mytext')

    词云参考:https://blog.csdn.net/weixin_40902527/article/details/86717490

    https://www.jb51.net/article/142134.htm

  • 相关阅读:
    C语言实验报告
    C语言实验报告
    第四次作业4-树和二叉树
    第03次作业-栈和队列
    第02次作业-线性表
    Data_Structure01-绪论作业
    C语言第二次实验报告
    C语言实验报告
    第04次作业-树
    第03次作业-栈和队列
  • 原文地址:https://www.cnblogs.com/linjiaxin59/p/12650734.html
Copyright © 2011-2022 走看看