zoukankan      html  css  js  c++  java
  • 统计词语频率保存到xls

    import json
    import jieba.analyse as anl
    import xlwt
    
    # 获取待统计的文本内容
    # 打开文件
    f = open('zhilian.json', 'r', encoding='utf-8')
    ans_data = ''
    for index in range(1867):
        data = f.readline().replace('},', '}')
        dict = json.loads(data)
        ans_data += dict['job_content']
    # 关闭文件
    f.close()
    
    # xls的相关操作
    # 新建文件,后面save
    workbook=xlwt.Workbook(encoding='ascii')
    # 新建sheet表
    worksheet=workbook.add_sheet('python招聘分词')
    
    # jieba分词统计ans_data文本中,分词后的频率
    seg = anl.extract_tags(ans_data, topK=150, withWeight=True)
    index = 0
    for tag, weight in seg:
        print("%-20s:%3s %-8s" % (weight, index, tag))
        # 写入xls的单元格
        worksheet.write(index, 0, label=index + 1)
        worksheet.write(index, 1, label=tag)
        worksheet.write(index, 2, label=weight)
        index += 1
    # 保存xls文件
    workbook.save('python招聘分词统计.xls')
    

      

  • 相关阅读:
    【二食堂】Alpha
    【二食堂】Alpha- 发布声明
    【Beta】Scrum Meeting 4
    【Beta】Scrum Meeting 3
    【Beta】Scrum Meeting 2
    【Beta】Scrum Meeting 1
    beta设计和计划
    alpha事后分析
    alpha项目展示
    Scrum Meeting 最终总结
  • 原文地址:https://www.cnblogs.com/andy9468/p/7860389.html
Copyright © 2011-2022 走看看