zoukankan      html  css  js  c++  java
  • 复合数据类型,英文词频统计

    作业来源于:https://edu.cnblogs.com/campus/gzcc/GZCC-16SE2/homework/2696

    1.列表,元组,字典,集合分别如何增删改查及遍历。

    1.列表

     1 list = ['wang','li','chen']
     2 list.append('huang')
     3 print(list)
     4 #末尾插入元素
     5 list = ['wang','li','chen']
     6 list.insert(2,'huang')
     7 print(list)
     8 #元素插入指定位置
     9 list = ['wang','li','chen']
    10 list.remove('li')
    11 print(list)
    12 #按名称删除元素
    13 list = ['wang','li','chen']
    14 list.pop(1)
    15 print(list)
    16 #按位置删除元素
    17 list = ['wang','li','chen']
    18 list[1] = 'huang'
    19 print(list)
    20 #按位置修改元素
    21 list = ['wang','li','chen']
    22 print(list[1])
    23 #查找元素
    24 list = ['wang','li','chen']
    25 for bl in list:
    26     print("序号:{}  {}".format(list.index(bl),bl))
    27 #遍历

    2.元组

     1 ob = ('wang','li')
     2 ob2 = ('chen','huang')
     3 ob3 = ob + ob2
     4 print(ob3)
     5 #添加元素
     6 ob3 = ('wang','li','chen','huang')
     7 print("第一个:{} 第二个:{}".format(ob3[0],ob3[1]))
     8 #查找指定元素
     9 ob = ('wang','li')
    10 print("元组删除")
    11 del ob
    12 #元组删除
    13 ob3 = ('wang','li','chen','huang')
    14 for bl in ob3:
    15     print(bl)
    16 #遍历元组

    3.字典

     1 dict = {'wang':100,'li':90,'chen':80}
     2 dict['huang'] = 70
     3 print(dict)
     4 #添加元素
     5 dict = {'wang':100,'li':90,'chen':80}
     6 del dict['wang']
     7 print(dict)
     8 #删除元素
     9 dict = {'wang':100,'li':90,'chen':80}
    10 dict.pop('wang')
    11 print(dict)
    12 #删除元素
    13 dict = {'wang':100,'li':90,'chen':80}
    14 dict['wang'] = 99
    15 print(dict)
    16 #修改元素
    17 dict = {'wang':100,'li':90,'chen':80}
    18 print("查找的人:{}".format(dict['wang']))
    19 #查找元素
    20 dict = {'wang':100,'li':90,'chen':80}
    21 for bl in dict:
    22     print("{}:{}".format(bl,dict[bl]))
    23     #遍历字典

    4.集合

     1 s = set(['wang','li','chen'])
     2 s.add('huang')
     3 print(s)
     4 #添加元素
     5 s = set(['wang','li','chen'])
     6 s.remove('wang')
     7 print(s)
     8 #删除元素
     9 s = set(['wang','li','chen'])
    10 s = list(s)
    11 s[0] = 'huang'
    12 s = set(s)
    13 print(s)
    14 #修改元素
    15 s = set(['wang','li','chen'])
    16 s.clear()
    17 print(s)
    18 s = set(['wang','li','chen'])
    19 for bl in s:
    20     print(bl)
    21     #遍历

    2.总结列表,元组,字典,集合的联系与区别。参考以下几个方面:

    • 括号
      列表:[ ]
      元组:( )
      字典:{ }
      集合:( )
    • 有序无序
      列表:有
      元组:有
      字典:无
      集合:无
    • 可变不可变
      列表:可变
      元组:不可变
      字典:不可变
      集合:可变
    • 重复不可重复
      列表:重复
      元组:重复
      字典:不可重复
      集合:不可重复
    • 存储与查找方式
      列表:按照一定顺序编写,通过偏移读取
      元组:与列表差不多
      字典:存储的是对象引用,不是拷贝,和列表一样
      集合:与列表差不多

    3.词频统计

     1 f = open(r'D:pyhomeworkigbigword.txt',encoding='utf8')
     2 #打开文件
     3 stop={'a','the','and','i','you','in','but','not','with','by','its','for','of','an','to','my','myself','we','our','ours','ourelves','about','no','nor'}
     4 def gettext():
     5     sep = "~`*()!<>?,./;':[]{}-=_+"
     6     text = f.read().lower()
     7     for s in sep:
     8         text=text.replace(s,'')
     9     return text
    10 #读取文件
    11 textList = gettext().split()
    12 print(textList)
    13 #分解提取单词
    14 textSet = set(textList)
    15 stop = set(stop)
    16 textSet = textSet - stop
    17 print(textSet)
    18 #排除语法词
    19 textDict = {}
    20 for word in textSet:
    21     textDict[word] = textList.count(word)
    22     print(textDict)
    23 print(textDict.items())
    24 word = list(textDict.items())
    25 #单词计数
    26 word.sort(key=lambda x:x[1],reverse=True)
    27 print(word)
    28 #排序
    29 for q in range(20):
    30     print(word[q])
    31 #次数为前20的单词
    32 
    33 import pandas as pd
    34 pd.DataFrame(data=word).to_csv("text.csv",encoding='utf-8')

  • 相关阅读:
    欧拉函数模板
    Django Views Decorator
    Anaconda3 安装报错 bunzip2: command not found
    Windows 错误 0x80070570
    GitHub报错error: bad signature
    failed to push some refs to 'git@github.com:RocsSun/mytest.git
    更新GitHub的仓库
    Git连接GitHub
    Git的初始化设置
    Git的选项参数
  • 原文地址:https://www.cnblogs.com/hzj111/p/10525860.html
Copyright © 2011-2022 走看看