一、数据筛选:
处理方式:
1、filter函数在py3,返回的是个生成式。
from random import randint data = [randint(-100,100) for i in range(10)] data2 = [34, -59, -13, 96, -78, 38, 89, -96, -79, 98] info = filter(lambda x:x>0,data2) for i in info: print(i)
2、列表解析
from random import randint data = [randint(-100,100) for i in range(10)] data2 = [34, -59, -13, 96, -78, 38, 89, -96, -79, 98] info = [i for i in data2 if i >0] print(info)
filter与列表解析的比较:
在py2使用列表生成式效率高,在py3使用filter过滤器会效率高
from random import randint import timeit import pprint data = [randint(-100,100) for i in range(10)] data2 = [34, -59, -13, 96, -78, 38, 89, -96, -79, 98] t1 = timeit.Timer('[i for i in [34, -59, -13, 96, -78, 38, 89, -96, -79, 98] if i >0]') t2 = timeit.Timer('filter(lambda x:x>0,[34, -59, -13, 96, -78, 38, 89, -96, -79, 98])') # t2 = print(t1.timeit()) print(t2.timeit()) 结果: 1.9095332647026366 0.6967581773661176
3、字典解析:
使用字典生成式来筛选数据
from random import randint info = {x:randint(10,100) for x in range(1,11)} print(info) info2 = {key:value for key,value in info.items() if value >60}
4、集合数据筛选:
结构看起来和字典生成式差不多
data = [34, -59, -13, 96, -78, 38, 89, -96, -79, 98] data2 = set(data) print(data2) info = {x for x in data2 if x>0} print(info)
二、如何为元组中的每个元素命名,提高程序的可读性
1、给index指定数值常量,类似C里的枚举
name,work,play,address = range(4)
people = ("Tom",35,"Teacher","swimming","shenzhen") print(people[name],people[work],people[address])
2、使用标准库中collections.namedtuple替代内置tuple,自定义一个tuple子类,这种方式开销仅仅比普通元组高一些。
from collections import namedtuple people2 = namedtuple('people2',(['name','age','work','play','address'])) info = people2("Tom",35,"Teacher","swimming","shenzhen") print(info) print(info.name,info.age,info.work,info.play,info.address) 结果: people2(name='Tom', age=35, work='Teacher', play='swimming', address='shenzhen') Tom 35 Teacher swimming shenzhen
3、如何统计出序列中元素出现的频度
1、使用fromkey方法初始化一个dict,然后通过for循环迭代统计次数。
# from random import randint # # data = [randint(0,10) for x in range(30)] # print(data) data2 = [3, 0, 9, 1, 4, 1, 5, 7, 4, 7, 7, 3, 10, 4, 0, 6, 9, 2, 2, 4, 1, 1, 7, 8, 2, 7, 3, 1, 4, 9] dict1 = dict.fromkeys(data2,0) print(dict1) for x in data2: dict1[x] += 1 print(dict1)
结果:
{0: 0, 1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6: 0, 7: 0, 8: 0, 9: 0, 10: 0}
{0: 2, 1: 5, 2: 3, 3: 3, 4: 5, 5: 1, 6: 1, 7: 5, 8: 1, 9: 3, 10: 1}
2、使用collections.Counter对象
# from random import randint # # data = [randint(0,10) for x in range(30)] # print(data) data2 = [3, 0, 9, 1, 4, 1, 5, 7, 4, 7, 7, 3, 10, 4, 0, 6, 9, 2, 2, 4, 1, 1, 7, 8, 2, 7, 3, 1, 4, 9] from collections import Counter dict1 = Counter(data2) print(dict1) print(dict1[1],dict1[4],dict1[7]) print(dict1.most_common(3))
结果:
Counter({1: 5, 4: 5, 7: 5, 2: 3, 3: 3, 9: 3, 0: 2, 5: 1, 6: 1, 8: 1, 10: 1})
5 5 5
[(1, 5), (4, 5), (7, 5)]
4、如何根据字典中value的大小,对字典的key进行排序
对于这种排序,一般选择用内置函数sorted,因为这些内置函数一般使用C编写,运算速度会快些
1、使用zip函数
from random import randint dict1 = {x:randint(50,100) for x in 'abcdefg'} print(dict1) ret = sorted(zip(dict1.values(),dict1.keys())) print(ret) 结果: {'e': 77, 'd': 100, 'b': 51, 'c': 78, 'g': 55, 'f': 80, 'a': 87} [(51, 'b'), (55, 'g'), (77, 'e'), (78, 'c'), (80, 'f'), (87, 'a'), (100, 'd')]
2、sorted函数默认对每个迭代对象的第一个元素进行排序,可以通过指定key参数(传入一个函数,sorted每次迭代时会把选择的元素传入key中,然后让我们决定使用哪个元素作为排序对象)来排序
from random import randint dict1 = {x:randint(50,100) for x in 'abcdefg'} print(dict1) print(dict1.items()) ret = sorted(dict1.items(),key=lambda x:x[1]) print(ret) 结果: {'a': 64, 'f': 51, 'd': 67, 'e': 73, 'c': 57, 'g': 100, 'b': 71} dict_items([('a', 64), ('f', 51), ('d', 67), ('e', 73), ('c', 57), ('g', 100), ('b', 71)]) [('f', 51), ('c', 57), ('a', 64), ('d', 67), ('b', 71), ('e', 73), ('g', 100)]