zoukankan html css js c++ java

Python基础-map/reduce/filter

一、map

Python内置函数，用法及说明如下：

class map(object):
    """
    map(func, *iterables) --> map object
    
    Make an iterator that computes the function using arguments from
    each of the iterables.  Stops when the shortest iterable is exhausted.
    """

map()函数接收两个参数，一个是函数，一个是Iterable，map将传入的函数依次作用到序列的每个元素，并把结果作为新的Iterator返回。　　

举例说明，比如我们有一个函数f(x)=x²，要把这个函数作用在一个list [1, 2, 3, 4, 5, 6, 7, 8, 9]上，就可以用map()实现如下：

def f(x):
    return x * x
r = map(f, [1, 2, 3, 4, 5, 6, 7, 8, 9])
list(r)
[1, 4, 9, 16, 25, 36, 49, 64, 81]

map()传入的第一个参数是f，即函数对象本身。由于结果r是一个Iterator，Iterator是惰性序列，因此通过list()函数让它把整个序列都计算出来并返回一个list。

#使用lambda匿名函数
list(map(lambda x: x * x, [1, 2, 3, 4, 5, 6, 7, 8, 9]))    
[1, 4, 9, 16, 25, 36, 49, 64, 81]

map()作为高阶函数，事实上它把运算规则抽象了，因此，我们不但可以计算简单的f(x)=x²，还可以计算任意复杂的函数，比如，把这个list所有数字转为字符串：

list(map(str, [1, 2, 3, 4, 5, 6, 7, 8, 9]))
['1', '2', '3', '4', '5', '6', '7', '8', '9']

map函数的优点：

函数逻辑更加清晰，参数‘f’就表明了对元素的操作
map是高阶函数，可以执行抽象度更高的运算

二、 reduce

def reduce(function, sequence, initial=None): # real signature unknown; restored from __doc__
    """
    reduce(function, sequence[, initial]) -> value
    
    Apply a function of two arguments cumulatively to the items of a sequence,
    from left to right, so as to reduce the sequence to a single value.
    For example, reduce(lambda x, y: x+y, [1, 2, 3, 4, 5]) calculates
    ((((1+2)+3)+4)+5).  If initial is present, it is placed before the items
    of the sequence in the calculation, and serves as a default when the
    sequence is empty.
    """
    pass

reduce把一个函数作用在一个序列[x1, x2, x3, ...]上，这个函数必须接收两个参数，reduce把结果继续和序列的下一个元素做累积计算，其效果就是：　　

reduce(f, [x1, x2, x3, x4]) = f(f(f(x1, x2), x3), x4)

比方说对一个序列求和，就可以用reduce实现：

from functools import reduce
def add(x, y):
    return x + y
reduce(add, [1, 3, 5, 7, 9])
25

匿名函数实现：

reduce(lambda x, y : x + y, [1, 3, 5, 7, 9])
25

当然求和运算可以直接用Python内建函数sum()，没必要动用reduce。

但是如果要把序列[1, 3, 5, 7, 9]变换成整数13579，reduce就可以派上用场：

from functools import reduce
def fn(x, y):
    return x * 10 + y
reduce(fn, [1, 3, 5, 7, 9])
13579

匿名函数实现：

reduce(lambda x, y: x * 10 + y, [1, 3, 5, 7, 9])
13579

这个例子本身没多大用处，但是，如果考虑到字符串str也是一个序列，对上面的例子稍加改动，配合map()，我们就可以写出把str转换为int的函数：

from functools import reduce
def fn(x, y):
    return x * 10 + y
def char2num(s):
    return {'0': 0, '1': 1, '2': 2, '3': 3, '4': 4, '5': 5, '6': 6, '7': 7, '8': 8, '9': 9}[s]
reduce(fn, map(char2num, '13579'))
13579

整理成一个str2int的函数就是：

from functools import reduce

def str2int(s):
    def fn(x, y):
        return x * 10 + y
    def char2num(s):
        return {'0': 0, '1': 1, '2': 2, '3': 3, '4': 4, '5': 5, '6': 6, '7': 7, '8': 8, '9': 9}[s]
    return reduce(fn, map(char2num, s))

还可以用lambda函数进一步简化成：

from functools import reduce

def char2num(s):
    return {'0': 0, '1': 1, '2': 2, '3': 3, '4': 4, '5': 5, '6': 6, '7': 7, '8': 8, '9': 9}[s]

def str2int(s):
    return reduce(lambda x, y: x * 10 + y, map(char2num, s))

小练习：

利用map()函数，把用户输入的不规范的英文名字，变为首字母大写，其他小写的规范名字。输入：['adam', 'LISA', 'barT']，输出：['Adam', 'Lisa', 'Bart']：

list(map(lambda x: x.capitalize(), ['adam', 'LISA', 'barT']))
['Adam', 'Lisa', 'Bart']

　　2. Python提供的sum()函数可以接受一个list并求和，请编写一个prod()函数，可以接受一个list并利用reduce()求积：

def prod(l):
    return reduce(lambda x, y: x * y, l)
l = [1, 2 ,3, 4, 5]
print(prod(l))
120

　　匿名函数实现：

reduce(lambda x, y: x * y, [1, 2, 3, 4, 5])
120

　　3. 利用map和reduce编写一个str2float函数，把字符串'123.456'转换成浮点数123.456：　　

from functools import reduce
def char2num(s):
    return {'0': 0, '1': 1, '2': 2, '3': 3, '4': 4, '5': 5, '6': 6, '7': 7, '8': 8, '9': 9}[s]
def str_split(s):
    s1, s2 = s.split('.')
    return s1, s2
def str2int_1(s1):
    return reduce(lambda x, y: x * 10 + y, map(char2num, s1))
def str2int_2(s2):
    return (reduce(lambda x, y: x * 10 + y, map(char2num, s2)))/pow(10, len(s2))
def str2float(s):
    s1, s2 = str_split(s)
    res = str2int_1(s1) + str2int_2(s2)
    return res
a = str2float('123.456')
print(a)
123.456

三、filter

Python内建的filter()函数用于过滤序列。

和map()类似，filter()也接收一个函数和一个序列。和map()不同的是，filter()把传入的函数依次作用于每个元素，然后根据返回值是True还是False决定保留还是丢弃该元素。

class filter(object):
    """
    filter(function or None, iterable) --> filter object
    
    Return an iterator yielding those items of iterable for which function(item)
    is true. If function is None, return the items that are true.
    """

例如，在一个list中，删掉偶数，只保留奇数，可以这么写：

def is_odd(n):
    return n % 2 == 1
list(filter(is_odd, [1, 2, 3, 4, 5, 6, 7, 8, 9]))
[1, 3, 5, 7, 9]

同样可以加入匿名函数：

list(filter(lambda x: x % 2 == 1, [1, 2, 3, 4, 5, 6, 7, 8, 9]))
[1, 3, 5, 7, 9]

把一个序列中的空字符串删掉，可以这么写：

def not_empty(s):
    return s and s.strip()
list(filter(not_empty, ['A', '', 'B', None, 'C', ' ']))
['A', 'B', 'C']

匿名函数的形式：

list(filter(lambda x: x and x.strip(), ['A', '', 'B', None, 'C', ' ']))
['A', 'B', 'C']

可见用filter()这个高阶函数，关键在于正确实现一个“筛选”函数。

注意到filter()函数返回的是一个Iterator，也就是一个惰性序列，所以要强迫filter()完成计算结果，需要用list()函数获得所有结果并返回list。

实例：

用filter求素数

计算素数的一个方法是埃氏筛法，它的算法理解起来非常简单：

首先，列出从2开始的所有自然数，构造一个序列：

2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, ...

取序列的第一个数2，它一定是素数，然后用2把序列的2的倍数筛掉：

3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, ...

取新序列的第一个数3，它一定是素数，然后用3把序列的3的倍数筛掉：

5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, ...

取新序列的第一个数5，然后用5把序列的5的倍数筛掉：

7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, ...

不断筛下去，就可以得到所有的素数。

用Python来实现这个算法，可以先构造一个从3开始的奇数序列：

def _odd_iter():
    n = 1
    while True:
        n = n + 2
        yield n

注意这是一个生成器，并且是一个无限序列。

然后定义一个筛选函数：

def _not_divisible(n):
    return lambda x: x % n > 0

最后，定义一个生成器，不断返回下一个素数：

def primes():
    yield 2
    it = _odd_iter()    #初始序列
    while True:
        n = next(it)    #返回序列的第一个数
        yield n
        it = filter(_n

这个生成器先返回第一个素数2，然后，利用filter()不断产生筛选后的新的序列。

由于primes()也是一个无限序列，所以调用时需要设置一个退出循环的条件：

#打印100以内的素数
for n in primes():
    if n < 100:
        print(n)
    else:
        break
        
2
3
5
7
11
13
17
19
23
29
31
37
41
43
47
53
59
61
67
71
73
79
83
89
97

注意到Iterator是惰性计算的序列，所以我们可以用Python表示“全体自然数”，“全体素数”这样的序列，而代码非常简洁。

小练习：

回数是指从左向右读和从右向左读都是一样的数，例如12321，909。请利用filter()滤掉非回数：

list(filter(lambda x: str(x) == str(x)[::-1], range(1,1000)))
[1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 22, 33, 44, 55, 66, 77, 88, 99, 101, 111, 121, 131, 141, 151, 161, 171, 181, 191, 202, 212, 222, 232, 242, 252, 262, 272, 282, 292, 
303, 313, 323, 333, 343, 353, 363, 373, 383, 393, 404, 414, 424, 434, 444, 454, 464, 474, 484, 494, 505, 515, 525, 535, 545, 555, 565, 575, 585, 595, 606, 616, 626,
 636, 646, 656, 666, 676, 686, 696, 707, 717, 727, 737, 747, 757, 767, 777, 787, 797, 808, 818, 828, 838, 848, 858, 868, 878, 888, 898, 909, 919, 929, 939, 949, 959,
 969, 979, 989, 999]

　　思路：先将int数字类型转换为str字符串类型，然后比较原字符串和取反后的字符串是否相等来返回值。

参考资料：

廖雪峰的官方网站

帮助很大，非常感谢！

查看全文

相关阅读:
datatables出现横向滚动条
 mumu模拟器设置代理/打开网络连接（windows）
a标签下划线
 python 配置导入方式
 redis命令
 django+mongodb 内置用户控制
 异常补充
 关于异常的总结
 java 常用类的方法
 java 日期时间类加Calendar的set和add方法

原文地址：https://www.cnblogs.com/OldJack/p/6705897.html