zoukankan      html  css  js  c++  java
  • Python自动化开发课堂笔记【Day10】

    线程

    定义:一条流水线的执行过程是一个线程,一条流水线必须属于一个车间,一个车间的运行过程就是一个进程(一个进程内至少有一个线程)
              进程是资源单位,而线程才是CPU上的执行单位,线程创建的开销远远小于进程

    多线程:一个车间内有多条流水线,多个流水线共享该车间的资源(多线程共享一个进程的资源)

    为何要创建多线程:
      1. 资源共享
      2. 创建开销小

    开启线程的两种方式:

    方式一:
    
    from threading import Thread
    
    def work(name):
        print('%s say hello' % name)
        
    if __name__ == '__main__':
        t = Thread(target=work,args=('Albert',))
        t.start()
        print('main thread')
    
    方式二:
    
    from threading import Thread
    
    class MyThread(Thread):
    
        def __init__(self,name):
            super().__init__()
            self.name = name
    
        def run(self):
            print('%s say hello' % self.name)
    
    if __name__ == '__main__':
    
        t = MyThread('Albert')
        t.start()
        print('main thread')

    P.S. 主进程和主线程公用同一个PID, 验证:

    from threading import Thread
    import os
    
    def work():
        print('%s say hello' % os.getpid())
    
    if __name__ == '__main__':
        t = Thread(target=work,)
        t.start()
        print('main thread:%s' % os.getpid())

    多线程练习:

    练习一:
    
    服务端:
    
    import socket
    import threading
    
    server = socket.socket(socket.AF_INET,socket.SOCK_STREAM)
    server.setsockopt(socket.SOL_SOCKET,socket.SO_REUSEADDR,1)
    server.bind(('127.0.0.1',8080))
    server.listen(5)
    
    def action(conn):
        while True:
            try:
                data = conn.recv(1024)
                if not data: break
                print(data)
                conn.send(data.upper())
            except Exception:
                break
    
    if __name__ == '__main__':
        while True:
            conn,addr = server.accept()
            p = threading.Thread(target=action,args=(conn,))
            p.start()
            
    客户端:
    
    import socket
    
    client = socket.socket(socket.AF_INET,socket.SOCK_STREAM)
    client.connect(('127.0.0.1',8080))
    
    while True:
        msg = input('>>>:').strip()
        if not msg: continue
        client.send(msg.encode())
        back_msg = client.recv(1024)
        print(back_msg.decode())
    
    练习二:
    
    from threading import Thread
    
    data_l = []
    format_data_l = []
    
    def inp():
        while True:
            data = input('>>>:').strip()
            if not data:continue
            data_l.append(data)
    
    def format():
        while True:
            if data_l:
                data = data_l.pop()
                format_data = data.upper()
                format_data_l.append(format_data)
    
    def write():
        while True:
            if format_data_l:
                data = format_data_l.pop()
                with open('c.txt','a',encoding='utf-8') as f:
                    f.write(data + '
    ')
    
    if __name__ == '__main__':
    
        t1 = Thread(target=inp)
        t2 = Thread(target=format)
        t3 = Thread(target=write)
    
        t1.start()
        t2.start()
        t3.start()

    线程一些其他属性

    import threading
    from threading import Thread
    import os
    
    def work():
        print('%s say hello' % threading.current_thread().getName())
    
    if __name__ == '__main__':
        t = Thread(target=work,)
        t.setDaemon(True) #线程的守护进程
        t.start()
        t.join()
        print(threading.enumerate()) #以列表形式显示当前活跃线程
        print(threading.active_count()) #活跃线程数量统计
        print('main thread:%s' % threading.current_thread().getName()) #获取当前线程名称

    GIL锁

    由于python GIL的存在,在Cpython解释器中,同一个进程下开启的多线程,同一时刻只能有一个线程执行,无法利用多核优势。
    GIL并不是Python的特性,它是在实现Python解析器(CPython)时所引入的一个概念
    有了GIL的存在,同一时刻统一进程中只有一个线程被执行

    结论:
      对计算来说,cpu越多越好,但是对于I/O来说,再多的cpu也没用
      当然对于一个程序来说,不会是纯计算或者纯I/O,我们只能相对的去看一个程序到底是计算密集型还是I/O密集型,从而进一步分析python的多线程有无用武之地
    现在的计算机基本上都是多核,python对于计算密集型的任务开多线程的效率并不能带来多大性能上的提升,甚至不如串行(没有大量切换),但是,对于IO密集型的任务效率还是有显著提升的。

    应用:
    多线程用于IO密集型,如socket,爬虫,web
    多进程用于计算密集型,如金融分析

    注意:
    GIL 与Lock是两把锁,保护的数据不一样,前者是解释器级别的(当然保护的就是解释器级别的数据,比如垃圾回收的数据),
    后者是保护用户自己开发的应用程序的数据,很明显GIL不负责这件事,只能用户自定义加锁处理,即Lock

    示例:
    I/O密集型
    
    from threading import Thread
    from multiprocessing import Process
    import time
    import os
    
    def work():
        time.sleep(1)
        print(os.getpid())
    
    if __name__ == '__main__':
        tp_l = []
        start_time = time.time()
        for i in range(100):
            tp = Thread(target=work) #run_time is 1.0190582275390625
            # tp = Process(target=work) #run_time is 10.807618141174316
            tp_l.append(tp)
            tp.start()
    
        for tp in tp_l:
            tp.join()
    
        stop_time = time.time()
        print('run_time is %s' % (stop_time - start_time))
        
        
    计算密集型
    
    from threading import Thread
    from multiprocessing import Process
    import os
    import time
    
    def work():
        res = 0
        for i in range(100000):
            res+=i
    
    if __name__ == '__main__':
        tp_l = []
        start_time = time.time()
        for i in range(300):
            # tp = Thread(target=work) # run_time is 4.402251720428467
            tp = Process(target=work) # run_time is 28.153610229492188
            tp_l.append(tp)
            tp.start()
    
        for tp in tp_l:
            tp.join()
    
        stop_time = time.time()
        print('run_time is %s' % (stop_time - start_time))

    互斥锁

    from threading import Thread, Lock
    import time
    
    n = 100
    def work():
        with mutex:
            global n
            temp = n
            time.sleep(0.1)
            n = temp - 1
    
    if __name__ == '__main__':
        mutex = Lock()
        t_l = []
        for i in range(100):
            t = Thread(target=work)
            t_l.append(t)
            t.start()
        for i in t_l:
            i.join()
        print(n)  

    死锁与递归锁

    所谓死锁: 是指两个或两个以上的进程或线程在执行过程中,因争夺资源而造成的一种互相等待的现象,若无外力作用,它们都将无法推进下去。
                      此时称系统处于死锁状态或系统产生了死锁,这些永远在互相等待的进程称为死锁进程,如下就是死锁:

    死锁示例:
    
    from threading import Thread,Lock
    import time
    
    class MyThread(Thread):
        def run(self):
            self.f1()
            self.f2()
        def f1(self):
            mutexA.acquire()
            print('33[40m%s get LockA33[0m' % self.name)
            mutexB.acquire()
            print('33[41m%s get LockB33[0m' % self.name)
            mutexB.release()
            mutexA.release()
    
        def f2(self):
            mutexB.acquire()
            time.sleep(1)
            print('33[41m%s get LockB33[0m' % self.name)
            mutexA.acquire()
            print('33[40m%s get LockA33[0m' % self.name)
            mutexA.release()
            mutexB.release()
    
    if __name__ == '__main__':
        mutexA = Lock()
        mutexB = Lock()
        for i in range(20):
            t = MyThread()
            t.start()

    如何解决死锁问题:
      递归锁,在Python中为了支持在同一线程中多次请求同一资源,python提供了可重入锁RLock。
      这个RLock内部维护着一个Lock和一个counter变量,counter记录了acquire的次数,从而使得资源可以被多次require。
      直到一个线程所有的acquire都被release,其他的线程才能获得资源。上面的例子如果使用RLock代替Lock,则不会发生死锁:

    from threading import Thread,RLock
    import time
    
    class MyThread(Thread):
        def run(self):
            self.f1()
            self.f2()
        def f1(self):
            mutexA.acquire()
            print('33[40m%s get LockA33[0m' % self.name)
            mutexB.acquire()
            print('33[41m%s get LockB33[0m' % self.name)
            mutexB.release()
            mutexA.release()
    
        def f2(self):
            mutexB.acquire()
            time.sleep(1)
            print('33[41m%s get LockB33[0m' % self.name)
            mutexA.acquire()
            print('33[40m%s get LockA33[0m' % self.name)
            mutexA.release()
            mutexB.release()
    
    if __name__ == '__main__':
        # mutexA = Lock()
        # mutexB = Lock()
        # 同时引用为一把锁,不要误认为是两把锁
        mutexA = mutexB = RLock() #一个线程拿到锁,counter加1,该线程内又碰到加锁的情况,则counter继续加1,这期间所有其他线程都只能等待,等待该线程释放所有锁,即counter递减到0为止
        for i in range(20):
            t = MyThread()
            t.start()

    信号量(Semaphore)

    同进程的一样
    Semaphore管理一个内置的计数器,
    每当调用acquire()时内置计数器-1;
    调用release() 时内置计数器+1;
    计数器不能小于0;当计数器为0时,acquire()将阻塞线程直到其他线程调用release()。

    from threading import Thread,Semaphore
    import time
    
    def work(id):
        with sem:
            time.sleep(2)
            print('%s say hello' %id)
    
    if __name__ == '__main__':
        sem = Semaphore(5)
        for i in range(20):
            t = Thread(target=work,args=(i,))
            t.start()

    事件(Event)

    event.isSet():返回event的状态值;
    event.wait():如果 event.isSet()==False将阻塞线程;
    event.set(): 设置event的状态值为True,所有阻塞池的线程激活进入就绪状态, 等待操作系统调度;
    event.clear():恢复event的状态值为False。
    from threading import Event ,Thread
    import threading
    import time
    
    def conn_mysql():
        print('%s waiting...' % threading.current_thread().getName())
        print(e.isSet()) #False
        e.wait()
        print('%s start to connect mysql...' % threading.current_thread().getName())
        print(e.isSet()) #True
        time.sleep(2)
    
    def check_mysql():
        print('%s is checking...' % threading.current_thread().getName())
        time.sleep(3)
        print(e.isSet()) #False
        e.set()
        print(e.isSet()) #True
    
    if __name__ == '__main__':
        e = Event()
        t1 = Thread(target=conn_mysql)
        t2 = Thread(target=conn_mysql)
        t3 = Thread(target=conn_mysql)
        t4 = Thread(target=check_mysql)
        t1.start()
        t2.start()
        t3.start()
        t4.start()

    定时器

    指定n秒后执行某操作
    
    from threading import Timer
    
    def hello():
        print('hello, world')
    t = Timer(3,hello)
    t.start()

    线程queue

    import queue
    
    q = queue.Queue() #先进先出--->队列
    
    q.put('first')
    q.put('second')
    q.put((1,2,3,4))
    
    print(q.get())
    print(q.get())
    print(q.get())
    
    q = queue.LifoQueue() #后进先出--->堆栈
    
    q.put('first')
    q.put('second')
    q.put((1,2,3,4))
    
    print(q.get())
    print(q.get())
    print(q.get())
    
    q = queue.PriorityQueue() #优先级queue,数字越小,优先级越高
    
    q.put((1,'a'))
    q.put((4,'b'))
    q.put((3,'c'))
    
    print(q.get())
    print(q.get())
    print(q.get())

    协程

    定义:单线程下的并发,又称微线程。协程是一种用户态的轻量级线程,即协程是由用户程序自己控制调度的。
              要实现协程,关键在于用户程序自己控制程序切换,切换之前必须由用户程序自己保存协程上一次调用时的状态,如此,每次重新调用时,能够从上次的位置继续执行
             (详细的:协程拥有自己的寄存器上下文和栈。协程调度切换时,将寄存器上下文和栈保存到其他地方,在切回来的时候,恢复先前保存的寄存器上下文和栈)

    协程的定义(满足1,2,3就可称为协程):

      1.必须在只有一个单线程里实现并发
      2.修改共享数据不需加锁
      3.用户程序里自己保存多个控制流的上下文栈
      4.附加:一个协程遇到IO操作自动切换到其它协程(如何实现检测IO,yield、greenlet都无法实现,就用到了gevent模块(select机制))

    需要强调的是:
      1. python的线程属于内核级别的,即由操作系统控制调度(如单线程一旦遇到io就被迫交出cpu执行权限,切换其他线程运行)
      2. 单线程内开启协程,一旦遇到io,从应用程序级别(而非操作系统)控制切换

    对比操作系统控制线程的切换,用户在单线程内控制协程的切换,优点如下:
      1. 协程的切换开销更小,属于程序级别的切换,操作系统完全感知不到,因而更加轻量级
      2. 单线程内就可以实现并发的效果,最大限度地利用cpu


    yield:
    1. yiled可以保存状态,yield的状态保存与操作系统的保存线程状态很像,但是yield是代码级别控制的,更轻量级
    2. send可以把一个函数的结果传给另外一个函数,以此实现单线程内程序之间的切换

    缺点:
    协程的本质是单线程下,无法利用多核,可以是一个程序开启多个进程,每个进程内开启多个线程,每个线程内开启协程
    协程指的是单个线程,因而一旦协程出现阻塞,将会阻塞整个线程

    无yield方式:
    
    from threading import Thread
    import time
    
    def consumer(item):
        print(item)
        x = 1
        y = 2
        z = 3
    
    def producer(target,seq):
        for item in seq:
            target(item)
    
    s_time = time.time()
    producer(consumer,range(500000))
    e_time = time.time()
    print('run time %s' % (e_time - s_time)) #4.764272451400757
    
    yield方式:
    
    from threading import Thread
    import time
    
    def consumer():
        x = 1
        y = 2
        z = 3
        while True:
            item = yield
    
    def producer(target,seq):
        for item in seq:
            target.send(item)
    
    g=consumer()
    next(g)
    s_time = time.time()
    producer(g,range(500000))
    e_time = time.time()
    print('run time %s' % (e_time - s_time)) #run time 0.12200713157653809

    Greenlet模块

    greenlet是一个用C实现的协程模块,相比与python自带的yield,它可以使你在任意函数之间随意切换,而不需把这个函数先声明为generator

    from greenlet import greenlet
    
    def test1():
        print('test1,1')
        gr2.switch()
        print('test1,2')
        gr2.switch()
    def test2():
        print('test2,1')
        gr1.switch()
        print('test2,2')
    gr1 = greenlet(test1)
    gr2 = greenlet(test2)
    gr1.switch()

    Gevent模块

    实现单线程下遇到I/O自动切换
    
    from gevent import monkey
    monkey.patch_all()
    import gevent
    import time
    
    def eat(name):
        print('%s eat food first' % name)
        # gevent.sleep(2)
        time.sleep(2)
        print('%s eat food second' % name)
    
    def play(name):
        print('%s play phone 1' % name)
        # gevent.sleep(1)
        time.sleep(1)
        print('%s play phone 2' % name)
    
    def drink(name):
        print('%s is drinking' % name)
        # gevent.sleep(4)
        time.sleep(4)
        print('%s is drinking' % name)
    
    g1 = gevent.spawn(eat,'Albert')
    g2 = gevent.spawn(play,'Albert')
    g3 = gevent.spawn(drink,'Albert')
    g1.join()
    g2.join()
    g3.join()
    print('main thread')
    
    
    协程实现并发爬取网页
    
    from gevent import monkey
    monkey.patch_all()
    import gevent
    import requests
    import time
    
    def get_page(url):
        print('GET Page: %s' % url)
        res = requests.get(url)
        if res.status_code == 200:
            print(res.text)
    
    s_time = time.time()
    gevent.joinall([gevent.spawn(get_page,'https://www.python.org/'),
                    gevent.spawn(get_page,'https://github.com/')])
    e_time = time.time()
    print('run time %s' % (e_time - s_time))

    单线程实现并发的socket

    from gevent import monkey
    monkey.patch_all()
    from socket import *
    import gevent
    
    def server(ip,port):
        s = socket(AF_INET,SOCK_STREAM)
        s.setsockopt(SOL_SOCKET,SO_REUSEADDR,1)
        s.bind((ip,port))
        s.listen(5)
        while True:
            conn,addr = s.accept()
            gevent.spawn(talk,conn,addr)
    
    def talk(conn,addr):
        try:
            while True:
                res = conn.recv(1024)
                print('client %s : %s msg: %s' % (addr[0],addr[1],res))
                conn.send(res.upper())
        except Exception as e:
            print(e)
        finally:
            conn.close()
    
    if __name__ == '__main__':
        server('127.0.0.1',8080)
    服务端
    from threading import Thread
    from socket import *
    import threading
    
    def client(ip,port):
        c = socket(AF_INET,SOCK_STREAM)
        c.connect((ip,port))
    
        count = 0
        while True:
            c.send(('%s say hello %s' % (threading.current_thread().getName(),count)).encode())
            msg = c.recv(1024)
            print(msg.decode())
            count += 1
    
    if __name__ == '__main__':
    
        for i in range(100):
            t = Thread(target=client,args=('127.0.0.1',8080))
            t.start()
    客户端

    socketserver

    import socketserver
    
    class MyHandler(socketserver.BaseRequestHandler):
        def handle(self):
            while True:
                res = self.request.recv(1024)
                print('client %s msg: %s' % (self.client_address,res))
                self.request.send(res.upper())
    
    if __name__ == '__main__':
        s = socketserver.ThreadingTCPServer(('127.0.0.1',8080),MyHandler)
        s.serve_forever()
    客户端
    import socket
    
    client = socket.socket(socket.AF_INET,socket.SOCK_STREAM)
    client.connect(('127.0.0.1',8080))
    
    while True:
        msg = input('>>>:').strip()
        if not msg: continue
        client.send(msg.encode())
        back_msg = client.recv(1024)
        print(back_msg.decode())
    服务端

    基于UDP的socket

    # 非并发效果
    # from socket import *
    #
    # s = socket(AF_INET,SOCK_DGRAM)
    # s.setsockopt(SOL_SOCKET,SO_REUSEADDR,1)
    # s.bind(('127.0.0.1',8080))
    #
    # while True:
    #     msg,addr = s.recvfrom(1024)
    #     print(msg)
    #     s.sendto(msg.upper(),addr)
    
    # 基于socketserver的并发效果
    
    import socketserver
    
    class MyUDPhandler(socketserver.BaseRequestHandler):
        def handle(self):
            client_msg,s = self.request
            s.sendto(client_msg.upper(),self.client_address)
    
    if __name__ == '__main__':
        s = socketserver.ThreadingUDPServer(('127.0.0.1',8080),MyUDPhandler)
        s.serve_forever()
    服务端
    from socket import *
    
    c = socket(AF_INET,SOCK_DGRAM)
    
    while True:
        msg = input('>>>:').strip()
        c.sendto(msg.encode(),('127.0.0.1',8080))
        back_msg,addr= c.recvfrom(1024)
        print('from server %s:%s' % (addr,back_msg.decode()))
    客户端
  • 相关阅读:
    log4net Appenders
    cnblogs 安家了
    log4net 资源索引贴
    Log2Console A Generic Log Viewer (for Log4Net, NLog...)
    [前端技术]如何加深对JavaScipt中的Math.ceil() 、Math.floor() 、Math.round() 三个函数的理解
    msiexec 命令使用文档
    “安装和部署”文章索引
    一句SQL实现获取自增列操作
    MsChart 部署遇到的一点问题
    [Asp.net]ZipHelper 在线压缩解压帮助类(SharpZipLib组件实现)
  • 原文地址:https://www.cnblogs.com/paodanke/p/7115571.html
Copyright © 2011-2022 走看看