Python笔记-进程Process、线程Thread、上锁

zoukankan html css js c++ java

Python笔记-进程Process、线程Thread、上锁
1、对于操作系统来说，一个任务就是一个进程（Process）。比如打开一个浏览器就是启动一个浏览器进程，打开一个记事本就启动了一个记事本进程。

2、在一个进程内部，要同时干多件事，就需要同时运行多个“子任务”，我们把进程内的这些“子任务”称为线程（Thread）。比如Word，它可以同时进行打字、拼写检查、打印等事情。

3、线程是最小的执行单元，而进程由至少一个线程组成。

多进程

1、Unix/Linux：fork()调用实现多进程。

2、Windows没有fork()，multiprocessing模块就是跨平台版本的多进程模块。multiprocessing模块提供了一个Process类来代表一个进程对象。

#启动一个子进程并等待其结束： from multiprocessing import Process import os # 子进程要执行的代码 def run_proc(name): print('Run child process %s (%s)...' % (name, os.getpid())) #主函数 if __name__=='__main__': print('Parent process %s.' % os.getpid()) #创建子进程时，只需要传入一个执行函数和函数的参数， #创建一个Process实例，用start()方法启动。 p = Process(target=run_proc, args=('test',)) print('Child process will start.') p.start() #join()可等待子进程结束后再继续往下运行，通常用于进程间的同步。 p.join() print('Child process end.') 结果： Parent process 928. Process will start. Run child process test (929)... Process end.

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

进程间通信

1、Process之间肯定是需要通信的，Python的multiprocessing模块包装了底层的机制，提供了Queue、Pipes等多种方式来交换数据。

以Queue为例，在父进程中创建两个子进程，一个往Queue里写数据，一个从Queue里读数据：

from multiprocessing import Process, Queue import os, time, random # 写数据进程执行的代码: def write(q): print('Process to write: %s' % os.getpid()) for value in ['A', 'B', 'C']: print('Put %s to queue...' % value) q.put(value) time.sleep(random.random()) # 读数据进程执行的代码: def read(q): print('Process to read: %s' % os.getpid()) while True: value = q.get(True) print('Get %s from queue.' % value) if __name__=='__main__': # 父进程创建Queue，并传给各个子进程： q = Queue() pw = Process(target=write, args=(q,)) pr = Process(target=read, args=(q,)) # 启动子进程pw，写入: pw.start() # 启动子进程pr，读取: pr.start() # 等待pw结束: pw.join() # pr进程里是死循环，无法等待其结束，只能强行终止: pr.terminate() 结果： Process to write: 50563 Put A to queue... Process to read: 50564 Get A from queue. Put B to queue... Get B from queue. Put C to queue... Get C from queue.

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

多线程

1、Python的标准库提供了两个模块：_thread（低级模块）和threading（高级模块，对_thread进行了封装）。绝大多数情况下，我们只需要使用threading这个高级模块。

2、启动一个线程就是把一个函数传入并创建Thread实例，然后调用start()开始执行：

import time, threading # 新线程执行的代码: def loop(): print('thread %s is running...' % threading.current_thread().name) n = 0 while n < 5: n = n + 1 print('thread %s >>> %s' %(threading.current_thread().name, n)) time.sleep(1) print('thread %s ended.' % threading.current_thread().name) print('thread %s is running...' % threading.current_thread().name) t = threading.Thread(target=loop, name='LoopThread') t.start() t.join() print('thread %s ended.' % threading.current_thread().name) 结果： thread MainThread is running... thread LoopThread is running... thread LoopThread >>> 1 thread LoopThread >>> 2 thread LoopThread >>> 3 thread LoopThread >>> 4 thread LoopThread >>> 5 thread LoopThread ended. thread MainThread ended.

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

由于任何进程默认就会启动一个线程（主线程），主线程又可以启动新的线程，current_thread()永远返回当前线程的实例。主线程实例的名字叫MainThread，子线程的名字在创建时指定。名字仅仅在打印时用来显示，完全没有其他意义，如果不起名字Python就自动给线程命名为Thread-1，Thread-2……

3、
多进程：同一个变量，各自有一份拷贝存在于每个进程中，互不影响。
多线程：所有变量都由所有线程共享。所以，任何一个变量都可以被任何一个线程修改，因此，线程之间共享数据最大的危险在于多个线程同时改一个变量，把内容给改乱了。

#来看看多个线程同时操作一个变量怎么把内容给改乱了 import time, threading # 假定这是你的银行存款: balance = 0 def change_it(n): # 先存后取，结果应该为0: global balance balance = balance + n balance = balance - n def run_thread(n): for i in range(100000): change_it(n) t1 = threading.Thread(target=run_thread, args=(5,)) t2 = threading.Thread(target=run_thread, args=(8,)) t1.start() t2.start() t1.join() t2.join() print(balance)

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

balance，理论上结果应该为0，但是，由于线程的调度是由OS决定的，当t1、t2交替执行时，只要循环次数足够多，balance的结果就不一定是0了。

原因是因为高级语言的一条语句在CPU执行时是若干条语句，即使一个简单的计算：

balance = balance + n

1

也分两步：

计算balance + n，存入临时变量中；
将临时变量的值赋给balance。
也就是可以看成：

x = balance + n balance = x

1

2

由于x是局部变量，两个线程各自都有自己的x，当代码正常执行时：

#初始值 balance = 0 t1: x1 = balance + 5 # x1 = 0 + 5 = 5 t1: balance = x1 # balance = 5 t1: x1 = balance - 5 # x1 = 5 - 5 = 0 t1: balance = x1 # balance = 0 t2: x2 = balance + 8 # x2 = 0 + 8 = 8 t2: balance = x2 # balance = 8 t2: x2 = balance - 8 # x2 = 8 - 8 = 0 t2: balance = x2 # balance = 0 #结果 balance = 0 #但是t1和t2是交替运行的，如果操作系统以下面的顺序执行t1、t2： #初始值 balance = 0 t1: x1 = balance + 5 # x1 = 0 + 5 = 5 t2: x2 = balance + 8 # x2 = 0 + 8 = 8 t2: balance = x2 # balance = 8 t1: balance = x1 # balance = 5 t1: x1 = balance - 5 # x1 = 5 - 5 = 0 t1: balance = x1 # balance = 0 t2: x2 = balance - 8 # x2 = 0 - 8 = -8 t2: balance = x2 # balance = -8 #结果 balance = -8

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

是因为修改balance需要多条语句，而执行这几条语句时，线程可能中断，从而导致多个线程把同一个对象的内容改乱了。

我们必须确保一个线程在修改balance的时候，别的线程一定不能改。

4、如果我们要确保balance计算正确，就要给change_it()上一把锁，当某个线程开始执行change_it()时，我们说，该线程因为获得了锁，因此其他线程不能同时执行change_it()，只能等待，直到锁被释放后，获得该锁以后才能改。由于锁只有一个，无论多少线程，同一时刻最多只有一个线程持有该锁，所以，不会造成修改的冲突。

创建一个锁就是通过threading.Lock()来实现：

balance = 0 lock = threading.Lock() def run_thread(n): for i in range(100000): # 先要获取锁: lock.acquire() try: # 放心地改吧: change_it(n) finally: # 改完了一定要释放锁: lock.release()

1

2

3

4

5

6

7

8

9

10

11

12

13

当多个线程同时执行lock.acquire()时，只有一个线程能成功地获取锁，然后继续执行代码，其他线程就继续等待直到获得锁为止。

5、获得锁的线程用完后一定要释放锁，否则那些苦苦等待锁的线程将永远等待下去，成为死线程。所以我们用try…finally来确保锁一定会被释放。

6、多进程模式：稳定性高（一个子进程崩溃了，不会影响主进程和其他子进程，当然主进程挂了所有进程就全挂了），但是创建进程的代价大，另外，操作系统能同时运行的进程数也是有限的，在内存和CPU的限制下，如果有几千个进程同时运行，操作系统连调度都会成问题。

7、多线程模式：比多进程快一点，但是也快不到哪去，而且，任何一个线程挂掉都可能直接造成整个进程崩溃，因为所有线程共享进程的内存。
查看全文

相关阅读:
聊聊Senior .net 面试，作为面试官你称职吗
 使用Microsoft BizTalk Adapter for mySAP Business Suite需要注意的一些限制点
 eos账号管理
 如何安装以太坊钱包Parity
Infoq主办 Baidu Web 开发者大会记录
 http请求的详细过程转载
 php 下载保存文件保存到本地
 php section
用javascript拼接html代码标签
 php使用sql数据库取得字段问题

原文地址：https://www.cnblogs.com/fengff/p/9606255.html

Python笔记-进程Process、线程Thread、上锁

多进程

进程间通信

多线程