zoukankan      html  css  js  c++  java
  • Python 线程使用模式

    参考阅读:http://www.ibm.com/developerworks/aix/library/au-threadingpython/

    一个小例子:

       1: import threading
       2: import datetime
       3:  
       4: class ThreadClass(threading.Thread):
       5:   def run(self):
       6:     now = datetime.datetime.now()
       7:     print "%s says Hello World at time: %s" % 
       8:     (self.getName(), now)
       9:  
      10: for i in range(2):
      11:   t = ThreadClass()
      12:   t.start()
     
    自己写的线程类要从threading.Thread继承,要实现run方法
     

    Noah Gift推荐在使用python的线程时使用queue模式

       1: #!/usr/bin/env python
       2: import Queue
       3: import threading
       4: import urllib2
       5: import time
       6:  
       7: hosts = ["http://yahoo.com", "http://google.com", "http://amazon.com",
       8: "http://ibm.com", "http://apple.com"]
       9:  
      10: queue = Queue.Queue()
      11:  
      12: class ThreadUrl(threading.Thread):
      13: """Threaded Url Grab"""
      14: def __init__(self, queue):
      15:   threading.Thread.__init__(self)
      16:   self.queue = queue
      17:  
      18: def run(self):
      19:   while True:
      20:     #grabs host from queue
      21:     host = self.queue.get()
      22:  
      23:     #grabs urls of hosts and prints first 1024 bytes of page
      24:     url = urllib2.urlopen(host)
      25:     print url.read(1024)
      26:  
      27:     #signals to queue job is done
      28:     self.queue.task_done()
      29:  
      30: start = time.time()
      31: def main():
      32:  
      33: #spawn a pool of threads, and pass them queue instance 
      34: for i in range(5):
      35:   t = ThreadUrl(queue)
      36:   t.setDaemon(True)
      37:   t.start()
      38:   
      39: #populate queue with data   
      40:   for host in hosts:
      41:     queue.put(host)
      42:  
      43: #wait on the queue until everything has been processed     
      44: queue.join()
      45:  
      46: main()
      47: print "Elapsed Time: %s" % (time.time() - start)

    这个例子给出使用queue的模式:

    1.用Queue.Queue()创建队列实例,然后用其操作数据

    2.把该队列实例传入线程类中

    3.生成守护线程池

    4.每次从队列中取一个数据,在线程中使用改数据,使用run方法,完成工作

    5.工作完成后,用queue.task_done()发送信号给队列,以表明任务已经结束

    6.在queue上使用join(),这意味着一直等到queue为空时,再退出主程序

    这里设置守护线程为真,是为了让主线程能够在只有守护线程时还在运行时退出,简化程序执行流程

    链式处理:

       1: import Queue
       2: import threading
       3: import urllib2
       4: import time
       5: from BeautifulSoup import BeautifulSoup
       6:  
       7: hosts = ["http://yahoo.com", "http://google.com", "http://amazon.com",
       8:         "http://ibm.com", "http://apple.com"]
       9:  
      10: queue = Queue.Queue()
      11: out_queue = Queue.Queue()
      12:  
      13: class ThreadUrl(threading.Thread):
      14:     """Threaded Url Grab"""
      15:     def __init__(self, queue, out_queue):
      16:         threading.Thread.__init__(self)
      17:         self.queue = queue
      18:         self.out_queue = out_queue
      19:  
      20:     def run(self):
      21:         while True:
      22:             #grabs host from queue
      23:             host = self.queue.get()
      24:  
      25:             #grabs urls of hosts and then grabs chunk of webpage
      26:             url = urllib2.urlopen(host)
      27:             chunk = url.read()
      28:  
      29:             #place chunk into out queue
      30:             self.out_queue.put(chunk)
      31:  
      32:             #signals to queue job is done
      33:             self.queue.task_done()
      34:  
      35: class DatamineThread(threading.Thread):
      36:     """Threaded Url Grab"""
      37:     def __init__(self, out_queue):
      38:         threading.Thread.__init__(self)
      39:         self.out_queue = out_queue
      40:  
      41:     def run(self):
      42:         while True:
      43:             #grabs host from queue
      44:             chunk = self.out_queue.get()
      45:  
      46:             #parse the chunk
      47:             soup = BeautifulSoup(chunk)
      48:             print soup.findAll(['title'])
      49:  
      50:             #signals to queue job is done
      51:             self.out_queue.task_done()
      52:  
      53: start = time.time()
      54: def main():
      55:  
      56:     #spawn a pool of threads, and pass them queue instance
      57:     for i in range(5):
      58:         t = ThreadUrl(queue, out_queue)
      59:         t.setDaemon(True)
      60:         t.start()
      61:  
      62:     #populate queue with data
      63:     for host in hosts:
      64:         queue.put(host)
      65:  
      66:     for i in range(5):
      67:         dt = DatamineThread(out_queue)
      68:         dt.setDaemon(True)
      69:         dt.start()
      70:  
      71:  
      72:     #wait on the queue until everything has been processed
      73:     queue.join()
      74:     out_queue.join()
      75:  
      76: main()
      77: print "Elapsed Time: %s" % (time.time() - start)

    可见使用queue来使用线程真的是简单方便,而且还可以通过链式queue来扩展。上面的小程序可以看做是搜索引擎和数据挖掘的基础组成部分

  • 相关阅读:
    集合总结
    dagger2系列之Scope
    dagger2系列之依赖方式dependencies、包含方式(从属方式)SubComponent
    dagger2系列之生成类实例
    Dagger2系列之使用方法
    Handler系列之内存泄漏
    Handler系列之创建子线程Handler
    Handler系列之原理分析
    Handler系列之使用
    HTML标签
  • 原文地址:https://www.cnblogs.com/westwind/p/2520632.html
Copyright © 2011-2022 走看看