原文:http://sdiehl.github.com/gevent-tutorial/
gevent For the Working Python Developer
Written by the Gevent Community
gevent is a concurrency library based around libev. It provides a clean API for a variety of concurrency and network related tasks.
gevent是一个并发性库基于libev,它提供了一个纯净的API 用来处理各类问题和网络相关任务.
Introduction 介绍
The structure of this tutorial assumes an intermediate level knowledge of Python but not much else. No knowledge of concurrency is expected. The goal is to give you the tools you need to get going with gevent, help you tame your existing concurrency problems and start writing asynchronous applications today.
这教程的结构假设一个中等层次的Python程序员且没有什么并发性的知识.目的是给你知道怎么使用gevent,帮助你解决并发性的问题,和开始编写异步的应用程序.
Contributors 贡献者
In chronological order of contribution: Stephen Diehl Jérémy Bethmont sww Bruno Bigras David Ripton Travis Cline Boris Feld youngsterxyf Eddie Hebert Alexis Metaireau
This is a collaborative document published under MIT license. Have something to add? See a typo? Fork and issue a pull request Github. Any and all contributions are welcome.
Core 核心
Greenlets
The primary pattern used in gevent is the Greenlet, a lightweight coroutine provided to Python as a C extension module. Greenlets all run inside of the OS process for the main program but are scheduled cooperatively. This differs from any of the real parallelism constructs provided by multiprocessing
or multithreading
libraries which do spin processes and POSIX threads which are truly parallel.
在gevent中主要使用Greenlet,给Python提供一个轻量级的协同程序,作为一个C的扩展模块.Greenlets主程序运行的所有系统进程是合理安排的. 这不同于任何multiprocessing
或者multithreading
提供的库和POSIX线程,这是真正的并行多处理器或多线程库提供真正的并行结构。(真难翻译)
Synchronous & Asynchronous Execution 同步&异步执行
The core idea of concurrency is that a larger task can be broken down into a collection of subtasks whose operation does not depend on the other tasks and thus can be run asynchronously instead of one at a time synchronously. A switch between the two executions is known as a context switch.
A context switch in gevent done through yielding. In this case example we have two contexts which yield to each other through invoking gevent.sleep(0)
.
并发的核心思想是一个更大的任务可以分解成多个子任务,其运行不依赖于其他任务的集合,因此可以异步运行 ,而不是一个在时间 同步。两个执行程序间的转换是一个关联转换。
在gevent中一个关联转换可以通过 yielding 来实现.在这个例子,两个程序的转换是通过调用 gevent.sleep(0)
.
import gevent
def foo():
print('Running in foo')
gevent.sleep(0)
print('Explicit context switch to foo again')
def bar():
print('Explicit context to bar')
gevent.sleep(0)
print('Implicit context switch back to bar')
gevent.joinall([
gevent.spawn(foo),
gevent.spawn(bar),
])
Running in foo
Explicit context to bar
Explicit context switch to foo again
Implicit context switch back to bar
It is illuminating to visualize the control flow of the program or walk through it with a debugger to see the context switches as they occur.
在调解器里面清楚地看到程序在两个转换之间是怎么运行的.
The real power of gevent comes when we use it for network and IO bound functions which can be cooperatively scheduled. Gevent has taken care of all the details to ensure that your network libraries will implicitly yield their greenlet contexts whenever possible. I cannot stress enough what a powerful idiom this is. But maybe an example will illustrate.
gevent真正的能力在于我们把它用于网络和IO相关的功能会很好的合作安排.
import time
import gevent
from gevent import select
start = time.time()
tic = lambda: 'at %1.1f seconds' % (time.time() - start)
def gr1():
# Busy waits for a second, but we don't want to stick around...
print('Started Polling: ', tic())
select.select([], [], [], 2)
print('Ended Polling: ', tic())
def gr2():
# Busy waits for a second, but we don't want to stick around...
print('Started Polling: ', tic())
select.select([], [], [], 2)
print('Ended Polling: ', tic())
def gr3():
print("Hey lets do some stuff while the greenlets poll, at", tic())
gevent.sleep(1)
gevent.joinall([
gevent.spawn(gr1),
gevent.spawn(gr2),
gevent.spawn(gr3),
])
Started Polling: at 0.0 seconds
Started Polling: at 0.0 seconds
Hey lets do some stuff while the greenlets poll, at at 0.0 seconds
Ended Polling: at 2.0 seconds
Ended Polling: at 2.0 seconds
A somewhat synthetic example defines a task
function which is non-deterministic (i.e. its output is not guaranteed to give the same result for the same inputs). In this case the side effect of running the function is that the task pauses its execution for a random number of seconds.
一个比较综合的例子,定义一个task函数,它是不确定的(并不能保证相同的输入输出).在这种情况运行task函数的作用只是暂停其执行几秒钟的随机数.
import gevent
import random
def task(pid):
"""
Some non-deterministic task
"""
gevent.sleep(random.randint(0,2)*0.001)
print('Task', pid, 'done')
def synchronous():
for i in range(1,10):
task(i)
def asynchronous():
threads = [gevent.spawn(task, i) for i in xrange(10)]
gevent.joinall(threads)
print('Synchronous:')
synchronous()
print('Asynchronous:')
asynchronous()
Synchronous:
Task 1 done
Task 2 done
Task 3 done
Task 4 done
Task 5 done
Task 6 done
Task 7 done
Task 8 done
Task 9 done
Asynchronous:
Task 1 done
Task 6 done
Task 5 done
Task 0 done
Task 9 done
Task 8 done
Task 7 done
Task 4 done
Task 3 done
Task 2 done
In the synchronous case all the tasks are run sequentially, which results in the main programming blocking ( i.e. pausing the execution of the main program ) while each task executes.
在同步的情况所有任务都会顺序的运行,当每个任务执行的时候导致主程序 blocking.
The important parts of the program are the gevent.spawn
which wraps up the given function inside of a Greenlet thread. The list of initialized greenlets are stored in the array threads
which is passed to the gevent.joinall
function which blocks the current program to run all the given greenlets. The execution will step forward only when all the greenlets terminate.
程序重要的部分是包装起来的函数gevent.spawn
, 它是Greenlet的线程. 初始化的greenlets储存在一个数组threads
,然后提交给 gevent.joinall
函数,然后阻塞当前的程序去运行所有greenlets.只有当所有greenlets停止的时候程序才会继续运行.
The important fact to notice is that the order of execution in the async case is essentially random and that the total execution time in the async case is much less than the sync case. In fact the maximum time for the synchronous case to complete is when each tasks pauses for 2 seconds resulting in a 20 seconds for the whole queue. In the async case the maximum runtime is roughly 2 seconds since none of the tasks block the execution of the others.
要注意的是异步的情况程序是无序的,异步的执行时间是远少于同步的.事实上同步去完成每个任务停止2秒的话,结果是要20秒才能完成整个队列.在异步的情况最大的运行时间大概就是2秒,因为每个任务的执行都不会阻塞其他的任务.
A more common use case, fetching data from a server asynchronously, the runtime of fetch()
will differ between requests given the load on the remote server.
一个更常见的情况,是从服务器上异步获取数据,请求之间 fetch()
的运行时间会给服务器带来不同的负载.
import gevent.monkey
gevent.monkey.patch_socket()
import gevent
import urllib2
import simplejson as json
def fetch(pid):
response = urllib2.urlopen('http://json-time.appspot.com/time.json')
result = response.read()
json_result = json.loads(result)
datetime = json_result['datetime']
print 'Process ', pid, datetime
return json_result['datetime']
def synchronous():
for i in range(1,10):
fetch(i)
def asynchronous():
threads = []
for i in range(1,10):
threads.append(gevent.spawn(fetch, i))
gevent.joinall(threads)
print 'Synchronous:'
synchronous()
print 'Asynchronous:'
asynchronous()
Determinism 确定性
As mentioned previously, greenlets are deterministic. Given the same inputs and they always produce the same output. For example lets spread a task across a multiprocessing pool compared to a gevent pool.
正如之前提到的,greenlets是确定性的.给相同的输入就总会提供相同的输出.例如展开一个任务来比较一个multiprocessing pool和一个gevent pool.
import time
def echo(i):
time.sleep(0.001)
return i
# Non Deterministic Process Pool
from multiprocessing.pool import Pool
p = Pool(10)
run1 = [a for a in p.imap_unordered(echo, xrange(10))]
run2 = [a for a in p.imap_unordered(echo, xrange(10))]
run3 = [a for a in p.imap_unordered(echo, xrange(10))]
run4 = [a for a in p.imap_unordered(echo, xrange(10))]
print( run1 == run2 == run3 == run4 )
# Deterministic Gevent Pool
from gevent.pool import Pool
p = Pool(10)
run1 = [a for a in p.imap_unordered(echo, xrange(10))]
run2 = [a for a in p.imap_unordered(echo, xrange(10))]
run3 = [a for a in p.imap_unordered(echo, xrange(10))]
run4 = [a for a in p.imap_unordered(echo, xrange(10))]
print( run1 == run2 == run3 == run4 )
False
True
Even though gevent is normally deterministic, sources of non-determinism can creep into your program when you begin to interact with outside services such as sockets and files. Thus even though green threads are a form of "deterministic concurrency", they still can experience some of the same problems that POSIX threads and processes experience.
The perennial problem involved with concurrency is known as a race condition. Simply put is when two concurrent threads / processes depend on some shared resource but also attempt to modify this value. This results in resources whose values become time-dependent on the execution order. This is a problem, and in general one should very much try to avoid race conditions since they result program behavior which is globally non-deterministic.*
The best approach to this is to simply avoid all global state all times. Global state and import-time side effects will always come back to bite you!
(上段太难,只能意会不能言传)
Spawning Threads
gevent provides a few wrappers around Greenlet initialization. Some of the most common patterns are:
gevent提供了一些Greenlet初始化的封装.部分比较常用的模块是:
import gevent
from gevent import Greenlet
def foo(message, n):
"""
Each thread will be passed the message, and n arguments
in its initialization.
"""
gevent.sleep(n)
print(message)
# Initialize a new Greenlet instance running the named function
# foo
thread1 = Greenlet.spawn(foo, "Hello", 1)
# Wrapper for creating and runing a new Greenlet from the named
# function foo, with the passed arguments
thread2 = gevent.spawn(foo, "I live!", 2)
# Lambda expressions
thread3 = gevent.spawn(lambda x: (x+1), 2)
threads = [thread1, thread2, thread3]
# Block until all threads complete.
gevent.joinall(threads)
Hello
I live!
In addition to using the base Greenlet class, you may also subclass Greenlet class and overload the _run
method.
除了用Greenlet的基类,你也可以用Greenlet的子类,重载_run
方法.
from gevent import Greenlet
class MyGreenlet(Greenlet):
def __init__(self, message, n):
Greenlet.__init__(self)
self.message = message
self.n = n
def _run(self):
print(self.message)
gevent.sleep(self.n)
g = MyGreenlet("Hi there!", 3)
g.start()
g.join()
Hi there!
Greenlet State 状态
Like any other segment of code, Greenlets can fail in various ways. A greenlet may fail to throw an exception, fail to halt or consume too many system resources.
像其他编程,Greenlets会以不同的方式失败.一个greenlet可能会抛出一个异常, 失败会使程序停止或者消耗系统很多资源.
The internal state of a greenlet is generally a time-dependent parameter. There are a number of flags on greenlets which let you monitor the state of the thread
greenlet内部的状态通常是一个按时间变化的参数.以下几个状态让你可以监听线程的状态.
started
-- Boolean, indicates whether the Greenlet has been started. 表明是否Greenlet已经开始ready()
-- Boolean, indicates whether the Greenlet has halted. 表明是否Greenlet已经停止successful()
-- Boolean, indicates whether the Greenlet has halted and not thrown an exception. 表明是否Greenlet已经停止并且没有抛出异常value
-- arbitrary, the value returned by the Greenlet. 任意,Greenlet返回的值exception
-- exception, uncaught exception instance thrown inside the greenlet 异常,greenlet内部实例没有被捕抓的异常
import gevent
def win():
return 'You win!'
def fail():
raise Exception('You fail at failing.')
winner = gevent.spawn(win)
loser = gevent.spawn(fail)
print(winner.started) # True
print(loser.started) # True
# Exceptions raised in the Greenlet, stay inside the Greenlet.
try:
gevent.joinall([winner, loser])
except Exception as e:
print('This will never be reached')
print(winner.value) # 'You win!'
print(loser.value) # None
print(winner.ready()) # True
print(loser.ready()) # True
print(winner.successful()) # True
print(loser.successful()) # False
# The exception raised in fail, will not propogate outside the
# greenlet. A stack trace will be printed to stdout but it
# will not unwind the stack of the parent.
print(loser.exception)
# It is possible though to raise the exception again outside
# raise loser.exception
# or with
# loser.get()
True
True
You win!
None
True
True
True
False
You fail at failing.
Program Shutdown 程序关闭
Greenlets that fail to yield when the main program receives a SIGQUIT may hold the program's execution longer than expected. This results in so called "zombie processes" which need to be killed from outside of the Python interpreter.
当主程序接受到一个SIGQUIT的时候,Greenlets的失败可能会让程序的执行比预想中长时间.这样的结果称为"zombie processes" ,需要让Python解析器以外的程序杀掉.
A common pattern is to listen SIGQUIT events on the main program and to invoke gevent.shutdown
before exit.
一个常用的模块是在主程序中监听SIGQUIT事件和退出前调用 gevent.shutdown
.
import gevent
import signal
def run_forever():
gevent.sleep(1000)
if __name__ == '__main__':
gevent.signal(signal.SIGQUIT, gevent.shutdown)
thread = gevent.spawn(run_forever)
thread.join()
Timeouts 超时设定
Timeouts are a constraint on the runtime of a block of code or a Greenlet.
超时是对一推代码或者一个Greenlet运行时间的一种约束.
import gevent
from gevent import Timeout
seconds = 10
timeout = Timeout(seconds)
timeout.start()
def wait():
gevent.sleep(10)
try:
gevent.spawn(wait).join()
except Timeout:
print 'Could not complete'
Or with a context manager in a with
a statement.
或者是带着一个语境的管理在一个with的状态.
import gevent
from gevent import Timeout
time_to_wait = 5 # seconds
class TooLong(Exception):
pass
with Timeout(time_to_wait, TooLong):
gevent.sleep(10)
In addition, gevent also provides timeout arguments for a variety of Greenlet and data stucture related calls. For example:
另外,gevent同时也提供timeout的参数给各种Greenlet和数据结构相关的调用.例如:
import gevent
from gevent import Timeout
def wait():
gevent.sleep(2)
timer = Timeout(1).start()
thread1 = gevent.spawn(wait)
try:
thread1.join(timeout=timer)
except Timeout:
print('Thread 1 timed out')
# --
timer = Timeout.start_new(1)
thread2 = gevent.spawn(wait)
try:
thread2.get(timeout=timer)
except Timeout:
print('Thread 2 timed out')
# --
try:
gevent.with_timeout(1, wait)
except Timeout:
print('Thread 3 timed out')
Thread 1 timed out
Thread 2 timed out
Thread 3 timed out
Data Structures 数据结构
Events 事件
Events are a form of asynchronous communication between Greenlets.
事件是Greenlets内部一种异步通讯的形式.
import gevent
from gevent.event import AsyncResult
a = AsyncResult()
def setter():
"""
After 3 seconds set wake all threads waiting on the value of
a.
"""
gevent.sleep(3)
a.set()
def waiter():
"""
After 3 seconds the get call will unblock.
"""
a.get() # blocking
print 'I live!'
gevent.joinall([
gevent.spawn(setter),
gevent.spawn(waiter),
])
A extension of the Event object is the AsyncResult which allows you to send a value along with the wakeup call. This is sometimes called a future or a deferred, since it holds a reference to a future value that can be set on an arbitrary time schedule.
Event对象的一个扩展AsyncResult,可以让你发送一个值连同唤醒调用.这样有时候调用一个将来或者一个延迟,然后它就可以保存涉及到一个将来的值可以用于任意时间表.
import gevent
from gevent.event import AsyncResult
a = AsyncResult()
def setter():
"""
After 3 seconds set the result of a.
"""
gevent.sleep(3)
a.set('Hello!')
def waiter():
"""
After 3 seconds the get call will unblock after the setter
puts a value into the AsyncResult.
"""
print a.get()
gevent.joinall([
gevent.spawn(setter),
gevent.spawn(waiter),
])
Queues 队列
Queues are ordered sets of data that have the usual put
/ get
operations but are written in a way such that they can be safely manipulated across Greenlets.
Queues是一组数据的排序,有常用的 put
/ get
操作,但也可以以另一种方式写入,就是当他们在Greenlets之间可以安全地操作.
For example if one Greenlet grabs an item off of the queue, the same item will not grabbed by another Greenlet executing simultaneously.
例如如果一个Greenlet在队列中取出一个元素,同样的元素就不会被另一个正在执行的Greenlet取出.
import gevent
from gevent.queue import Queue
tasks = Queue()
def worker(n):
while not tasks.empty():
task = tasks.get()
print('Worker %s got task %s' % (n, task))
gevent.sleep(0)
print('Quitting time!')
def boss():
for i in xrange(1,25):
tasks.put_nowait(i)
gevent.spawn(boss).join()
gevent.joinall([
gevent.spawn(worker, 'steve'),
gevent.spawn(worker, 'john'),
gevent.spawn(worker, 'nancy'),
])
Worker steve got task 1
Worker john got task 2
Worker nancy got task 3
Worker steve got task 4
Worker nancy got task 5
Worker john got task 6
Worker steve got task 7
Worker john got task 8
Worker nancy got task 9
Worker steve got task 10
Worker nancy got task 11
Worker john got task 12
Worker steve got task 13
Worker john got task 14
Worker nancy got task 15
Worker steve got task 16
Worker nancy got task 17
Worker john got task 18
Worker steve got task 19
Worker john got task 20
Worker nancy got task 21
Worker steve got task 22
Worker nancy got task 23
Worker john got task 24
Quitting time!
Quitting time!
Quitting time!
Queues can also block on either put
or get
as the need arises.
Queues也可以在 put
或者get
的时候阻塞,如果有必要的话.
Each of the put
and get
operations has a non-blocking counterpart, put_nowait
and get_nowait
which will not block, but instead raise either gevent.queue.Empty
or gevent.queue.Full
in the operation is not possible.
每个put
和get
操作不会有阻塞的情况.put_nowait
和get_nowait
也不会阻塞,但在操作中抛出gevent.queue.Empty
或者gevent.queue.Full
是不可能的.
In this example we have the boss running simultaneously to the workers and have a restriction on the Queue that it can contain no more than three elements. This restriction means that the put
operation will block until there is space on the queue. Conversely the get
operation will block if there are no elements on the queue to fetch, it also takes a timeout argument to allow for the queue to exit with the exception gevent.queue.Empty
if no work can found within the time frame of the Timeout.
在这个例子中,我们有一个boss同时给工人任务,有一个限制是说队列中不能超过3个工人,这个限制意味着put
操作会阻塞当队伍中没有空间.相反的get
操作会阻塞如果队列中没有元素可取,也可以加入一个timeout的参数来允许队列带着一个异常gevent.queue.Empty
退出,如果在Timeout时间范围内没有工作.
import gevent
from gevent.queue import Queue, Empty
tasks = Queue(maxsize=3)
def worker(n):
try:
while True:
task = tasks.get(timeout=1) # decrements queue size by 1
print('Worker %s got task %s' % (n, task))
gevent.sleep(0)
except Empty:
print('Quitting time!')
def boss():
"""
Boss will wait to hand out work until a individual worker is
free since the maxsize of the task queue is 3.
"""
for i in xrange(1,10):
tasks.put(i)
print('Assigned all work in iteration 1')
for i in xrange(10,20):
tasks.put(i)
print('Assigned all work in iteration 2')
gevent.joinall([
gevent.spawn(boss),
gevent.spawn(worker, 'steve'),
gevent.spawn(worker, 'john'),
gevent.spawn(worker, 'bob'),
])
Worker steve got task 1
Worker john got task 2
Worker bob got task 3
Worker steve got task 4
Worker bob got task 5
Worker john got task 6
Assigned all work in iteration 1
Worker steve got task 7
Worker john got task 8
Worker bob got task 9
Worker steve got task 10
Worker bob got task 11
Worker john got task 12
Worker steve got task 13
Worker john got task 14
Worker bob got task 15
Worker steve got task 16
Worker bob got task 17
Worker john got task 18
Assigned all work in iteration 2
Worker steve got task 19
Quitting time!
Quitting time!
Quitting time!
Groups and Pools
Locks and Semaphores
Thread Locals
Werkzeug
Actors
The actor model is a higher level concurrency model popularized by the language Erlang. In short the main idea is that you have a collection of independent Actors which have an inbox from which they receive messages from other Actors. The main loop inside the Actor iterates through its messages and takes action according to its desired behavior.
Gevent does not have a primitive Actor type, but we can define one very simply using a Queue inside of a subclassed Greenlet.
import gevent
class Actor(gevent.Greenlet):
def __init__(self):
self.inbox = queue.Queue()
Greenlet.__init__(self)
def receive(self, message):
"""
Define in your subclass.
"""
raise NotImplemented()
def _run(self):
self.running = True
while self.running:
message = self.inbox.get()
self.receive(message)
In a use case:
import gevent
from gevent.queue import Queue
from gevent import Greenlet
class Pinger(Actor):
def receive(self, message):
print message
pong.inbox.put('ping')
gevent.sleep(0)
class Ponger(Actor):
def receive(self, message):
print message
ping.inbox.put('pong')
gevent.sleep(0)
ping = Pinger()
pong = Ponger()
ping.start()
pong.start()
ping.inbox.put('start')
gevent.joinall([ping, pong])
Real World Applications
Gevent ZeroMQ
ZeroMQ is described by its authors as "a socket library that acts as a concurrency framework". It is a very powerful messaging layer for building concurrent and distributed applications.
ZeroMQ根据其作者的描述是"一个socket库作为一个并发性的框架".它是非常强大的消息传送层在建立并发性结构和分布式应用的时候.
ZeroMQ provides a variety of socket primitives, the simplest of which being a Request-Response socket pair. A socket has two methods of interest send
and recv
, both of which are normally blocking operations. But this is remedied by a briliant library by Travis Cline which uses gevent.socket to poll ZeroMQ sockets in a non-blocking manner. You can install gevent-zeromq from PyPi via: pip install gevent-zeromq
ZeroMQ提供了各种socket基元,最简单的就是一对Request-Response socket. 一个socket有2个有用方法 send
和recv
,两者一般都会有阻塞操作.但这已经被一个作者叫Travis Cline,基于gevent 写的briliant库补救了.socket 属于 ZeroMQ sockets 是一种不会阻塞的方式.你可以安装 gevent-zeromq 从 PyPi 取到: pip install gevent-zeromq
# Note: Remember to ``pip install pyzmq gevent_zeromq``
import gevent
from gevent_zeromq import zmq
# Global Context
context = zmq.Context()
def server():
server_socket = context.socket(zmq.REQ)
server_socket.bind("tcp://127.0.0.1:5000")
for request in range(1,10):
server_socket.send("Hello")
print('Switched to Server for ', request)
# Implicit context switch occurs here
server_socket.recv()
def client():
client_socket = context.socket(zmq.REP)
client_socket.connect("tcp://127.0.0.1:5000")
for request in range(1,10):
client_socket.recv()
print('Switched to Client for ', request)
# Implicit context switch occurs here
client_socket.send("World")
publisher = gevent.spawn(server)
client = gevent.spawn(client)
gevent.joinall([publisher, client])
Switched to Server for 1
Switched to Client for 1
Switched to Server for 2
Switched to Client for 2
Switched to Server for 3
Switched to Client for 3
Switched to Server for 4
Switched to Client for 4
Switched to Server for 5
Switched to Client for 5
Switched to Server for 6
Switched to Client for 6
Switched to Server for 7
Switched to Client for 7
Switched to Server for 8
Switched to Client for 8
Switched to Server for 9
Switched to Client for 9
Simple Telnet Servers
# On Unix: Access with ``$ nc 127.0.0.1 5000``
# On Window: Access with ``$ telnet 127.0.0.1 5000``
from gevent.server import StreamServer
def handle(socket, address):
socket.send("Hello from a telnet!\n")
for i in range(5):
socket.send(str(i) + '\n')
socket.close()
server = StreamServer(('127.0.0.1', 5000), handle)
server.serve_forever()
WSGI Servers
Gevent provides two WSGI servers for serving content over HTTP. Henceforth called wsgi
and pywsgi
:
Gevent提供2个WSGI服务器用于HTTP.称为wsgi
和pywsgi
:
- gevent.wsgi.WSGIServer
- gevent.pywsgi.WSGIServer
In earlier versions of gevent before 1.0.x, gevent used libevent instead of libev. Libevent included a fast HTTP server which was used by gevent's wsgi
server.
在1.0x前的gevent版本,gevent用libevent代替libev.Libevent包含一个快的HTTP服务用于gevent的wsgi
服务器.
In gevent 1.0.x there is no http server included. Instead gevent.wsgi
it is now an alias for the pure Python server in gevent.pywsgi
.
在gevent1.0.x这里没有http服务器包含,gevent.wsgi
现在已经是纯Python服务器gevent.pywsgi
的别名.
Streaming Servers 流式服务器
If you are using gevent 1.0.x, this section does not apply
如果你用的是gevent1.0.x,这部分是不适用的.
For those familiar with streaming HTTP services, the core idea is that in the headers we do not specify a length of the content. We instead hold the connection open and flush chunks down the pipe, prefixing each with a hex digit indicating the length of the chunk. The stream is closed when a size zero chunk is sent.
和那些流式HTTP服务器相似,核心意见是在headers中,我们不指定内容的长度.我们用保持连接打到管道接收缓冲块,在每一块的前缀用十六进制表明块的长度,当块的长度是0发送的时候,流就会关闭.
HTTP/1.1 200 OK
Content-Type: text/plain
Transfer-Encoding: chunked
8
<p>Hello
9
World</p>
0
The above HTTP connection could not be created in wsgi because streaming is not supported. It would instead have to buffered.
以上的HTTP连接,不能用于创建wsgi,因为流是不支持的,我们用缓冲的方法.
from gevent.wsgi import WSGIServer
def application(environ, start_response):
status = '200 OK'
body = '<p>Hello World</p>'
headers = [
('Content-Type', 'text/html')
]
start_response(status, headers)
return [body]
WSGIServer(('', 8000), application).serve_forever()
Using pywsgi we can however write our handler as a generator and yield the result chunk by chunk.
使用pywsgi我们把处理当作一个发生器,一块接一块的返回结果.
from gevent.pywsgi import WSGIServer
def application(environ, start_response):
status = '200 OK'
headers = [
('Content-Type', 'text/html')
]
start_response(status, headers)
yield "<p>Hello"
yield "World</p>"
WSGIServer(('', 8000), application).serve_forever()
But regardless, performance on Gevent servers is phenomenal compared to other Python servers. libev is a very vetted technology and its derivative servers are known to perform well at scale.
To benchmark, try Apache Benchmark ab
or see this Benchmark of Python WSGI Servers for comparison with other servers.
但是不管怎样,gevent跟其他python服务器对比是非凡的.libev是一个非常vetted的技术,和它派生出来的服务器已被认可是表现良好的.
用基准问题测试,尝试 Apache Benchmark ab
看这个 Benchmark of Python WSGI Servers 和其他服务的比较.
$ ab -n 10000 -c 100 http://127.0.0.1:8000/
Long Polling
import gevent
from gevent.queue import Queue, Empty
from gevent.pywsgi import WSGIServer
import simplejson as json
data_source = Queue()
def producer():
while True:
data_source.put_nowait('Hello World')
gevent.sleep(1)
def ajax_endpoint(environ, start_response):
status = '200 OK'
headers = [
('Content-Type', 'application/json')
]
try:
datum = data_source.get(timeout=5)
except Empty:
datum = []
start_response(status, headers)
return json.dumps(datum)
gevent.spawn(producer)
WSGIServer(('', 8000), ajax_endpoint).serve_forever()
Websockets
Websocket example which requires gevent-websocket.
Websocket例子需要 gevent-websocket.
# Simple gevent-websocket server
import json
import random
from gevent import pywsgi, sleep
from geventwebsocket.handler import WebSocketHandler
class WebSocketApp(object):
'''Send random data to the websocket'''
def __call__(self, environ, start_response):
ws = environ['wsgi.websocket']
x = 0
while True:
data = json.dumps({'x': x, 'y': random.randint(1, 5)})
ws.send(data)
x += 1
sleep(0.5)
server = pywsgi.WSGIServer(("", 10000), WebSocketApp(),
handler_class=WebSocketHandler)
server.serve_forever()
HTML Page:
<html>
<head>
<title>Minimal websocket application</title>
<script type="text/javascript" src="jquery.min.js"></script>
<script type="text/javascript">
$(function() {
// Open up a connection to our server
var ws = new WebSocket("ws://localhost:10000/");
// What do we do when we get a message?
ws.onmessage = function(evt) {
$("#placeholder").append('<p>' + evt.data + '</p>')
}
// Just update our conn_status field with the connection status
ws.onopen = function(evt) {
$('#conn_status').html('<b>Connected</b>');
}
ws.onerror = function(evt) {
$('#conn_status').html('<b>Error</b>');
}
ws.onclose = function(evt) {
$('#conn_status').html('<b>Closed</b>');
}
});
</script>
</head>
<body>
<h1>WebSocket Example</h1>
<div id="conn_status">Not Connected</div>
<div id="placeholder" style="600px;height:300px;"></div>
</body>
</html>
Chat Server
The final motivating example, a realtime chat room. This example requires Flask ( but not neccesarily so, you could use Django, Pyramid, etc ). The corresponding Javascript and HTML files can be found here.
最后一个激励的例子,一个实时的聊天室.这个例子需要Flask(但不是必须的,你可以用Django, Pyramid等等).相应的Javascript 和 HTML 文件可以在这里找到.
# Micro gevent chatroom.
# ----------------------
from flask import Flask, render_template, request
from gevent import queue
from gevent.pywsgi import WSGIServer
import simplejson as json
app = Flask(__name__)
app.debug = True
rooms = {
'topic1': Room(),
'topic2': Room(),
}
users = {}
class Room(object):
def __init__(self):
self.users = set()
self.messages = []
def backlog(self, size=25):
return self.messages[-size:]
def subscribe(self, user):
self.users.add(user)
def add(self, message):
for user in self.users:
print user
user.queue.put_nowait(message)
self.messages.append(message)
class User(object):
def __init__(self):
self.queue = queue.Queue()
@app.route('/')
def choose_name():
return render_template('choose.html')
@app.route('/<uid>')
def main(uid):
return render_template('main.html',
uid=uid,
rooms=rooms.keys()
)
@app.route('/<room>/<uid>')
def join(room, uid):
user = users.get(uid, None)
if not user:
users[uid] = user = User()
active_room = rooms[room]
active_room.subscribe(user)
print 'subscribe', active_room, user
messages = active_room.backlog()
return render_template('room.html',
room=room, uid=uid, messages=messages)
@app.route("/put/<room>/<uid>", methods=["POST"])
def put(room, uid):
user = users[uid]
room = rooms[room]
message = request.form['message']
room.add(':'.join([uid, message]))
return ''
@app.route("/poll/<uid>", methods=["POST"])
def poll(uid):
try:
msg = users[uid].queue.get(timeout=10)
except queue.Empty:
msg = []
return json.dumps(msg)
if __name__ == "__main__":
http = WSGIServer(('', 5000), app)
http.serve_forever()
ps:本译文是本人为了更好的学习gevent翻译的,并没有太多时间去查阅整理.英语水平有限,请见谅.