介绍
概观
Supervisor是一个客户端/服务器系统,允许其用户在类似UNIX操作系统上控制许多进程,有以下几点:
方便
为每个流程实例编写rc.d脚本通常很不方便。 rc.d脚本是进程初始化/自动启动/管理的最低通用分母形式,但编写和维护可能会很痛苦。此外,rc.d脚本无法自动重新启动崩溃的进程,并且许多程序在崩溃时无法正常重新启动。Supervisord将进程作为其子进程启动,并且可以配置为在崩溃时自动重新启动它们。它还可以自动配置为在自己的调用上启动进程。
准确性
在UNIX上的进程上,通常很难获得准确的上/下状态。Pidfiles经常撒谎。Supervisord将进程作为子进程启动,因此它始终知道其子进程的真正上/下状态,并且可以方便地查询此数据。
代表团
需要控制进程状态的用户通常只需要这样做。他们不希望或需要对运行进程的机器进行全面的shell访问。侦听“低”TCP端口的进程通常需要以root用户身份启动和重新启动(UNIX错误)。通常情况下,允许“普通”人员停止或重新启动此类进程是完全正常的,但为他们提供shell访问权限通常是不切实际的,并且通常无法为他们提供root访问权限或sudo访问权限。它(正确地)很难向他们解释为什么存在这个问题。如果以root身份启动supervisord,则可以允许“普通”用户控制此类进程,而无需向他们解释问题的复杂性。Supervisorctl允许以非常有限的方式访问机器,
流程组
流程通常需要分组启动和停止,有时甚至是“优先顺序”。通常很难向人们解释如何做到这一点。Supervisor允许您为进程分配优先级,并允许用户通过supervisorctl客户端发出命令,如“start all”和“restart all”,它以预先分配的优先级顺序启动它们。此外,可以将流程分组为“流程组”,并且可以停止一组逻辑相关流程并将其作为一个单元启动。
兼容
除了Windows之外,Supervisor几乎可以处理所有事情。它在Linux,Mac OS X,Solaris和FreeBSD上经过测试和支持。它完全用Python编写,因此安装不需要C编译器。
平台要求
Supervisor已经过测试,并且已知可以在Linux(Ubuntu 9.10),Mac OS X(10.4 / 10.5 / 10.6)和Solaris(10 for Intel)和FreeBSD 6.1上运行。它可能适用于大多数UNIX系统。
在任何版本的Windows下,Supervisor都不会运行。
众所周知,Supervisor可以使用Python 2.4或更高版本,但不能在任何版本的Python 3下运行。
主管组件
supervisord
服务器主管名为supervisord。它负责在自己的调用中启动子程序,响应来自客户端的命令,重新启动崩溃或退出的子进程,记录其子进程stdout和stderr 输出,以及生成和处理与子进程生命周期中的点相对应的“事件”。
服务器进程使用配置文件。这通常位于/etc/supervisord.conf中。此配置文件是“Windows-INI”样式配置文件。通过适当的文件系统权限保持此文件的安全非常重要,因为它可能包含未加密的用户名和密码。
supervisorctl
主管的命令行客户端部分名为 supervisorctl。它为supervisord提供的功能提供了类似shell的界面。从 supervisorctl,用户可以连接到不同的 supervisord进程(一次一个),获取由子进程控制的状态,停止和启动子进程,并获取supervisord的运行进程列表。
命令行客户端通过UNIX域套接字或Internet(TCP)套接字与服务器通信。服务器可以声明客户端的用户在允许他执行命令之前应该提供身份验证凭据。客户端进程通常使用与服务器相同的配置文件,但其中包含[supervisorctl]部分的任何配置文件都可以使用。
网络服务器
与功能媲美A(稀疏)的Web用户界面 supervisorctl可以通过浏览器,如果你开始访问supervisord对互联网插座。在激活配置文件的[inet_http_server]部分后,访问服务器URL(例如http:// localhost:9001 /)以通过Web界面查看和控制进程状态。
XML-RPC接口
为Web UI提供服务的同一HTTP服务器提供了一个XML-RPC接口,可用于询问和控制主管及其运行的程序。请参阅XML-RPC API文档。
安装
当前Supervisor的最高版本是3.0,之前尝试使用2.x版本管理实验集群中的若干mdrill进程,发现使用客户端无法有效启动和停止服务器端管理的各个子进程,从网上搜索错误发现2.x版本有一些bug,建议升级到3.0版本。因此我卸载了2.x版本,重新安装了3.0版本,发现3.0版本很好使。3.0版本相对2.x版本,配置文件不同部分的配置项都发生了变化,详见官方文档。
安装easy_install
yum install python-setuptools-devel
安装Supervisor
easy_install supervisor
生成配置文件
echo_supervisord_conf > /etc/supervisord.conf
配置文件
配置顺序
经过上面两步,Supervisor就安装好了。配置文件模版已经在/etc/supervisord.conf中生成。当然配置文件的位置不一定放在/etc/supervisord.conf中。在不指定配置文件位置的情况下,supervisord和supervisorctl会按照以下三个顺序来搜索配置文件的位置。
- $CWD/supervisord.conf
- $CWD/etc/supervisord.conf
- /etc/supervisord.conf
- /etc/supervisor/supervisord.conf(自Supervisor 3.3.0起)
- ../etc/supervisord.conf(相对于可执行文件)
- ../supervisord.conf(相对于可执行文件)
具体的配置文件内容
在一个三个结点的集群中部署了mdrill,三台主机名称分别为cma01, cma02, cma03,在cma01和cma02上打算启动Supervisor进程(这里的Supervisor进程是storm的进程,与supervisor进程管理工具不是一回事),在cma03上打算启动NimbusServer,Mdrillui和Supervisor三个进程。在三台主机上都安装supervisor进程控制系统后,分别在每台主机的/etc/supervisord.conf文件中配置每台主机要管理的进程。具体配置如下,其中以;开始的行为注释行:
[cma01 /etc/supervisord.conf]
; Sample supervisor config file. ; ; For more information on the config file, please see: ; http://supervisord.org/configuration.html ; ; Note: shell expansion ("~" or "$HOME") is not supported. Environment ; variables can be expanded using this syntax: "%(ENV_HOME)s". [unix_http_server] file=/tmp/supervisor.sock ; (the path to the socket file) ;chmod=0700 ; socket file mode (default 0700) ;chown=nobody:nogroup ; socket file uid:gid owner ;username=user ; (default is no username (open server)) ;password=123 ; (default is no password (open server)) [inet_http_server] ; inet (TCP) server disabled by default port=cma01:9001 ; (ip_address:port specifier, *:port for all iface) ;username=user ; (default is no username (open server)) ;password=123 ; (default is no password (open server)) [supervisord] logfile=/tmp/supervisord.log ; (main log file;default $CWD/supervisord.log) logfile_maxbytes=50MB ; (max main logfile bytes b4 rotation;default 50MB) logfile_backups=10 ; (num of main logfile rotation backups;default 10) loglevel=info ; (log level;default info; others: debug,warn,trace) pidfile=/tmp/supervisord.pid ; (supervisord pidfile;default supervisord.pid) nodaemon=false ; (start in foreground if true;default false) minfds=1024 ; (min. avail startup file descriptors;default 1024) minprocs=200 ; (min. avail process descriptors;default 200) ;umask=022 ; (process file creation umask;default 022) ;user=chrism ; (default is current user, required if root) ;identifier=supervisor ; (supervisord identifier, default is 'supervisor') ;directory=/tmp ; (default is not to cd during start) ;nocleanup=true ; (don't clean up tempfiles at start;default false) ;childlogdir=/tmp ; ('AUTO' child log dir, default $TEMP) ;environment=KEY="value" ; (key value pairs to add to environment) ;strip_ansi=false ; (strip ansi escape codes in logs; def. false) ; the below section must remain in the config file for RPC ; (supervisorctl/web interface) to work, additional interfaces may be ; added by defining them in separate rpcinterface: sections [rpcinterface:supervisor] supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface [supervisorctl] serverurl=unix:///tmp/supervisor.sock ; use a unix:// URL for a unix socket ;serverurl=http://127.0.0.1:9001 ; use an http:// url to specify an inet socket ;username=chris ; should be same as http_username if set ;password=123 ; should be same as http_password if set ;prompt=mysupervisor ; cmd line prompt (default "supervisor") ;history_file=~/.sc_history ; use readline history if available ; The below sample program section shows all possible program subsection values, ; create one or more 'real' program: sections to be able to control them under ; supervisor. ;[program:theprogramname] ;command=/bin/cat ; the program (relative uses PATH, can take args) ;process_name=%(program_name)s ; process_name expr (default %(program_name)s) ;numprocs=1 ; number of processes copies to start (def 1) ;directory=/tmp ; directory to cwd to before exec (def no cwd) ;umask=022 ; umask for process (default None) ;priority=999 ; the relative start priority (default 999) ;autostart=true ; start at supervisord start (default: true) ;autorestart=unexpected ; whether/when to restart (default: unexpected) ;startsecs=1 ; number of secs prog must stay running (def. 1) ;startretries=3 ; max # of serial start failures (default 3) ;exitcodes=0,2 ; 'expected' exit codes for process (default 0,2) ;stopsignal=QUIT ; signal used to kill process (default TERM) ;stopwaitsecs=10 ; max num secs to wait b4 SIGKILL (default 10) ;stopasgroup=false ; send stop signal to the UNIX process group (default false) ;killasgroup=false ; SIGKILL the UNIX process group (def false) ;user=chrism ; setuid to this UNIX account to run the program ;redirect_stderr=true ; redirect proc stderr to stdout (default false) ;stdout_logfile=/a/path ; stdout log path, NONE for none; default AUTO ;stdout_logfile_maxbytes=1MB ; max # logfile bytes b4 rotation (default 50MB) ;stdout_logfile_backups=10 ; # of stdout logfile backups (default 10) ;stdout_capture_maxbytes=1MB ; number of bytes in 'capturemode' (default 0) ;stdout_events_enabled=false ; emit events on stdout writes (default false) ;stderr_logfile=/a/path ; stderr log path, NONE for none; default AUTO ;stderr_logfile_maxbytes=1MB ; max # logfile bytes b4 rotation (default 50MB) ;stderr_logfile_backups=10 ; # of stderr logfile backups (default 10) ;stderr_capture_maxbytes=1MB ; number of bytes in 'capturemode' (default 0) ;stderr_events_enabled=false ; emit events on stderr writes (default false) ;environment=A="1",B="2" ; process environment additions (def no adds) ;serverurl=AUTO ; override serverurl computation (childutils) [program:spvs-01] command=/root/alimama/adhoc-core/bin/bluewhale supervisor process_name=%(program_name)s numprocs=1 user=root autostart=true autorestart=true startsecs=5 startretries=3 redirect_stderr=true stdout_logfile=/root/alimama/adhoc-core/bin/supervisor.log stdout_logfile_maxbytes=20MB stdout_logfile_backups=10 stopasgroup=true ; The below sample eventlistener section shows all possible ; eventlistener subsection values, create one or more 'real' ; eventlistener: sections to be able to handle event notifications ; sent by supervisor. ;[eventlistener:theeventlistenername] ;command=/bin/eventlistener ; the program (relative uses PATH, can take args) ;process_name=%(program_name)s ; process_name expr (default %(program_name)s) ;numprocs=1 ; number of processes copies to start (def 1) ;events=EVENT ; event notif. types to subscribe to (req'd) ;buffer_size=10 ; event buffer queue size (default 10) ;directory=/tmp ; directory to cwd to before exec (def no cwd) ;umask=022 ; umask for process (default None) ;priority=-1 ; the relative start priority (default -1) ;autostart=true ; start at supervisord start (default: true) ;autorestart=unexpected ; whether/when to restart (default: unexpected) ;startsecs=1 ; number of secs prog must stay running (def. 1) ;startretries=3 ; max # of serial start failures (default 3) ;exitcodes=0,2 ; 'expected' exit codes for process (default 0,2) ;stopsignal=QUIT ; signal used to kill process (default TERM) ;stopwaitsecs=10 ; max num secs to wait b4 SIGKILL (default 10) ;stopasgroup=false ; send stop signal to the UNIX process group (default false) ;killasgroup=false ; SIGKILL the UNIX process group (def false) ;user=chrism ; setuid to this UNIX account to run the program ;redirect_stderr=true ; redirect proc stderr to stdout (default false) ;stdout_logfile=/a/path ; stdout log path, NONE for none; default AUTO ;stdout_logfile_maxbytes=1MB ; max # logfile bytes b4 rotation (default 50MB) ;stdout_logfile_backups=10 ; # of stdout logfile backups (default 10) ;stdout_events_enabled=false ; emit events on stdout writes (default false) ;stderr_logfile=/a/path ; stderr log path, NONE for none; default AUTO ;stderr_logfile_maxbytes=1MB ; max # logfile bytes b4 rotation (default 50MB) ;stderr_logfile_backups ; # of stderr logfile backups (default 10) ;stderr_events_enabled=false ; emit events on stderr writes (default false) ;environment=A="1",B="2" ; process environment additions ;serverurl=AUTO ; override serverurl computation (childutils) ; The below sample group section shows all possible group values, ; create one or more 'real' group: sections to create "heterogeneous" ; process groups. ;[group:thegroupname] ;programs=progname1,progname2 ; each refers to 'x' in [program:x] definitions ;priority=999 ; the relative start priority (default 999) ; The [include] section can just contain the "files" setting. This ; setting can list multiple files (separated by whitespace or ; newlines). It can also contain wildcards. The filenames are ; interpreted as relative to this file. Included files *cannot* ; include files themselves. ;[include] ;files = relative/directory/*.ini
关键部分是program:spvs-01,该部分为主机cma01配置了让supervisord管理的storm进程Supervisor。
[cma02 /etc/supervisord.conf]
; Sample supervisor config file. ; ; For more information on the config file, please see: ; http://supervisord.org/configuration.html ; ; Note: shell expansion ("~" or "$HOME") is not supported. Environment ; variables can be expanded using this syntax: "%(ENV_HOME)s". [unix_http_server] file=/tmp/supervisor.sock ; (the path to the socket file) ;chmod=0700 ; socket file mode (default 0700) ;chown=nobody:nogroup ; socket file uid:gid owner ;username=user ; (default is no username (open server)) ;password=123 ; (default is no password (open server)) [inet_http_server] ; inet (TCP) server disabled by default port=cma02:9001 ; (ip_address:port specifier, *:port for all iface) ;username=user ; (default is no username (open server)) ;password=123 ; (default is no password (open server)) [supervisord] logfile=/tmp/supervisord.log ; (main log file;default $CWD/supervisord.log) logfile_maxbytes=50MB ; (max main logfile bytes b4 rotation;default 50MB) logfile_backups=10 ; (num of main logfile rotation backups;default 10) loglevel=info ; (log level;default info; others: debug,warn,trace) pidfile=/tmp/supervisord.pid ; (supervisord pidfile;default supervisord.pid) nodaemon=false ; (start in foreground if true;default false) minfds=1024 ; (min. avail startup file descriptors;default 1024) minprocs=200 ; (min. avail process descriptors;default 200) ;umask=022 ; (process file creation umask;default 022) ;user=chrism ; (default is current user, required if root) ;identifier=supervisor ; (supervisord identifier, default is 'supervisor') ;directory=/tmp ; (default is not to cd during start) ;nocleanup=true ; (don't clean up tempfiles at start;default false) ;childlogdir=/tmp ; ('AUTO' child log dir, default $TEMP) ;environment=KEY="value" ; (key value pairs to add to environment) ;strip_ansi=false ; (strip ansi escape codes in logs; def. false) ; the below section must remain in the config file for RPC ; (supervisorctl/web interface) to work, additional interfaces may be ; added by defining them in separate rpcinterface: sections [rpcinterface:supervisor] supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface [supervisorctl] serverurl=unix:///tmp/supervisor.sock ; use a unix:// URL for a unix socket ;serverurl=http://127.0.0.1:9001 ; use an http:// url to specify an inet socket ;username=chris ; should be same as http_username if set ;password=123 ; should be same as http_password if set ;prompt=mysupervisor ; cmd line prompt (default "supervisor") ;history_file=~/.sc_history ; use readline history if available ; The below sample program section shows all possible program subsection values, ; create one or more 'real' program: sections to be able to control them under ; supervisor. ;[program:theprogramname] ;command=/bin/cat ; the program (relative uses PATH, can take args) ;process_name=%(program_name)s ; process_name expr (default %(program_name)s) ;numprocs=1 ; number of processes copies to start (def 1) ;directory=/tmp ; directory to cwd to before exec (def no cwd) ;umask=022 ; umask for process (default None) ;priority=999 ; the relative start priority (default 999) ;autostart=true ; start at supervisord start (default: true) ;autorestart=unexpected ; whether/when to restart (default: unexpected) ;startsecs=1 ; number of secs prog must stay running (def. 1) ;startretries=3 ; max # of serial start failures (default 3) ;exitcodes=0,2 ; 'expected' exit codes for process (default 0,2) ;stopsignal=QUIT ; signal used to kill process (default TERM) ;stopwaitsecs=10 ; max num secs to wait b4 SIGKILL (default 10) ;stopasgroup=false ; send stop signal to the UNIX process group (default false) ;killasgroup=false ; SIGKILL the UNIX process group (def false) ;user=chrism ; setuid to this UNIX account to run the program ;redirect_stderr=true ; redirect proc stderr to stdout (default false) ;stdout_logfile=/a/path ; stdout log path, NONE for none; default AUTO ;stdout_logfile_maxbytes=1MB ; max # logfile bytes b4 rotation (default 50MB) ;stdout_logfile_backups=10 ; # of stdout logfile backups (default 10) ;stdout_capture_maxbytes=1MB ; number of bytes in 'capturemode' (default 0) ;stdout_events_enabled=false ; emit events on stdout writes (default false) ;stderr_logfile=/a/path ; stderr log path, NONE for none; default AUTO ;stderr_logfile_maxbytes=1MB ; max # logfile bytes b4 rotation (default 50MB) ;stderr_logfile_backups=10 ; # of stderr logfile backups (default 10) ;stderr_capture_maxbytes=1MB ; number of bytes in 'capturemode' (default 0) ;stderr_events_enabled=false ; emit events on stderr writes (default false) ;environment=A="1",B="2" ; process environment additions (def no adds) ;serverurl=AUTO ; override serverurl computation (childutils) [program:spvs-02] command=/root/alimama/adhoc-core/bin/bluewhale supervisor process_name=%(program_name)s numprocs=1 user=root autostart=true autorestart=true startsecs=5 startretries=3 redirect_stderr=true stdout_logfile=/root/alimama/adhoc-core/bin/supervisor.log stdout_logfile_maxbytes=20MB stdout_logfile_backups=10 stopasgroup=true ; The below sample eventlistener section shows all possible ; eventlistener subsection values, create one or more 'real' ; eventlistener: sections to be able to handle event notifications ; sent by supervisor. ;[eventlistener:theeventlistenername] ;command=/bin/eventlistener ; the program (relative uses PATH, can take args) ;process_name=%(program_name)s ; process_name expr (default %(program_name)s) ;numprocs=1 ; number of processes copies to start (def 1) ;events=EVENT ; event notif. types to subscribe to (req'd) ;buffer_size=10 ; event buffer queue size (default 10) ;directory=/tmp ; directory to cwd to before exec (def no cwd) ;umask=022 ; umask for process (default None) ;priority=-1 ; the relative start priority (default -1) ;autostart=true ; start at supervisord start (default: true) ;autorestart=unexpected ; whether/when to restart (default: unexpected) ;startsecs=1 ; number of secs prog must stay running (def. 1) ;startretries=3 ; max # of serial start failures (default 3) ;exitcodes=0,2 ; 'expected' exit codes for process (default 0,2) ;stopsignal=QUIT ; signal used to kill process (default TERM) ;stopwaitsecs=10 ; max num secs to wait b4 SIGKILL (default 10) ;stopasgroup=false ; send stop signal to the UNIX process group (default false) ;killasgroup=false ; SIGKILL the UNIX process group (def false) ;user=chrism ; setuid to this UNIX account to run the program ;redirect_stderr=true ; redirect proc stderr to stdout (default false) ;stdout_logfile=/a/path ; stdout log path, NONE for none; default AUTO ;stdout_logfile_maxbytes=1MB ; max # logfile bytes b4 rotation (default 50MB) ;stdout_logfile_backups=10 ; # of stdout logfile backups (default 10) ;stdout_events_enabled=false ; emit events on stdout writes (default false) ;stderr_logfile=/a/path ; stderr log path, NONE for none; default AUTO ;stderr_logfile_maxbytes=1MB ; max # logfile bytes b4 rotation (default 50MB) ;stderr_logfile_backups ; # of stderr logfile backups (default 10) ;stderr_events_enabled=false ; emit events on stderr writes (default false) ;environment=A="1",B="2" ; process environment additions ;serverurl=AUTO ; override serverurl computation (childutils) ; The below sample group section shows all possible group values, ; create one or more 'real' group: sections to create "heterogeneous" ; process groups. ;[group:thegroupname] ;programs=progname1,progname2 ; each refers to 'x' in [program:x] definitions ;priority=999 ; the relative start priority (default 999) ; The [include] section can just contain the "files" setting. This ; setting can list multiple files (separated by whitespace or ; newlines). It can also contain wildcards. The filenames are ; interpreted as relative to this file. Included files *cannot* ; include files themselves. ;[include] ;files = relative/directory/*.ini
关键部分是program:spvs-02,该部分为主机cma02配置了让supervisord管理的storm进程Supervisor。
[cma03 /etc/supervisord.conf]
; Sample supervisor config file. ; ; For more information on the config file, please see: ; http://supervisord.org/configuration.html ; ; Note: shell expansion ("~" or "$HOME") is not supported. Environment ; variables can be expanded using this syntax: "%(ENV_HOME)s". [unix_http_server] file=/tmp/supervisor.sock ; (the path to the socket file) ;chmod=0700 ; socket file mode (default 0700) ;chown=nobody:nogroup ; socket file uid:gid owner ;username=user ; (default is no username (open server)) ;password=123 ; (default is no password (open server)) [inet_http_server] ; inet (TCP) server disabled by default port=cma03:9001 ; (ip_address:port specifier, *:port for all iface) ;username=user ; (default is no username (open server)) ;password=123 ; (default is no password (open server)) [supervisord] logfile=/tmp/supervisord.log ; (main log file;default $CWD/supervisord.log) logfile_maxbytes=50MB ; (max main logfile bytes b4 rotation;default 50MB) logfile_backups=10 ; (num of main logfile rotation backups;default 10) loglevel=info ; (log level;default info; others: debug,warn,trace) pidfile=/tmp/supervisord.pid ; (supervisord pidfile;default supervisord.pid) nodaemon=false ; (start in foreground if true;default false) minfds=1024 ; (min. avail startup file descriptors;default 1024) minprocs=200 ; (min. avail process descriptors;default 200) ;umask=022 ; (process file creation umask;default 022) ;user=chrism ; (default is current user, required if root) ;identifier=supervisor ; (supervisord identifier, default is 'supervisor') ;directory=/tmp ; (default is not to cd during start) ;nocleanup=true ; (don't clean up tempfiles at start;default false) ;childlogdir=/tmp ; ('AUTO' child log dir, default $TEMP) ;environment=KEY="value" ; (key value pairs to add to environment) ;strip_ansi=false ; (strip ansi escape codes in logs; def. false) ; the below section must remain in the config file for RPC ; (supervisorctl/web interface) to work, additional interfaces may be ; added by defining them in separate rpcinterface: sections [rpcinterface:supervisor] supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface [supervisorctl] serverurl=unix:///tmp/supervisor.sock ; use a unix:// URL for a unix socket ;serverurl=http://127.0.0.1:9001 ; use an http:// url to specify an inet socket ;username=chris ; should be same as http_username if set ;password=123 ; should be same as http_password if set ;prompt=mysupervisor ; cmd line prompt (default "supervisor") ;history_file=~/.sc_history ; use readline history if available ; The below sample program section shows all possible program subsection values, ; create one or more 'real' program: sections to be able to control them under ; supervisor. ;[program:theprogramname] ;command=/bin/cat ; the program (relative uses PATH, can take args) ;process_name=%(program_name)s ; process_name expr (default %(program_name)s) ;numprocs=1 ; number of processes copies to start (def 1) ;directory=/tmp ; directory to cwd to before exec (def no cwd) ;umask=022 ; umask for process (default None) ;priority=999 ; the relative start priority (default 999) ;autostart=true ; start at supervisord start (default: true) ;autorestart=unexpected ; whether/when to restart (default: unexpected) ;startsecs=1 ; number of secs prog must stay running (def. 1) ;startretries=3 ; max # of serial start failures (default 3) ;exitcodes=0,2 ; 'expected' exit codes for process (default 0,2) ;stopsignal=QUIT ; signal used to kill process (default TERM) ;stopwaitsecs=10 ; max num secs to wait b4 SIGKILL (default 10) ;stopasgroup=false ; send stop signal to the UNIX process group (default false) ;killasgroup=false ; SIGKILL the UNIX process group (def false) ;user=chrism ; setuid to this UNIX account to run the program ;redirect_stderr=true ; redirect proc stderr to stdout (default false) ;stdout_logfile=/a/path ; stdout log path, NONE for none; default AUTO ;stdout_logfile_maxbytes=1MB ; max # logfile bytes b4 rotation (default 50MB) ;stdout_logfile_backups=10 ; # of stdout logfile backups (default 10) ;stdout_capture_maxbytes=1MB ; number of bytes in 'capturemode' (default 0) ;stdout_events_enabled=false ; emit events on stdout writes (default false) ;stderr_logfile=/a/path ; stderr log path, NONE for none; default AUTO ;stderr_logfile_maxbytes=1MB ; max # logfile bytes b4 rotation (default 50MB) ;stderr_logfile_backups=10 ; # of stderr logfile backups (default 10) ;stderr_capture_maxbytes=1MB ; number of bytes in 'capturemode' (default 0) ;stderr_events_enabled=false ; emit events on stderr writes (default false) ;environment=A="1",B="2" ; process environment additions (def no adds) ;serverurl=AUTO ; override serverurl computation (childutils) [program:nmbs] command=/root/alimama/adhoc-core/bin/bluewhale nimbus process_name=%(program_name)s numprocs=1 user=root priority=700 autostart=true autorestart=true startsecs=5 startretries=3 redirect_stderr=true stdout_logfile=/root/alimama/adhoc-core/bin/nimbus.log stdout_logfile_maxbytes=20MB stdout_logfile_backups=10 stopasgroup=true [program:spvs-03] command=/root/alimama/adhoc-core/bin/bluewhale supervisor process_name=%(program_name)s numprocs=1 user=root priority=800 autostart=true autorestart=true startsecs=5 startretries=3 redirect_stderr=true stdout_logfile=/root/alimama/adhoc-core/bin/supervisor.log stdout_logfile_maxbytes=20MB stdout_logfile_backups=10 stopasgroup=true [program:ui] command=/root/alimama/adhoc-core/bin/bluewhale mdrillui 1107 /root/alimama/adhoc-core/lib/adhoc-web-0.18-beta.jar process_name=%(program_name)s numprocs=1 user=root priority=900 autostart=true autorestart=true startsecs=5 startretries=3 redirect_stderr=true stdout_logfile=/root/alimama/adhoc-core/bin/ui.log stdout_logfile_maxbytes=20MB stdout_logfile_backups=10 stopasgroup=true ; The below sample eventlistener section shows all possible ; eventlistener subsection values, create one or more 'real' ; eventlistener: sections to be able to handle event notifications ; sent by supervisor. ;[eventlistener:theeventlistenername] ;command=/bin/eventlistener ; the program (relative uses PATH, can take args) ;process_name=%(program_name)s ; process_name expr (default %(program_name)s) ;numprocs=1 ; number of processes copies to start (def 1) ;events=EVENT ; event notif. types to subscribe to (req'd) ;buffer_size=10 ; event buffer queue size (default 10) ;directory=/tmp ; directory to cwd to before exec (def no cwd) ;umask=022 ; umask for process (default None) ;priority=-1 ; the relative start priority (default -1) ;autostart=true ; start at supervisord start (default: true) ;autorestart=unexpected ; whether/when to restart (default: unexpected) ;startsecs=1 ; number of secs prog must stay running (def. 1) ;startretries=3 ; max # of serial start failures (default 3) ;exitcodes=0,2 ; 'expected' exit codes for process (default 0,2) ;stopsignal=QUIT ; signal used to kill process (default TERM) ;stopwaitsecs=10 ; max num secs to wait b4 SIGKILL (default 10) ;stopasgroup=false ; send stop signal to the UNIX process group (default false) ;killasgroup=false ; SIGKILL the UNIX process group (def false) ;user=chrism ; setuid to this UNIX account to run the program ;redirect_stderr=true ; redirect proc stderr to stdout (default false) ;stdout_logfile=/a/path ; stdout log path, NONE for none; default AUTO ;stdout_logfile_maxbytes=1MB ; max # logfile bytes b4 rotation (default 50MB) ;stdout_logfile_backups=10 ; # of stdout logfile backups (default 10) ;stdout_events_enabled=false ; emit events on stdout writes (default false) ;stderr_logfile=/a/path ; stderr log path, NONE for none; default AUTO ;stderr_logfile_maxbytes=1MB ; max # logfile bytes b4 rotation (default 50MB) ;stderr_logfile_backups ; # of stderr logfile backups (default 10) ;stderr_events_enabled=false ; emit events on stderr writes (default false) ;environment=A="1",B="2" ; process environment additions ;serverurl=AUTO ; override serverurl computation (childutils) ; The below sample group section shows all possible group values, ; create one or more 'real' group: sections to create "heterogeneous" ; process groups. ;[group:thegroupname] ;programs=progname1,progname2 ; each refers to 'x' in [program:x] definitions ;priority=999 ; the relative start priority (default 999) ; The [include] section can just contain the "files" setting. This ; setting can list multiple files (separated by whitespace or ; newlines). It can also contain wildcards. The filenames are ; interpreted as relative to this file. Included files *cannot* ; include files themselves. ;[include] ;files = relative/directory/*.ini
关键部分是program:nmbs, program:spvs-03和program:ui,这三部分为主机cma03配置了让supervisord管理的storm进程NimbusServer, Supervisor和Mdrillui。
有几个配置项值得解释一下,可以根据需要自行设置。
1. stopasgroup=true。这一配置项的作用是:如果supervisord管理的进程px又产生了若干子进程,使用supervisorctl停止px进程,停止信号会传播给px产生的所有子进程,确保子进程也一起停止。这一配置项对希望停止所有进程的需求是非常有用的。
2. autostart=true。这一配置项的作用是:当启动supervisord的时候会将该配置项设置为true的所有进程自动启动。
【启动supervisord】
确保配置无误后可以在每台主机上使用下面的命令启动supervisor的服务器端supervisord
supervisord
【停止supervisord】
supervisorctl shutdown
【重新加载配置文件】
supervisorctl reload
进程管理
1. 启动supervisord管理的所有进程
supervisorctl start all
2. 停止supervisord管理的所有进程
supervisorctl stop all
3. 启动supervisord管理的某一个特定进程
supervisorctl start program-name // program-name为[program:xx]中的xx
4. 停止supervisord管理的某一个特定进程
supervisorctl stop program-name // program-name为[program:xx]中的xx
5. 重启所有进程或所有进程
supervisorctl restart all // 重启所有 supervisorctl reatart program-name // 重启某一进程,program-name为[program:xx]中的xx
6. 查看supervisord当前管理的所有进程的状态
supervisorctl status
遇到问题及解决方案
在使用命令supervisorctl start all启动控制进程时,遇到如下错误
unix:///tmp/supervisor.sock no such file
出现上述错误的原因是supervisord并未启动,只要在命令行中使用命令sudo supervisord启动supervisord即可。