zoukankan      html  css  js  c++  java
  • Greenplum 集群部署

    最近开始接触Greenplum,线上也在使用了,感觉还不错,本次介绍一下集群的部署方法。那么Greenplum的架构如下:

                                                         (架构图来源网络)

    简单来说GPDB是一个分布式数据库软件,其可以管理和处理分布在多个不同主机上的海量数据。对于GPDB来说,一个DB实例实际上是由多个独立的PostgreSQL实例组成的,它们分布在不同的物理主机上,协同工作,呈现给用户的是一个DB的效果。Master是GPDB系统的访问入口,其负责处理客户端的连接及SQL 命令、协调系统中的其他Instance(Segment)工作,Segment负责管理和处理用户数据。

    环境准备:

    操作系统:centos 6.3 64位

    master 1台(架构图中的主节点),Standby 1台(架构图中的从节点),Segment 2台。共4台服务器。

         ip       主机名
    10.69.30.251   mdw
    10.36.25.152   smdw
    10.36.25.250   sdw1
    221.5.32.152   sdw2

    1. 修改/etc/hosts文件,添加下面内容(4台服务器相同的配置)

    10.69.30.251  mdw
    10.36.25.152  smdw
    10.36.25.250  sdw1
    10.37.237.152  sdw2

    2. 服务器关闭selinux,防火墙4台服务器相互开放,测试环境可以直接先关闭防火墙。(4台服务器)

    3. 操作系统参数设置

    vim /etc/sysctl.conf   (注:4台服务器)

    kernel.shmmax = 500000000
    kernel.shmmni = 4096
    kernel.shmall = 4000000000
    kernel.sem = 250 512000 100 2048
    kernel.sysrq = 1
    kernel.core_uses_pid = 1
    kernel.msgmnb = 65536
    kernel.msgmax = 65536
    net.ipv4.tcp_syncookies = 1
    net.ipv4.ip_forward = 0
    net.ipv4.conf.default.accept_source_route = 0
    net.ipv4.tcp_tw_recycle = 1
    net.ipv4.tcp_max_syn_backlog = 4096
    net.ipv4.conf.all.arp_filter = 1
    net.ipv4.conf.default.arp_filter = 1
    net.core.netdev_max_backlog = 10000
    vm.overcommit_memory = 2
    kernel.msgmni = 2048
    net.ipv4.ip_local_port_range = 1025 65535

    vim /etc/security/limits.conf  (注:4台服务器)

    * soft nofile 65536
    * hard nofile 65536
    * soft nproc 131072
    * hard nproc 131072

    我的操作系统是centos 6.3,所以还有一个文件要修改:那就是/etc/security/limits.d/90-nproc.conf (注:4台服务器都要修改)

    修改后的内容如下:

    * soft nproc 131072
    * hard nproc 131072
    * soft nofile 1048576
    * hard nofile 1048576

    磁盘预读参数及 deadline算法修改 (注:4台服务器)

    blockdev --setra 65536 /dev/sda
    echo deadline > /sys/block/sda/queue/scheduler

    软件下载地址:https://network.pivotal.io/products/pivotal-gpdb

    在Master主机上安装GP二进制文件,也就是主机名是mdw的服务器。(注:在master上安装即可,后面通过批量的方法安装剩下的服务器)

    我下载的文件是greenplum-db-4.3.7.3-build-2-RHEL5-x86_64.zip

    文件解压之是一个可执行文件,greenplum-db-4.3.7.3-build-2-RHEL5-x86_64.bin,直接执行安装。

    ./greenplum-db-4.3.7.3-build-2-RHEL5-x86_64.bin 

    然后有些说明,一直回车,出现如下图按照提示安装即可完成。

    在master上添加用户

    useradd gpadmin

    确保所有的服务器安装了ed软件,否则后面初始化集群就会报错。

    yum install ed -y

    准备文件,all_hosts ,seg_hosts放到/home/gpadmin。(用于批量安装软件以及后续集群的初始化)

    [gpadmin@mdw ~]$ cat seg_hosts 
    sdw1
    sdw2
    [gpadmin@mdw ~]$ cat all_hosts 
    mdw
    smdw
    sdw1
    sdw2
    [gpadmin@mdw ~]$ 

    建立节点服务器间的信任(我服务器已经打通了信任的,没有打通的会要求输入密码)

    [root@mdw gpadmin]# gpssh-exkeys -f all_hosts
    [STEP 1 of 5] create local ID and authorize on local host
      ... /root/.ssh/id_rsa file exists ... key generation skipped
    
    [STEP 2 of 5] keyscan all hosts and update known_hosts file
    
    [STEP 3 of 5] authorize current user on remote hosts
      ... send to smdw
      ... send to sdw1
      ... send to sdw2
    
    [STEP 4 of 5] determine common authentication file content
    
    [STEP 5 of 5] copy authentication files to all remote hosts
      ... finished key exchange with smdw
      ... finished key exchange with sdw1
      ... finished key exchange with sdw2
    
    [INFO] completed successfully

    批量安装软件(GP)

    cd /home/gpadmin/
    source /usr/local/greenplum-db/greenplum_path.sh gpseginstall -f all_hosts -u gpadmin -p gpadmin

    最后出现如下提示代表批量安装成功。

    20160827:13:58:47:020986 gpseginstall:greenplum_master:root-[INFO]:-SUCCESS -- Requested commands completed

    检查是否安装完成

    su - gpadmin
    source /usr/local/greenplum-db/greenplum_path.sh 
    gpssh -f all_hosts -e ls -l $GPHOME

    输出如下代表成功:

    [ mdw] ls -l /usr/local/greenplum-db/.
    [ mdw] total 276
    [ mdw] drwxr-xr-x 3 gpadmin gpadmin   4096 Feb 25  2016 bin
    [ mdw] drwxr-xr-x 2 gpadmin gpadmin   4096 Feb 25  2016 demo
    [ mdw] drwxr-xr-x 5 gpadmin gpadmin   4096 Feb 25  2016 docs
    [ mdw] drwxr-xr-x 2 gpadmin gpadmin   4096 Feb 25  2016 etc
    [ mdw] drwxr-xr-x 3 gpadmin gpadmin   4096 Feb 25  2016 ext
    [ mdw] -rw-r--r-- 1 gpadmin gpadmin  43025 Feb 25  2016 GPDB-LICENSE.txt
    [ mdw] -rw-rw-r-- 1 gpadmin gpadmin    735 Aug 27 12:30 greenplum_path.sh
    [ mdw] drwxr-xr-x 6 gpadmin gpadmin   4096 Feb 25  2016 include
    [ mdw] drwxr-xr-x 9 gpadmin gpadmin   4096 Feb 25  2016 lib
    [ mdw] -rw-r--r-- 1 gpadmin gpadmin 192912 Feb 25  2016 LICENSE.thirdparty
    [ mdw] drwxr-xr-x 2 gpadmin gpadmin   4096 Aug 27 15:16 sbin
    [ mdw] drwxr-xr-x 4 gpadmin gpadmin   4096 Feb 25  2016 share
    [sdw1] ls -l /usr/local/greenplum-db/.
    [sdw1] total 276
    [sdw1] drwxr-xr-x 3 gpadmin gpadmin   4096 Feb 25  2016 bin
    [sdw1] drwxr-xr-x 2 gpadmin gpadmin   4096 Feb 25  2016 demo
    [sdw1] drwxr-xr-x 5 gpadmin gpadmin   4096 Feb 25  2016 docs
    [sdw1] drwxr-xr-x 2 gpadmin gpadmin   4096 Feb 25  2016 etc
    [sdw1] drwxr-xr-x 3 gpadmin gpadmin   4096 Feb 25  2016 ext
    [sdw1] -rw-r--r-- 1 gpadmin gpadmin  43025 Feb 25  2016 GPDB-LICENSE.txt
    [sdw1] -rw-rw-r-- 1 gpadmin gpadmin    735 Aug 27 12:30 greenplum_path.sh
    [sdw1] drwxr-xr-x 6 gpadmin gpadmin   4096 Feb 25  2016 include
    [sdw1] drwxr-xr-x 9 gpadmin gpadmin   4096 Feb 25  2016 lib
    [sdw1] -rw-r--r-- 1 gpadmin gpadmin 192912 Feb 25  2016 LICENSE.thirdparty
    [sdw1] drwxr-xr-x 2 gpadmin gpadmin   4096 Aug 27 15:16 sbin
    [sdw1] drwxr-xr-x 4 gpadmin gpadmin   4096 Feb 25  2016 share
    [smdw] ls -l /usr/local/greenplum-db/.
    [smdw] total 276
    [smdw] drwxr-xr-x 3 gpadmin gpadmin   4096 Feb 25  2016 bin
    [smdw] drwxr-xr-x 2 gpadmin gpadmin   4096 Feb 25  2016 demo
    [smdw] drwxr-xr-x 5 gpadmin gpadmin   4096 Feb 25  2016 docs
    [smdw] drwxr-xr-x 2 gpadmin gpadmin   4096 Feb 25  2016 etc
    [smdw] drwxr-xr-x 3 gpadmin gpadmin   4096 Feb 25  2016 ext
    [smdw] -rw-r--r-- 1 gpadmin gpadmin  43025 Feb 25  2016 GPDB-LICENSE.txt
    [smdw] -rw-rw-r-- 1 gpadmin gpadmin    735 Aug 27 12:30 greenplum_path.sh
    [smdw] drwxr-xr-x 6 gpadmin gpadmin   4096 Feb 25  2016 include
    [smdw] drwxr-xr-x 9 gpadmin gpadmin   4096 Feb 25  2016 lib
    [smdw] -rw-r--r-- 1 gpadmin gpadmin 192912 Feb 25  2016 LICENSE.thirdparty
    [smdw] drwxr-xr-x 2 gpadmin gpadmin   4096 Aug 27 15:16 sbin
    [smdw] drwxr-xr-x 4 gpadmin gpadmin   4096 Feb 25  2016 share
    [sdw2] ls -l /usr/local/greenplum-db/.
    [sdw2] total 276
    [sdw2] drwxr-xr-x 3 gpadmin gpadmin   4096 Feb 25  2016 bin
    [sdw2] drwxr-xr-x 2 gpadmin gpadmin   4096 Feb 25  2016 demo
    [sdw2] drwxr-xr-x 5 gpadmin gpadmin   4096 Feb 25  2016 docs
    [sdw2] drwxr-xr-x 2 gpadmin gpadmin   4096 Feb 25  2016 etc
    [sdw2] drwxr-xr-x 3 gpadmin gpadmin   4096 Feb 25  2016 ext
    [sdw2] -rw-r--r-- 1 gpadmin gpadmin  43025 Feb 25  2016 GPDB-LICENSE.txt
    [sdw2] -rw-rw-r-- 1 gpadmin gpadmin    735 Aug 27 12:30 greenplum_path.sh
    [sdw2] drwxr-xr-x 6 gpadmin gpadmin   4096 Feb 25  2016 include
    [sdw2] drwxr-xr-x 9 gpadmin gpadmin   4096 Feb 25  2016 lib
    [sdw2] -rw-r--r-- 1 gpadmin gpadmin 192912 Feb 25  2016 LICENSE.thirdparty
    [sdw2] drwxr-xr-x 2 gpadmin gpadmin   4096 Aug 27 15:16 sbin
    [sdw2] drwxr-xr-x 4 gpadmin gpadmin   4096 Feb 25  2016 share
    View Code

    创建存储目录

    master,standby 操作命令:

    mkdir -p /data/master
    chown -R gpadmin.gpadmin /data/master/

    在master上用下面命令批量操作segment,建立目录,改权限

    gpssh -f seg_hosts -e 'mkdir -p /data/primary'
    gpssh -f seg_hosts -e 'mkdir -p /data/mirror'
    gpssh -f seg_hosts -e 'chown -R gpadmin.gpadmin /data/primary'
    gpssh -f seg_hosts -e 'chown -R gpadmin.gpadmin /data/mirror'

    设置NTP同步

    在每个Segment主机,以root登录编辑/etc/ntp.conf文件。设置第一个server参数指向Master主机,第二个server参数指向Standby主机。如下面:

    server mdw prefer
    server smdw

    在Standby主机,以root登录编辑/etc/ntp.conf文件。设置第一个server参数指向Master主机,第二个参数指向数据中心的时间服务器。

    server mdw prefer

    在Master主机,使用NTP守护进程同步所有Segment主机的系统时钟。例如,使用gpssh来完成:

    gpssh -f all_hosts -v -e 'ntpd'

    输出如下代表成功:

    [root@mdw gpadmin]# gpssh -f all_hosts -v -e 'ntpd'
    [WARN] Reference default values as $MASTER_DATA_DIRECTORY/gpssh.conf could not be found
    Using delaybeforesend 0.05 and prompt_validation_timeout 1.0
    
    [Reset ...]
    [INFO] login mdw
    [INFO] login smdw
    [INFO] login sdw1
    [INFO] login sdw2
    [ mdw] ntpd
    [smdw] ntpd
    [sdw1] ntpd
    [sdw2] ntpd
    [INFO] completed successfully
    
    [Cleanup...]

    要配置集群自动同步系统时钟,应开启ntpd服务,并设置为开机时自动运行,4台服务器都启动:

     /etc/init.d/ntpd start 

    检查环境(我这里检查一下操作系统参数设置是否ok,网络,io,都可以检查的)

    [root@mdw gpadmin]# gpcheck -f all_hosts -m mdw
    20160827:16:47:47:022142 gpcheck:mdw:root-[INFO]:-dedupe hostnames
    20160827:16:47:47:022142 gpcheck:mdw:root-[INFO]:-Detected platform: Generic Linux Cluster
    20160827:16:47:47:022142 gpcheck:mdw:root-[INFO]:-generate data on servers
    20160827:16:47:47:022142 gpcheck:mdw:root-[INFO]:-copy data files from servers
    20160827:16:47:48:022142 gpcheck:mdw:root-[INFO]:-delete remote tmp files
    20160827:16:47:48:022142 gpcheck:mdw:root-[INFO]:-Using gpcheck config file: /usr/local/greenplum-db/./etc/gpcheck.cnf
    20160827:16:47:48:022142 gpcheck:mdw:root-[INFO]:-GPCHECK_NORMAL
    20160827:16:47:48:022142 gpcheck:mdw:root-[INFO]:-gpcheck completing...

    在master上进行下面的操作:

    创建Greenplum数据库配置文件,以gpadmin用户登录

    su - gpadmin
    cp $GPHOME/docs/cli_help/gpconfigs/gpinitsystem_config   /home/gpadmin/gpinitsystem_config
    chmod 775 gpinitsystem_config

    设置所有必须的参数

    ARRAY_NAME="EMC Greenplum DW"
    SEG_PREFIX=gpseg
    PORT_BASE=40000
    declare -a DATA_DIRECTORY=(/data/primary)
    MASTER_HOSTNAME=mdw
    MASTER_DIRECTORY=/data/master
    MASTER_PORT=5432
    TRUSTED_SHELL=ssh
    CHECK_POINT_SEGMENTS=8
    ENCODING=UNICODE

    设置可选参数

    MIRROR_PORT_BASE=50000
    REPLICATION_PORT_BASE=41000
    declare -a MIRROR_DATA_DIRECTORY=(/data/mirror)

    运行初始化工具初始化数据库

    su - gpadmin
    source /usr/local/greenplum-db/greenplum_path.sh 
    gpinitsystem -c gpinitsystem_config -h seg_hosts

    正常情况下初始化完成以后最后的日志没有任何报错,而且gp进程也是启动了的。

    初始化日志:

    20160827:16:23:11:002458 gpinitsystem:mdw:gpadmin-[INFO]:-Review options for gpinitstandby
    20160827:16:23:11:002458 gpinitsystem:mdw:gpadmin-[INFO]:-------------------------------------------------------
    20160827:16:23:11:002458 gpinitsystem:mdw:gpadmin-[INFO]:-The Master /data/master/gpseg-1/pg_hba.conf post gpinitsystem
    20160827:16:23:11:002458 gpinitsystem:mdw:gpadmin-[INFO]:-has been configured to allow all hosts within this new
    20160827:16:23:11:002458 gpinitsystem:mdw:gpadmin-[INFO]:-array to intercommunicate. Any hosts external to this
    20160827:16:23:11:002458 gpinitsystem:mdw:gpadmin-[INFO]:-new array must be explicitly added to this file
    20160827:16:23:11:002458 gpinitsystem:mdw:gpadmin-[INFO]:-Refer to the Greenplum Admin support guide which is
    20160827:16:23:11:002458 gpinitsystem:mdw:gpadmin-[INFO]:-located in the /usr/local/greenplum-db/./docs directory
    20160827:16:23:11:002458 gpinitsystem:mdw:gpadmin-[INFO]:-------------------------------------------------------

    启动进程:

    [root@mdw gpadmin]# ps -ef | grep postgres
    gpadmin  18082     1  0 16:23 ?        00:00:00 /usr/local/greenplum-db-4.3.7.3/bin/postgres -D /data/master/gpseg-1 -p 5432 -b 1 -z 2 --silent-mode=true -i -M master -C -1 -x 0 -E
    gpadmin  18083 18082  0 16:23 ?        00:00:00 postgres: port  5432, master logger process                                                                                         
    gpadmin  18086 18082  0 16:23 ?        00:00:00 postgres: port  5432, stats collector process                                                                                       
    gpadmin  18087 18082  0 16:23 ?        00:00:00 postgres: port  5432, writer process                                                                                                
    gpadmin  18088 18082  0 16:23 ?        00:00:00 postgres: port  5432, checkpoint process                                                                                            
    gpadmin  18089 18082  0 16:23 ?        00:00:00 postgres: port  5432, seqserver process                                                                                             
    gpadmin  18090 18082  0 16:23 ?        00:00:00 postgres: port  5432, ftsprobe process                                                                                              
    gpadmin  18091 18082  0 16:23 ?        00:00:00 postgres: port  5432, sweeper process                                                                                               
    root     22716 20223  0 16:52 pts/0    00:00:00 grep postgres
    [root@mdw gpadmin]# 


    现在只有1个master,2个segment,没有standby,那么接下来把standby加入集群。

    在master服务器上执行

    gpinitstandby -s smdw

    输出如下:

    [gpadmin@mdw ~]$ gpinitstandby -s smdw
    20160827:16:59:24:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Validating environment and parameters for standby initialization...
    20160827:16:59:25:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Checking for filespace directory /data/master/gpseg-1 on smdw
    20160827:16:59:25:023346 gpinitstandby:mdw:gpadmin-[INFO]:------------------------------------------------------
    20160827:16:59:25:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum standby master initialization parameters
    20160827:16:59:25:023346 gpinitstandby:mdw:gpadmin-[INFO]:------------------------------------------------------
    20160827:16:59:25:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum master hostname               = mdw
    20160827:16:59:25:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum master data directory         = /data/master/gpseg-1
    20160827:16:59:25:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum master port                   = 5432
    20160827:16:59:25:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum standby master hostname       = smdw
    20160827:16:59:25:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum standby master port           = 5432
    20160827:16:59:25:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum standby master data directory = /data/master/gpseg-1
    20160827:16:59:25:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum update system catalog         = On
    20160827:16:59:25:023346 gpinitstandby:mdw:gpadmin-[INFO]:------------------------------------------------------
    20160827:16:59:25:023346 gpinitstandby:mdw:gpadmin-[INFO]:- Filespace locations
    20160827:16:59:25:023346 gpinitstandby:mdw:gpadmin-[INFO]:------------------------------------------------------
    20160827:16:59:25:023346 gpinitstandby:mdw:gpadmin-[INFO]:-pg_system -> /data/master/gpseg-1
    Do you want to continue with standby master initialization? Yy|Nn (default=N):
    > y
    20160827:16:59:31:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Syncing Greenplum Database extensions to standby
    20160827:16:59:31:023346 gpinitstandby:mdw:gpadmin-[INFO]:-The packages on smdw are consistent.
    20160827:16:59:31:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Adding standby master to catalog...
    20160827:16:59:31:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Database catalog updated successfully.
    20160827:16:59:31:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Updating pg_hba.conf file...
    20160827:16:59:37:023346 gpinitstandby:mdw:gpadmin-[INFO]:-pg_hba.conf files updated successfully.
    20160827:16:59:39:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Updating filespace flat files...
    20160827:16:59:39:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Filespace flat file updated successfully.
    20160827:16:59:39:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Starting standby master
    20160827:16:59:39:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Checking if standby master is running on host: smdw  in directory: /data/master/gpseg-1
    20160827:16:59:40:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Cleaning up pg_hba.conf backup files...
    20160827:16:59:46:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Backup files of pg_hba.conf cleaned up successfully.
    20160827:16:59:46:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Successfully created standby master on smdw
    View Code

    设置环境变量,master,standby都设置。

    1.  以gpadmin用户登录:

    su - gpadmin

    2.  打开用户启动文件(如.bashrc)。比如:

    vim ~/.bashrc

    3.  添加新行用以加载greenplum_path.sh文件和设置MASTER_DATA_DIRECTORY环境变量。例如:

    source /usr/local/greenplum-db/greenplum_path.sh
    export MASTER_DATA_DIRECTORY=/data/master/gpseg-1

    启动和停止数据库测试是否能正常启动和关闭,命令如下

    gpstart
    gpstop 

    到此 greenplum 就部署完成了。下面进行一些简单的测试。

    登录数据库:psql -d postgres

    建表,插入,查询

    postgres=# create table student ( no int primary key,student_name varchar(40),age int);
    NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "student_pkey" for table "student"
    CREATE TABLE
    postgres=# insert into student values(1,'yayun',18);
    INSERT 0 1
    postgres=# select * from student;
     no | student_name | age 
    ----+--------------+-----
      1 | yayun        |  18
    (1 row)
  • 相关阅读:
    Android中GC_EXTERNAL_ALLOC的含义
    Phonegap开发的前后台数据交互
    代码管理工具TortoiseSVN
    14款响应式前端开发框架
    简化工作流程,10款必备的HTML5开发工具
    [C#.net]处理UTF-8文件乱码
    [Oracle]ORA-14400:插入的分区关键字未映射到任何分区
    [网络]10M、100M、1000M网线的水晶头接法
    [Office]Execl取消保护密码
    SLI的相关学习
  • 原文地址:https://www.cnblogs.com/gomysql/p/5812591.html
Copyright © 2011-2022 走看看