Greenplum安装
评估存储能力
计算可用的空间
步骤1:初始存储能力=硬盘大小*硬盘数
步骤2:配置RAID10,格式化磁盘空间=(初始存储能力*0.9)/2
步骤3:可用磁盘空间=格式化磁盘空间*0.7步骤4:用户数据使用空间
使用镜像:(2*用户数据)+用户数据/3=可用磁盘空间
不使用镜像:用户数据+用户数据/3=可用磁盘空间
计算用户数据大小
平均来说,实际占用磁盘空间大小=用户数据*1.4
页面开销:32KB页面需要20 bytes
行开销:每行24 bytes,'append-only'表需要4bytes
索引开销:
B-tree:唯一值*(数据类型大小+24bytes) Bitmap:(唯一值*行数*1bit*压vi缩比率/8)+(唯一值*32)
为元数据和日志计算空间需求
系统元数据:20M
预写日志(WAL):WAL被拆分成多个64M的文件,WAL文件数最多为
2*checkpoint_segments+1,checkpoint_segments默认值为8。也就意味着每个实例需要1088MB的WAL空间
GP数据库日志文件:日志轮转
性能监控数据
实验环境
硬件环境
VMware虚拟机软件10.0三台Linux虚拟机:Red Hat Enterprise Linux Serverrelease 5.42
数据库:greenplum-db-4.2.8.0-build-1-RHEL5-x86_64.zip
网卡设置
虚拟机配置
基本信息配置
节点 |
IP |
内存 |
硬盘 |
master |
192.168.80.200 |
1024M |
20G |
slave1 |
192.168.80.201 |
1024M |
20G |
slave2 |
192.168.80.202 |
1024M |
20G |
设置IP地址的命令行方式
vi /etc/sysconfig/network-scripts/ifcfg-eth0 |
填写ip地址、子网掩码、网关、DNS等
DEVICE=eth0 HWADDR=00:0c:29:b7:7d:69 ONBOOT=yes BOOTPROTO=static DNS1=192.168.80.200 IPV6INIT=no USERCTL=no IPADDR=192.168.80.200 NETMASK=255.255.255.0 GATEWAY=192.168.80.1 |
关闭防火墙
[root@localhost ~]# service iptables status 表格:filter Chain INPUT (policy ACCEPT) num target prot opt source destination 1 RH-Firewall-1-INPUT all -- 0.0.0.0/0 0.0.0.0/0
Chain FORWARD (policy ACCEPT) num target prot opt source destination 1 RH-Firewall-1-INPUT all -- 0.0.0.0/0 0.0.0.0/0
Chain OUTPUT (policy ACCEPT) num target prot opt source destination
Chain RH-Firewall-1-INPUT (2 references) num target prot opt source destination 1 ACCEPT all -- 0.0.0.0/0 0.0.0.0/0 2 ACCEPT icmp -- 0.0.0.0/0 0.0.0.0/0 icmp type 255 3 ACCEPT esp -- 0.0.0.0/0 0.0.0.0/0 4 ACCEPT ah -- 0.0.0.0/0 0.0.0.0/0 5 ACCEPT udp -- 0.0.0.0/0 224.0.0.251 udp dpt:5353 6 ACCEPT udp -- 0.0.0.0/0 0.0.0.0/0 udp dpt:631 7 ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpt:631 8 ACCEPT all -- 0.0.0.0/0 0.0.0.0/0 state RELATED,ESTABLISHED 9 ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:21 10 ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:22 11 ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:23 12 ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:80 13 ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:443 14 REJECT all -- 0.0.0.0/0 0.0.0.0/0 reject-with icmp-host-prohibited
[root@localhost ~]# service iptables stop 清除防火墙规则: [确定] 把 chains 设置为 ACCEPT 策略:filter [确定] 正在卸载 Iiptables 模块: [确定] |
修改主机名和IP映射:
[root@master ~]# vi /etc/sysconfig/network NETWORKING=yes NETWORKING_IPV6=no HOSTNAME=master [root@localhost ~]# vi /etc/hosts # Do not remove the following line, or various programs # that require network functionality will fail. 127.0.0.1 localhost.localdomain localhost ::1 localhost6.localdomain6 localhost6 192.168.80.200 master |
系统参数配置
包括共享内存、网络、用户限制等,修改或添加/etc/sysctl.conf # Kernel sysctl configuration file for Red Hat Linux # For binary values, 0 is disabled, 1 is enabled. See sysctl(8) and # sysctl.conf(5) for more details.
# Controls source route verification net.ipv4.conf.default.rp_filter = 1
#for greenplum begin xfs_mount_options = rw,noatime,inode64,allocsize=16m # Controls the maximum shared segment size, in bytes kernel.shmmax = 500000000 # Controls the minimum shared segment size, in bytes kernel.shmmni = 4096 # Controls the maximum number of shared memory segments, in pages kernel.shmall = 4000000000 kernel.sem = 250 512000 100 2048 # Controls the System Request debugging functionality of the kernel kernel.sysrq = 1 # Controls whether core dumps will append the PID to the core filename # Useful for debugging multi-threaded applications kernel.core_uses_pid = 1 # Controls the maximum size of a message, in bytes kernel.msgmnb = 65536
# Controls the default maxmimum size of a mesage queue kernel.msgmax = 65536 # Controls the default minmimum size of a mesage queue kernel.msgmni = 2048 # Controls the use of TCP syncookies net.ipv4.tcp_syncookies = 1 # Controls IP packet forwarding net.ipv4.ip_forward = 0 # Do not accept source routing net.ipv4.conf.default.accept_source_route = 0 net.ipv4.tcp_tw_recycle = 1 net.ipv4.tcp_max_syn_backlog = 4096 net.ipv4.conf.all.arp_filter = 1 net.ipv4.ip_local_port_range = 1025 65535 net.core.netdev_max_backlog = 10000 vm.overcommit_memory = 2 #for greenplum end |
系统设置(所有节点)
配置/etc/security/limits.conf文件
# for greenplum begin * soft nofile 65536 * hard nofile 65536 * soft nproc 131072 * hard nproc 131072 # for greenplum end |
设置磁盘访问I/O调度策略
说明:介绍文章:http://blog.csdn.net/theorytree/article/details/6259104
deadline的调度策略可以平衡IO调度和进程调度,不会造成等待进程频繁的等待
[root@master ~] # cat /sys/block/sda/queue/scheduler noop anticipatory deadline [cfq] [root@master ~] # cat /sys/block/fd0/queue/scheduler noop anticipatory deadline [cfq] [root@master ~]# dmesg | grep -i scheduler io scheduler noop registered io scheduler anticipatory registered io scheduler deadline registered io scheduler cfq registered (default) |
[root@master ~]# echo deadline > /sys/block/sda/queue/scheduler [root@master ~]# echo deadline > /sys/block/fd0/queue/scheduler [root@master ~]# cat /sys/block/sda/queue/scheduler noop anticipatory [deadline] cfq [root@master ~]# cat /sys/block/fd0/queue/scheduler noop anticipatory [deadline] cfq |
设置预读块的值为16384;
16384:数据仓库的最大特点是用于保存历史数据,存在大量的数据操作,包括增删改查,当设置的块越大时读取性能越高;16384是greenplum数据库要求的最低要求。
[root@master ~]# /sbin/blockdev --setra 16384 /dev/sda [root@master ~]# /sbin/blockdev --setra 16384 /dev/sda1 [root@master ~]# /sbin/blockdev --setra 16384 /dev/sda2 [root@master ~]# /sbin/blockdev --setra 16384 /dev/sda3 [root@master ~]# fdisk -l
Disk /dev/sda: 21.4 GB, 21474836480 bytes 255 heads, 63 sectors/track, 2610 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System /dev/sda1 * 1 653 5245191 83 Linux /dev/sda2 654 914 2096482+ 82 Linux swap / Solaris /dev/sda3 915 1045 1052257+ 83 Linux |
修改/etc/hosts,添加如下内容
127.0.0.1 localhost.localdomain localhost ::1 localhost6.localdomain6 localhost6 192.168.80.200 master 192.168.80.201 slave1 192.168.80.202 slave2 |
在Master节点上安装Greenplum软件
准备介质
下载地址:http://gopivotal.com/products/pivotal-greenplum-database
解压
#unzip greenplum-db-4.2.8.0-build-1-RHEL5-x86_64.zip
安装软件
#/bin/bash greenplum-db-4.2.8.0-build-1-RHEL5-x86_64.bin
I HAVE READ AND AGREE TO THE TERMS OF THE ABOVE PIVOTAL SOFTWARE LICENSE AGREEMENT. ******************************************************************************** Do you accept the Pivotal Database license agreement? [yes|no] ******************************************************************************** yes ******************************************************************************** Provide the installation path for Greenplum Database or press ENTER to accept the default installation path: /usr/local/greenplum-db-4.2.8.0 ******************************************************************************** ******************************************************************************** Install Greenplum Database into </usr/local/greenplum-db-4.2.8.0>? [yes|no] ******************************************************************************** yes ******************************************************************************** /usr/local/greenplum-db-4.2.8.0 does not exist. Create /usr/local/greenplum-db-4.2.8.0 ? [yes|no] (Selecting no will exit the installer) ******************************************************************************** yes ******************************************************************************** [Optional] Provide the path to a previous installation of Greenplum Database, or press ENTER to skip this step. e.g. /usr/local/greenplum-db-4.1.1.3 This installation step will migrate any Greenplum Database extensions from the provided path to the version currently being installed. This step is optional and can be run later with: gppkg --migrate <path_to_old_gphome> /usr/local/greenplum-db-4.2.8.0 ******************************************************************************** Extracting product to /usr/local/greenplum-db-4.2.8.0 Skipping migration of Greenplum Database extensions... ******************************************************************************** Installation complete. Greenplum Database is installed in /usr/local/greenplum-db-4.2.8.0 Pivotal Greenplum documentation is available for download at http://docs.gopivotal.com/gpdb ******************************************************************************** |
安装结果
[root@master greenplum-db-4.2.8.0]# pwd /usr/local/greenplum-db-4.2.8.0 [root@master greenplum-db-4.2.8.0]# ls -ltr total 328 drwxr-xr-x 3 510 510 4096 Jun 18 2014 share drwxr-xr-x 2 510 510 4096 Jun 18 2014 demo drwxr-xr-x 5 510 510 4096 Jun 18 2014 docs drwxr-xr-x 7 510 510 4096 Jun 18 2014 lib drwxr-xr-x 6 510 510 4096 Jun 18 2014 include drwxr-xr-x 3 510 510 4096 Jun 18 2014 ext drwxr-xr-x 2 510 510 4096 Jun 18 2014 etc drwxr-xr-x 2 510 510 4096 Jun 18 2014 sbin -rw-rw-r-- 1 510 510 193083 Jun 18 2014 LICENSE.thirdparty -rw-rw-r-- 1 510 510 43025 Jun 18 2014 GPDB-LICENSE.txt drwxr-xr-x 3 510 510 4096 Jun 18 2014 bin -rw-r--r-- 1 root root 676 Mar 20 19:41 greenplum_path.sh |
文件说明:
greenplum_path.sh:Greenplum数据库环境变量文件
GPDB-LICENSE.txt:Greenplum许可协议
bin:管理工具、客户端程序及服务程序
demo:示例程序
docs:帮助文档
etc:OpenSSL的配置示例
ext:一些GP工具使用的捆绑程序
inlcude:C头文件
lib:库文件
sbin:支持的或者内部的脚本和程序
share:共享文件
在所有主机上初始化配置Greenplum
获取环境参数:
# source/usr/local/greenplum-db/greenplum_path.sh
GPHOME=/usr/local/greenplum-db-4.2.8.0
# Replace with symlink path if it is present and correct if [ -h ${GPHOME}/../greenplum-db ]; then GPHOME_BY_SYMLINK=`(cd ${GPHOME}/../greenplum-db/ && pwd -P)` if [ x"${GPHOME_BY_SYMLINK}" = x"${GPHOME}" ]; then GPHOME=`(cd ${GPHOME}/../greenplum-db/ && pwd -L)`/. fi unset GPHOME_BY_SYMLINK fi PATH=$GPHOME/bin:$GPHOME/ext/python/bin:$PATH LD_LIBRARY_PATH=$GPHOME/lib:$GPHOME/ext/python/lib:$LD_LIBRARY_PATH PYTHONPATH=$GPHOME/lib/python PYTHONHOME=$GPHOME/ext/python OPENSSL_CONF=$GPHOME/etc/openssl.cnf export GPHOME export PATH export LD_LIBRARY_PATH export PYTHONPATH export PYTHONHOME export OPENSSL_CONF |
创建主机文件all_hosts
文件内容:
master slave1 slave2 |
env
[root@master greenplum-db-4.2.8.0]# env HOSTNAME=master TERM=vt100 SHELL=/bin/bash HISTSIZE=1000 SSH_CLIENT=192.168.80.120 49852 22 GPHOME=/usr/local/greenplum-db/. OLDPWD=/root/installer SSH_TTY=/dev/pts/2 USER=root LD_LIBRARY_PATH=/usr/local/greenplum-db/./lib:/usr/local/greenplum-db/./ext/python/lib: LS_COLORS=no=00:fi=00:di=01;34:ln=01;36:pi=40;33:so=01;35:bd=40;33;01:cd=40;33;01:or=01;05;37;41:mi=01;05;37;41:ex=01;32:*.cmd=01;32:*.exe=01;32:*.com=01;32:*.btm=01;32:*.bat=01;32:*.sh=01;32:*.csh=01;32:*.tar=01;31:*.tgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.gz=01;31:*.bz2=01;31:*.bz=01;31:*.tz=01;31:*.rpm=01;31:*.cpio=01;31:*.jpg=01;35:*.gif=01;35:*.bmp=01;35:*.xbm=01;35:*.xpm=01;35:*.png=01;35:*.tif=01;35: MAIL=/var/spool/mail/root PATH=/usr/local/greenplum-db/./bin:/usr/local/greenplum-db/./ext/python/bin:/usr/kerberos/sbin:/usr/kerberos/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin INPUTRC=/etc/inputrc PWD=/usr/local/greenplum-db-4.2.8.0 LANG=en_US.UTF-8 PYTHONHOME=/usr/local/greenplum-db/./ext/python SSH_ASKPASS=/usr/libexec/openssh/gnome-ssh-askpass SHLVL=1 HOME=/root OPENSSL_CONF=/usr/local/greenplum-db/./etc/openssl.cnf PYTHONPATH=/usr/local/greenplum-db/./lib/python LOGNAME=root CVS_RSH=ssh SSH_CONNECTION=192.168.80.120 49852 192.168.80.200 22 LESSOPEN=|/usr/bin/lesspipe.sh %s G_BROKEN_FILENAMES=1 _=/bin/env |
[root@master greenplum-db-4.2.8.0]# ssh slave1 The authenticity of host 'slave1 (192.168.80.201)' can't be established. RSA key fingerprint is 04:6d:9c:fc:68:e3:9a:24:dc:11:ff:25:14:9e:d1:5b. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'slave1,192.168.80.201' (RSA) to the list of known hosts. root@slave1's password: Last login: Fri Mar 20 17:49:51 2015 from 192.168.80.120 |
运行gpseginstall工具
# gpseginstall -f all_hosts-ugpadmin -p gpadmin
[root@master greenplum-db]# gpseginstall -f all_hosts -u gpadmin -p gpadmin 20150320:19:58:32:003371 gpseginstall:master:root-[INFO]:-Installation Info: link_name greenplum-db binary_path /usr/local/greenplum-db-4.2.8.0 binary_dir_location /usr/local binary_dir_name greenplum-db-4.2.8.0 20150320:19:58:32:003371 gpseginstall:master:root-[INFO]:-check cluster password access *** Enter password for slave1: 20150320:19:58:38:003371 gpseginstall:master:root-[INFO]:-de-duplicate hostnames 20150320:19:58:38:003371 gpseginstall:master:root-[INFO]:-master hostname: master 20150320:19:58:38:003371 gpseginstall:master:root-[INFO]:-check for user gpadmin on cluster 20150320:19:58:38:003371 gpseginstall:master:root-[INFO]:-add user gpadmin on master 20150320:19:58:41:003371 gpseginstall:master:root-[INFO]:-add user gpadmin on cluster 20150320:19:58:42:003371 gpseginstall:master:root-[INFO]:-chown -R gpadmin:gpadmin /usr/local/greenplum-db 20150320:19:58:43:003371 gpseginstall:master:root-[INFO]:-chown -R gpadmin:gpadmin /usr/local/greenplum-db-4.2.8.0 20150320:19:58:43:003371 gpseginstall:master:root-[INFO]:-rm -f /usr/local/greenplum-db-4.2.8.0.tar; rm -f /usr/local/greenplum-db-4.2.8.0.tar.gz 20150320:19:58:43:003371 gpseginstall:master:root-[INFO]:-cd /usr/local; tar cf greenplum-db-4.2.8.0.tar greenplum-db-4.2.8.0 20150320:19:58:50:003371 gpseginstall:master:root-[INFO]:-gzip /usr/local/greenplum-db-4.2.8.0.tar 20150320:19:59:04:003371 gpseginstall:master:root-[INFO]:-remote command: mkdir -p /usr/local 20150320:19:59:04:003371 gpseginstall:master:root-[INFO]:-remote command: rm -rf /usr/local/greenplum-db-4.2.8.0 20150320:19:59:04:003371 gpseginstall:master:root-[INFO]:-scp software to remote location 20150320:19:59:34:003371 gpseginstall:master:root-[INFO]:-remote command: gzip -f -d /usr/local/greenplum-db-4.2.8.0.tar.gz 20150320:19:59:39:003371 gpseginstall:master:root-[INFO]:-md5 check on remote location 20150320:19:59:41:003371 gpseginstall:master:root-[INFO]:-remote command: cd /usr/local; tar xf greenplum-db-4.2.8.0.tar 20150320:19:59:59:003371 gpseginstall:master:root-[INFO]:-remote command: rm -f /usr/local/greenplum-db-4.2.8.0.tar 20150320:20:00:01:003371 gpseginstall:master:root-[INFO]:-remote command: cd /usr/local; rm -f greenplum-db; ln -fs greenplum-db-4.2.8.0 greenplum-db 20150320:20:00:01:003371 gpseginstall:master:root-[INFO]:-remote command: chown -R gpadmin:gpadmin /usr/local/greenplum-db 20150320:20:00:02:003371 gpseginstall:master:root-[INFO]:-remote command: chown -R gpadmin:gpadmin /usr/local/greenplum-db-4.2.8.0 20150320:20:00:02:003371 gpseginstall:master:root-[INFO]:-rm -f /usr/local/greenplum-db-4.2.8.0.tar.gz 20150320:20:00:02:003371 gpseginstall:master:root-[INFO]:-Changing system passwords ... 20150320:20:00:05:003371 gpseginstall:master:root-[INFO]:-exchange ssh keys for user root 20150320:20:00:16:003371 gpseginstall:master:root-[INFO]:-exchange ssh keys for user gpadmin 20150320:20:00:18:003371 gpseginstall:master:root-[INFO]:-/usr/local/greenplum-db/./sbin/gpfixuserlimts -f /etc/security/limits.conf -u gpadmin 20150320:20:00:20:003371 gpseginstall:master:root-[INFO]:-remote command: . /usr/local/greenplum-db/./greenplum_path.sh; /usr/local/greenplum-db/./sbin/gpfixuserlimts -f /etc/security/limits.conf -u gpadmin 20150320:20:00:21:003371 gpseginstall:master:root-[INFO]:-version string on master: gpssh version 4.2.8.0 build 1 20150320:20:00:21:003371 gpseginstall:master:root-[INFO]:-remote command: . /usr/local/greenplum-db/./greenplum_path.sh; /usr/local/greenplum-db/./bin/gpssh --version 20150320:20:00:21:003371 gpseginstall:master:root-[INFO]:-remote command: . /usr/local/greenplum-db-4.2.8.0/greenplum_path.sh; /usr/local/greenplum-db-4.2.8.0/bin/gpssh --version 20150320:20:00:27:003371 gpseginstall:master:root-[INFO]:-SUCCESS -- Requested commands completed |
在所有主机上初始化配置Greenplum
切换到gpamdin用户并获取环境变量
$ su - gpadmin# source/usr/local/greenplum-db/greenplum_path.sh
使用gpssh工具来测试无密码登录所有主机
$ gpssh -f host_list -e ls -l$GPHOME [gpadmin@master greenplum-db]$ gpssh -f all_hosts -e ls -l $GPHOME [master] ls -l /usr/local/greenplum-db/. [master] total 336 ...... [slave1] ls -l /usr/local/greenplum-db/. [slave1] total 336 ...... [slave2] ls -l /usr/local/greenplum-db/. [slave2] total 336 ...... |
|
将"./usr/local/greenplum-db-4.2.2.4/greenplum_path.sh"添加到.bashrc文件最后并传送到子服务器
[gpadmin@master ~]$ cat .bashrc # .bashrc # Source global definitions if [ -f /etc/bashrc ]; then . /etc/bashrc fi . /usr/local/greenplum-db/greenplum_path.sh # User specific aliases and functions |
[gpadmin@master ~]$ env HOSTNAME=master SHELL=/bin/bash TERM=vt100 HISTSIZE=1000 GPHOME=/usr/local/greenplum-db/. USER=gpadmin LD_LIBRARY_PATH=/usr/local/greenplum-db/./lib:/usr/local/greenplum-db/./ext/python/lib: LS_COLORS=no=00:fi=00:di=01;34:ln=01;36:pi=40;33:so=01;35:bd=40;33;01:cd=40;33;01:or=01;05;37;41:mi=01;05;37;41:ex=01;32:*.cmd=01;32:*.exe=01;32:*.com=01;32:*.btm=01;32:*.bat=01;32:*.sh=01;32:*.csh=01;32:*.tar=01;31:*.tgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.gz=01;31:*.bz2=01;31:*.bz=01;31:*.tz=01;31:*.rpm=01;31:*.cpio=01;31:*.jpg=01;35:*.gif=01;35:*.bmp=01;35:*.xbm=01;35:*.xpm=01;35:*.png=01;35:*.tif=01;35: MAIL=/var/spool/mail/gpadmin PATH=/usr/local/greenplum-db/./bin:/usr/local/greenplum-db/./ext/python/bin:/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/home/gpadmin/bin INPUTRC=/etc/inputrc PWD=/home/gpadmin LANG=en_US.UTF-8 PYTHONHOME=/usr/local/greenplum-db/./ext/python SSH_ASKPASS=/usr/libexec/openssh/gnome-ssh-askpass SHLVL=1 HOME=/home/gpadmin OPENSSL_CONF=/usr/local/greenplum-db/./etc/openssl.cnf PYTHONPATH=/usr/local/greenplum-db/./lib/python LOGNAME=gpadmin CVS_RSH=ssh LESSOPEN=|/usr/bin/lesspipe.sh %s G_BROKEN_FILENAMES=1 _=/bin/env |
[gpadmin@master ~]$ scp .bashrc slave1:`pwd` .bashrc 100% 167 0.2KB/s 00:00 [gpadmin@master ~]$ scp .bashrc slave2:`pwd` .bashrc 100% 167 0.2KB/s 00:00 |
验证操作系统设置
创建存储区域
a) 创建Master数据存储区域
# mkdir -p /data/master
b) 改变目录的所有权
# chown gpadmin /data/master
c) 创建一个包含所有segment主机的文件seg_hosts
[root@master ~]# mkdir /tmp/greenplum [root@master ~]# cd /tmp/greenplum/ [root@master greenplum]# vi seg_hosts slave1 slave2 "seg_hosts" [New] 2L, 14C written [root@master greenplum]# cat seg_hosts slave1 slave2 |
d) 使用gpssh工具在所有segment主机上创建主数据和镜像数据目录
[root@master greenplum]# . /usr/local/greenplum-db/greenplum_path.sh [root@master greenplum]# gpssh -f seg_hosts -e 'mkdir -p /data/primary' [slave1] mkdir -p /data/primary [slave2] mkdir -p /data/primary [root@master greenplum]# gpssh -f seg_hosts -e 'mkdir -p /data/mirror' [slave1] mkdir -p /data/mirror [slave2] mkdir -p /data/mirror [root@master greenplum]# gpssh -f seg_hosts -e 'chown gpadmin /data/primary' [slave1] chown gpadmin /data/primary [slave2] chown gpadmin /data/primary [root@master greenplum]# gpssh -f seg_hosts -e 'chown gpadmin /data/mirror' [slave1] chown gpadmin /data/mirror [slave2] chown gpadmin /data/mirror |
同步系统时间
a) 在Master主机上编辑/etc/ntp.conf来设置如下内容:
[root@master greenplum]# cat /etc/ntp.conf # Permit time synchronization with our time source, but do not # permit the source to query or modify the service on this system. restrict default kod nomodify notrap nopeer noquery restrict -6 default kod nomodify notrap nopeer noquery
# Permit all access over the loopback interface. This could # be tightened as well, but to do so would effect some of # the administrative functions. restrict 127.0.0.1 restrict -6 ::1
# Hosts on local network are less restricted. #restrict 192.168.1.0 mask 255.255.255.0 nomodify notrap
# Use public servers from the pool.ntp.org project. # Please consider joining the pool (http://www.pool.ntp.org/join.html). server 0.rhel.pool.ntp.org server 1.rhel.pool.ntp.org server 2.rhel.pool.ntp.org
#broadcast 192.168.1.255 key 42 # broadcast server #broadcastclient # broadcast client #broadcast 224.0.1.1 key 42 # multicast server #multicastclient 224.0.1.1 # multicast client #manycastserver 239.255.254.254 # manycast server #manycastclient 239.255.254.254 key 42 # manycast client
# Undisciplined Local Clock. This is a fake driver intended for backup # and when no outside source of synchronized time is available. server 127.127.1.0 # local clock fudge 127.127.1.0 stratum 10
# Drift file. Put this in a directory which the daemon can write to. # No symbolic links allowed, either, since the daemon updates the file # by creating a temporary in the same directory and then rename()'ing # it to the file. driftfile /var/lib/ntp/drift
# Key file containing the keys and key identifiers used when operating # with symmetric key cryptography. keys /etc/ntp/keys
# Specify the key identifiers which are trusted. #trustedkey 4 8 42
# Specify the key identifier to use with the ntpdc utility. #requestkey 8
# Specify the key identifier to use with the ntpq utility. #controlkey 8 |
b) 在Segment主机上编辑/etc/ntp.conf
# Undisciplined Local Clock. This is a fake driver intended for backup # and when no outside source of synchronized time is available. #server 127.127.1.0 # local clock server master # local clock fudge 127.127.1.0 stratum 10 |
c) 在Master主机上,通过NTP守护进程同步系统时钟
[root@master greenplum]# gpssh -f all_hosts -v -e 'ntpd'
[Reset ...] [INFO] login master [INFO] login slave1 [INFO] login slave2 [master] ntpd [slave1] ntpd [slave2] ntpd [INFO] completed successfully
[Cleanup...] |
验证操作系统设置
[root@master greenplum]# gpcheck -f all_hosts -m master ....... 20150320:21:20:17:004588 gpcheck:master:root-[INFO]:-gpcheck completing... [root@master greenplum]# gpcheck -f all_hosts -m master ...... 20150320:21:23:52:004727 gpcheck:master:root-[INFO]:-gpcheck completing... [root@master greenplum]# echo deadline > /sys/block/sr0/queue/scheduler [root@master greenplum]# gpcheck -f all_hosts -m master ...... 20150320:21:27:50:004866 gpcheck:master:root-[INFO]:-gpcheck completing... |
创建Greenplum数据库配置文件
初始化Greenplum数据库系统
a) 以gpadmin用户登录
# su - gpadmin
b) 从模板中拷贝一份gpinitsystem_config文件
[gpadmin@master ~]$ cp $GPHOME/docs/cli_help/gpconfigs/gpinitsystem_config /home/gpadmin/ [gpadmin@master ~]$ chmod 775 /home/gpadmin/gpinitsystem_config |
c) 设置所有必须的参数以及可选参数
[gpadmin@master ~]$ vi gpinitsystem_config # FILE NAME: gpinitsystem_config
# Configuration file needed by the gpinitsystem
################################################ #### REQUIRED PARAMETERS ################################################
#### Name of this Greenplum system enclosed in quotes. ARRAY_NAME="EMC Greenplum DW"
#### Naming convention for utility-generated data directories. SEG_PREFIX=gpseg
#### Base number by which primary segment port numbers #### are calculated. PORT_BASE=40000 declare -a DATA_DIRECTORY=(/data/primary ) #### OS-configured hostname or IP address of the master host. MASTER_HOSTNAME=master
#### File system location where the master data directory #### will be created. MASTER_DIRECTORY=/data/master
#### Port number for the master instance. MASTER_PORT=5432
#### Shell utility used to connect to remote hosts. TRUSTED_SHELL=ssh
#### Maximum log file segments between automatic WAL checkpoints. CHECK_POINT_SEGMENTS=8
#### Default server-side character set encoding. ENCODING=UNICODE
################################################ #### OPTIONAL MIRROR PARAMETERS ################################################
#### Base number by which mirror segment port numbers #### are calculated. MIRROR_PORT_BASE=50000
#### Base number by which primary file replication port #### numbers are calculated. REPLICATION_PORT_BASE=41000
#### Base number by which mirror file replication port #### numbers are calculated. MIRROR_REPLICATION_PORT_BASE=51000
#### File system location(s) where mirror segment data directories #### will be created. The number of mirror locations must equal the #### number of primary locations as specified in the #### DATA_DIRECTORY parameter. declare -a MIRROR_DATA_DIRECTORY=(/data/mirror)
################################################ #### OTHER OPTIONAL PARAMETERS ################################################
#### Create a database of this name after initialization. #DATABASE_NAME=name_of_database
#### Specify the location of the host address file here instead of #### with the the -h option of gpinitsystem. #MACHINE_LIST_FILE=/home/gpadmin/gpconfigs/hostfile_gpinitsystem |
初始化数据库
a) 运行初始化工具
[gpadmin@master ~]$ cp /tmp/greenplum/seg_hosts . [gpadmin@master ~]$ ls gpinitsystem_config seg_hosts [gpadmin@master ~]$ gpinitsystem -c gpinitsystem_config -h seg_hosts 20150320:21:53:15:005103 gpinitsystem:master:gpadmin-[FATAL]:-Errors generated from parallel processes 20150320:21:53:15:005103 gpinitsystem:master:gpadmin-[INFO]:-Dumped contents of status file to the log file 20150320:21:53:15:005103 gpinitsystem:master:gpadmin-[INFO]:-Building composite backout file 20150320:21:53:15:gpinitsystem:master:gpadmin-[FATAL]:-Failures detected, see log file /home/gpadmin/gpAdminLogs/gpinitsystem_20150320.log for more detail Script Exiting! 20150320:21:53:15:005103 gpinitsystem:master:gpadmin-[WARN]:-Script has left Greenplum Database in an incomplete state 20150320:21:53:15:005103 gpinitsystem:master:gpadmin-[WARN]:-Run command /bin/bash /home/gpadmin/gpAdminLogs/backout_gpinitsystem_gpadmin_20150320_215158 to remove these changes 20150320:21:53:15:005103 gpinitsystem:master:gpadmin-[INFO]:-Start Function BACKOUT_COMMAND 20150320:21:53:15:005103 gpinitsystem:master:gpadmin-[INFO]:-End Function BACKOUT_COMMAND |
经过重新同步各服务器之间的时间后初始化正确:
[gpadmin@master ~]$ gpinitsystem -c gpinitsystem_config -h seg_hosts 20150321:14:23:03:032514 gpstart:master:gpadmin-[INFO]:----------------------------------------------------- 20150321:14:23:03:032514 gpstart:master:gpadmin-[INFO]:-Successfully started 2 of 2 segment instances 20150321:14:23:03:032514 gpstart:master:gpadmin-[INFO]:----------------------------------------------------- 20150321:14:23:03:032514 gpstart:master:gpadmin-[INFO]:-Starting Master instance master directory /data/master/gpseg-1 20150321:14:23:04:032514 gpstart:master:gpadmin-[INFO]:-Command pg_ctl reports Master master instance active |
当出现以下信息时初始化成功
[gpadmin@master ~]$ ps -ef|grep post gpadmin 1444 15784 0 14:27 pts/1 00:00:00 grep post gpadmin 32570 1 0 14:23 ? 00:00:00 /usr/local/greenplum-db-4.2.8.0/bin/postgres -D /data/master/gpseg-1 -p 5432 -b 1 -z 2 --silent-mode=true -i -M master -C -1 -x 0 -E gpadmin 32571 32570 0 14:23 ? 00:00:00 postgres: port 5432, master logger process gpadmin 32574 32570 0 14:23 ? 00:00:00 postgres: port 5432, stats collector process gpadmin 32575 32570 0 14:23 ? 00:00:00 postgres: port 5432, writer process gpadmin 32576 32570 0 14:23 ? 00:00:00 postgres: port 5432, checkpoint process gpadmin 32577 32570 0 14:23 ? 00:00:00 postgres: port 5432, seqserver process gpadmin 32578 32570 0 14:23 ? 00:00:00 postgres: port 5432, WAL Send Server process gpadmin 32579 32570 0 14:23 ? 00:00:00 postgres: port 5432, ftsprobe process gpadmin 32580 32570 0 14:23 ? 00:00:00 postgres: port 5432, sweeper process |
b) 设置环境变量
添加“export MASTER_DATA_DIRECTORY=/data/master/gpseg-1”到~/.bashrc文件尾,并同步到其他节点。
[gpadmin@master ~]$ cat .bashrc # .bashrc
# Source global definitions if [ -f /etc/bashrc ]; then . /etc/bashrc fi . /usr/local/greenplum-db/greenplum_path.sh export MASTER_DATA_DIRECTORY=/data/master/gpseg-1 # User specific aliases and functions [gpadmin@master ~]$ scp .bashrc slave1:`pwd` .bashrc 100% 217 0.2KB/s 00:00 [gpadmin@master ~]$ scp .bashrc slave2:`pwd` .bashrc 100% 217 0.2KB/s 00:00 |
启动和停止数据库
a) 启动数据库 $ gpstart
[gpadmin@master ~]$ gpstart 20150321:14:49:18:007507 gpstart:master:gpadmin-[INFO]:-Starting gpstart with args: ....... 20150321:14:49:26:007507 gpstart:master:gpadmin-[INFO]:-Database successfully started [gpadmin@master ~]$ psql -d template1 psql (8.2.15) Type "help" for help. template1=# |
注意:需要检查防火墙是否关闭否则,会有错误。
b) 关闭数据库
$ gpstop
[gpadmin@master ~]$ gpstop 20150321:14:38:08:004265 gpstop:master:gpadmin-[INFO]:-Starting gpstop with args: Continue with Greenplum instance shutdown Yy|Nn (default=N): > y 20150321:14:38:14:004265 gpstop:master:gpadmin-[INFO]:-Successfully shutdown 2 of 2 segment instances 20150321:14:38:14:004265 gpstop:master:gpadmin-[INFO]:-Database successfully shutdown with no errors reported |
来源:oschina
链接:https://my.oschina.net/u/4402583/blog/4486349