“This is the 30th day of my participation in the Gwen Challenge in November. See details: The Last Gwen Challenge in 2021”
1. Plan deployment resources
1. Memory: officially recommended 16GB for each device and 30GB for each primary.
2. Disk space: 2GB for GP software installation. The usage of GP data disks must not exceed 70%.
3. Network requirements: The official recommendation is 10 GIGABit Ethernet, with multiple network ports bonded
4. File directory: The XFS file system is recommended.
5. Rhel7 installation requirements:
Operating system version: RHEL7.9
Hardpoints:
/boot /sda1 XFS 2048MB
/ /sda2 XFS
SWAP/SDB SWAP Memory /2
SWAP/SDD SWAP Memory /2
Language choice: English
Time zone: Shanghai
Software selection: File and Print Server
Optional add-ons: Development Tools
The default root password is 123456
1. System version: Redhat7.9 2. Hardware: 3 VMS, 2 cores, 16G memory, and 50G hard disk 3
Host IP |
host |
Node planning |
192.168.31.201 |
mdw |
master |
192.168.31.202 |
sdw1 |
seg1,seg2,mirror3,mirror4 |
192.168.31.203 |
sdw2 |
seg3,seg4,mirror1,mirror2 |
2. Configure deployment parameters
Rely on:
# yum install Install dependency check procedure GP5. x install dependency check gp6.2 Install dependency check yum install Install dependency checkCopy the code
apr apr-util bash bzip2 curl krb5 libcurl libevent (or libevent2 on RHEL/CentOS 6) libxml2 libyaml zlib openldap openssh openssl openssl-libs (RHEL7/Centos7) perl readline rsync R sed (used by gpinitsystem) tar zip mount /dev/cdrom /mnt mv /etc/yum.repos.d/* /tmp/echo "[local]" >> /etc/yum.repos.d/local.repo
echo "name = local" >> /etc/yum.repos.d/local.repo
echo "baseurl = file:///mnt/" >> /etc/yum.repos.d/local.repo
echo "enabled = 1" >> /etc/yum.repos.d/local.repo
echo "gpgcheck = 0" >> /etc/yum.repos.d/local.repo
yum clean all
yum repolist all
yum install -y apr apr-util bash bzip2 curl krb5 libcurl libevent libxml2 libyaml zlib openldap openssh openssl openssl-libs perl readline rsync R sed tar zip krb5-devel
Copy the code
1. Disable the firewall and Selinux
systemctl stop firewalld.service
systemctl disable firewalld.service
systemctl status firewalld.service
Copy the code
2. Change the host name
hostnamectl set-hostname mdw
hostnamectl set-hostname sdw1
hostnamectl set-hostname sdw2
Copy the code
3. Modify the /etc/hosts file
Vim /etc/hosts 192.168.31.201 MDW 192.168.31.202 SDw1 192.168.31.203 SDw2Copy the code
4. Configure the system parameter file sysctl.conf
Modify the system parameters according to the actual situation of the system (the default values are officially given before GP 5.0, and some calculation formulas are given after 5.0). Reloading parameters (sysctl -p) :
Shmall = _PHYS_PAGES / 2 # See Shared Memory Pages #
kernel.shmall = 4000000000
Shmmax = kernel.shmall * PAGE_SIZE
kernel.shmmax = 500000000
kernel.shmmni = 4096
vm.overcommit_memory = 2 # See Segment Host Memory
vm.overcommit_ratio = 95 # See Segment Host Memory
net.ipv4.ip_local_port_range = 10000 65535 # See Port Settings Port Settingskernel.sem = 500 2048000 200 40960 kernel.sysrq = 1 kernel.core_uses_pid = 1 kernel.msgmnb = 65536 kernel.msgmax = 65536 kernel.msgmni = 2048 net.ipv4.tcp_syncookies = 1 net.ipv4.conf.default.accept_source_route = 0 net.ipv4.tcp_max_syn_backlog = 4096 net.ipv4.conf.all.arp_filter = 1 net.core.netdev_max_backlog = 10000 net.core.rmem_max = 2097152 net.core.wmem_max = 2097152 vm.swappiness = 10 vm.zone_reclaim_mode = 0 vm.dirty_expire_centisecs = 500 vm.dirty_writeback_centisecs = 100 vm.dirty_background_ratio = 0# See System Memory
vm.dirty_ratio = 0
vm.dirty_background_bytes = 1610612736
vm.dirty_bytes = 4294967296
Copy the code
-- Shared memory $echo $(expr $(getconf _PHYS_PAGES) / 2)
$ echo $(expr $(getconf _PHYS_PAGES) / 2 \* $(getconf PAGE_SIZE))
[root@mdw ~]# echo $(expr $(getconf _PHYS_PAGES) / 2)
2053918
[root@mdw ~]# echo $(expr $(getconf _PHYS_PAGES) / 2 \* $(getconf PAGE_SIZE))
8412848128
Copy the code
Overcommit_memory The system uses this parameter to determine how much memory can be allocated to processes. For GP databases, this parameter should be set to 2. Vm. overcommit_ratio specifies the percentage of the allocation to the process, leaving the rest to the operating system. On Red Hat, the default value is 50. Overcommit_ratio VM. overcommit_ratio = (RAM-0.026*gp_vmem)/RAMCopy the code
-- Port is set to avoid port conflicts with other applications during Greenplum initialization. Net.ipv4.ip_local_port_range is specified. When initializing Greenplum using gpinitSystem, do not specify the Greenplum database port in this scope. For example, if net.ipv4.ip_local_port_range = 10000 65535, set the Greenplum database base port number to these values. PORT_BASE = 6000 MIRROR_PORT_BASE = 7000Copy the code
Dirty_background_ratio = 0 Vm. dirty_background_bytes = 1610612736 Is recommended if the system memory is greater than 64 GB# 1.5 GB
vm.dirty_bytes = 4294967296 # 4GBIf the system memory is smaller than or equal to 64GB, remove vm.dirty_background_bytes and set vm.dirty_background_ratio = 3 vm.dirty_ratio = 10Copy the code
Add vm.min_free_kbytes to ensure that the network and storage driver PF_MEMALLOC are allocated. This is especially important for large memory systems. On general systems, the default value is usually too low. You can use the awk command to calculate the value of vm.min_free_kbytes, which is usually 3% of the recommended system physical memory: AWK'BEGIN {OFMT = "%.0f"; } /MemTotal/ {print "vm.min_free_kbytes =", $2 * .03; } '/proc/meminfo >> /etc/sysctl.conf Do not set vm.min_free_kbytes to more than 5% of the system memory, as this may cause insufficient memory.Copy the code
Redhat7.9,16G memory was used in this experiment and the configuration is as follows:
vim /etc/sysctl.conf
kernel.shmall = 2053918
kernel.shmmax = 8412848128
kernel.shmmni = 4096
vm.overcommit_memory = 2
vm.overcommit_ratio = 95
net.ipv4.ip_local_port_range = 10000 65535
kernel.sem = 500 2048000 200 4096
kernel.sysrq = 1
kernel.core_uses_pid = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.msgmni = 2048
net.ipv4.tcp_syncookies = 1
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.tcp_max_syn_backlog = 4096
net.ipv4.conf.all.arp_filter = 1
net.core.netdev_max_backlog = 10000
net.core.rmem_max = 2097152
net.core.wmem_max = 2097152
vm.swappiness = 10
vm.zone_reclaim_mode = 0
vm.dirty_expire_centisecs = 500
vm.dirty_writeback_centisecs = 100
vm.dirty_background_ratio = 3
vm.dirty_ratio = 10
Copy the code
5. Modify the/etc/security/limits. Conf
vim /etc/security/limits.conf
* soft nofile 524288
* hard nofile 524288
* soft nproc 131072
* hard nproc 131072
Copy the code
RHEL/CentOS 7 changes: / etc/security/limits. D / 20 – nproc. Nproc conf file for 131072
[root@mdw ~]# cat /etc/security/limits.d/20-nproc.conf
# Default limit for number of user's processes to prevent
# accidental fork bombs.
# See rhbz #432903 for reasoning.
* soft nproc 131072
root soft nproc unlimited
Copy the code
The Linux module pam_limits sets user limits by reading limits.conf files. The ulimit -u command displays the maximum number of processes available to each user. Max user processes Verify that the return value is 131072.
6. XFS mount
Compared with ext4, XFS has the following advantages: Ext4 is indeed very stable as a traditional file system. However, with the increasing storage requirements, ext4 is no longer suitable for historical disks. Ext4 supports a maximum of 4 billion inodes (32 bits) and a maximum of 16 TB files. XFS uses 64-bit management space and the file system size can reach EB level. XFS is a file system that needs XFS based on B+Tree metadata management GP. RHEL/CentOS 7 and Oracle Linux use XFS as the default file system. SUSE/openSUSE has long supported XFS. Because the VM has only one disk and it is a system disk, the file system cannot be changed. Hanging on XFS is skipped here.
[root@mdw ~]# cat /etc/fstab
#
# /etc/fstab
# Created by anaconda on Sat Feb 27 08:37:50 2021
#
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
#
/dev/mapper/rhel-root / xfs defaults 0 0
UUID=8553f10d-0334-4cd5-8e3f-b915b6e0ccaa /boot xfs defaults 0 0
/dev/mapper/rhel-swap swap swap defaults 0 0
/dev/mapper/rhel-swap00 swap swap defaults 0 0
Copy the code
Gp6 Does not have the gpCheck tool, so it does not affect the cluster. Before gp6 is installed, the GPCheck script can be commented out to check some code of the file system.
The file system is usually specified during the installation of the operating system or formatted when a new disk is mounted. You can also format disks that are not system disks into a specified file system. For example, mount a new XFS:
mkfs.xfs /dev/sda3
mkdir -p /data/master
vi /etc/fstab
dev/data /data xfs nodev,noatime,nobarrier,inode64 0 0
Copy the code
7.Disk I/O Settings
Disk file prefetch: 16384. Disk directories on different systems are different. You can use LSBLK to check whether the disk is hanging
[root@mdw ~]# lsblk NAME MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB ├ ─ rhel - root 253:0 0 0 LVM / 82 g ├ ─ rhel - swap 253:1 0 0 8 g LVM/swap └ ─ rhel - swap00 253:2 0 0 8 g LVM/swap sr0 11:0 1 4.2g 0 ROM/MNT [root@mdw ~]# /sbin/blockdev --setra 16384 /dev/sda
[root@mdw ~]# /sbin/blockdev --getra /dev/sdaBlockdev --setra 16384 /dev/sda chmod +x /etc/rc.d/rc.localCopy the code
8.Disk I/O Scheduler Algorithm for scheduling Disk I/ OS
--RHEL 7.x or CentOS 7.x, use grub2, you can use the system tool Grubby to change the value. grubby --update-kernel=ALL --args="elevator=deadline"Grubby --info=ALLCopy the code
9.Transparent Huge Pages (THP) Disables THP
Disable THP because it degrades the Greenplum database performance.
--RHEL 7.x or CentOS 7.x, use grub2, you can use the system tool Grubby to change the value. grubby --update-kernel=ALL --args="transparent_hugepage=never"$cat /sys/kernel/mm/* Transparent_hugePage /enabled always [never]Copy the code
10.IPC Object Removal
Disable IPC Object removal for RHEL 7.2 or CentOS 7.2, or Ubuntu. The default systemd setting RemoveIPC=yes removes IPC connections when non-system user accounts log out. This causes the Greenplum Database utility gpinitsystem to fail with semaphore errors. Perform one of the following to avoid this issue. When you add the gpadmin operating system user account to the master node in Creating the Greenplum Administrative User, create the user as a system account. Disable RemoveIPC. Set this parameter in /etc/systemd/logind.conf on the Greenplum Database host systems.
vi /etc/systemd/logind.conf
RemoveIPC=no
service systemd-logind restart
Copy the code
11.SSH Connection Threshold Indicates the SSH Connection Threshold
The Greenplum database manager’s gpexpand ‘gpinitSystem, gpaddmirrors, uses SSH connections to perform tasks. In a large Greenplum cluster, the number of SSH connections for a program may exceed the maximum threshold for unauthenticated connections for the host. When this happens, you receive the following error: SSH_exchangE_IDENTIFICATION: The connection was closed by the remote host. To avoid this, update the MaxStartups and MaxSessions parameters in /etc/ssh/sshd_config or /etc/sshd_config files
If you specify MaxStartups and MaxSessions using a single integer value, you identify the maximum number of concurrent unauthenticated connections (MaxStartups) and maximum number of open shell, login, or subsystem sessions permitted per network connection (MaxSessions). For example:
MaxStartups 200
MaxSessions 200
Copy the code
If you specify MaxStartups using the “start:rate:full” syntax, you enable random early connection drop by the SSH daemon. start identifies the maximum number of unauthenticated SSH connection attempts allowed. Once start number of unauthenticated connection attempts is reached, the SSH daemon refuses rate percent of subsequent connection attempts. full identifies the maximum number of unauthenticated connection attempts after which all attempts are refused. For example:
Max Startups 10:30:200
MaxSessions 200
Copy the code
Vi /etc/ssh/sshd_config or /etc/sshd_config Max Startups 10:30:200 MaxSessions 200 -- Restart the SSHD for the parameters to take effect# systemctl reload sshd.service
Copy the code
Synchronizing System Clocks (NTP)
Conf on the master server to configure the clock server as the NTP server in the data center. If no, change the time of the master server to the correct time, and then change the /etc/ntp.conf file of other nodes to synchronize the time of the master server.
-- Root Log in to the master host using the vi /etc/ntp.conf file#10.6.220.20 is your time server IPServer 10.6.220.20 --root log in to segment host server MDW prefer# priority primary node
server smdw If there is no standby node, it can be configured as the clock server in the data center
service ntpd restart Restart the NTP service
Copy the code
Check the character set
-- if not, add RC_LANG= en_us.utf-8 [root@mdw greenplum-db] to /etc/sysconfig/language# echo $LANG
en_US.UTF-8
Copy the code
14.Creating the Greenplum Administrative Use
Gp6.2 No gpseginstall. You must create the gpadmin user before installation
Create a gpadmin user on each node to manage and run the GP cluster, preferably with sudo permission. After the GP of the active node is installed, use GPSSH to create the GPSSH on other nodes in batches. Example:
groupadd gpadmin
useradd gpadmin -r -m -g gpadmin
passwd gpadmin
echo "gpadmin" |passwd gpadmin --stdin
Copy the code
3. Configure and install GP
1. Upload the installation file and install it
[root@mdw ~]# mkdir /soft
[root@mdw ~]#
[root@mdw ~]# id gpadmin
uid=995(gpadmin) gid=1000(gpadmin) groups=1000(gpadmin)
[root@mdw ~]# chown -R gpadmin:gpadmin /soft/
[root@mdw ~]# chmod 775 /soft/
[root@mdw ~]# cd /soft/
[root@mdw soft]# lsOpensource-greenplum-db-6.14.1-rhel7-x86_64. RPM -- install [root@mdw soft]# RPM - the ivh open - source - greenplum db - 6.14.1 - rhel7 - x86_64. RPM
Preparing... # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # [100%]Updating / installing... 1: the open - source - greenplum db - 6-6.14.1# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # [100%]
/usr/local/ = /usr/local/
chown -R gpadmin:gpadmin /usr/local/greenplum*
Copy the code
2. Configure SSH to enable trust in the cluster and secret-free login (required for both root and gpadmin).
Gpssh-exkeys -f all_host is not required in 3.3.1 ssh-keygen and 3.3.2 ssh-copy-id.
$ su gpadmin
Create hostfile_exkeys # #
在$GPHOMEDirectory Create two host files (all_host and seg_host) for subsequent use of scripts such as GPSSH and GPSCP. Host parameter file all_host: contains all host names or IP addresses of the cluster, including master,segment, and standby. Seg_host: specifies the host names or IP addresses of all segments. If a machine has multiple network adapters and the network adapters are not bound in bond0 mode, the IP addresses or hosts of the network adapters must be listed. [gpadmin@mdw ~]# cd /usr/local/
[gpadmin@mdw local]$ls bin etc games greenplum-db greenplum-db-6.14.1 include lib lib64 libexec sbin share SRC [gpadmin@mdwlocal]# cd greenplum-db
[gpadmin@mdw greenplum-db]$ ls
bin docs ext include libexec NOTICE sbin
COPYRIGHT etc greenplum_path.sh lib LICENSE open_source_license_greenplum_database.txt share
[gpadmin@mdw greenplum-db]# vim all_host
[gpadmin@mdw greenplum-db]# vim seg_host
[gpadmin@mdw greenplum-db]# cat all_host
mdw
sdw1
sdw2
[gpadmin@mdw greenplum-db]# cat seg_host
sdw1
sdw2
Generate the key
$ ssh-keygen -t rsa -b 4096
Generating public/private rsa key pair.
Enter file in which to save the key (/home/gpadmin/.ssh/id_rsa):
Created directory '/home/gpadmin/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
## Master and segment trust
su - gpadmin
ssh-copy-id -i ~/.ssh/id_rsa.pub gpadmin@sdw1
ssh-copy-id -i ~/.ssh/id_rsa.pub gpadmin@sdw2
Use gpssh-exkeys to open n-N's password-free login
[gpadmin@mdw greenplum-db]$ gpssh-exkeys -f all_host
bash: gpssh-exkeys: command not found
Environment variables need to be activated
[gpadmin@mdw greenplum-db]$ source /usr/local/greenplum-db/greenplum_path.sh [gpadmin@mdw greenplum-db]$ [gpadmin@mdw greenplum-db]$ gpssh-exkeys -f all_host [STEP 1 of 5] createlocal ID and authorize on local host
... /home/gpadmin/.ssh/id_rsa file exists ... key generation skipped
[STEP 2 of 5] keyscan all hosts and update known_hosts file
[STEP 3 of 5] retrieving credentials from remote hosts
... send to sdw1
... send to sdw2
[STEP 4 of 5] determine common authentication file content
[STEP 5 of 5] copy authentication files to all remote hosts
... finished key exchange with sdw1
... finished key exchange with sdw2
[INFO] completed successfully
Copy the code
3. Verify GPSSH
[gpadmin@mdw greenplum-db]$ gpssh -f /usr/local/greenplum-db/all_host -e 'ls /usr/local/'
[sdw1] ls /usr/local/
[sdw1] bin etc games include lib lib64 libexec sbin share src
[ mdw] ls /usr/local/ [MDW] bin games greenplum-db-6.14.1 lib libexec share [MDW] etc greenplum-db include lib64 sbin SRC [sdw2] ls /usr/local/
[sdw2] bin etc games include lib lib64 libexec sbin share src
Copy the code
4. Set environment variables in batches
## Batch set greenplum in the gpadmin user environment variable
Add gp installation directory and call environment information to user's environment variables.
vim /home/gpadmin/.bash_profile
cat >> /home/gpadmin/.bash_profile << EOF
source /usr/local/greenplum-db/greenplum_path.sh
EOF
vim .bashrc
cat >> /home/gpadmin/.bashrc << EOF
source /usr/local/greenplum-db/greenplum_path.sh
EOF
vim /etc/profile
cat >> /etc/profile << EOF
source /usr/local/greenplum-db/greenplum_path.sh
EOF
## Environment variable files are distributed to other nodes
su - root
source /usr/local/greenplum-db/greenplum_path.sh
gpscp -f /usr/local/greenplum-db/seg_host /etc/profile @=:/etc/profile
su - gpadmin
source /usr/local/greenplum-db/greenplum_path.sh
gpscp -f /usr/local/greenplum-db/seg_host /home/gpadmin/.bash_profile gpadmin@=:/home/gpadmin/.bash_profile
gpscp -f /usr/local/greenplum-db/seg_host /home/gpadmin/.bashrc gpadmin@=:/home/gpadmin/.bashrc
Copy the code
4. Cluster node installation
## Different from the old versionThis section is missing from the official website. Prior to GP6, there was a tool, gpsegInstall, to install GP software for each node. Gpseginstall logs show that the main steps of gpseginstall are: 1. Create gp users on each node (skip this step). 2. Package the installation directory of the primary node. 5. Grant gpamdin gpseginstall installation log, refer to gp5 Installation notesCopy the code
1. Simulate the gpseginstall script
Run the command as user root
# variable Settings
link_name='greenplum-db' # soft connection name
binary_dir_location='/usr/local' # Install path
binary_dir_name='the greenplum db - 6.14.1' # Install directory
binary_path='/ usr/local/greenplum db - 6.14.1' # all directory
link_name='greenplum-db'
binary_dir_location='/usr/local'
binary_dir_name='the greenplum db - 6.14.1'
binary_path='/ usr/local/greenplum db - 6.14.1'
Copy the code
Master node packaging
chown -R gpadmin:gpadmin $binary_path
rm -f ${binary_path}.tar; rm -f ${binary_path}.tar.gz
cd $binary_dir_location; tar cf ${binary_dir_name}.tar ${binary_dir_name}
gzip ${binary_path}.tar
[root@mdw local]# chown -R gpadmin:gpadmin $binary_path
[root@mdw local]# rm -f ${binary_path}.tar; rm -f ${binary_path}.tar.gz
[root@mdw local]# cd $binary_dir_location; tar cf ${binary_dir_name}.tar ${binary_dir_name}
[root@mdw local]# gzip ${binary_path}.tar
[root@mdw local]# lsBin games greenplum-db-6.14.1 include lib64 sbin SRC etc greenplum-db greenplum-db-6.14.1.tar.gz lib libexec shareCopy the code
Distribute to segments as root
link_name='greenplum-db'
binary_dir_location='/usr/local'
binary_dir_name='the greenplum db - 6.14.1'
binary_path='/ usr/local/greenplum db - 6.14.1'
source /usr/local/greenplum-db/greenplum_path.sh
gpssh -f ${binary_path}/seg_host -e "mkdir -p ${binary_dir_location}; rm -rf${binary_path}; rm -rf${binary_path}.tar; rm -rf${binary_path}.tar.gz"
gpscp -f ${binary_path}/seg_host ${binary_path}.tar.gz root@=:${binary_path}.tar.gz
gpssh -f ${binary_path}/seg_host -e "cd ${binary_dir_location}; gzip -f -d${binary_path}.tar.gz; tar xf${binary_path}.tar"
gpssh -f ${binary_path}/seg_host -e "rm -rf ${binary_path}.tar; rm -rf${binary_path}.tar.gz; rm -f${binary_dir_location}/${link_name}"
gpssh -f ${binary_path}/seg_host -e ln -fs ${binary_dir_location}/${binary_dir_name} ${binary_dir_location}/${link_name}
gpssh -f ${binary_path}/seg_host -e "chown -R gpadmin:gpadmin ${binary_dir_location}/${link_name}; chown -R gpadmin:gpadmin${binary_dir_location}/${binary_dir_name}"
gpssh -f ${binary_path}/seg_host -e "source ${binary_path}/greenplum_path"
gpssh -f ${binary_path}/seg_host -e "cd ${binary_dir_location}; ll"
Copy the code
2. Create a cluster data directory
Create the master data directory
mkdir -p /opt/greenplum/data/master
chown gpadmin:gpadmin /opt/greenplum/data/master
## Standby data directory (this experiment has no standby)Use GPSSH to create a data directory for the standby remotely# source /usr/local/greenplum-db/greenplum_path.sh
# gpssh -h smdw -e 'mkdir -p /data/master'
# gpssh -h smdw -e 'chown gpadmin:gpadmin /data/master'
# create segment data directoryThis time, we plan to install two segments and two mirrors for each host.source /usr/local/greenplum-db/greenplum_path.sh
gpssh -f /usr/local/greenplum-db/seg_host -e 'mkdir -p /opt/greenplum/data1/primary'
gpssh -f /usr/local/greenplum-db/seg_host -e 'mkdir -p /opt/greenplum/data1/mirror'
gpssh -f /usr/local/greenplum-db/seg_host -e 'mkdir -p /opt/greenplum/data2/primary'
gpssh -f /usr/local/greenplum-db/seg_host -e 'mkdir -p /opt/greenplum/data2/mirror'
gpssh -f /usr/local/greenplum-db/seg_host -e 'chown -R gpadmin /opt/greenplum/data*'
Copy the code
3. Perform a cluster performance test
## Different from the old versionGp6 disables the gpCheck tool. The part that is currently verifiable is network and disk IO performance. The gpcheck tool verifies system parameters and hardware configurations required by gp. Generally speaking, the disk must reach 2000M/s and the network must reach at least 1000M/sCopy the code
4. Network performance test
[root@mdw ~]# gpcheckperf -f /usr/local/greenplum-db/seg_host -r N -d /tmp
/usr/local/ greenplum db - 6.14.1 / bin/gpcheckperf -f/usr /local/greenplum-db/seg_host -r N -d /tmp ------------------- -- NETPERF TEST ------------------- NOTICE: -t is deprecated, and has no effect NOTICE: -f is deprecated, and has no effect NOTICE: -t is deprecated, and has no effect NOTICE: -f is deprecated, And has no effect = = = = = = = = = = = = = = = = = = = = = = the RESULT: the 2021-02-27 T11 will. 502661 = = = = = = = = = = = = = = = = = = = = was bisection bandwidthtestSdw1 -> SDw2 = 2069.110000 SDW2 -> SDw1 = 2251.890000 Sum = 4321.00 MB/ SEC min = 2069.11 MB/ SEC Max = 2251.89 MB/ SEC AVG = 2160.50 MB/ SEC median = 2251.89 MB/ SECCopy the code
5. Test disk I/O performance
gpcheckperf -f /usr/local/greenplum-db/seg_host -r ds -D -d /opt/greenplum/data1/primary
Copy the code
6. Cluster clock verification (unofficial step)
Verify the cluster time. If the cluster time is inconsistent, change the NTP time
gpssh -f /usr/local/greenplum-db/all_host -e 'date'
Copy the code
5. Cluster initialization
IO /6-2/install…
1. Write an initial configuration file
su - gpadmin
mkdir -p /home/gpadmin/gpconfigs
cp $GPHOME/docs/cli_help/gpconfigs/gpinitsystem_config /home/gpadmin/gpconfigs/gpinitsystem_config
Copy the code
2. Modify parameters as required
Note: To specify PORT_BASE, review the port range specified in the net.ipv4.ip_local_port_range parameter in the /etc/sysctl.conf file. Main modified parameters:
ARRAY_NAME="Greenplum Data Platform"
SEG_PREFIX=gpseg
PORT_BASE=6000
declare -a DATA_DIRECTORY=(/opt/greenplum/data1/primary /opt/greenplum/data2/primary)
MASTER_HOSTNAME=mdw
MASTER_DIRECTORY=/opt/greenplum/data/master
MASTER_PORT=5432
TRUSTED_SHELL=ssh
CHECK_POINT_SEGMENTS=8
ENCODING=UNICODE
MIRROR_PORT_BASE=7000
declare -a MIRROR_DATA_DIRECTORY=(/opt/greenplum/data1/mirror /opt/greenplum/data2/mirror)
DATABASE_NAME=gpdw
Copy the code
3. Cluster initialization command parameters
# # root is carried out
##/usr/local/greenplum-db/./bin/ gpinitSystem: line 244: / TMP /cluster_tmp_file.8070: Permission denied
gpssh -f /usr/local/greenplum-db/all_host -e 'chmod 777 /tmp'
##/bin/mv: cannot stat '/ TMP /cluster_tmp_file.8070': Permission denied
gpssh -f /usr/local/greenplum-db/all_host -e 'chmod u+s /bin/ping'
su - gpadmin
gpinitsystem -c /home/gpadmin/gpconfigs/gpinitsystem_config -h /usr/local/greenplum-db/seg_host -D
Copy the code
If the initialization completes successfully, Greenplum Database instance successfully created is printed. Log is generated to/home/gpadmin gpAdminLogs/directory and naming rules: gpinitsystem_ ${date} installation. The log log last part are as follows:
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-End Function PARALLEL_SUMMARY_STATUS_REPORT
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-End Function CREATE_SEGMENT
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-Start Function FORCE_FTS_PROBE
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-End Function FORCE_FTS_PROBE
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-Start Function SCAN_LOG
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-Scanning utility log file for any warning messages
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-Log file scan check passed
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-End Function SCAN_LOG
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-Greenplum Database instance successfully created
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-------------------------------------------------------
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-To complete the environment configuration, please
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-update gpadmin .bashrc file with the following
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-1. Ensure that the greenplum_path.sh file is sourced
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-2. Add "export MASTER_DATA_DIRECTORY=/opt/greenplum/data/master/gpseg-1"
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:- to access the Greenplum scripts for this instance:
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:- or, use -d /opt/greenplum/data/master/gpseg-1 option for the Greenplum scripts
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:- Example gpstate -d /opt/greenplum/data/master/gpseg-1
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-Script log file = /home/gpadmin/gpAdminLogs/gpinitsystem_20210227.log
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-To remove instance, run gpdeletesystem utility
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-To initialize a Standby Master Segment for this Greenplum instance
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-Review options for gpinitstandby
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-------------------------------------------------------
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-The Master /opt/greenplum/data/master/gpseg-1/pg_hba.conf post gpinitsystem
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-has been configured to allow all hosts within this new
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-array to intercommunicate. Any hosts external to this
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-new array must be explicitly added to this file
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-Refer to the Greenplum Admin support guide which is
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-located in the /usr/local/ greenplum db - 6.14.1 / docs directory 20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:------------------------------------------------------- 20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-End MainCopy the code
Read the bottom of the log carefully, there are a few more steps to take. 4.3.1 Checking Log Content The following information is displayed in the log:
Scan of log file indicates that some warnings or errors
were generated during the array creation
Please review contents of log file
/home/gpadmin/gpAdminLogs/gpinitsystem_20210227.log
Copy the code
Scan warnings or errors:
[gpadmin@mdw ~]$ cat /home/gpadmin/gpAdminLogs/gpinitsystem_20210227.log|grep -E -i 'WARN|ERROR]'
WARNING: enabling "trust" authentication for local connections
WARNING: enabling "trust" authentication for local connections
WARNING: enabling "trust" authentication for local connections
WARNING: enabling "trust" authentication for local connections
WARNING: enabling "trust" authentication for local connections
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-Scanning utility log file for any warning messages
Copy the code
Adjust the log content to optimize cluster performance.
4. Set environment variables
Edit the environment variable for user gpadmin and add:
source /usr/local/greenplum-db/greenplum_path.sh
export MASTER_DATA_DIRECTORY=/opt/greenplum/data/master/gpseg-1
Copy the code
In addition to this, it is usually added:
export PGPORT=5432 # Set this parameter based on the actual situation
export PGUSER=gpadmin # Set this parameter based on the actual situation
export PGDATABASE=gpdw # Set this parameter based on the actual situation
Copy the code
Environment variables for reference: GPDB. Docs. Pivotal. IO / 510 / install… .
su - gpadmin
cat >> /home/gpadmin/.bash_profile << EOF
export MASTER_DATA_DIRECTORY=/opt/greenplum/data/master/gpseg-1
export PGPORT=5432
export PGUSER=gpadmin
export PGDATABASE=gpdw
EOF
cat >> /home/gpadmin/.bashrc << EOF
export MASTER_DATA_DIRECTORY=/opt/greenplum/data/master/gpseg-1
export PGPORT=5432
export PGUSER=gpadmin
export PGDATABASE=gpdw
EOF
## Environment variable files are distributed to other nodes
gpscp -f /usr/local/greenplum-db/seg_host /home/gpadmin/.bash_profile gpadmin@=:/home/gpadmin/.bash_profile
gpscp -f /usr/local/greenplum-db/seg_host /home/gpadmin/.bashrc gpadmin@=:/home/gpadmin/.bashrc
gpssh -f /usr/local/greenplum-db/all_host -e 'source /home/gpadmin/.bash_profile; source /home/gpadmin/.bashrc; '
Copy the code
5. To delete the reinstallation, use gpdeletesystem
IO /6-2/ Utility gpdeletesystem for more details, see gpDB.docs.pivotal. Use the following command:
gpdeletesystem -d /opt/greenplum/data/master/gpseg-1 -f
Copy the code
-d MASTER_DATA_DIRECTORY (master data directory) deletes all data directories of the master segment. -f force: Terminates all processes and forcibly deletes them. Example:
gpdeletesystem -d /opt/greenplum/data/master/gpseg-1 -f
Copy the code
After deleting the cluster, adjust the cluster initialization configuration file as required and initialize the cluster again.
vi /home/gpadmin/gpconfigs/gpinitsystem_config
gpinitsystem -c /home/gpadmin/gpconfigs/gpinitsystem_config -h /usr/local/greenplum-db/seg_host -D
Copy the code
6. Configure after successful installation
1. PSQL log in to gp and set the password
Use PSQL to log in to gp. The general command format is as follows:
psql -h hostname -p port -d database -U user -W password
Copy the code
-h followed by the host name of the master or segment -p Followed by the port number of the master or segment -d Followed by the database name You can set the preceding parameters to user environment variables. In Linux, the gpadmin user does not require a password. Example for PSQL login and setting the password of user gpadmin:
[gpadmin@mdw gpseg- 1]$ psql -h mdw -p5432 -d gpdw
psql (9.424.)
Type "help" for help.
gpdw=# alter user gpadmin with password 'gpadmin';
ALTER ROLE
gpdw=# \q
Copy the code
2. Log in to different nodes
[gpadmin@mdw gpseg-1]$ PGOPTIONS='-c gp_session_role=utility'PSQL -h MDW -p5432 -d postgres PSQL (9.4.24) Type"help" for help.
postgres=# \q
[gpadmin@mdw gpseg-1]$ PGOPTIONS='-c gp_session_role=utility'PSQL -h sdw1-p6000 -d postgres PSQL (9.4.24) Type"help" for help.
postgres=# \q
Copy the code
3. Log in to the GP client
Configure pg_hba.conf Configure postgresql.conf
Configuration pg_hba. Conf
The reference configuration instructions: blog.csdn.net/yaoqiancuo3…
vim /opt/greenplum/data/master/gpseg-1/pg_hba.conf
# # modifiedHost replication gpadmin 192.168.31.201/32 Trust# # forHost all gpadmin 192.168.31.201/32 trust# # newHost all gpadmin 0.0.0.0/0 MD5# New rule allows login with any IP password
Copy the code
** Configure postgresql.conf
Gp6.0 sets this parameter to listen_addresses = ‘*’ by default
vim /opt/greenplum/data/master/gpseg-1/postgresql.conf
Copy the code
4. Load the modified file
gpstop -u
Copy the code