“This is the 30th day of my participation in the Gwen Challenge in November. See details: The Last Gwen Challenge in 2021”

1. Plan deployment resources

1. Memory: officially recommended 16GB for each device and 30GB for each primary.

2. Disk space: 2GB for GP software installation. The usage of GP data disks must not exceed 70%.

3. Network requirements: The official recommendation is 10 GIGABit Ethernet, with multiple network ports bonded

4. File directory: The XFS file system is recommended.

5. Rhel7 installation requirements:

Operating system version: RHEL7.9

Hardpoints:

/boot   /sda1  XFS  2048MB

/ /sda2 XFS

SWAP/SDB SWAP Memory /2

SWAP/SDD SWAP Memory /2

Language choice: English

Time zone: Shanghai

Software selection: File and Print Server

Optional add-ons: Development Tools

The default root password is 123456

1. System version: Redhat7.9 2. Hardware: 3 VMS, 2 cores, 16G memory, and 50G hard disk 3

Host IP

host

Node planning

192.168.31.201

mdw

master

192.168.31.202

sdw1

seg1,seg2,mirror3,mirror4

192.168.31.203

sdw2

seg3,seg4,mirror1,mirror2

2. Configure deployment parameters

Rely on:

# yum install Install dependency check procedure GP5. x install dependency check gp6.2 Install dependency check yum install Install dependency checkCopy the code
apr apr-util bash bzip2 curl krb5 libcurl libevent (or libevent2 on RHEL/CentOS 6) libxml2 libyaml zlib openldap openssh  openssl openssl-libs (RHEL7/Centos7) perl readline rsync R sed (used by gpinitsystem) tar zip mount /dev/cdrom /mnt mv /etc/yum.repos.d/* /tmp/echo "[local]" >> /etc/yum.repos.d/local.repo
echo "name = local" >> /etc/yum.repos.d/local.repo
echo "baseurl = file:///mnt/" >> /etc/yum.repos.d/local.repo
echo "enabled = 1" >> /etc/yum.repos.d/local.repo
echo "gpgcheck = 0" >> /etc/yum.repos.d/local.repo
 
yum clean all
yum repolist all

yum install -y apr apr-util bash bzip2 curl krb5 libcurl libevent libxml2 libyaml zlib openldap openssh openssl openssl-libs perl readline rsync R sed tar zip krb5-devel
Copy the code

1. Disable the firewall and Selinux

systemctl stop firewalld.service
systemctl disable firewalld.service
systemctl status firewalld.service
Copy the code

2. Change the host name

hostnamectl set-hostname mdw
hostnamectl set-hostname sdw1
hostnamectl set-hostname sdw2
Copy the code

3. Modify the /etc/hosts file

Vim /etc/hosts 192.168.31.201 MDW 192.168.31.202 SDw1 192.168.31.203 SDw2Copy the code

4. Configure the system parameter file sysctl.conf

Modify the system parameters according to the actual situation of the system (the default values are officially given before GP 5.0, and some calculation formulas are given after 5.0). Reloading parameters (sysctl -p) :

Shmall = _PHYS_PAGES / 2 # See Shared Memory Pages #
kernel.shmall = 4000000000
Shmmax = kernel.shmall * PAGE_SIZE
kernel.shmmax = 500000000
kernel.shmmni = 4096
vm.overcommit_memory = 2 # See Segment Host Memory
vm.overcommit_ratio = 95 # See Segment Host Memory
net.ipv4.ip_local_port_range = 10000 65535 # See Port Settings Port Settingskernel.sem = 500 2048000 200 40960 kernel.sysrq = 1 kernel.core_uses_pid = 1 kernel.msgmnb = 65536 kernel.msgmax = 65536  kernel.msgmni = 2048 net.ipv4.tcp_syncookies = 1 net.ipv4.conf.default.accept_source_route = 0 net.ipv4.tcp_max_syn_backlog = 4096 net.ipv4.conf.all.arp_filter = 1 net.core.netdev_max_backlog = 10000 net.core.rmem_max = 2097152 net.core.wmem_max = 2097152 vm.swappiness = 10 vm.zone_reclaim_mode = 0 vm.dirty_expire_centisecs = 500 vm.dirty_writeback_centisecs = 100 vm.dirty_background_ratio = 0# See System Memory
vm.dirty_ratio = 0
vm.dirty_background_bytes = 1610612736
vm.dirty_bytes = 4294967296
Copy the code
-- Shared memory $echo $(expr $(getconf _PHYS_PAGES) / 2) 
$ echo $(expr $(getconf _PHYS_PAGES) / 2 \* $(getconf PAGE_SIZE))

[root@mdw ~]# echo $(expr $(getconf _PHYS_PAGES) / 2)
2053918
[root@mdw ~]# echo $(expr $(getconf _PHYS_PAGES) / 2 \* $(getconf PAGE_SIZE))
8412848128
Copy the code
Overcommit_memory The system uses this parameter to determine how much memory can be allocated to processes. For GP databases, this parameter should be set to 2. Vm. overcommit_ratio specifies the percentage of the allocation to the process, leaving the rest to the operating system. On Red Hat, the default value is 50. Overcommit_ratio VM. overcommit_ratio = (RAM-0.026*gp_vmem)/RAMCopy the code
-- Port is set to avoid port conflicts with other applications during Greenplum initialization. Net.ipv4.ip_local_port_range is specified. When initializing Greenplum using gpinitSystem, do not specify the Greenplum database port in this scope. For example, if net.ipv4.ip_local_port_range = 10000 65535, set the Greenplum database base port number to these values. PORT_BASE = 6000 MIRROR_PORT_BASE = 7000Copy the code
Dirty_background_ratio = 0 Vm. dirty_background_bytes = 1610612736 Is recommended if the system memory is greater than 64 GB# 1.5 GB
vm.dirty_bytes = 4294967296 # 4GBIf the system memory is smaller than or equal to 64GB, remove vm.dirty_background_bytes and set vm.dirty_background_ratio = 3 vm.dirty_ratio = 10Copy the code
Add vm.min_free_kbytes to ensure that the network and storage driver PF_MEMALLOC are allocated. This is especially important for large memory systems. On general systems, the default value is usually too low. You can use the awk command to calculate the value of vm.min_free_kbytes, which is usually 3% of the recommended system physical memory: AWK'BEGIN {OFMT = "%.0f"; } /MemTotal/ {print "vm.min_free_kbytes =", $2 * .03; } '/proc/meminfo >> /etc/sysctl.conf Do not set vm.min_free_kbytes to more than 5% of the system memory, as this may cause insufficient memory.Copy the code

Redhat7.9,16G memory was used in this experiment and the configuration is as follows:

vim /etc/sysctl.conf

kernel.shmall = 2053918
kernel.shmmax = 8412848128
kernel.shmmni = 4096
vm.overcommit_memory = 2
vm.overcommit_ratio = 95

net.ipv4.ip_local_port_range = 10000 65535
kernel.sem = 500 2048000 200 4096
kernel.sysrq = 1
kernel.core_uses_pid = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.msgmni = 2048
net.ipv4.tcp_syncookies = 1
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.tcp_max_syn_backlog = 4096
net.ipv4.conf.all.arp_filter = 1
net.core.netdev_max_backlog = 10000
net.core.rmem_max = 2097152
net.core.wmem_max = 2097152
vm.swappiness = 10
vm.zone_reclaim_mode = 0
vm.dirty_expire_centisecs = 500
vm.dirty_writeback_centisecs = 100
vm.dirty_background_ratio = 3
vm.dirty_ratio = 10
Copy the code

5. Modify the/etc/security/limits. Conf

vim /etc/security/limits.conf

* soft nofile 524288
* hard nofile 524288
* soft nproc 131072
* hard nproc 131072
Copy the code

RHEL/CentOS 7 changes: / etc/security/limits. D / 20 – nproc. Nproc conf file for 131072

[root@mdw ~]# cat /etc/security/limits.d/20-nproc.conf
# Default limit for number of user's processes to prevent
# accidental fork bombs.
# See rhbz #432903 for reasoning.

*          soft    nproc     131072
root       soft    nproc     unlimited
Copy the code

The Linux module pam_limits sets user limits by reading limits.conf files. The ulimit -u command displays the maximum number of processes available to each user. Max user processes Verify that the return value is 131072.

6. XFS mount

Compared with ext4, XFS has the following advantages: Ext4 is indeed very stable as a traditional file system. However, with the increasing storage requirements, ext4 is no longer suitable for historical disks. Ext4 supports a maximum of 4 billion inodes (32 bits) and a maximum of 16 TB files. XFS uses 64-bit management space and the file system size can reach EB level. XFS is a file system that needs XFS based on B+Tree metadata management GP. RHEL/CentOS 7 and Oracle Linux use XFS as the default file system. SUSE/openSUSE has long supported XFS. Because the VM has only one disk and it is a system disk, the file system cannot be changed. Hanging on XFS is skipped here.

[root@mdw ~]# cat /etc/fstab 

#
# /etc/fstab
# Created by anaconda on Sat Feb 27 08:37:50 2021
#
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
#
/dev/mapper/rhel-root   /                       xfs     defaults        0 0
UUID=8553f10d-0334-4cd5-8e3f-b915b6e0ccaa /boot                   xfs     defaults        0 0
/dev/mapper/rhel-swap   swap                    swap    defaults        0 0
/dev/mapper/rhel-swap00 swap                    swap    defaults        0 0
Copy the code

Gp6 Does not have the gpCheck tool, so it does not affect the cluster. Before gp6 is installed, the GPCheck script can be commented out to check some code of the file system.

The file system is usually specified during the installation of the operating system or formatted when a new disk is mounted. You can also format disks that are not system disks into a specified file system. For example, mount a new XFS:

mkfs.xfs /dev/sda3 
mkdir -p /data/master

vi /etc/fstab
dev/data /data xfs nodev,noatime,nobarrier,inode64 0 0
Copy the code

7.Disk I/O Settings

Disk file prefetch: 16384. Disk directories on different systems are different. You can use LSBLK to check whether the disk is hanging

[root@mdw ~]# lsblk NAME MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB MB ├ ─ rhel - root 253:0 0 0 LVM / 82 g ├ ─ rhel - swap 253:1 0 0 8 g LVM/swap └ ─ rhel - swap00 253:2 0 0 8 g LVM/swap sr0 11:0 1 4.2g 0 ROM/MNT [root@mdw ~]# /sbin/blockdev --setra 16384 /dev/sda
[root@mdw ~]# /sbin/blockdev --getra /dev/sdaBlockdev --setra 16384 /dev/sda chmod +x /etc/rc.d/rc.localCopy the code

8.Disk I/O Scheduler Algorithm for scheduling Disk I/ OS

--RHEL 7.x or CentOS 7.x, use grub2, you can use the system tool Grubby to change the value. grubby --update-kernel=ALL --args="elevator=deadline"Grubby --info=ALLCopy the code

9.Transparent Huge Pages (THP) Disables THP

Disable THP because it degrades the Greenplum database performance.

--RHEL 7.x or CentOS 7.x, use grub2, you can use the system tool Grubby to change the value. grubby --update-kernel=ALL --args="transparent_hugepage=never"$cat /sys/kernel/mm/* Transparent_hugePage /enabled always [never]Copy the code

10.IPC Object Removal

Disable IPC Object removal for RHEL 7.2 or CentOS 7.2, or Ubuntu. The default systemd setting RemoveIPC=yes removes IPC connections when non-system user accounts log out. This causes the Greenplum Database utility gpinitsystem to fail with semaphore errors. Perform one of the following to avoid this issue. When you add the gpadmin operating system user account to the master node in Creating the Greenplum Administrative User, create the user as a system account. Disable RemoveIPC. Set this parameter in /etc/systemd/logind.conf on the Greenplum Database host systems.

vi /etc/systemd/logind.conf

RemoveIPC=no

service systemd-logind restart
Copy the code

11.SSH Connection Threshold Indicates the SSH Connection Threshold

The Greenplum database manager’s gpexpand ‘gpinitSystem, gpaddmirrors, uses SSH connections to perform tasks. In a large Greenplum cluster, the number of SSH connections for a program may exceed the maximum threshold for unauthenticated connections for the host. When this happens, you receive the following error: SSH_exchangE_IDENTIFICATION: The connection was closed by the remote host. To avoid this, update the MaxStartups and MaxSessions parameters in /etc/ssh/sshd_config or /etc/sshd_config files

If you specify MaxStartups and MaxSessions using a single integer value, you identify the maximum number of concurrent unauthenticated connections (MaxStartups) and maximum number of open shell, login, or subsystem sessions permitted per network connection (MaxSessions). For example:

MaxStartups 200
MaxSessions 200
Copy the code

If you specify MaxStartups using the “start:rate:full” syntax, you enable random early connection drop by the SSH daemon. start identifies the maximum number of unauthenticated SSH connection attempts allowed. Once start number of unauthenticated connection attempts is reached, the SSH daemon refuses rate percent of subsequent connection attempts. full identifies the maximum number of unauthenticated connection attempts after which all attempts are refused. For example:

Max Startups 10:30:200
MaxSessions 200
Copy the code
Vi /etc/ssh/sshd_config or /etc/sshd_config Max Startups 10:30:200 MaxSessions 200 -- Restart the SSHD for the parameters to take effect# systemctl reload sshd.service
Copy the code

Synchronizing System Clocks (NTP)

Conf on the master server to configure the clock server as the NTP server in the data center. If no, change the time of the master server to the correct time, and then change the /etc/ntp.conf file of other nodes to synchronize the time of the master server.

-- Root Log in to the master host using the vi /etc/ntp.conf file#10.6.220.20 is your time server IPServer 10.6.220.20 --root log in to segment host server MDW prefer# priority primary node
server smdw        If there is no standby node, it can be configured as the clock server in the data center
service ntpd restart  Restart the NTP service
Copy the code

Check the character set

-- if not, add RC_LANG= en_us.utf-8 [root@mdw greenplum-db] to /etc/sysconfig/language# echo $LANG
en_US.UTF-8
Copy the code

14.Creating the Greenplum Administrative Use

Gp6.2 No gpseginstall. You must create the gpadmin user before installation

Create a gpadmin user on each node to manage and run the GP cluster, preferably with sudo permission. After the GP of the active node is installed, use GPSSH to create the GPSSH on other nodes in batches. Example:

groupadd gpadmin
useradd gpadmin -r -m -g gpadmin
passwd gpadmin
echo "gpadmin" |passwd gpadmin --stdin
Copy the code

3. Configure and install GP

1. Upload the installation file and install it

[root@mdw ~]# mkdir /soft
[root@mdw ~]# 
[root@mdw ~]# id gpadmin
uid=995(gpadmin) gid=1000(gpadmin) groups=1000(gpadmin)
[root@mdw ~]# chown -R gpadmin:gpadmin /soft/
[root@mdw ~]# chmod 775 /soft/
[root@mdw ~]# cd /soft/
[root@mdw soft]# lsOpensource-greenplum-db-6.14.1-rhel7-x86_64. RPM -- install [root@mdw soft]# RPM - the ivh open - source - greenplum db - 6.14.1 - rhel7 - x86_64. RPM
Preparing...                          # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # [100%]Updating / installing... 1: the open - source - greenplum db - 6-6.14.1# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # [100%]
/usr/local/ = /usr/local/
chown -R gpadmin:gpadmin /usr/local/greenplum*
Copy the code

2. Configure SSH to enable trust in the cluster and secret-free login (required for both root and gpadmin).

Gpssh-exkeys -f all_host is not required in 3.3.1 ssh-keygen and 3.3.2 ssh-copy-id.

$ su gpadmin
Create hostfile_exkeys # #$GPHOMEDirectory Create two host files (all_host and seg_host) for subsequent use of scripts such as GPSSH and GPSCP. Host parameter file all_host: contains all host names or IP addresses of the cluster, including master,segment, and standby. Seg_host: specifies the host names or IP addresses of all segments. If a machine has multiple network adapters and the network adapters are not bound in bond0 mode, the IP addresses or hosts of the network adapters must be listed. [gpadmin@mdw ~]# cd /usr/local/
[gpadmin@mdw local]$ls bin etc games greenplum-db greenplum-db-6.14.1 include lib lib64 libexec sbin share SRC [gpadmin@mdwlocal]# cd greenplum-db
[gpadmin@mdw greenplum-db]$ ls
bin        docs  ext                include  libexec  NOTICE                                      sbin
COPYRIGHT  etc   greenplum_path.sh  lib      LICENSE  open_source_license_greenplum_database.txt  share
[gpadmin@mdw greenplum-db]# vim all_host
[gpadmin@mdw greenplum-db]# vim seg_host
[gpadmin@mdw greenplum-db]# cat all_host
mdw
sdw1
sdw2
[gpadmin@mdw greenplum-db]# cat seg_host
sdw1
sdw2

Generate the key
$ ssh-keygen -t rsa -b 4096
Generating public/private rsa key pair.
Enter file in which to save the key (/home/gpadmin/.ssh/id_rsa):
Created directory '/home/gpadmin/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:


## Master and segment trust
su - gpadmin
ssh-copy-id -i ~/.ssh/id_rsa.pub gpadmin@sdw1
ssh-copy-id -i ~/.ssh/id_rsa.pub gpadmin@sdw2

Use gpssh-exkeys to open n-N's password-free login
[gpadmin@mdw greenplum-db]$ gpssh-exkeys -f all_host
bash: gpssh-exkeys: command not found
Environment variables need to be activated
[gpadmin@mdw greenplum-db]$ source /usr/local/greenplum-db/greenplum_path.sh [gpadmin@mdw greenplum-db]$ [gpadmin@mdw greenplum-db]$ gpssh-exkeys -f all_host [STEP 1  of 5] createlocal ID and authorize on local host
  ... /home/gpadmin/.ssh/id_rsa file exists ... key generation skipped

[STEP 2 of 5] keyscan all hosts and update known_hosts file

[STEP 3 of 5] retrieving credentials from remote hosts
  ... send to sdw1
  ... send to sdw2

[STEP 4 of 5] determine common authentication file content

[STEP 5 of 5] copy authentication files to all remote hosts
  ... finished key exchange with sdw1
  ... finished key exchange with sdw2

[INFO] completed successfully
Copy the code

3. Verify GPSSH

[gpadmin@mdw greenplum-db]$ gpssh -f /usr/local/greenplum-db/all_host -e 'ls /usr/local/'
[sdw1] ls /usr/local/
[sdw1] bin  etc  games  include  lib  lib64  libexec  sbin  share  src
[ mdw] ls /usr/local/ [MDW] bin games greenplum-db-6.14.1 lib libexec share [MDW] etc greenplum-db include lib64 sbin SRC [sdw2] ls /usr/local/
[sdw2] bin  etc  games  include  lib  lib64  libexec  sbin  share  src
Copy the code

4. Set environment variables in batches

## Batch set greenplum in the gpadmin user environment variable
Add gp installation directory and call environment information to user's environment variables.
vim /home/gpadmin/.bash_profile

cat >> /home/gpadmin/.bash_profile << EOF
source /usr/local/greenplum-db/greenplum_path.sh
EOF

vim .bashrc

cat >> /home/gpadmin/.bashrc << EOF
source /usr/local/greenplum-db/greenplum_path.sh
EOF

vim /etc/profile

cat >> /etc/profile << EOF
source /usr/local/greenplum-db/greenplum_path.sh
EOF

## Environment variable files are distributed to other nodes
su - root
source /usr/local/greenplum-db/greenplum_path.sh
gpscp -f /usr/local/greenplum-db/seg_host /etc/profile @=:/etc/profile

su - gpadmin
source /usr/local/greenplum-db/greenplum_path.sh
gpscp -f /usr/local/greenplum-db/seg_host /home/gpadmin/.bash_profile  gpadmin@=:/home/gpadmin/.bash_profile
gpscp -f /usr/local/greenplum-db/seg_host /home/gpadmin/.bashrc gpadmin@=:/home/gpadmin/.bashrc
Copy the code

4. Cluster node installation

## Different from the old versionThis section is missing from the official website. Prior to GP6, there was a tool, gpsegInstall, to install GP software for each node. Gpseginstall logs show that the main steps of gpseginstall are: 1. Create gp users on each node (skip this step). 2. Package the installation directory of the primary node. 5. Grant gpamdin gpseginstall installation log, refer to gp5 Installation notesCopy the code

1. Simulate the gpseginstall script

Run the command as user root
# variable Settings
link_name='greenplum-db'                    # soft connection name
binary_dir_location='/usr/local'            # Install path
binary_dir_name='the greenplum db - 6.14.1'        # Install directory
binary_path='/ usr/local/greenplum db - 6.14.1' # all directory


link_name='greenplum-db'
binary_dir_location='/usr/local'
binary_dir_name='the greenplum db - 6.14.1'
binary_path='/ usr/local/greenplum db - 6.14.1'
Copy the code

Master node packaging

chown -R gpadmin:gpadmin $binary_path
rm -f ${binary_path}.tar; rm -f ${binary_path}.tar.gz
cd $binary_dir_location; tar cf ${binary_dir_name}.tar ${binary_dir_name}
gzip ${binary_path}.tar



[root@mdw local]# chown -R gpadmin:gpadmin $binary_path
[root@mdw local]# rm -f ${binary_path}.tar; rm -f ${binary_path}.tar.gz
[root@mdw local]# cd $binary_dir_location; tar cf ${binary_dir_name}.tar ${binary_dir_name}
[root@mdw local]# gzip ${binary_path}.tar
[root@mdw local]# lsBin games greenplum-db-6.14.1 include lib64 sbin SRC etc greenplum-db greenplum-db-6.14.1.tar.gz lib libexec shareCopy the code

Distribute to segments as root

link_name='greenplum-db'
binary_dir_location='/usr/local'
binary_dir_name='the greenplum db - 6.14.1'
binary_path='/ usr/local/greenplum db - 6.14.1'
source /usr/local/greenplum-db/greenplum_path.sh
gpssh -f ${binary_path}/seg_host -e "mkdir -p ${binary_dir_location}; rm -rf${binary_path}; rm -rf${binary_path}.tar; rm -rf${binary_path}.tar.gz"
gpscp -f ${binary_path}/seg_host ${binary_path}.tar.gz root@=:${binary_path}.tar.gz
gpssh -f ${binary_path}/seg_host -e "cd ${binary_dir_location}; gzip -f -d${binary_path}.tar.gz; tar xf${binary_path}.tar"
gpssh -f ${binary_path}/seg_host -e "rm -rf ${binary_path}.tar; rm -rf${binary_path}.tar.gz; rm -f${binary_dir_location}/${link_name}"
gpssh -f ${binary_path}/seg_host -e ln -fs ${binary_dir_location}/${binary_dir_name} ${binary_dir_location}/${link_name}
gpssh -f ${binary_path}/seg_host -e "chown -R gpadmin:gpadmin ${binary_dir_location}/${link_name}; chown -R gpadmin:gpadmin${binary_dir_location}/${binary_dir_name}"
gpssh -f ${binary_path}/seg_host -e "source ${binary_path}/greenplum_path"
gpssh -f ${binary_path}/seg_host -e "cd ${binary_dir_location}; ll"
Copy the code

2. Create a cluster data directory

Create the master data directory
mkdir -p /opt/greenplum/data/master
chown gpadmin:gpadmin /opt/greenplum/data/master

## Standby data directory (this experiment has no standby)Use GPSSH to create a data directory for the standby remotely# source /usr/local/greenplum-db/greenplum_path.sh
# gpssh -h smdw -e 'mkdir -p /data/master'
# gpssh -h smdw -e 'chown gpadmin:gpadmin /data/master'

# create segment data directoryThis time, we plan to install two segments and two mirrors for each host.source /usr/local/greenplum-db/greenplum_path.sh
gpssh -f /usr/local/greenplum-db/seg_host -e 'mkdir -p /opt/greenplum/data1/primary'
gpssh -f /usr/local/greenplum-db/seg_host -e 'mkdir -p /opt/greenplum/data1/mirror'
gpssh -f /usr/local/greenplum-db/seg_host -e 'mkdir -p /opt/greenplum/data2/primary'
gpssh -f /usr/local/greenplum-db/seg_host -e 'mkdir -p /opt/greenplum/data2/mirror'
gpssh -f /usr/local/greenplum-db/seg_host -e 'chown -R gpadmin /opt/greenplum/data*'
Copy the code

3. Perform a cluster performance test

## Different from the old versionGp6 disables the gpCheck tool. The part that is currently verifiable is network and disk IO performance. The gpcheck tool verifies system parameters and hardware configurations required by gp. Generally speaking, the disk must reach 2000M/s and the network must reach at least 1000M/sCopy the code

4. Network performance test

[root@mdw ~]# gpcheckperf -f /usr/local/greenplum-db/seg_host -r N -d /tmp
/usr/local/ greenplum db - 6.14.1 / bin/gpcheckperf -f/usr /local/greenplum-db/seg_host -r N -d /tmp ------------------- -- NETPERF TEST ------------------- NOTICE: -t is deprecated, and has no effect NOTICE: -f is deprecated, and has no effect NOTICE: -t is deprecated, and has no effect NOTICE: -f is deprecated, And has no effect = = = = = = = = = = = = = = = = = = = = = = the RESULT: the 2021-02-27 T11 will. 502661 = = = = = = = = = = = = = = = = = = = = was bisection bandwidthtestSdw1 -> SDw2 = 2069.110000 SDW2 -> SDw1 = 2251.890000 Sum = 4321.00 MB/ SEC min = 2069.11 MB/ SEC Max = 2251.89 MB/ SEC AVG = 2160.50 MB/ SEC median = 2251.89 MB/ SECCopy the code

5. Test disk I/O performance

gpcheckperf -f /usr/local/greenplum-db/seg_host -r ds -D -d /opt/greenplum/data1/primary
Copy the code

6. Cluster clock verification (unofficial step)

Verify the cluster time. If the cluster time is inconsistent, change the NTP time

gpssh -f /usr/local/greenplum-db/all_host -e 'date'
Copy the code

5. Cluster initialization

IO /6-2/install…

1. Write an initial configuration file

su - gpadmin
mkdir -p /home/gpadmin/gpconfigs
cp $GPHOME/docs/cli_help/gpconfigs/gpinitsystem_config /home/gpadmin/gpconfigs/gpinitsystem_config
Copy the code

2. Modify parameters as required

Note: To specify PORT_BASE, review the port range specified in the net.ipv4.ip_local_port_range parameter in the /etc/sysctl.conf file. Main modified parameters:

 
ARRAY_NAME="Greenplum Data Platform"
SEG_PREFIX=gpseg
PORT_BASE=6000
declare -a DATA_DIRECTORY=(/opt/greenplum/data1/primary /opt/greenplum/data2/primary)
MASTER_HOSTNAME=mdw
MASTER_DIRECTORY=/opt/greenplum/data/master
MASTER_PORT=5432
TRUSTED_SHELL=ssh
CHECK_POINT_SEGMENTS=8
ENCODING=UNICODE
MIRROR_PORT_BASE=7000
declare -a MIRROR_DATA_DIRECTORY=(/opt/greenplum/data1/mirror /opt/greenplum/data2/mirror)
DATABASE_NAME=gpdw
Copy the code

3. Cluster initialization command parameters

# # root is carried out
##/usr/local/greenplum-db/./bin/ gpinitSystem: line 244: / TMP /cluster_tmp_file.8070: Permission denied

gpssh -f /usr/local/greenplum-db/all_host -e 'chmod 777 /tmp'

##/bin/mv: cannot stat '/ TMP /cluster_tmp_file.8070': Permission denied

gpssh -f /usr/local/greenplum-db/all_host -e 'chmod u+s /bin/ping'

su - gpadmin
gpinitsystem -c /home/gpadmin/gpconfigs/gpinitsystem_config -h /usr/local/greenplum-db/seg_host -D
Copy the code

If the initialization completes successfully, Greenplum Database instance successfully created is printed. Log is generated to/home/gpadmin gpAdminLogs/directory and naming rules: gpinitsystem_ ${date} installation. The log log last part are as follows:

20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-End Function PARALLEL_SUMMARY_STATUS_REPORT
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-End Function CREATE_SEGMENT
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-Start Function FORCE_FTS_PROBE
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-End Function FORCE_FTS_PROBE
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-Start Function SCAN_LOG
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-Scanning utility log file for any warning messages
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-Log file scan check passed
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-End Function SCAN_LOG
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-Greenplum Database instance successfully created
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-------------------------------------------------------
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-To complete the environment configuration, please 
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-update gpadmin .bashrc file with the following
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-1. Ensure that the greenplum_path.sh file is sourced
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-2. Add "export MASTER_DATA_DIRECTORY=/opt/greenplum/data/master/gpseg-1"
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-   to access the Greenplum scripts for this instance:
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-   or, use -d /opt/greenplum/data/master/gpseg-1 option for the Greenplum scripts
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-   Example gpstate -d /opt/greenplum/data/master/gpseg-1
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-Script log file = /home/gpadmin/gpAdminLogs/gpinitsystem_20210227.log
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-To remove instance, run gpdeletesystem utility
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-To initialize a Standby Master Segment for this Greenplum instance
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-Review options for gpinitstandby
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-------------------------------------------------------
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-The Master /opt/greenplum/data/master/gpseg-1/pg_hba.conf post gpinitsystem
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-has been configured to allow all hosts within this new
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-array to intercommunicate. Any hosts external to this
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-new array must be explicitly added to this file
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-Refer to the Greenplum Admin support guide which is
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-located in the /usr/local/ greenplum db - 6.14.1 / docs directory 20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:------------------------------------------------------- 20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-End MainCopy the code

Read the bottom of the log carefully, there are a few more steps to take. 4.3.1 Checking Log Content The following information is displayed in the log:

Scan of log file indicates that some warnings or errors
were generated during the array creation
Please review contents of log file
/home/gpadmin/gpAdminLogs/gpinitsystem_20210227.log
Copy the code

Scan warnings or errors:

[gpadmin@mdw ~]$ cat /home/gpadmin/gpAdminLogs/gpinitsystem_20210227.log|grep -E -i 'WARN|ERROR]'
WARNING: enabling "trust" authentication for local connections
WARNING: enabling "trust" authentication for local connections
WARNING: enabling "trust" authentication for local connections
WARNING: enabling "trust" authentication for local connections
WARNING: enabling "trust" authentication for local connections
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-Scanning utility log file for any warning messages
Copy the code

Adjust the log content to optimize cluster performance.

4. Set environment variables

Edit the environment variable for user gpadmin and add:

source /usr/local/greenplum-db/greenplum_path.sh
export MASTER_DATA_DIRECTORY=/opt/greenplum/data/master/gpseg-1
Copy the code

In addition to this, it is usually added:

export PGPORT=5432       # Set this parameter based on the actual situation
export PGUSER=gpadmin    # Set this parameter based on the actual situation
export PGDATABASE=gpdw  # Set this parameter based on the actual situation
Copy the code

Environment variables for reference: GPDB. Docs. Pivotal. IO / 510 / install… .

su - gpadmin
cat >> /home/gpadmin/.bash_profile << EOF
export MASTER_DATA_DIRECTORY=/opt/greenplum/data/master/gpseg-1
export PGPORT=5432
export PGUSER=gpadmin
export PGDATABASE=gpdw
EOF

cat >> /home/gpadmin/.bashrc << EOF
export MASTER_DATA_DIRECTORY=/opt/greenplum/data/master/gpseg-1
export PGPORT=5432
export PGUSER=gpadmin
export PGDATABASE=gpdw
EOF

## Environment variable files are distributed to other nodes
gpscp -f /usr/local/greenplum-db/seg_host /home/gpadmin/.bash_profile gpadmin@=:/home/gpadmin/.bash_profile
gpscp -f /usr/local/greenplum-db/seg_host /home/gpadmin/.bashrc gpadmin@=:/home/gpadmin/.bashrc
gpssh -f /usr/local/greenplum-db/all_host -e 'source /home/gpadmin/.bash_profile; source /home/gpadmin/.bashrc; '

Copy the code

5. To delete the reinstallation, use gpdeletesystem

IO /6-2/ Utility gpdeletesystem for more details, see gpDB.docs.pivotal. Use the following command:

gpdeletesystem -d /opt/greenplum/data/master/gpseg-1 -f
Copy the code

-d MASTER_DATA_DIRECTORY (master data directory) deletes all data directories of the master segment. -f force: Terminates all processes and forcibly deletes them. Example:

gpdeletesystem -d /opt/greenplum/data/master/gpseg-1 -f
Copy the code

After deleting the cluster, adjust the cluster initialization configuration file as required and initialize the cluster again.

vi /home/gpadmin/gpconfigs/gpinitsystem_config
gpinitsystem -c /home/gpadmin/gpconfigs/gpinitsystem_config -h /usr/local/greenplum-db/seg_host -D
Copy the code

6. Configure after successful installation

1. PSQL log in to gp and set the password

Use PSQL to log in to gp. The general command format is as follows:

psql -h hostname -p port -d database -U user -W password
Copy the code

-h followed by the host name of the master or segment -p Followed by the port number of the master or segment -d Followed by the database name You can set the preceding parameters to user environment variables. In Linux, the gpadmin user does not require a password. Example for PSQL login and setting the password of user gpadmin:

[gpadmin@mdw gpseg- 1]$ psql -h mdw -p5432 -d gpdw
psql (9.424.)
Type "help" for help.

gpdw=# alter user gpadmin with password 'gpadmin';
ALTER ROLE
gpdw=# \q
Copy the code

2. Log in to different nodes

[gpadmin@mdw gpseg-1]$ PGOPTIONS='-c gp_session_role=utility'PSQL -h MDW -p5432 -d postgres PSQL (9.4.24) Type"help" for help.

postgres=# \q
[gpadmin@mdw gpseg-1]$ PGOPTIONS='-c gp_session_role=utility'PSQL -h sdw1-p6000 -d postgres PSQL (9.4.24) Type"help" for help.

postgres=# \q
Copy the code

3. Log in to the GP client

Configure pg_hba.conf Configure postgresql.conf

Configuration pg_hba. Conf

The reference configuration instructions: blog.csdn.net/yaoqiancuo3…

vim /opt/greenplum/data/master/gpseg-1/pg_hba.conf

# # modifiedHost replication gpadmin 192.168.31.201/32 Trust# # forHost all gpadmin 192.168.31.201/32 trust# # newHost all gpadmin 0.0.0.0/0 MD5# New rule allows login with any IP password
Copy the code

** Configure postgresql.conf

Gp6.0 sets this parameter to listen_addresses = ‘*’ by default

vim /opt/greenplum/data/master/gpseg-1/postgresql.conf
Copy the code

4. Load the modified file

gpstop -u 
Copy the code