GreenPlum is introduced
PostgreSql
PostgreSQL is an advanced object-relational database management system that supports an extended subset of the SQL standard, including transactions, foreign keys, subqueries, triggers, user-defined types and functions. This distribution also contains C language bindings.
GreenPlum
The Greenplum Database (GPDB) is an advanced, fully featured, open source data warehouse. It provides powerful and rapid analytics on petabyte scale data volumes. Uniquely geared toward big data analytics, Greenplum Database is powered by the world’s most advanced cost-based Query Optimizer delivering high Analytical Query Greenplum is based on PostgreSQL and has almost the same syntax as PostgreSQL. Greenplum is essentially a relational database cluster, a logical database composed of multiple independent database services. Greenplum is no different from Oracle or PostgreSQL in that it can access data through standard SQL statements.
GreenPlum source code compiled and installed
CentOS 6.9 is used as an example. The download address is Aliccloud Image.
CentOS – 6.9 – x86_64 – bin – DVD1. Iso CentOS – 6.9 – x86_64 – bin – DVD2. Iso
It is strongly recommended that the user name gpadmin be used.
The preparatory work
Add the permission of user root
Vim opens the /etc/sudoers file (as root) and finds the following two lines
## Allow root to run any commands anywhere
root ALL=(ALL) ALL
Copy the code
Add it on the next line
gpadmin ALL=(ALL) ALL # Notice alignment
Copy the code
Save and return
Install GCC – c + +
sudo yum install gcc-c++
Copy the code
Install git
sudo yum install git
Copy the code
Install cmake
cd/ home/gpadmin wget https://cmake.org/files/v3.5/cmake-3.5.2.tar.gz tar - ZXVF cmake - 3.5.2. Tar. GzcdCmake - 3.5.2. / configure -- prefix = / usr /local
make
sudo make install
Copy the code
Install the PIP
sudo yum -y install epel-release
sudo yum -y install python-pip
pip install --upgrade The PIP version must be at least 7.x.x
Copy the code
Make sure you have the necessary Python modules installed
- psutil
- Lockfile (> = 0.9.1)
- paramiko
- setuptools
Ensure dynamic link library sharing
Add /usr/local/lib and /usr/local/lib64 to /etc/ld.so.conf and run the command
sudo ldconfig
Copy the code
Support for c + + 11
sudo yum install -y centos-release-scl
sudo yum install -y devtoolset-6-toolchain
echo 'source scl_source enable devtoolset-6' >> ~/.bashrc
source ~/.bashrc
Copy the code
Install dependencies
sudo yum -y install gcc git apr bison flex readline gcc-c++ curl-devel bzip2-devel python-devel readline-devel apr-devel libevent-devel openssl-devel perl-ExtUtils-Embed libxml2-devel openldap-devel pam pam-develCopy the code
Install GPORCA
clone GPORCA
cd /home/gpadmin
git clone https://github.com/greenplum-db/gporca.git
Copy the code
Install the gp – xerces
git clone https://github.com/greenplum-db/gp-xerces.git
cd gp-xerces
mkdir build
cdbuild .. /configure --prefix=/usr/local
make
sudo make install
Copy the code
Install the ninja
cd /home/gpadmin
git clone git://github.com/ninja-build/ninja.git && cd ninja
./configure.py --bootstrap
sudo cp ninja /usr/bin
Copy the code
Compile and install GPORCA
cd /home/gpadmin/gporca
cmake -GNinja -H. -Bbuild
sudo ninja install -C build
Copy the code
Compile the GreenPlum
cd /home/gpadmin
git clone https://github.com/greenplum-db/gpdb.git
cd gpdb
./configure --with-perl --with-python --with-libxml --with-gssapi --prefix=/usr/local/gpdb
# Compile and install
make -j8
sudo make -j8 install
# Bring in greenplum environment into your running shell
source /usr/local/gpdb/greenplum_path.sh
Copy the code
If appear
Greenplum Database installation complete.
Copy the code
If yes, the installation is successful
The deployment of
Take a master node and a segment node on the same host as an example
Configuring the Host File
Check the IP
ip addr show eth0
Select IP from inet
Copy the code
Modifying the hosts file
sudo vim /etc/hosts
Copy the code
Add a line or two
<ip> mdw
<ip> sdw1
The IP addresses are the same
Copy the code
Modifying a Network file
sudo vim /etc/sysconfig/network
# change the HOSTNAME = MDW
Copy the code
Create host list file and segment file
cd /home/gpadmin
mkdir conf
cd conf
sudo vim hostlist
Copy the code
Add the following
mdw
sdw1
Copy the code
sudo vim seg_hosts
Copy the code
Add the following
sdw1
Copy the code
Shared memory and network parameters
sudo vim /etc/sysctl.conf
Copy the code
Replace it with the following
net.ipv4.ip_forward = 0
net.ipv4.conf.default.accept_source_route = 0
kernel.sysrq = 1
kernel.core_uses_pid = 1
net.ipv4.tcp_syncookies = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.sem = 250 64000 100 512
kernel.shmmax = 500000000
kernel.shmmni = 4096
kernel.shmall = 4000000000
kernel.sem = 250 64000 100 512
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_max_syn_backlog = 4096
net.core.netdev_max_backlog = 10000
vm.overcommit_memory = 2
net.ipv4.conf.all.arp_filter = 1
Copy the code
sudo sysctl -p
Copy the code
Add limit parameter
sudo vim /etc/security/limits.conf
Copy the code
Add the following
* soft nofile 65536
* hard nofile 65536
* soft nproc 131072
* hard nproc 131072
The # asterisk cannot be removed
Copy the code
Node deployment
Setting environment variables
source /usr/local/gpdb/greenplum_path.sh
Copy the code
Configure password-free login
sudo pip install psutil
gpssh-exkeys -f /home/gpadmin/conf/hostlist
Copy the code
Creating a data directory
cd /home/gpadmin
mkdir gpdata
cd gpdata
mkdir segmentdata1 masterdata
Copy the code
Environment Variable Configuration
vim ~/.bash_profile
Copy the code
Add the following
source /usr/local/gpdb/greenplum_path.sh
export MASTER_DATA_DIRECTORY=/home/gpadmin/gpdata/masterdata/gpseg-1
export PGPORT=2345
export PGDATABASE=testDB
Copy the code
source ~/.bash_profile
Copy the code
Write a database startup parameter file
cd /home/gpadmin/conf
mkdir gpconfigs
cd gpconfigs
vim gpinitsystem_config
Copy the code
Add the following
Start parameter file contents
ARRAY_NAME="Greenplum"
SEG_PREFIX=gpseg
PORT_BASE=40000
declare -a DATA_DIRECTORY=(/home/gpadmin/gpdata/segmentdata1 /home/gpadmin/gpdata/segmentdata1)
MASTER_HOSTNAME=mdw
MASTER_DIRECTORY=/home/gpadmin/gpdata/masterdata
##### Port number for the master instance.
MASTER_PORT=2345
# #### Shell utility used to connect to remote hosts.
TRUSTED_SHELL=/usr/bin/ssh
##### Maximum log file segments between automatic WAL checkpoints.
CHECK_POINT_SEGMENTS=8
ENCODING=UNICODE
#MIRROR_PORT_BASE=50000
REPLICATION_PORT_BASE=41000
#MIRROR_REPLICATION_PORT_BASE=51000
#declare -a MIRROR_DATA_DIRECTORY=(/home/gpadmin/gpdata/segmentmirror1 #/home/gpadmin/gpdata/segmentmirror)
MACHINE_LIST_FILE=/home/gpadmin/conf/seg_hosts
Copy the code
Initializing the database
Gpinitsystem - c/home/gpadmin/conf/gpconfigs/gpinitsystem_config - aCopy the code
Database usage
Gpstart # Start the database service gpStop # Stop the database service gprecoverseg # Restore the failed node gpState -m # Check the node status createdb testDB # Create database PSQL # Connect to database PSQL –help see more options
TPC -h test
Run Greenplum TPC -h test reference https://yq.aliyun.com/articles/93?commentId=29