Basic Configuration Environment
Rely on | version |
---|---|
ClickHouse | 20.11.3.3-2 |
CentOS | 7.x |
Gcc | 4.8.5 |
Download and install ClickHouse
- Offline installation (we use download to install locally)
Wget wget HTTP: / / https://repo.yandex.ru/clickhouse/rpm/stable/x86_64/clickhouse-client-20.11.3.3-2.noarch.rpm https://repo.yandex.ru/clickhouse/rpm/stable/x86_64/clickhouse-common-static-20.11.3.3-2.x86_64.rpm wget https://repo.yandex.ru/clickhouse/rpm/stable/x86_64/clickhouse-common-static-dbg-20.11.3.3-2.x86_64.rpm wget https://repo.yandex.ru/clickhouse/rpm/stable/x86_64/clickhouse-server-20.11.3.3-2.noarch.rpm yum install - y clickhouse - *Copy the code
- Yum source installed online
sudo yum install yum-utils
sudo rpm --import https://repo.clickhouse.tech/CLICKHOUSE-KEY.GPG
sudo yum-config-manager --add-repo https://repo.clickhouse.tech/rpm/stable/x86_64
sudo yum install clickhouse-server clickhouse-client
Copy the code
If you want to use the latest version, replace stable with testing (recommended in your test environment)
Configuration ClickHouse
Clickhouse-server and Clickhouse-client are generated in /etc by default. Since we are installing the service, we need to modify the clickhouse-server configuration file
- Modify the
/etc/clickhouse-server/config.xml
<! --NOTE: User and query level settings are set up in "users.xml" file.
-->
<yandex>
<logger>
<! -- Possible levels: https://github.com/pocoproject/poco/blob/develop/Foundation/include/Poco/Logger.h#L105 -->
<level>trace</level>
<log>/data2/clickhouse/clickhouse-server.log</log>
<errorlog>/data2/clickhouse/clickhouse-server.err.log</errorlog>
<size>1000M</size>
<count>10</count>
<! -- <console>1</console> --> <! -- Default behavior is autodetection (log to console if not daemon mode and is tty) -->
</logger>
<! --display_name>production</display_name--> <! -- It is the name that will be shown in the client -->
<http_port>9123</http_port>
<tcp_port>9000</tcp_port>
<! -- For HTTPS and SSL over native protocol. -->
<! -- <https_port>8443</https_port> <tcp_port_secure>9440</tcp_port_secure> -->
<! -- Used with https_port and tcp_port_secure. Full ssl options list: https://github.com/ClickHouse-Extras/poco/blob/master/NetSSL_OpenSSL/include/Poco/Net/SSLManager.h#L71 -->
<openSSL>
<server> <! -- Used for https server AND secure tcp port -->
<! -- openssl req -subj "/CN=localhost" -new -newkey rsa:2048 -days 365 -nodes -x509 -keyout /etc/clickhouse-server/server.key -out /etc/clickhouse-server/server.crt -->
<certificateFile>/etc/clickhouse-server/server.crt</certificateFile>
<privateKeyFile>/etc/clickhouse-server/server.key</privateKeyFile>
<! -- openssl dhparam -out /etc/clickhouse-server/dhparam.pem 4096 -->
<dhParamsFile>/etc/clickhouse-server/dhparam.pem</dhParamsFile>
<verificationMode>none</verificationMode>
<loadDefaultCAFile>true</loadDefaultCAFile>
<cacheSessions>true</cacheSessions>
<disableProtocols>sslv2,sslv3</disableProtocols>
<preferServerCiphers>true</preferServerCiphers>
</server>
<client> <! -- Used for connecting to https dictionary source -->
<loadDefaultCAFile>true</loadDefaultCAFile>
<cacheSessions>true</cacheSessions>
<disableProtocols>sslv2,sslv3</disableProtocols>
<preferServerCiphers>true</preferServerCiphers>
<! -- Use for self-signed: <verificationMode>none</verificationMode> -->
<invalidCertificateHandler>
<! -- Use for self-signed: <name>AcceptCertificateHandler</name> -->
<name>RejectCertificateHandler</name>
</invalidCertificateHandler>
</client>
</openSSL>
<! -- Default root page on http[s] server. For example load UI from https://tabix.io/ when opening http://localhost:8123 -->
<! -- <http_server_default_response><! [CDATA[<html ng-app="SMI2"><head><base href="http://ui.tabix.io/"></head><body><div ui-view="" class="content-ui"></div><script src="http://loader.tabix.io/master.js"></script></body></html>]]></http_server_default_response> -->
<! -- Port for communication between replicas. Used for data exchange. -->
<interserver_http_port>9009</interserver_http_port>
<listen_host>0.0.0.0</listen_host>
<! -- Don't exit if ipv6 or ipv4 unavailable, but listen_host with this protocol specified -->
<! -- <listen_try>0</listen_try> -->
<! -- Allow listen on same address:port -->
<! -- <listen_reuse_port>0</listen_reuse_port> -->
<! -- <listen_backlog>64</listen_backlog> -->
<max_connections>4096</max_connections>
<keep_alive_timeout>3</keep_alive_timeout>
<! -- Maximum number of concurrent queries. -->
<max_concurrent_queries>100</max_concurrent_queries>
<! -- Set limit on number of open files (default: maximum). This setting makes sense on Mac OS X because getrlimit() fails to retrieve correct maximum value. -->
<! -- <max_open_files>262144</max_open_files> -->
<! -- Size of cache of uncompressed blocks of data, used in tables of MergeTree family. In bytes. Cache is single for server. Memory is allocated only on demand. Cache is used when 'use_uncompressed_cache' user setting turned on (off by default). Uncompressed cache is advantageous only for very short queries and in rare cases. -->
<uncompressed_cache_size>8589934592</uncompressed_cache_size>
<! -- Approximate size of mark cache, used in tables of MergeTree family. In bytes. Cache is single for server. Memory is allocated only on demand. You should not lower this value. -->
<mark_cache_size>5368709120</mark_cache_size>
<! -- Path to data directory, with trailing slash. -->
<path>/data2/clickhouse/data/</path>
<! -- Path to temporary data for processing hard queries. -->
<tmp_path>/data2/clickhouse/tmp/</tmp_path>
<! -- Directory with user provided files that are accessible by 'file' table function. -->
<user_files_path>/data2/clickhouse/user_files/</user_files_path>
<! -- Path to configuration file with users, access rights, profiles of settings, quotas. -->
<users_config>users.xml</users_config>
<! -- Default profile of settings. -->
<default_profile>default</default_profile>
<! -- System profile of settings. This settings are used by internal processes (Buffer storage, Distibuted DDL worker and so on). -->
<! -- <system_profile>default</system_profile> -->
<! -- Default database. -->
<default_database>default</default_database>
<mlock_executable>false</mlock_executable>
<zookeeper incl="zookeeper-servers" optional="true" />
<! -- Substitutions for parameters of replicated tables. Optional. If you don't use replicated tables, you could omit that. See https://clickhouse.yandex/docs/en/table_engines/replication/#creating-replicated-tables -->
<macros incl="macros" optional="true" />
<! -- Reloading interval for embedded dictionaries, in seconds. Default: 3600. -->
<builtin_dictionaries_reload_interval>3600</builtin_dictionaries_reload_interval>
<! -- Maximum session timeout, in seconds. Default: 3600. -->
<max_session_timeout>3600</max_session_timeout>
<! -- Default session timeout, in seconds. Default: 60. -->
<default_session_timeout>60</default_session_timeout>
<! -- Query log. Used only for queries with setting log_queries = 1. -->
<query_log>
<! -- What table to insert data. If table is not exist, it will be created. When query log structure is changed after system update, then old table will be renamed and new table will be created automatically. -->
<database>system</database>
<table>query_log</table>
<! -- PARTITION BY expr https://clickhouse.yandex/docs/en/table_engines/custom_partitioning_key/ Example: event_date toMonday(event_date) toYYYYMM(event_date) toStartOfHour(event_time) -->
<partition_by>toYYYYMM(event_date)</partition_by>
<! -- Interval of flushing data. -->
<flush_interval_milliseconds>7500</flush_interval_milliseconds>
</query_log>
<! -- Trace log. Stores stack traces collected by query profilers. See query_profiler_real_time_period_ns and query_profiler_cpu_time_period_ns settings. -->
<trace_log>
<database>system</database>
<table>trace_log</table>
<partition_by>toYYYYMM(event_date)</partition_by>
<flush_interval_milliseconds>7500</flush_interval_milliseconds>
</trace_log>
<! -- Query thread log. Has information about all threads participated in query execution. Used only for queries with setting log_query_threads = 1. -->
<query_thread_log>
<database>system</database>
<table>query_thread_log</table>
<partition_by>toYYYYMM(event_date)</partition_by>
<flush_interval_milliseconds>7500</flush_interval_milliseconds>
</query_thread_log>
<dictionaries_config>*_dictionary.xml</dictionaries_config>
<! -- Uncomment if you want data to be compressed 30-100% better. Don't do that if you just started using ClickHouse. -->
<compression incl="clickhouse_compression">
</compression>
<! -- Allow to execute distributed DDL queries (CREATE, DROP, ALTER, RENAME) on cluster. Works only if ZooKeeper is enabled. Comment it if such functionality isn't required. -->
<distributed_ddl>
<! -- Path in ZooKeeper to queue with DDL queries -->
<path>/a8root/clickhouse/task_queue/ddl</path>
<! -- Settings from this profile will be used to execute DDL queries -->
<! -- <profile>default</profile> -->
</distributed_ddl>
<graphite_rollup_example>
<pattern>
<regexp>click_cost</regexp>
<function>any</function>
<retention>
<age>0</age>
<precision>3600</precision>
</retention>
<retention>
<age>86400</age>
<precision>60</precision>
</retention>
</pattern>
<default>
<function>max</function>
<retention>
<age>0</age>
<precision>60</precision>
</retention>
<retention>
<age>3600</age>
<precision>300</precision>
</retention>
<retention>
<age>86400</age>
<precision>3600</precision>
</retention>
</default>
</graphite_rollup_example>
<format_schema_path>/data2/clickhouse/format_schemas/</format_schema_path>
</yandex>
Copy the code
We generally modify the following configuration items:
logger
Example Change the log storage pathhttp_port
Modify the port that the browser accesses. The default port is8123
tcp_port
Example Modify the TCP transfer portopenSSL
Some SSL authentication configuration files, we do not do SSL authentication for the moment, do not deal with herelisten_host
If we want to use the service externally, we need to change this to0.0.0.0
remote_servers
Here is the configuration information related to the cluster, which will be explained later
Other custom configurations can be modified so that we can start a local ClickHouse service
- Create directories and permissions
mkdir -p /data2/clickhouse
chown -R clickhouse:clickhouse /data2/clickhouse
Copy the code
Operating ClickHouse
- Start the service
sudo -u clickhouse clickhouse-server --daemon --pid-file=/var/run/clickhouse-server/clickhouse-server.pid --config-file=/etc/clickhouse-server/config.xml
Copy the code
--daemon
Indicates that we want to start the service in the background--pid-file
Specifies the path to the progress file after the service is started--config-file
Specify the configuration file to start the service
If we use debugging, we can use the following command
sudo -u clickhouse clickhouse-server start
Copy the code
To boot as a separate user, the system will prompt you to boot as root
Debug the ClickHouse service
After ClickHouse is installed, the clickhouse-Server and Clickhouse-client directories will be generated, and we will use Clickhouse-Client to test the service
clickhouse-client -h clickhouse --port 9000 --multiquery --query="show databases"
Copy the code
Note: –port specifies the TCP port
Running the query database list returns information similar to the following
_temporary_and_external_tables
default
system
Copy the code
This syntax can use multiple SQL according to English; Just split each SQL
For more clickhouse-client information use clickhouse-client –help