The background,

As the bottom of the technical personnel, at present due to the customer the mixed architecture, which meets the operational level of use the product on public clouds Redis database, at the same time because of the business in the cloud and kubnets container also has the Redis database, so for this kind of mixed mode database monitoring, carries on the simple analysis summary, in this note, I had hoped that I would be of little benefit to you. Redis is a key-value storage system. Similar to Memcached, it supports more value types and data persistence on the basis of atomic operations. It also supports master/slave synchronization, cluster building, and publish/subscribe mechanism, which greatly improves the scalability of data operations and data redundancy security. At present, whether it is common cloud or private cloud, self-built cluster or Docker container, SAAS has diversified monitoring methods and complementary ways to better provide monitoring for different scenarios.

Ii. Monitoring methods

2.1 Cloud Platform monitoring

Current level database for SAAS products, out-of-the-box underlying all depends on multiple copies of each cloud vendors or cluster, to protect the security of data and the stability of the service, but in the present several cloud fault, we also want to the heart of the heart is the fear of the data, all aspects of multimode do data backup, there is no absolute safety, No matter how many slAs of 9, irretrievably lost or corrupted data is a 100 percent loss, and data is priceless. Therefore, data backup is required:

  • Shared cloud product backup policy,
  • Verify data availability for backup and recovery
  • Remote Dr Across availability zones
  • Multiple copies, master/slave backup
  • Local or object storage archive backup

Monitoring method using a database monitoring in the public cloud, but to provide the monitoring indicators may rarely, and very cloudy vendor monitoring frequency for 5 minutes, real-time is not high, we can use custom control to the data reported to a cloud platform to expand the custom monitoring items, at the same time contact whether cloud service providers can shorten the monitoring frequency, etc. Can understand the cloud platform custom monitoring: Ali Cloud custom monitoring here is only an example, different cloud vendors if they have custom monitoring function, it can be similar to the operation.

Two representative monitoring charts of Redis, a domestic public cloud manufacturer, are displayed here

  • Tencent cloud

  • Ali cloud

2.2 Third-party Monitoring Tools

There are also many third-party mature monitoring tools, such as:

2.2.1 SmartEye of Anchang Network

Application monitoring is only briefly described here

  • There are various types of service monitoring, including common applications and database monitoring
  • In the perspective of operation and maintenance personnel, to provide complete service monitoring indicators
  • The installation mode is very convenient and can be installed and deployed with one command
  • There are various monitoring modes. You can install the Agent directly or monitor in Intranet Proxy mode
  • Service system resource footprint is minimal
  • Monitoring frequency Fine-grained, monitoring data is reported in 1 minute
  • You can customize scripts to expand indicators or monitor other application services
  • Alarm policy classification and alarm mode optional For more details, see SmartEye’s official website for details

2.2.2 monitoring treasure

The application monitoring of monitoring treasure is also more detailed. You can refer to the previous article to learn about monitoring treasure service performance monitoring

2.3 Open source monitoring tools

2.3.1 RedisLive

RedisLive is a graphical monitoring tool written by Python and open source. It is very lightweight. The core service part contains only a Web service and a monitoring service based on the redis built-in info command and monitor command. It’s very straightforward. In addition, it supports multi-instance monitoring, is easy to switch, and is very easy to configure. Monitoring information supports redis storage and persistent storage (SQLite). For details, see: RedisLive Monitors the Redis service

2.3.2 Zabbix monitoring

As an open source monitoring system known to operation and maintenance personnel, Zabbix is favored by technical personnel with powerful functions and diversified monitoring methods. Many companies have carried out secondary development of Zabbix to customize their own monitoring system.

The SAAP cloud platform combined with Zabbix private monitoring system is also provided for customers to jointly build a complete monitoring system, complementary advantages and monitoring redundancy.

  • For customer SAAS products, Redis can be well satisfied by using its own cloud product monitoring and custom monitoring items.
  • Use Zabbix monitoring for customer built Redis and in-container database running;
  • The Redis of cloud host can collect data by installing agent on the cloud host and passively send it to the server for monitoring. However, it is not very convenient to install Agent inside the container and deploy large-scale services that are already in full operation. At this point, use custom scripts in Zabbix Server for monitoring.

A. You can obtain detailed information about redis by using the info parameter of redis- CLI command. You can use custom scripts to intercept desired parameters for monitoring.

# Server # Server informationRedis_version: 3.2.10# Redis version
redis_git_sha1:00000000     #Git SHA1
redis_git_dirty:0       #Git dirty flag
redis_build_id:c8b45a0ec7dc67c6
redis_mode:standalone       # Redis run modeOS :Linux 3.10.0-514.26.2.el7.x86_64 x86_64 arch_bits:64 multiplexing_API :epoll gcc_version:4.8.5 process_id:1755 multiplex_bits :64 multiplexing_API :epoll gcc_version:4.8.5 process_id:1755 run_id:0ee1ada5aa030118000d991c39564e41d504a3a5# Random identifiers for servers (used for Sentinel and clustering
tcp_port:6379       # monitor port
uptime_in_seconds:5095759       Run time (s)
uptime_in_days:58       # Running time
hz:10
lru_clock:7308631       # Clock increment in minutes for LRU management
executable:/usr/bin/redis-server
config_file:/etc/redis.conf

# Clients # Client information
connected_clients:4     # Number of connected clients
client_longest_output_list:0        The longest output list of currently connected clients
client_biggest_input_buf:0      # Maximum input cache among currently connected clients
blocked_clients:0      If the monitoring data is greater than 0, an alarm is generated

# Memory # Memory information
used_memory:1366580712      # The amount of memory allocated by the Redis allocator, in bytesUsed_memory_human: 1.27 G# Return the amount of memory allocated by Redis in a human-readable format
used_memory_rss:1593454592      ## The amount of memory allocated by Redis, including memory fragmentation, in bytesUsed_memory_rss_human: 1.48 GReturn the total amount of memory allocated by Redis (including memory fragmentation) in human-readable format
used_memory_peak:2856423024     #Redis peak memory consumption in bytesUsed_memory_peak_human: 2.66 GReturn Redis peak memory consumption in human-readable format
total_system_memory:16658460672     Total system memory (in bytes)Total_system_memory_human: 15.51 GTotal system memory for human guest readable format Redis
used_memory_lua:37888       Size of memory (in bytes) used by the Lua engineUsed_memory_lua_human: 37.00 K# Return the size of memory used by the Lua engine in human-readable formMaxmemory_human maxmemory: 0:0 b maxmemory_policy: noeviction mem_fragmentation_ratio: 1.17# Redis memory fragmentation rate (usED_memory_rss/usED_memory). If the value is less than 1, redis has used the swap partition, and an alarm is generatedMem_allocator: jemalloc - 3.6.0The memory allocator used by Redis, specified at compile time. It can be liBC, Jemalloc, or TCMALloc

Persistence #RDB and AOF related information
loading:0       A flag that indicates whether the server is loading persistence files
rdb_changes_since_last_save:5270        ## How many seconds have passed since the last successful persistence file creation
rdb_bgsave_in_progress:0        A flag that indicates whether the server is creating an RDB file
rdb_last_save_time:1534035237       The last UNIX timestamp of the RDB file was successfully created
rdb_last_bgsave_status:ok       ## a flag value that records whether the last RDB file was created successfully or failed
rdb_last_bgsave_time_sec:15     The last RDB file was created in seconds
rdb_current_bgsave_time_sec:-1      # If the server is creating an RDB file, this value records the number of seconds that the current creation operation has taken
aof_enabled:0       #redis whether aOF is enabled
aof_rewrite_in_progress:0       A flag that indicates whether the server is creating AOF files
aof_rewrite_scheduled:0     A flag that indicates whether the scheduled AOF rewrite operation should be performed after the RDB file is created
aof_last_rewrite_time_sec:-1        # How long it took to create AOF file last time
aof_current_rewrite_time_sec:-1     If the server is creating AOF files, this field records the number of seconds that the current creation operation has taken
aof_last_bgrewrite_status:ok        A flag that records whether the last AOF file was created successfully or failed
aof_last_write_status:ok        The last AOF file was created successfully

# Stats
total_connections_received:14133        Number of connection requests accepted by the server
total_commands_processed:7477479089     # Number of commands executed by the server
instantaneous_ops_per_sec:271       The number of commands executed by the server per second
total_net_input_bytes:599074234857      #redis Number of bytes of network inbound traffic
total_net_output_bytes:2484573994623        #redis Number of bytes of network egress trafficInstantaneous_input_kbps: 53.60# Redis network entry KBPSInstantaneous_output_kbps: 107.04# Redis network exit KBPS
rejected_connections:0      # Number of connection requests rejected because of the maximum number of clients
sync_full:0
sync_partial_ok:0
sync_partial_err:0
expired_keys:61703692       # Number of database keys that were automatically deleted because they expiredEvicted_keys :0 Number of keys evict (EVICT) due to maximum memory capacity limit Keyspace_hits :5121978022Number of key hits
keyspace_misses:68376478        Number of times a key has not been hit
pubsub_channels:0       # Number of channels currently in use
pubsub_patterns:0       # Number of patterns currently in useLatest_fork_usec :40503 Microseconds that the last fork blocked the REDis process, in microseconds. migrate_cached_sockets:0# Replication # Duplicate information
role:master     The # character
connected_slaves:0      # Number of connected slaves
master_repl_offset:0        Master copy offset
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0

# CPUUsed_cpu_sys: 169884.36#Redis server consumes system CPUUsed_cpu_user: 85656.09# User CPU consumed by Redis serverUsed_cpu_sys_children: 43671.30# System CPU consumed by child processesUsed_cpu_user_children: 316353.91# User CPU consumed by child processes

# Cluster
cluster_enabled:0       # Whether to enable cluster

# Keyspace
db0:keys=2353841,expires=2204380,avg_ttl=929326076  # number of keys in the database, number of keys in the database set to expire (this decrease is normal)
Copy the code

B. Select the indicators that you want to monitor for script interception

#! /bin/bash
# -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -
 
REDIS_CLI_COMMAND="redis-cli"
REDIS_HOST="172.x.x.x"
REDIS_PORT="6379"
 
ARGS=1
 
if [ $# -ne "$ARGS" ];then
    echo "Please input one arguement:"
fi
 
case The $1 in
    connected_clients)
        result=`$REDIS_CLI_COMMAND -h $REDIS_HOST -p $REDIS_PORT -a 'password' info | grep -w "connected_clients" | awk -F':' '{print $2}'`
            echo $result
            ;;
    used_memory_rss)
        result=`$REDIS_CLI_COMMAND -h $REDIS_HOST -p $REDIS_PORT -a 'password' info | grep -w "used_memory_rss" | awk -F':' '{print $2}' | awk -F'G' '{print $1}'`
            echo $result.# Here you can write the monitoring items you are interested in according to the above
esac
Copy the code

Note that if multiple hosts are monitored, you can pass host as the second parameter, or you can split multiple scripts to distinguish different ports or passwords, C. Finally, add it to the Agent configuration segment of the Zabbix server host

UnsafeUserParameters=1           Configure user - definable scriptsUserParameter = Redis. Info_0. 2 [*], / etc/zabbix/script/Redis/chk_redis. 0.2. ShThe $1         The last two IP addresses can be used to differentiate data
Copy the code

Third, summary

  • At present, for the platform, try to use the platform unified monitoring management, how to have internal custom that is better, can coordinate with the script multidimensional fine granularity to achieve customized monitoring, very convenient;
  • For a hybrid model such as customer, the platform cannot meet most of the requirements and can be complemented by platform-based monitoring + privatized deployment monitoring.
  • For those who want convenient and powerful functions, smarteye can be considered. The monitoring types and indicators are full and complete, with fine granularity, high frequency, support for customization and privatized deployment, making it the first choice for the next generation of monitoring.