- Syslog: The Complete System Administrator Guide
- Original author: Schkn
- The Nuggets translation Project
- Permanent link to this article: github.com/xitu/gold-m…
- Translator: githubmnume
- Proofreader: TodayCoder001, Shixi-Li, Portandbridge
Syslog: Complete guide for system Administrators
If you’re a system administrator, or just a regular Linux user, chances are you’ve used Syslog at least once.
On your Linux system, almost everything related to system logging is related to the Syslog protocol.
Designed by Eric Allman (Berkeley University) in the early 1980s, the protocol is a specification that defines a standard for message logging on any system.
Yes… Any system.
Syslog does not depend on Linux operating systems. It can also be used on Windows operating systems or any operating system that implements the Syslog protocol.
If you want to learn more about syslog and Linux logging in general, this is probably the tutorial you should read.
Here’s everything you need to know about Syslog.
I – What is the purpose of Syslog?
Syslog is the standard for generating, forwarding, and collecting logs generated on Linux instances. Syslog defines severity levels and facility levels to help users better understand the logs generated on their computers. Logs can later be analyzed and displayed on the server where the Syslog protocol is deployed.
Here are a few reasons why the Syslog protocol was originally designed:
- Define the architecture: More on this later, but if syslog is a protocol, it could be part of a complete network architecture with multiple clients and servers. Therefore, we need to define roles, in a nutshell: do you receive, generate, or forward data?
- Message format: Syslog defines how messages are formatted. This obviously needs to be standardized, as logs are often parsed and stored in different storage engines. Therefore, we need to define what the syslog client can produce and what the syslog server can receive;
- Specify reliability: Syslog needs to define how it handles messages that cannot be delivered. As part of the TCP/IP stack, syslog is obviously selected over the underlying network protocol (TCP or UDP);
- Handles authentication or message authenticity: Syslog requires a reliable way to ensure that clients and servers interact in a secure manner and that received messages are not changed.
Now that we know why Syslog was developed in the first place, let’s look at how the Syslog architecture works.
II — What is Syslog Architecture?
When designing a logging architecture, such as a centralized logging server, it is likely that multiple instances will work together.
Some instances will generate log messages, which will be referred to as “devices” or “Syslog clients.”
Some simply forward incoming messages, which will be called “relays.”
Finally, in some cases, you will receive and store log data, which are called “collectors” or “syslog servers.”
With these concepts in mind, we can say that the independent Linux computer itself acts as a syslog client-server: it generates log data that is collected by Rsyslog and stored in the file system.
Here is a set of architectural examples around this principle.
In the first design, you have a device and a collector. This is the simplest form of logging architecture.
Add a few more clients to your infrastructure and you have the foundation for a centralized logging architecture.
Multiple clients are generating data and sending it to a centralized syslog server that aggregates and stores client data.
If we wanted to complicate our architecture, we could add a “relay”.
For example, a relay can be a Logstash instance, but it can also be an Rsyslog rule on the client side.
Most of these relays act as “content-based routers” (if you’re not familiar with content-based routers, here’s a link to help you understand it). This means that data will be redirected to different locations based on log content. If you’re not interested in the data, you can discard it entirely.
Now that we have the Syslog component in detail, let’s take a look at what Syslog messages look like.
III – What is the Syslog message format?
The Syslog format is divided into three parts:
- The PRI section: details message priorities (from debug messages to emergencies) and facility levels (mail, authorization, kernel);
- HEADER: consists of two fields: timestamp and host name. The host name is the name of the computer that sends logs.
- MSG section: This section contains actual information about the event that occurred. It is also divided into TAG and CONTENT fields.
Before describing the different parts of the syslog format in detail, let’s take a quick look at the severity level of Syslog and the system logging facility level.
A — What is Syslog Facility level?
Simply put, the facility level is used to identify the program or part of the system that generates logs.
By default, some parts of the system are given a functional level, such as a kernel that uses kern functionality, or a mail system that uses mail functionality.
If a third party wants to log, it may retain a set of facility levels from 16 to 23, called ** “local use” facility levels **.
Alternatively, they can use a “user level” tool, which means they can log information about the user executing the command.
In short, if my Apache server is run by an “Apache” user, the logs will be stored in a file called “apache.log” (.log)
The following table describes the Syslog facility levels:
Numerical Code | Keyword | Facility name |
---|---|---|
0 | kern | Kernel messages |
1 | user | User-level messages |
2 | Mail system | |
3 | daemon | System Daemons |
4 | auth | Security messages |
5 | syslog | Syslogd messages |
6 | lpr | Line printer subsystem |
7 | news | Network news subsystem |
8 | uucp | UUCP subsystem |
9 | cron | Clock daemon |
10 | authpriv | Security messages |
11 | ftp FTP | daemon |
12 | ntp NTP | subsystem |
13 | security | Security log audit |
14 | console | Console log alerts |
15 | solaris-cron | Scheduling logs |
16-23 | local0 to local7 | Locally used facilities |
Do these levels look familiar to you?
Yes! On Linux, by default, files are separated by facility names, which means you’ll have one file for authentication (auth.log), one file for the kernel (kern.log), and so on.
This is an example screenshot of my Debian 10 instance.
Now that we have seen syslog facility levels, let’s describe what syslog severity levels are.
B – What is the Syslog severity level?
Syslog Severity Indicates the severity of an event. The severity ranges from debug and message to critical.
Similar to the Syslog facility level, severity levels are numbered from 0 to 7, with 0 being the most critical.
The following table describes the syslog severity levels:
Value | Severity | Keyword |
---|---|---|
0 | Emergency | emerg |
1 | Alert | alert |
2 | Critical | crit |
3 | Error | err |
4 | Warning | warning |
5 | Notice | notice |
6 | Informational | info |
7 | Debug | debug |
Even though logs are stored by facility name by default, you can store them entirely by event severity level.
If you use Rsyslog as the default system log server, you can check the Rsyslog property to configure the separation of logs.
Now that you have a better understanding of the facilities and severity, let’s go back to the syslog message format.
What is the C-pri part?
The PRI block is the first part of a syslog message.
PRI stores the “priority value” between Angle brackets.
Remember what you just learned about facilities and severity?
If you use the message facility number, multiply it by 8 and add the severity level, you will get the “priority value” for syslog messages.
Keep this in mind if you want to decode your syslog messages in the future.
What is the D-header part?
As mentioned earlier, the HEADER section consists of two key pieces of information: the TIMESTAMP section and the HOSTNAME section (which can sometimes be resolved to an IP address)
The HEADER section is directly attached to the PRI section, just after the right Angle bracket.
It is worth noting that the format of the TIMESTAMP section is “Mmm DD hh:mm:ss”, where “Mmm” is the first three letters of the month of the year.
Speaking of HOSTNAME, it is usually given when you type the HOSTNAME command. If it cannot be found, it will be assigned an IPv4 or IPv6 host.
IV — How does Syslog messaging work?
When publishing Syslog messages, you need to make sure that you are delivering log data in a reliable and secure way.
Syslog certainly has its own ideas in this regard, and here are some answers to these questions.
A – What is syslog forwarding?
Syslog forwarding involves sending client logs to a remote server for centralized recording, facilitating log analysis and visualization.
Most of the time, instead of monitoring one machine, system administrators need to monitor dozens of machines both on-site and remotely.
Therefore, it is quite common to use different communication protocols, such as UDP or TCP, to send logs to a remote machine called a centralized log server.
B – Syslog using TCP or UDP?
According to RFC 3164, the syslog client uses UDP to send messages to the system log server.
In addition, Syslog uses port 514 for UDP communication.
However, in recent syslog implementations, such as Rsyslog or syslog-ng, you can use Transmission Control Protocol (TCP) as a secure communication channel.
For example, Rsyslog uses port 10514 for TCP communication to ensure that no packets are lost in the transport link.
In addition, you can use TLS/SSL over TCP to encrypt system log packets to ensure that there is no man-in-the-middle attack to monitor your logs.
If you are interested in rsyslog, here is a tutorial on how to set up a complete centralized log server in a secure and reliable manner.
V – What are the current Syslog implementations?
Syslog is a specification, but not an actual implementation in Linux systems.
Here is a list of current Syslog implementations on Linux:
-
Syslog Daemon: Released in 1980, Syslog daemon was probably the first implementation and supports only a limited set of functions (such as UDP transport). It is often referred to as the Sysklogd daemon on Linux;
-
Syslog-ng: Syslog-ng was released in 1998 and extends the feature set of the original Syslog Daemon to include TCP forwarding (thus enhanced reliability), TLS encryption and content-based filters. You can also store logs to a local database for further analysis.
- Rsyslog: Rsyslog was released by Rainer Gerhards in 2004 and is the default syslog implementation on most actual Linux distributions (Ubuntu, RHEL, Debian, etc.). It provides the same forwarding functionality as syslog-ng, but it allows developers to select data from more sources (such as Kafka, files, or Docker)
VI – What are logging best practices?
When it comes to operating system logging or building a complete logging architecture, you need to know some best practices:
- Unless you are willing to lose data, use a reliable communication protocol. It’s really important to choose between UDP (an unreliable protocol) and TCP (a reliable protocol). Make this choice in advance;
- Configure your host using THE NTP protocol: When you want to use real-time log debugging, it is best to synchronize the host, otherwise it is difficult to accurately debug events;
- Secure your logs: Using TLS/SSL will certainly have some performance impact on your instance, but if you are forwarding authentication or kernel logs, it is best to encrypt them to ensure that no one can access critical information;
- You should avoid over-recording: a defined logging strategy is crucial for your company. For example, you must decide whether you are interested in storing (and basically consuming bandwidth) informational logs or debug logs. For example, you might only be interested in error logs;
- Back up your log data regularly: If you are concerned about keeping sensitive logs, or if you are audited regularly, you may back up your logs on relevant external drives or on a properly configured database.
- Set a log retention policy: If logs are too old, you may be interested in discarding them, also known as “rotating” them. This is done using the Logrotate utility on Linux systems.
Conclusion VII –
The Syslog protocol is a classic for system administrators or Linux engineers who want to learn more about how logging functions in a server work.
However, there is a time for theory and a time for practice.
So what should you do? You have several options.
You can start by setting up a syslog server on your instance, such as Kiwi Syslog server, and start collecting data from it.
Or, if you have a larger infrastructure, you might want to set up a centralized logging architecture first and then monitor it using very modern tools such as the Kibana visualization tool.
I hope you learned something today.
Live in the moment and have fun as always.
If you find any mistakes in your translation or other areas that need to be improved, you are welcome to the Nuggets Translation Program to revise and PR your translation, and you can also get the corresponding reward points. The permanent link to this article at the beginning of this article is the MarkDown link to this article on GitHub.
The Nuggets Translation Project is a community that translates quality Internet technical articles from English sharing articles on nuggets. The content covers Android, iOS, front-end, back-end, blockchain, products, design, artificial intelligence and other fields. If you want to see more high-quality translation, please continue to pay attention to the Translation plan of Digging Gold, the official Weibo, Zhihu column.