From: juejin. Cn/collection /…
Journaling is an art that has long been neglected by development students. Logs are like car insurance. no one wants to pay for insurance, but they want to have it when things go wrong. We are all very casual when we print the log, but when we use it, we make fun of all kinds of SB including ourselves! Write every log, and you share!
The log
What is a log?
Logs, as defined by Wikipedia, record the activities of computer devices or software such as servers.
Log files provide accurate system records to locate details and root causes of errors. The nature of a log is that it describes discrete (discontinuous) events. For example, an application outputs INFO or ERROR information in a scroll file and stores it in some storage engines (Elasticsearch) through the log collection system.
What’s the use of a journal?
-
In the above article, we explained that the function of logs is to provide accurate systematic records for root cause analysis. So in what specific ways does it work?
-
Print debugging: that is, variables or a piece of logic can be logged. Record the running process of the program, that is, what code the program runs, which is convenient for troubleshooting logic problems.
-
Problem location: quickly locate the problem when the program is abnormal or faulty, which is convenient to solve the problem later. Because the online production environment cannot be debugged, simulating a production environment in the test environment takes time and effort. So it’s important to rely on log information to locate problems. Traffic can also be recorded, and later traffic statistics can be performed through ELK (including EFK).
-
User behavior log: Records user operations for big data analysis, such as monitoring, risk control, and recommendation. This type of log is usually used by other teams for analysis, and may be multiple teams, so there is a certain format that developers should follow to make it easier for other teams to use. Of course, what actions and actions to record are generally agreed upon, so the developer is mainly in the role of implementation.
-
Root cause analysis (essential) : Record logs in key areas. Convenient in and each terminal location problem, others say when your program problem, you can justifiably take out your log and say, look, I ran here, the state is also right. In this way, the other party will obediently locate his code, rather than shirking each other.
When to log?
When do you need to log?
- System initialization: startup parameters of a system or service. The initialization of core modules or components depends on some key configurations and provides different services according to different parameters. Be sure to log INFO here, print out the parameters, and start the finished service statement.
- Programming languages hint for exceptions: Every major programming language today includes exception mechanisms, and popular business-related frameworks have complete exception modules. This type of catch is what the system tells developers to be concerned about and is a very high quality error. Logs should be logged appropriately, using WARN or ERROR levels depending on the business situation.
- Business process expectation mismatch: In addition to platform and programming language exceptions, project code is one of the logging scenarios where results do not meet expectations. Simply put, all process branches can be considered. It is up to the developer to decide whether to tolerate the situation. Common suitable scenarios include incorrect external parameters, data processing problems that result in return codes that are not within a reasonable range, and so on.
- System core roles and key actions of components: The service actions triggered by the core roles in the system need to be paid attention to and are important indicators to measure the normal operation of the system. It is recommended to record info-level logs, for example, the entire process of users in the e-commerce system from logging in to placing orders. Interaction between microservices and service nodes; Core data table addition and deletion; Core component operation, etc. If the log frequency is high or the printing volume is very large, you can refine the key point INFO record, and consider DEBUG level as appropriate.
- Remote invocation of third-party services: An important point in the micro-service architecture is that the third party can never be trusted. It is recommended to print the parameters of request and response for remote invocation of third-party services, which is convenient to locate problems with each terminal and will not become confused because of the absence of third-party service logs.
Log print
Slf4j & Logback
Slf4j, short for Simple Logging Facade for Java, provides Simple Logging Facade for Java. The Facade, the lower level is the interface. It allows users to access different logging systems through Slf4j in their projects to their liking.
Slf4j Logback is the native implementation framework of Slf4j, which is also written by Log4j itself, but has more advantages, features, and better performance than Log4j. Compared with Log4j, Logback has faster execution speed. Based on our previous work on Log4j, Logback rewrites the internal implementation to be up to 10 times faster for certain scenarios. The components of Logback are faster and require less memory.
The log file
Log files are stored in a fixed directory and named according to a template. Recommended log file names are as follows:
Log File name that is being written: < application name >[-< function name >]. Log Example: example-server-book-service-access.log File name that has been rolled into the history: < application name >[-< Function name >].YYYY-MM-DD-hh.[scroll symbol].log Copies the codeCopy the code
Such as: example – server – book – service – access. 2019-12-01-10.1. Log
Log variable Definition
It is recommended to use lombok (code generator) annotations @ lombok. Extern. Slf4j. Slf4j to generate log instance variables.
<! -- https://mvnrepository.com/artifact/org.projectlombok/lombok --> <dependency> <groupId>org.projectlombok</groupId> <artifactId> Lombok </artifactId> <version>1.18.10</version> <scope> Provided </scope> </dependency> Copies the codeCopy the code
Code sample
import lombok.extern.slf4j.Slf4j;
@Slf4j
public class LogTest {
public static void main(String[] args) {
log.info("this is log test"); }} Copy the codeCopy the code
The log configuration
Logs are recorded at different levels according to the log file name. Logs at different levels are recorded in different log files. If you have a special format log, such as access log, and use a single file, be careful to avoid double printing (use additivity**=”false”** to avoid this).
Parameter placeholder format
Use the parameterized {} placeholder, [] for parameter isolation, which has the advantage of being more readable and only processing parameters when they are actually ready to print.
// Correct example, you must parameterize the message log.debug("order is paying with userId:[{}] and orderId : [{}]",userId, orderId); // Error: Do not concatenate strings. This will generate many strings, occupying space and affecting performance. And log levels higher than this level also perform string concatenation logic. log.debug("order is paying with userId: " + userId + " and orderId: "+ orderId); Copy the codeCopy the code
Basic log format
Log time
As the date and time the log was generated, this data is very important, usually to the millisecond.
Yyyy-mm-dd HH: MM :ss.SSS Copies the codeCopy the code
The level of logging
The output of logs is hierarchical. Different logs are generated for different Settings and occasions.
The following four levels are used:
- DEBUG: Logs at the DEUBG level are generated during development and testing. Logs of this level should be as detailed as possible. Developers can record detailed information in the DEBUG for debugging, including parameter information, debugging details, and returned value information. In this way, they can analyze problems or exceptions during development or testing.
- INFO: the main INFO record key information systems, designed to keep system work normally during the key operation index, developers can initialize the system configuration, state of business change information, or the user in the core business process handling records to the INFO in the log, convenient daily operations and repetition error back context scenarios. After the project is completed, you are advised to set the log level to INFO in the test environment. Then, you are advised to use the INFO level to check whether you can learn about the application usage and whether the logs provide useful information for troubleshooting when problems occur.
- WARN: The WARN level mainly outputs warning content that is predictable and planned, for example, when a method entry parameter is empty or the value of the parameter does not meet the conditions for running the method. More detailed information should be output at the WARN level for later log analysis.
- ERROR: Logs of THE ERROR level are generated for unpredictable information, such as errors and exceptions. For example, network communication and database connection exceptions caught in the Catch block have little impact on the entire system process. Logs of the WARN level can be generated. When generating error-level logs, output as much data as possible, such as method input parameters and objects generated during method execution. If the data contains ERROR or exception objects, output these objects together.
DEBUG/INFO options
DEBUG logs are lower than INFO and contain more detailed information about the system running status during debugging, such as the values of variables. INFO is the default output level of online logs and provides information about the current system status to end users. The output should be meaningful to the end user. From a functional point of view, the output of INFO can be regarded as part of the software product, and therefore should be treated with caution. If the log is frequently printed or fails to correct errors most of the time, consider downsizing to DEBUG.
- The number of DEBUG logs printed is much larger than that of INFO logs. For the sake of the log performance, if the code is core code and the log execution frequency is very high, you must check whether the log design is reasonable and whether you need to downgrade the log to DEBUG level.
- Pay attention to the readability of the log and review the log after writing the code to see if it runs smoothly and provides real meaningful information.
- Log output is common to multiple threads, and if another thread is writing log output, the log output will be interrupted, and the output will not match the expected output.
WARN/ERROR option
It can be used when the expected results are not generated in the process of method or function processing or an error is reported in the framework. Common troubleshooting methods include:
- Add judgment processing logic to try local resolution: Adding logic judgment to swallow alarms is always the best option to throw an exception and hand it over to higher-level logic to resolve
- Throw an exception and hand it over to higher-level logic
- Log and alert
- Wrap an error with a return code
Generally speaking, logs of the WARN level will not send SMS alarms, while logs of the ERROR level will send SMS alarms or even phone alarms. Logs of the ERROR level indicate that serious problems occur in the system and must be handled immediately, for example, the database is unavailable or key business processes cannot go down. In fact, it is very irresponsible to record the error as long as there is a problem without discriminating the importance of the problem. For mature systems, there will be a complete error reporting mechanism, so when should the error message be sent? Many are based on the number of ERROR logs per unit of time.
Emphasize ERROR alarm
- ERROR level log printing is usually accompanied by an alarm notification. The ERROR should be accompanied by a business function impairment, that is, a very serious problem has occurred in the system mentioned above and someone must take care of it immediately.
ERROR log target
- Direct and accurate information to the handler: the ERROR message forms its own closed loop.
Problem location:
- What went wrong and what functions were affected
- Obtaining help information: Direct help information or storage location of help information
- Call the police to know the solution or who to call
Name of the thread
Generally, a synchronous request is completed by the same thread in an application. The output thread name can be classified in the logs generated by each request to distinguish the current request context.
Opentracing logo
In distributed applications, a user’s request will invoke several services to complete, and these services may be nested. Therefore, the log to complete a request is not in the log file of an application, but scattered in the log file of different application nodes on different servers. This identifier is used to concatenate the invocation log of a request across the system.
- Unique string (Trace ID)
- Call hierarchy (SPAN ID)
By searching for the trace ID, you can find all the logs generated during the flow (processing) of the request identified by this trace ID throughout the system.
Biz logo
In business development, logs are associated with services. Sometimes, clustering is required based on users or services. Therefore, if a request can be clustered by a certain identifier, the cluster identifier can be printed to the log.
- User ID
- Business Identification (BIZ ID)
Logger name
The logger name is usually the class name, and you can simply print the class name in the log file, depending on whether the package name and line number are needed. It is used to find the log output in the class to locate the fault.
Log contents
Println and System.err. Println are disabled
Parameter change replaces log splicing
The object that outputs the log should implement a fast toString method in its class so that only the object class name and hashCode are printed when the log is output
Null pointer prevention: Do not call an object’s method in the log to get a value, unless you ensure that the object is not null, it is very likely that the application will generate null pointer exceptions due to logging problems.
The exception stack
The exception stack, which typically appears in logs at the ERROR or WARN level, contains the system of the method call chain and the source of the exception. The exception stack logs belong to the previous line and must be placed in the previous line during log collection.
Best practices
Log format
2019-12-01 00:00:00. 000 | | pidlog-level|[svc-name,trace-id,span-id,user-id,biz-id]|thread-name|package-name.class-name : logMessage copy codeCopy the code
- time
- Pid, pid
- Log-level: indicates the log level
- Svc-name: indicates the application name
- Trace-id, call chain identifier
- Span-id, invoking the hierarchy identifier
- User-id: indicates the user ID
- Biz-id: indicates the business ID
- Thread-name: indicates the name of a thread
- Package-name. class-name, the logger name
- Log message: Log message body
Log Module Extension
The log module is extended based on the following technical points.
- Slf4j MDC implementation principle (not to carry out detailed explanation, if interested in private communication)
- The principles of Opentracing Scope
In each tracing link, the context information in the Opentracing Scope is placed into MDC, and the value logic ** logging.Pattern.level ** is extended according to the Spring Boot Logging extension interface.
Related source code reference
Spring Cloud Sleuth
[Spring Cloud Sleuth github.com/spring-clou…
Standardization can be achieved by changing the pattern of each appender in the Logback configuration file to the following default values.
%d{${LOG_DATEFORMAT_PATTERN:-yyyy-MM-dd HH:mm:ss.SSS}} |${PID:- }|%level|${LOG_LEVEL_PATTERN:-%5p}T | | % % 40.40 logger {39} : % MSG % n duplicate codeCopy the code
Logback. XML excerpt
<configuration>
<property name="LOG_PATH"
value="${LOG_PATH:-${LOG_TEMP:-${java.io.tmpdir:-/tmp}}}"/>
<springProperty scope="context" name="APP_NAME"
source="spring.application.name" defaultValue="spring-boot-fusion"/ > <! -- Universal pattern --> <property name="LOG_PATTERN"
value="%d{${LOG_DATEFORMAT_PATTERN:-yyyy-MM-dd HH:mm:ss.SSS}} |${PID:- }|%level|${LOG_LEVEL_PATTERN:-%5p}T | | % % 40.40 logger {39} : % MSG % n"/ > <! <appender name= <appender name="fileInfo" class="ch.qos.logback.core.rolling.RollingFileAppender"> <! -- Specifies the name of the file to be written to. It can be a relative or absolute directory. If the upper directory does not exist, the system automatically creates the file. --> <file>${LOG_PATH}/${APP_NAME}-info.log</file> <! RollingPolicy --> <rollingPolicy class="ch.qos.logback.core.rolling.SizeAndTimeBasedRollingPolicy"> <! -- YYYY-MM-DD time policy is one file per day --> <FileNamePattern>${LOG_PATH}/${APP_NAME}-info.%d{yyyy-MM-dd-HH}.%i.log</FileNamePattern> <! -- <MaxHistory>48</MaxHistory> <maxFileSize>1GB</maxFileSize> <totalSizeCap>20GB</totalSizeCap> </rollingPolicy> <! -- Layout is responsible for converting events to strings, formatting the output of log information --> <layout class="ch.qos.logback.classic.PatternLayout">
<pattern>${LOG_PATTERN}</pattern> </layout> <! Level filter: Filters logs by log level. If the log level is equal to the configuration level, the filter accepts or rejects the log based on onMath and onMismatch --> <filter class="ch.qos.logback.classic.filter.LevelFilter"> <! <level>INFO</level> <! <onMatch>ACCEPT</onMatch> <! -- To configure actions that do not meet the filtering criteria --> <onMismatch>DENY</onMismatch> </filter> </appender> </configuration> Copy the codeCopy the code
Examples of code use:
@Override
public Result<PagingObject<SimpleResponse>> page(@RequestParam(value = "page-num", defaultValue = "1") int pageNum,
@RequestParam(value = "page-size", defaultValue = "10") int pageSize) {
LogStandardUtils.putUserId("userId123");
LogStandardUtils.putBizId("bizId321");
producerService.sendMsg("xxx");
simpleClient.page(pageNum, pageSize);
returnnew Result<>(simpleService.page(pageNum, pageSize)); } Duplicate codeCopy the code
logging
2019-12-04 16:29:08. 223 | 43546 | INFO | [example - server - book - service, ac613cff04bac8b1, 4 a9adc10fdf0eb5 userId123, bizId321] | XNIO - 1 task-4|c.n.u.concurrent.ShutdownEnabledTimer : Shutdown hook installedfor: NFLoadBalancer-PingTimer-example-server-order-service 2019-12-04 16:29:08. 224 | 43546 | INFO | [example - server - book - service, ac613cff04bac8b1, 4 a9adc10fdf0eb5 userId123, bizId321] | XNIO - 1 task-4|c.netflix.loadbalancer.BaseLoadBalancer : Client: example-server-order-service instantiated a LoadBalancer: DynamicServerListLoadBalancer:{NFLoadBalancer:name=example-server-order-service,current list of Servers=[],Load balancer stats=Zone stats: {},Server stats: []}ServerList:null 2019-12-04 16:29:08. 234 | 43546 | INFO | [example - server - book - service, ac613cff04bac8b1, 4 a9adc10fdf0eb5 userId123, bizId321] | XNIO - 1 task-4|c.n.l.DynamicServerListLoadBalancer : Using serverListUpdater PollingServerListUpdater 2019-12-04 16:29:08. 247 | 43546 | INFO | [example - server - book - service, ac613cff04bac8b1, 4 a9adc10fdf0eb5 userId123, bizId321] | XNIO - 1 task-4|c.n.l.DynamicServerListLoadBalancer : DynamicServerListLoadBalancerforclient example-server-order-service initialized: DynamicServerListLoadBalancer:{NFLoadBalancer:name=example-server-order-service,current list of Servers=[],Load balancer stats=Zone stats: {},Server stats: []}ServerList:ConsulServerList{serviceId='example-server-order-service'. tag=null} 2019-12-04 16:29:08. 329 | 43546 | WARN | [example - server - book - service, ac613cff04bac8b1, 4 a9adc10fdf0eb5 userId123, bizId321] | XNIO - 1 task-4|c.p.f.l.ctl.common.rule.StrategyRule : No up servers available from load balancer: DynamicServerListLoadBalancer:{NFLoadBalancer:name=example-server-order-service,current list of Servers=[],Load balancer stats=Zone stats: {},Server stats: []}ServerList:ConsulServerList{serviceId='example-server-order-service'. tag=null} 2019-12-04 16:29:08. 334 | 43546 | WARN | [example - server - book - service, ac613cff04bac8b1, 4 a9adc10fdf0eb5 userId123, bizId321] | XNIO - 1 task-4|c.p.f.l.ctl.common.rule.StrategyRule : No up servers available from load balancer: DynamicServerListLoadBalancer:{NFLoadBalancer:name=example-server-order-service,current list of Servers=[],Load balancer stats=Zone stats: {},Server stats: []}ServerList:ConsulServerList{serviceId='example-server-order-service'. tag=null} 2019-12-04 16:29:08. 342 | 43546 | ERROR | [example - server - book - service, ac613cff04bac8b1, 4 a9adc10fdf0eb5 userId123, bizId321] | XNIO - 1 Task - 4 | C.P.F.W.C.A dvice. ExceptionHandlerAdvice: the current program into the exception handling, wrong url is: [http://127.0.0.1:10011/simples], the parameters of the error is: [{"querystring":"{}"."payload":""} ]
java.lang.RuntimeException: com.netflix.client.ClientException: Load balancer does not have available server forClient: example-server-order-service copies the codeCopy the code
The log service
SLS log service
Log service (SLS for short) is a one-stop service for log data, developed in alibaba Group through a large number of big data scenarios. You can quickly complete log data collection, consumption, delivery, query and analysis functions without development, improving operation and maintenance efficiency, and establishing massive log processing capability in the DT era.
project
Project, management log base unit, service log suggests that an environment should be built as a Project, so that logging is a whole closed loop, and logging is generated with service calls within the whole environment.
logstore
Log library. Log libraries are recommended to be classified into different log types, for example, access logs and INFO/WARN/ERROR logs. Specific log formats can be configured with more indexes and alarm Settings.
Note: Do not divide logStores into different application service areas. In the microservice architecture, multiple application services are crossed in a single request, and logs are scattered in each application service. Distinguishing LogStores by service requires that developers have a good understanding of application health and call topology, which is often not available.
Real-time acquisition and consumption
Function:
- Access real-time log data (such as Metric, Event, BinLog, TextLog, and Click) through ECS, containers, mobile terminals, open source software, and JS.
- Provide real-time consumption interface, and real-time computing and service docking.
Purpose: Data cleaning (ETL), Stream Compute, monitoring and alarm, machine learning and iterative computing.
Query analysis
Real-time index, query and analysis data.
- Query: keyword, ambiguity, context, scope.
- Statistics: SQL aggregation and other rich query means.
- Visualization: Dashboard + Reporting functionality.
- Interconnection: Grafana, JDBC/SQL92.
Purpose: DevOps/online operation and maintenance, log real-time data analysis, safety diagnosis and analysis, operation and customer service system.
Consumer delivery
Stable and reliable log delivery. Deliver the log central data to storage services for storage. Supports compression, custom Partition, and column storage.
Purpose: Data warehouse + data analysis, audit, recommendation system and user portrait.
The alarm
The alarm function of the log service is implemented based on the query chart in the dashboard. On the log service console query page or dashboard page, set alarm rules and specify the configuration, check conditions, and notification mode of alarm rules. After alarms are set, the log service periodically checks the query results of the dashboard. When the check results meet the preset conditions, the log service sends an alarm notification to monitor the service status in real time.
Best practices
Log service is very powerful, want to use the log service can see:
Help.aliyun.com/document_de…
ELK universal logging solution
ELK is the initials of Elasticsearch, Logstash, and Kibana. It is also known as the Elastic Stack. Elasticsearch is a near real-time search platform framework based on Lucene, distributed and Restful interaction. Elasticsearch can be used as the basic support framework for search engines like Baidu and Google. Elasticsearch provides a very powerful search capability. Logstash is ELK’s central data flow engine. It collects data in different formats from different targets (file/data store /MQ) and filters it to different destinations (file /MQ/Redis/Elasticsearch/Kafka, etc.). Kibana allows Elasticsearch data to be displayed on a friendly page that provides real-time analysis.
Practice shows that
Plain format log
The 2019-11-26 15:01:03. 332 | 1543 | INFO | [example - server - book - service, 28 f019d57b8336ab, 630697 c7f34ca4fa, 105459204 3 | XNIO - 1 Task - 42] | C.P.F.W.P ay. PayServiceImpl: order is paying with userId: 105 and orderId: 45982043 duplicated codeCopy the code
Common log prefixes are fixed and can be fixed to facilitate faster query and analysis.
Specific format log
Take access logs as an example
The 2019-11-26 15:01:03. 332 | 1543 | INFO | [example - server - book - service, 28 f019d57b8336ab, 630697 c7f34ca4fa, 105459204 3 | XNIO - 1 task-42]|c.p.f.w.logging.AccessLoggingFilter : > url: http://liweichao.com:10011/actuator/health > http-method: GET > request-header: [Accept:"text/plain, text/*, */*", Connection:"close", User-Agent:"Consul Health Check", Host:"liweichao.com:10011", Accept-Encoding:"gzip"] > request-time: 2019-11-26 15:01:03.309 > queryString: - > payload: - > extra-param: - < response-time: 2019-11-26 15:01:03.33 < take-time: 23 < http-status: 200 < response-header: [content-type:"application/vnd.spring-boot.actuator.v2+json; charset=UTF-8", content-size:"15"]
< response-data: {"status":"UP"} Duplicate codeCopy the code
You can create indexes for specific log formats to facilitate query analysis and alarms. For example, the indexes are based on the values of take-time, HTTP-status, and biz code.
reference
- [Java logging best practice] (www.jianshu.com/p/546e9aace…
- Stop typing logs in Java code, this is the correct position to print logs! (mp.weixin.qq.com/s/hJvkRlt9x…
- [the log service] (help.aliyun.com/product/289…
- [Spring Boot Logging](docs. Spring. IO /spring-boot…
- Spring Cloud Sleuth [github.com/spring-clou…
- [ Opentracing ](github.com/opentracing