One, foreword

These days when I need to query a large log file, every time I open vim, cat and so on, I get stuck, but I need to see how many rows of data match the condition. Here are some common query commands.

Common search commands

1. Grep search

Grep parameter file name | head / / from the beginning to find grep parameter file name | wc - l / / see how many qualified line cat filename | grep parameters $/ / output to the end of lineCopy the code

2, the instance,

(1) Search the number of rows according to specific parameters

cat /data/weblogs/xxx.access.log  |grep "GET /pixel.jpg?"|wc -l 
			4102386Copy the code

(2) Partial regular query

cat /data/weblogs/em.evony.com.access.log |grep "25/Nov/2019:15:[00-59]" |wc -l 
		120
Copy the code

Select * from 25/Nov/2019:15 from 00 to 59

(3) Pipe connection can be used between multiple conditions to query the number of lines that meet both conditions

cat /data/weblogs/xxx.log |grep "25/Nov/2019:15:[00-59]" |grep "GET /pixel.jpg?"|wc -l 

		120Copy the code

Query the number of rows that match condition 1 or condition 2

cat /data/weblogs/xxx.log |grep -E "25/Nov/2019:15:[00-59] |GET /pixel.jpg?"|wc -l4098135 Short: grep -e"exp1|exp2|exp3" | wc -lReference: https://blog.csdn.net/lijing742180/article/details/84959963Copy the code

3. Grep is a fuzzy query

When using grep to search for the port number, the result is not satisfactory, as shown in the following example:

netstat -anp |grep -i '80'(Not all processes could be identified, non-owned process info will not be shown, You would have to be root to see it all.) TCP 0 0 127.0.0.1:80 0.0.0.0:* LISTEN - TCP 0 0 10.17.2.50:80 0.0.0.0:* LISTEN - TCP 00 216.66.17.189:80 0.0.0.0:* LISTEN - TCP 00 10.17.2.50:10050 10.17.13.2:33801 TIME_WAIT -Copy the code

To query the usage of port 80, run the following command:

 netstat -apn | awk '{split($4,arr,":"); if(arr[2] == "80") print $0}'Copy the code

One step in place, found to be 80 port process, very easy to use.

Search for the IP address in the file

1. Match the IP address

grep -Eo '([^0-9]|\b)((1[0-9]{2}|2[0-4][0-9]|25[0-5]|[1-9][0-9]|[0-9])\.) {3}(1[0-9][0-9]|2[0-4][0-9]|25[0-5]|[1-9][0-9]|[0-9])([^0-9]|\b)' xxx.log | sed -nr 's/([^ 0-9] | \ b) (([0-9] {1, 3} \.) {3} [0-9] {1, 3}) ([^ 0-9] | \ b) / 2 / p \ '|wc -l

31116275Copy the code

2. Query the number of occurrences of each IP address

grep -E -o "(25 [0 to 5] | 2 [0 to 4] [0-9] | [01]? [0-9] [0-9]?) \. (25 [0 to 5] | 2 [0 to 4] [0-9] | [01]? [0-9] [0-9]?) \. (25 [0 to 5] | 2 [0 to 4] [0-9] | [01]? [0-9] [0-9]?) \. (25 [0 to 5] | 2 [0 to 4] [0-9] | [01]? [0-9] [0-9]?) "XXX. The log | sort | uniq -c 99.203.87.103 2 99.203.87.142 4 99.203.87.145 99.203.87.153 8Copy the code

The first is the number of occurrences, followed by the IP

3. More accurate IP matching

grep -E -o "(25 [0 to 5] | 2 [0 to 4] [0-9] | [01]? [0-9] [0-9]?) \. (25 [0 to 5] | 2 [0 to 4] [0-9] | [01]? [0-9] [0-9]?) \. (25 [0 to 5] | 2 [0 to 4] [0-9] | [01]? [0-9] [0-9]?) \. (25 [0 to 5] | 2 [0 to 4] [0-9] | [01]? [0-9] [0-9]?) "  xxx.log|wc -l

32929372Copy the code

4. Fuzzy IP matching

grep -E -o "([0-9] {1, 3} [\.]) {3} [0-9] {1, 3}" xxx.log|wc -l

32930309Copy the code

5, multiple conditions query IP, first according to the qualifying conditions to obtain the specified number of lines, and then search the number of IP

cat xxx.log |grep "25/Nov/2019:15:[00-59]" |grep "GET /pixel.jpg?"|grep -E -o "([0-9] {1, 3} [\.]) {3} [0-9] {1, 3}"|wc -l 
1110Copy the code

Feel these methods to check IP are bad, because the log file has been increasing, so the results are not the same, check the speed is relatively slow, may be the file is too large, in this record, there is always useful time.

I hope the above content can help you. Many PHPer will encounter some problems and bottlenecks when they are advanced, and they have no sense of direction when writing too many business codes. I have sorted out some information, including but not limited to: Distributed architecture, high scalability, high performance, high concurrency, server performance tuning, TP6, Laravel, Redis, Swoole, Swoft, Mysql optimization, shell scripting, Docker, micro services, Nginx and other knowledge points can be shared for free. Please stamp below if needed

PHP Advanced Architect >>> Video, interview documents for free


Video:

PHP Architect Techniques – Case Study of multi-stage PV Concurrent Architecture (load limiting algorithm, de-weight, delay queue)