-
Grep :(global search regular expression and print out the line) text filtering tool
-
Sed :(stream editor) a text editor
-
Awk :(initials of three inventors) implementation of Linux as gawk, text report generator (formatted text)
Regular expression formula:
A pattern written by the special character **== and the text character ==**, some of which do not represent their literal meaning but are used to represent control or wildcard functions
Metacharacters fall into two categories:
- Basic regular expression BRE
- Extend the regular expression ERE
A grep.
-
Function: text filtering tool, according to the user == specified “mode (filter condition)” == on the target text == line by line == match check, print the matched line;
-
Pattern: Filter conditions written by metacharacters and text characters of regular expressions
grep [OPTIONS] PATTERN [FILE...] grep [OPTIONS] [-e PATTERN | -f FILE] [FILE...]Copy the code
1. option
-i |
ignore | |
-o |
only | |
-v |
invert | |
-q |
quiet | |
-E |
Extend | Extend regular |
-a |
all | –text Don’t ignore binary data |
-A |
After | |
-B |
Before | |
-C |
Context | |
-- color = auto |
2. Regular expression /PATTERN
Basic regular expression metacharacters
2.1 Character Matching
. |
Any single character |
[] |
Specifies any single character within the range == |
(^) |
Specifies any single character outside the range == |
[:digit:] |
digital |
[:lower:] |
lowercase |
[:upper:] |
A capital |
[:alpha:] |
The letter |
[:alnum:] |
Containing numbers and letters |
[:punct:] |
punctuation |
[:space:] |
Whitespace characters (including tabs, Spaces, newlines, etc.) |
[root@localhost ~]# grep "r[[:alpha:]][[:alpha:]]t" /etc/passwd
root:x:0:0:root:/root:/bin/bash
operator:x:11:0:operator:/root:/sbin/nologin
Copy the code
Note:
[[:alpha:]]
== Parentheses == match a single character
== Parentheses == indicate a single letter character
2.2 Number of matches
Limit the number of occurrences of the preceding character == = (default greedy mode, as many matches as possible) == Backslashes are escaped ==
Matches the preceding character | |
---|---|
* |
== any time == =;0,1, multiple times |
. * |
== Any == length == any == character |
\? |
0 or 1 times, == Optional == |
\ + |
1 or more times, == At least once == |
\{m\} |
== Specify exactly m times == |
\ \} {m, n |
== = at least m times, == at most n times |
\ \} {0, n Up to n times |
|
\ {m, \} At least m |
|
2.3 Position == Anchor (beginning of line, end of line, beginning of word) ==
Line, | |
---|---|
^ |
The beginning of a line anchor |
$ |
Anchor end of each line |
^PATTERN$ |
Match the entire line |
^[[:space:]]*$ |
A line containing whitespace characters |
\ < |
The word first anchor |
\ > |
Ending anchor |
\<PATTERN\> |
Accurate anchor |
Exercises:
1. grep -v "/bin/bash$" /etc/passwd
#Note that you need to anchor the beginning and end of the word, otherwise it's not two or three digits.2. Grep "\" [[: digit:]] \ {2, 3 \} \ > "/ etc/passwd and grep" ^ [[: space:]] \ {1, \} [^ [: space:]] \ {1, \} "/ etc/grub2. CFG load_env set default="${next_entry}" set next_entry= save_env next_entry set boot_once=true set default="${saved_entry}" 4 menuentry_id_option = "id" netstat ant | grep "LISTEN [[: space:]] * $" TCP 0 0 0.0.0.0:139 0.0.0.0: * LISTEN TCP 0 0 127.0.0.1:6379 0.0.0.0:* LISTEN TCP 0 0 0.0.0.0:22 0.0.0.0:* LISTEN TCP 0 0 0.0.0.0:445 0.0.0.0:* LISTEN tcp6 0 0 :::3306 :::* LISTEN tcp6 0 0 :::139 :::* LISTEN tcp6 0 0 :::22 :::* LISTEN tcp6 0 0 :::445 :::* LISTEN tcp6 0 0 :::33060 :::* LISTENCopy the code
2.4 Grouping and Referencing (== Backreferencing == : Referencing the characters matched by the preceding parentheses)
\ (\) |
To bind one or more characters together, == is treated as a whole == |
The grep engine automatically logs the grouping variables | |
\ 1 |
The first set of open parentheses, the content matched by the close parentheses |
\ 2 |
|
3 \ |
|
3. egrep fgrep
Egrep: extends grep
Fgrep: fastgrep The default is not regular
All three can be switched by -e -F -G
- metacharacters
== Note: parentheses () and <> need to be escaped ==
-
practice
1. grep -i "^s" /proc/meminfo grep -i "^[sS]" /proc/meminfo grep -E "^s|^S" /proc/meminfo grep -E "^(s|S)" /proc/meminfo 2. #Note the anchoring endingsgrep -E "^(root|mysql|samba)\>" /etc/passwd 3. grep -E "\<[[:alpha:]]*\>\(" /etc/rc.d/init.d/functions grep "\" [[: alpha:]] * \ > ("/etc/rc. D/init. After d/functions provides correct: Grep -e "\" [_, [: alnum:]] * \ > \ (\) "/ etc/rc. D/init. 4 d/functions provides the echo ` PWD ` | egrep -o" ^ / / / < [[: alpha:]] + \ > "after the correct: Note from the end of each line began to locate the echo/root/mysite/layouts / | grep - E - o "[^ /] + /? The $5. "ifconfig | grep - E '\" {1} [1, 2] [0-9] [0-9] {0, 1} {0, 1} \ >' corrected: From the ones digit, Ten to hundred digit start 1-9 of 10-99 100-199, 200-249, 250-255 the ifconfig | grep - E '\ < (1-9] [| [1-9] [0-9] | 1 [0-9] [0-9] | 2 [0 to 4] [0-9] 25 [0-5]) | \ >' 6. 1-255 0-255 0-255 1-254 ifconfig |grep -E -o | < (' \ [1-9] [1-9] [0-9] | 1 [0-9] [0-9] | 2 [0 to 4] [0-9] 25 [0-5]) | \ >. \ "([0-9] | [1-9] [0-9] | 1 [0-9] [0-9] | 2 [0 to 4] [0-9] 25 [0-5]) | \ >. \" ([0-9] | [1-9] [0-9] [0-9] | 1 | 2 [0-9] [0-9] [0-4] 25 [0-5]) | \ > \ < (1-9] [| [1-9] [0-9] | 1 [0-9] [0-9] | 2 [0 to 4] [0-9] 25 [0-4]) | \ > '7. Pay attention to the first anchor words ending, the beginning of a line, end-of-line grep -e "^ (\" + \ [^ :] >). * \ 1 $"/etc/passwdCopy the code
* Added WC, cut, sort, UNIq, diff
1. wc: world count
-l |
line |
-w |
word |
-c |
byte |
2. cut
-d |
The separator |
-f |
How many columns |
3. sort
The sorting algorithm is awesome
-t char |
Specifying delimiters |
-k# |
Fields that sort comparisons |
-n |
Compare in numerical order |
-r |
The reverse |
-f |
Ignore character case |
-u |
== Only one copy of == is kept for duplicate rows. Repeated lines: continuous and identical |
4. uniq
-c |
Displays the number of repeats per line |
-u |
Only unique rows are displayed |
-d |
Only non-unique rows are displayed |
5. diff
Compare the differences line by line
6. Patch (forward and reverse)
diff old file Newfie > patch_file
-u uses an unfied mechanism that shows the context of the row to be modified. Default is three lines.
-r Indicates that the recovery mode can be forward or reverse
-
practice
Obtain the IP address in the ifconfig command result
2. Sed
sed [OPTION]... {script} [input-file]...
Copy the code
1. option
-n |
quiet |
Does not output the contents of the mode space to the screen (Each line will enter the mode space, equal to input all not output, output is script editing command operation results) |
-e |
Multi script operation input text at the same time, multi – point editing-e script -e script |
|
-f |
file |
Script Script file, one edit command per line |
-r |
regular |
Support for RegEXP -extended regular expressions |
-i[SUFFIX] |
edit files in place (makes backup if SUFFIX supplied) |
Edit the original file directly and add the suffix -iFile + suffix Backup files of |
-n before and after comparison, filter out the matched lines in script delimiting
[root@localhost layouts]# cat -n /etc/fstab
1
2 #
3 # /etc/fstab
4 # Created by anaconda on Sun Nov 8 17:58:16 2020
5 #
6 # Accessible filesystems, by reference, are maintained under '/dev/disk'
7 # See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
8 #
9 UUID=09780b2e-66e0-4fc3-8d91-7d9ddd350bbb / ext4 defaults 1 1
10 UUID=79e8f2c8-9f29-4248-9181-39209540d9a8 /boot xfs defaults 0 0
11 UUID=b822f2a7-37ba-4df5-9a1e-f82f9e2f9bfe swap swap defaults 0 0
[root@localhost layouts]# sed -n '1~2a\123' /etc/fstab
123
123
123
123
123
123
[root@localhost layouts]# sed '1~2a\123' /etc/fstab
123
#
# /etc/fstab
123
#Created by anaconda on Sun Nov 8 17:58:16 2020
#
123
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
123
#UUID=09780b2e-66e0-4fc3-8d91-7d9ddd350bbb / ext4 defaults 1 1123 UUID=79e8f2c8-9f29-4248-9181-39209540d9a8 /boot xfs defaults 0 0 UUID=b822f2a7-37ba-4df5-9a1e-f82f9e2f9bfe swap swap defaults 0 0 123Copy the code
2. The script
== Address delimiter editing command == (note that there is no split in the middle)
-
Address and bound
-
Empty address: Full text is processed
-
Single address:
#
Specify the line /pattern/
Each row that is matched by this pattern -
Address range
# #,
[Specify line, specify line] + # #
[specify line, specify line +#] #,/pattern/
[Specify line, pattern matches line] /pattern1/,/pattern2/
[Pattern1 matches the row, pattern2 matches the row] -
Step by step
1 ~ 2
Start at 1, step size 2, all odd rows 2 ~ 2
Starting at 2, step size 2, all even rows
-
-
Edit command
d
delete Delete matching rows p
pattern Displays the contents of the schema space a \text
append Append after matching rows i \text
insert Insert before matching row c \text
relace&change The matched lines are replaced with text w /WritePath
write Saves the rows matched in the pattern space r /FilePath
replace&append Read the file and append it to the pattern space matching line =
The line Numbers Prints the line numbers for the lines that the pattern matches !
Address and bound !
Edit commandsed -n ‘2~2! =’ testtest11
1
3S /// Replace the tag
search&replace The == delimiter can specify == s@@@``s###
== Replace flag == :g
Global replacementw
The result of successful replacement is savedp
Displays the lines that were printed successfully
3. input-file
You can add multiple files and process them.
practice
1. sed -r '/^[[:space:]]+/s/^[[:space:]]+//p' /etc/grub2.cfg 2. sed -r -n '/^#/s/^#[[:space:]]*//p' /etc/fstab 3. echo '/home/xcg/desktop' |sed 's@[^/]*\/\? $@ @ 'Copy the code
4. Advanced editing commands
== mode space, hold space ==
Imaginative:
Three awk.
awk (1) - pattern scanning and processing language
Copy the code
Awk 197 has been around for a few years, GNU AWk is reimplemented on Linux.
Pattern scanning and processing == language ==, scripting language interpreter programming language.
1. Basic usage
awk [options] 'program' file ...
Copy the code
Program supports conditional judgments, loops, and variables.
The program:
==PATTERN{ACTION STATEMENTS}==
Statements are separated by semicolons
2. option
-F |
The separator | |
-v |
Define built-in variables | -v FS=’:’ (field seperator defaults to blank characters) -v OFS=’:’ (output field seperator defaults to blank characters) |
3. program
3.1 the print
3.2 variable
3.2.1 Built – in variables
FS |
field seperator | The default is whitespace |
OFS |
output field seperator | The default is whitespace |
RS |
Input Record Separator | A line break is entered, which is delimited by the separator |
ORS |
output Record seperator | Newline character for output |
NF |
number of fields | Number of fields |
{print $NF } The last word is short |
||
{print NF } number of fields |
||
NR |
number of record | The number of rows |
FNR |
File number of Record | Each file counts separately, the number of lines |
FILENAME |
Current file name | |
ARGC |
Number of command line arguments | |
ARGV |
Array that holds each parameter given on the command line |
3.2.2 User-defined Variables
-v var = value
Variables are case-sensitive
- in
program
Directly defined in
3.3 the printf command
Printf “FORMAT, item1, item2…”
-
FORMAT must be given
-
Does not wrap, requires an explicit line break control character, \n
-
In FORMAT, you need to specify a formatting symbol for each subsequent item
-
Format character
%c
Display the ASCII code of the character %d
.%i
Decimal integer %e
.%E
Scientific counting numerical display %f
Displays as a floating point number %g
%G
Display values in scientific notation or dot form %s
Display string %u
Unsigned integer % %
Display % itself -
The modifier
# # [.]
The first one #
Digitally controlled displayThe width of theAnd the second#
After the decimal pointprecision% 3.1 f -
== left aligned == %-15s + A symbol that displays numerical values
-
3.4 the operator
-
Arithmetic operator
+ - * / ^ % -x +x Copy the code
-
String operators: unsigned operations, string concatenation
-
Assignment operator:
= += -= *= /= ^= %= ++ -- Copy the code
-
Comparison operator
> >= < <=! = = =Copy the code
-
== Mode card ==
~ match! ~ does not matchCopy the code
-
Logical operator
&& || ! Copy the code
-
A function call
Function_name (argu1 argu2,...)Copy the code
-
Conditional expression
? : :Copy the code
A user whose uid is greater than or equal to 1000 is a common user; otherwise, a system user
3.5 the PATTERN
== is similar to the delimiter == of sed
empty |
Empty mode, matching each line |
/regular expression/ |
Only rows == that can be matched by this pattern == are processed |
relation expression |
The == relation/comparison expression == is processed only if the result is true.True: The result is a non-zero value. |
line ranges |
Line range |
Startline, endline:/ pattern1 /, / pattern2 / Numeric format is not supported |
|
BEGIN/END mode |
BEGIN{} Execute only once before you start processing the text in the file |
END{} Execute only once after text processing is complete |
3.6 Common Actions
expression |
|
control statement |
Control statements such as if while |
compound statements |
Combined statement |
input statements |
The input statement |
output statements |
Output statements |
3.7 Control Statements
if(condition) {statements}
if(condition) {statements} else {statements}
while(condition){statements}
do{statements} while (condition)
for(expr1; expr2; expr3){statements}break;
continue;
delete array[index]
delete array
exit
{ statements }
Copy the code
-
If – else
Syntax: if(condition) {statements} else {statements}
-
while
Grammar: while (condition) {statements}
-
Do while
Syntax: do{statements} while (condition)
A loop body that executes at least once
-
The for loop
Grammar: the for (expr1; expr2; expr3){statements}
-
A switch statement
Syntax: switch(expression){case VALUE1 or /REGEXP:statement; case VALUE2 or /REGEXP2: statements; . }
-
Break and continue
-
next
End processing of this line early and advance to the next line. Similar to continue, but next is between lines and continue is within lines.
-
Array == Common statistics ==
Associative array: array[index-expression]
index-expression
:-
Any string can be used. Strings need to be enclosed in == double quotes ==.
-
If an array element does not already exist, awK automatically creates the element and initializes its value as an “empty string” when referencing it
== Statistics are collected by IP address access count ==
== Practice: ==
-
Count the number of occurrences of each file system type in the /etc/fstab file.
~] # awk '/^UUID/{fs[$3]++}END{for(i in fs) {print i,fs[i]}}' /etc/fstab swap 1 ext4 1 xfs 1 Copy the code
-
Counts the number of occurrences of each word in a specified file
~] # cat word.txt aaa bbb aaa ccc aaa eee bbb ccc ~] # awk '{for(i=1; i<=NF; i++) word[$i]++}END{for(i in word) {print i,word[i]}}' word.txt aaa 3 ccc 2 eee 1 bbb 2 Copy the code
-
-
3.8 the function
Built-in function | ||
---|---|---|
Numerical processing | rand() |
Returns a random number between 0 and 1, only the first fetch is random |
String handling | sub(r,s,[t]) |
To find thet Represents a match in a stringr The content and put itFor the first time,Appear instead ofs Content of presentation |
gsub(r,s,[t]) |
Global replacementTo findt Represents a match in a stringr The content and put itAll The TimesAppear instead ofs Content of presentation |
|
split(s,a[],r) |
In order tor Slits characters for delimiterss And save the cutting result to the array represented by A |
Split usage examples:
#Delimit the third column of characters with:, and then format the output
~] # netstat -nlptu |awk '/^tcp\>/{split($4,ip,":"); count[ip[1]]++}END{for(i in count) {printf"%-10s\t%d\n",i,count[i]}}'
127.0.0.1 1
0.0.0.0 3
Copy the code