• Grep :(global search regular expression and print out the line) text filtering tool

  • Sed :(stream editor) a text editor

  • Awk :(initials of three inventors) implementation of Linux as gawk, text report generator (formatted text)

Regular expression formula:

A pattern written by the special character **== and the text character ==**, some of which do not represent their literal meaning but are used to represent control or wildcard functions

Metacharacters fall into two categories:

  • Basic regular expression BRE
  • Extend the regular expression ERE

A grep.

  • Function: text filtering tool, according to the user == specified “mode (filter condition)” == on the target text == line by line == match check, print the matched line;

  • Pattern: Filter conditions written by metacharacters and text characters of regular expressions

    grep [OPTIONS] PATTERN [FILE...]  grep [OPTIONS] [-e PATTERN | -f FILE] [FILE...]Copy the code

1. option

-i ignore
-o only
-v invert
-q quiet
-E Extend Extend regular
-a all –text Don’t ignore binary data
-A After
-B Before
-C Context
-- color = auto

2. Regular expression /PATTERN

Basic regular expression metacharacters

2.1 Character Matching

. Any single character
[] Specifies any single character within the range ==
(^) Specifies any single character outside the range ==
[:digit:] digital
[:lower:] lowercase
[:upper:] A capital
[:alpha:] The letter
[:alnum:] Containing numbers and letters
[:punct:] punctuation
[:space:] Whitespace characters (including tabs, Spaces, newlines, etc.)
[root@localhost ~]# grep "r[[:alpha:]][[:alpha:]]t" /etc/passwd
root:x:0:0:root:/root:/bin/bash
operator:x:11:0:operator:/root:/sbin/nologin
Copy the code

Note:

[[:alpha:]]

== Parentheses == match a single character

== Parentheses == indicate a single letter character

2.2 Number of matches

Limit the number of occurrences of the preceding character == = (default greedy mode, as many matches as possible) == Backslashes are escaped ==

Matches the preceding character
* == any time == =;0,1, multiple times
. * == Any == length == any == character
\? 0 or 1 times, == Optional ==
\ + 1 or more times, == At least once ==
\{m\} == Specify exactly m times ==
\ \} {m, n == = at least m times, == at most n times
\ \} {0, nUp to n times
\ {m, \}At least m

2.3 Position == Anchor (beginning of line, end of line, beginning of word) ==

Line,
^ The beginning of a line anchor
$ Anchor end of each line
^PATTERN$ Match the entire line
^[[:space:]]*$ A line containing whitespace characters
\ < The word first anchor
\ > Ending anchor
\<PATTERN\> Accurate anchor

Exercises:

1. grep -v  "/bin/bash$" /etc/passwd
#Note that you need to anchor the beginning and end of the word, otherwise it's not two or three digits.2. Grep "\" [[: digit:]] \ {2, 3 \} \ > "/ etc/passwd and grep" ^ [[: space:]] \ {1, \} [^ [: space:]] \ {1, \} "/ etc/grub2. CFG load_env set default="${next_entry}" set next_entry= save_env next_entry set boot_once=true set default="${saved_entry}" 4 menuentry_id_option = "id" netstat ant | grep "LISTEN [[: space:]] * $" TCP 0 0 0.0.0.0:139 0.0.0.0: * LISTEN TCP 0 0 127.0.0.1:6379 0.0.0.0:* LISTEN TCP 0 0 0.0.0.0:22 0.0.0.0:* LISTEN TCP 0 0 0.0.0.0:445 0.0.0.0:* LISTEN tcp6 0 0 :::3306 :::* LISTEN tcp6 0 0 :::139 :::* LISTEN tcp6 0 0 :::22 :::* LISTEN tcp6 0 0 :::445 :::* LISTEN tcp6 0 0 :::33060  :::* LISTENCopy the code

2.4 Grouping and Referencing (== Backreferencing == : Referencing the characters matched by the preceding parentheses)

\ (\) To bind one or more characters together, == is treated as a whole ==
The grep engine automatically logs the grouping variables
\ 1 The first set of open parentheses, the content matched by the close parentheses
\ 2
3 \

3. egrep fgrep

Egrep: extends grep

Fgrep: fastgrep The default is not regular

All three can be switched by -e -F -G

  • metacharacters

== Note: parentheses () and <> need to be escaped ==

  • practice

1. grep -i "^s" /proc/meminfo grep -i "^[sS]" /proc/meminfo grep -E "^s|^S" /proc/meminfo grep -E "^(s|S)" /proc/meminfo  2.	#Note the anchoring endingsgrep -E "^(root|mysql|samba)\>" /etc/passwd 3. grep -E "\<[[:alpha:]]*\>\(" /etc/rc.d/init.d/functions grep "\" [[: alpha:]] * \ > ("/etc/rc. D/init. After d/functions provides correct: Grep -e "\" [_, [: alnum:]] * \ > \ (\) "/ etc/rc. D/init. 4 d/functions provides the echo ` PWD ` | egrep -o" ^ / / / < [[: alpha:]] + \ > "after the correct: Note from the end of each line began to locate the echo/root/mysite/layouts / | grep - E - o "[^ /] + /? The $5. "ifconfig | grep - E '\" {1} [1, 2] [0-9] [0-9] {0, 1} {0, 1} \ >' corrected: From the ones digit, Ten to hundred digit start 1-9 of 10-99 100-199, 200-249, 250-255 the ifconfig | grep - E '\ < (1-9] [| [1-9] [0-9] | 1 [0-9] [0-9] | 2 [0 to 4] [0-9] 25 [0-5]) | \ >' 6. 1-255 0-255 0-255 1-254 ifconfig |grep -E -o | < (' \ [1-9] [1-9] [0-9] | 1 [0-9] [0-9] | 2 [0 to 4] [0-9] 25 [0-5]) | \ >. \ "([0-9] | [1-9] [0-9] | 1 [0-9] [0-9] | 2 [0 to 4] [0-9] 25 [0-5]) | \ >. \" ([0-9] | [1-9] [0-9] [0-9] | 1 | 2 [0-9] [0-9] [0-4] 25 [0-5]) | \ > \ < (1-9] [| [1-9] [0-9] | 1 [0-9] [0-9] | 2 [0 to 4] [0-9] 25 [0-4]) | \ > '7. Pay attention to the first anchor words ending, the beginning of a line, end-of-line grep -e "^ (\" + \ [^ :] >). * \ 1 $"/etc/passwdCopy the code

* Added WC, cut, sort, UNIq, diff


1. wc: world count

-l line
-w word
-c byte

2. cut

-d The separator
-f How many columns

3. sort

The sorting algorithm is awesome

-t char Specifying delimiters
-k# Fields that sort comparisons
-n Compare in numerical order
-r The reverse
-f Ignore character case
-u == Only one copy of == is kept for duplicate rows. Repeated lines: continuous and identical

4. uniq

-c Displays the number of repeats per line
-u Only unique rows are displayed
-d Only non-unique rows are displayed

5. diff

Compare the differences line by line

6. Patch (forward and reverse)

diff old file Newfie > patch_file

-u uses an unfied mechanism that shows the context of the row to be modified. Default is three lines.

-r Indicates that the recovery mode can be forward or reverse

  • practice

    Obtain the IP address in the ifconfig command result


2. Sed

sed  [OPTION]... {script} [input-file]...
Copy the code

1. option

-n quiet Does not output the contents of the mode space to the screen

(Each line will enter the mode space, equal to input all not output, output is script editing command operation results)
-e Multi script operation input text at the same time, multi – point editing

-e script -e script
-f file Script Script file, one edit command per line
-r regular Support for RegEXP -extended regular expressions
-i[SUFFIX] edit files in place (makes backup if SUFFIX supplied) Edit the original file directly and add the suffix -iFile + suffixBackup files of

-n before and after comparison, filter out the matched lines in script delimiting

[root@localhost layouts]# cat -n /etc/fstab 
     1
     2  #
     3  # /etc/fstab
     4  # Created by anaconda on Sun Nov  8 17:58:16 2020
     5  #
     6  # Accessible filesystems, by reference, are maintained under '/dev/disk'
     7  # See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
     8  #
     9  UUID=09780b2e-66e0-4fc3-8d91-7d9ddd350bbb /                       ext4    defaults        1 1
    10  UUID=79e8f2c8-9f29-4248-9181-39209540d9a8 /boot                   xfs     defaults        0 0
    11  UUID=b822f2a7-37ba-4df5-9a1e-f82f9e2f9bfe swap                    swap    defaults        0 0
    
[root@localhost layouts]# sed -n '1~2a\123' /etc/fstab 
123
123
123
123
123
123
[root@localhost layouts]# sed '1~2a\123' /etc/fstab 

123
#
# /etc/fstab
123
#Created by anaconda on Sun Nov 8 17:58:16 2020
#
123
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
123
#UUID=09780b2e-66e0-4fc3-8d91-7d9ddd350bbb / ext4 defaults 1 1123 UUID=79e8f2c8-9f29-4248-9181-39209540d9a8 /boot xfs defaults 0 0 UUID=b822f2a7-37ba-4df5-9a1e-f82f9e2f9bfe swap swap  defaults 0 0 123Copy the code

2. The script

== Address delimiter editing command == (note that there is no split in the middle)

  • Address and bound

    • Empty address: Full text is processed

    • Single address:

      # Specify the line
      /pattern/ Each row that is matched by this pattern
    • Address range

      # #, [Specify line, specify line]
      + # # [specify line, specify line +#]
      #,/pattern/ [Specify line, pattern matches line]
      /pattern1/,/pattern2/ [Pattern1 matches the row, pattern2 matches the row]
    • Step by step

      1 ~ 2 Start at 1, step size 2, all odd rows
      2 ~ 2 Starting at 2, step size 2, all even rows
  • Edit command

    d delete Delete matching rows
    p pattern Displays the contents of the schema space
    a \text append Append after matching rows
    i \text insert Insert before matching row
    c \text relace&change The matched lines are replaced with text
    w /WritePath write Saves the rows matched in the pattern space
    r /FilePath replace&append Read the file and append it to the pattern space matching line
    = The line Numbers Prints the line numbers for the lines that the pattern matches
    ! Address and bound!Edit command sed -n ‘2~2! =’ testtest11

    1

    3
    S /// Replace the tag search&replace The == delimiter can specify ==s@@@``s###

    == Replace flag == :gGlobal replacementwThe result of successful replacement is savedpDisplays the lines that were printed successfully

3. input-file

You can add multiple files and process them.

practice

1. sed -r '/^[[:space:]]+/s/^[[:space:]]+//p' /etc/grub2.cfg 2. sed -r -n '/^#/s/^#[[:space:]]*//p' /etc/fstab 3. echo '/home/xcg/desktop' |sed 's@[^/]*\/\? $@ @ 'Copy the code

4. Advanced editing commands

== mode space, hold space ==

Imaginative:


Three awk.

awk (1)              - pattern scanning and processing language
Copy the code

Awk 197 has been around for a few years, GNU AWk is reimplemented on Linux.

Pattern scanning and processing == language ==, scripting language interpreter programming language.

1. Basic usage

awk [options] 'program' file ...
Copy the code

Program supports conditional judgments, loops, and variables.

The program:

​ ==PATTERN{ACTION STATEMENTS}==

Statements are separated by semicolons

2. option

-F The separator
-v Define built-in variables -v FS=’:’ (field seperator defaults to blank characters)

-v OFS=’:’ (output field seperator defaults to blank characters)

3. program

3.1 the print

3.2 variable

3.2.1 Built – in variables
FS field seperator The default is whitespace
OFS output field seperator The default is whitespace
RS Input Record Separator A line break is entered, which is delimited by the separator
ORS output Record seperator Newline character for output
NF number of fields Number of fields
{print $NF} The last word is short
{print NF} number of fields
NR number of record The number of rows
FNR File number of Record Each file counts separately, the number of lines
FILENAME Current file name
ARGC Number of command line arguments
ARGV Array that holds each parameter given on the command line
3.2.2 User-defined Variables
  • -v var = value

Variables are case-sensitive

  • inprogramDirectly defined in

3.3 the printf command

Printf “FORMAT, item1, item2…”

  • FORMAT must be given

  • Does not wrap, requires an explicit line break control character, \n

  • In FORMAT, you need to specify a formatting symbol for each subsequent item

    • Format character

      %c Display the ASCII code of the character
      %d.%i Decimal integer
      %e.%E Scientific counting numerical display
      %f Displays as a floating point number
      %g %G Display values in scientific notation or dot form
      %s Display string
      %u Unsigned integer
      % % Display % itself
    • The modifier

      # # [.] The first one#Digitally controlled displayThe width of theAnd the second#After the decimal pointprecision % 3.1 f
      - == left aligned == %-15s
      + A symbol that displays numerical values

3.4 the operator

  • Arithmetic operator

    + - * / ^ % 
    -x
    +x
    Copy the code
  • String operators: unsigned operations, string concatenation

  • Assignment operator:

    = += -= *= /= ^= %=
    ++ --
    Copy the code
  • Comparison operator

    > >= < <=! = = =Copy the code
  • == Mode card ==

    ~ match! ~ does not matchCopy the code
  • Logical operator

    &&
    ||
    !
    Copy the code
  • A function call

    Function_name (argu1 argu2,...)Copy the code
  • Conditional expression

    ? : :Copy the code

    A user whose uid is greater than or equal to 1000 is a common user; otherwise, a system user

3.5 the PATTERN

== is similar to the delimiter == of sed

empty Empty mode, matching each line
/regular expression/ Only rows == that can be matched by this pattern == are processed
relation expression The == relation/comparison expression == is processed only if the result is true.True: The result is a non-zero value.
line ranges Line range
Startline, endline:/ pattern1 /, / pattern2 /Numeric format is not supported
BEGIN/END mode BEGIN{}Execute only once before you start processing the text in the file
END{}Execute only once after text processing is complete

3.6 Common Actions

expression
control statement Control statements such as if while
compound statements Combined statement
input statements The input statement
output statements Output statements

3.7 Control Statements

if(condition) {statements}

if(condition) {statements} else {statements}

while(condition){statements}

do{statements} while (condition)

for(expr1; expr2; expr3){statements}break;

continue;

delete array[index]

delete array

exit

{ statements }
Copy the code
  • If – else

    Syntax: if(condition) {statements} else {statements}

  • while

    Grammar: while (condition) {statements}

  • Do while

    Syntax: do{statements} while (condition)

    A loop body that executes at least once

  • The for loop

    Grammar: the for (expr1; expr2; expr3){statements}

  • A switch statement

    Syntax: switch(expression){case VALUE1 or /REGEXP:statement; case VALUE2 or /REGEXP2: statements; . }

  • Break and continue

  • next

    End processing of this line early and advance to the next line. Similar to continue, but next is between lines and continue is within lines.

  • Array == Common statistics ==

    Associative array: array[index-expression]

    index-expression:

    • Any string can be used. Strings need to be enclosed in == double quotes ==.

    • If an array element does not already exist, awK automatically creates the element and initializes its value as an “empty string” when referencing it

      == Statistics are collected by IP address access count ==

      == Practice: ==

      1. Count the number of occurrences of each file system type in the /etc/fstab file.

        ~] # awk '/^UUID/{fs[$3]++}END{for(i in fs) {print i,fs[i]}}' /etc/fstab 
        swap 1
        ext4 1
        xfs 1
        Copy the code
      2. Counts the number of occurrences of each word in a specified file

        ~] # cat word.txt 
        aaa bbb aaa ccc aaa eee bbb ccc
        ~] # awk '{for(i=1; i<=NF; i++) word[$i]++}END{for(i in word) {print i,word[i]}}' word.txt 
        aaa 3
        ccc 2
        eee 1
        bbb 2
        Copy the code

3.8 the function

Built-in function
Numerical processing rand() Returns a random number between 0 and 1, only the first fetch is random
String handling sub(r,s,[t]) To find thetRepresents a match in a stringrThe content and put itFor the first time,Appear instead ofsContent of presentation
gsub(r,s,[t]) Global replacementTo findtRepresents a match in a stringrThe content and put itAll The TimesAppear instead ofsContent of presentation
split(s,a[],r) In order torSlits characters for delimiterssAnd save the cutting result to the array represented by A
Split usage examples:
#Delimit the third column of characters with:, and then format the output
~] # netstat -nlptu |awk '/^tcp\>/{split($4,ip,":"); count[ip[1]]++}END{for(i in count) {printf"%-10s\t%d\n",i,count[i]}}'
127.0.0.1       1
0.0.0.0         3
Copy the code