🎏 This is the 11th day of my participation in the Gwen Challenge. Check out the details: Gwen Challenge

0 x00 📢 preface

Python is a programming language that knows how not to interfere with your programming. It’s easy to learn and powerful enough to build Web applications and automate boring stuff.

This article is a detailed summary and analysis of common string operations, hoping to help you.

0x01 String (string)

Strings are the most common data type in Python and support both single and double quotes. Print strings in single quotes when double quotes are used.

>>> "Hello world!"
'Hello world! '

>>> 'Hello  world!'
'Hello  world!'

>>> "Let's go!"
"Let's go!"

>>> 'she said "Hello world!" '
'she said "Hello, world!" '
Copy the code

Quotes to escape

The above example can be used to escape quotes with a backslash (\).

>>> 'Let\'s go! '
"Let's go!"

>>> "\"Hello, world! \" she said"
'"Hello, world!" she said'
Copy the code

Concatenated string

A + sign is usually used to concatenate strings, like adding numbers.

>>> "she said " + '"Hello world!" '
'she said "Hello world!" '

>>> a = "she said "
>>> b = '"Hello world!" '
>>> a + b
'she said "Hello world!" '
Copy the code

String concatenation can also be implemented when two strings are entered in sequence.

>>> "she said " '"Hello world!" '   
'she said "Hello world!" '

This is useful only if the input is a string
>>> a = "she said "
>>> b = '"Hello world!" '
>>> a  b
  File "<stdin>", line 1
    a  b
       ^
SyntaxError: invalid syntax
Copy the code

A long string

You can use triple quotes to represent long strings (strings that span multiple lines).

>>> """like this"""
'like this'

>>> print('''long long ago!
"Hello world!"
she said.''')
long long ago!
"Hello world!"
she said. 
Copy the code

Regular strings can also span multiple lines. As long as a backslash is added to the end of the line, backslashes and newlines are escaped, that is, ignored.

>>> 1 + 2 + \
4 + 5
12

>>> print("Hello \ world!")
Hello  world!

>>> print\ ['Hello world')
Hello  world
Copy the code

Indexing

A string literal can be indexed directly without first assigning it to a variable.

>>> 'Hello'[1]
'e'
Copy the code

If a function call returns a sequence, it can be indexed directly.

>>> yearnum = input('please input year: ') [3]
please input year: 2021
>>> yearnum
'1'  
Copy the code

When the sequence is multiplied by the number n, the sequence is repeated n times to create a new sequence.

>>> 'python' * 3 
'pythonpythonpython'
Copy the code

The operator in

To check whether a particular value is contained in a sequence, use the operator in

>>> access_mode = 'rw+'
>>> 'w' in access_mode 
True
>>> 'x' in access_mode 
False

>>> subject = '$$$ Get rich now!!! $$$'
>>> '$$$' in subject 
True
Copy the code

Create a list of

Using the list function, you can quickly convert a string to a list of characters.

>>> somelist = list('Hello')
>>> somelist
['H'.'e'.'l'.'l'.'o']

Copy the code

Converts a character list to a string.

>>>' '.join(somelist)
Copy the code

Slice assignment

>>> name = list('Perl')
>>> name 
['P'.'e'.'r'.'l']

>>> name[2:] = list('ar')
>>> name 
['P'.'e'.'a'.'r']

>>> name = list('Perl')
>>> name[1:] = list('ython')
>>> name 
['P'.'y'.'t'.'h'.'o'.'n']
Copy the code

0x02 String formatting

The %s in the format string, called the conversion specifier, indicates where the value is to be inserted and specifies the value to be formatted on the right. When specifying values that you want to format, you can use a single value (such as a string or number), tuples (if you want to format multiple values), or dictionaries, of which tuples are the most common.

>>> format = "Hello, %s. %s !"
>>> values = ('world'.'python')
>>> format % values 
'Hello, world. python ! '
Copy the code

Template string

Arguments that contain an equal sign are called keyword arguments,

>>> from string import Template
>>> tmpl = Template("Hello, $param1! $param2 !")
>>> tmpl.substitute(param1="world", param2="Python") 
'Hello, world! Python ! '
Copy the code

String method format

>>> "{}, {} and {}".format("first"."second"."third") 
'first, second and third'
>>> "{0}, {1} and {2}".format("first"."second"."third") 
'first, second and third'
>>> {3} {0} {2} {1} {3} {0}".format("be"."not"."or"."to") 
'to be or not to be'

>>> from math import pi
>>> "{name} is approximately {value:.2f}.".format(value=pi, name="PI") 
'π is approximately 3.14.''
Copy the code

If the variable has the same name as the replacement field, a shorthand may be used. In this case, use the f string — prefix the string with f. (Python 3.6 +)

>>> from math import e
>>> f"Euler's constant is roughly {e}."  # equivalent to "Euler's constant is roughly {e}.". Format (e=e)
"Euler's constant is roughly 2.718281828459045."
Copy the code

0x03 How Do I Set the Format

Strings contain information about how to format them, and this information is specified using a mini-format specification language. Each value is inserted into the string to replace the replacement field enclosed in curly braces. The replacement field consists of the following sections, each of which is optional.

  • Field name: Index or identifier indicating which value to format and replace the field with the result. In addition to specifying a value, you can specify specific parts of a value, such as elements of a list.
  • Conversion mark: A single character followed by an exclamation point. Currently supported characters include R for repr, S for STR, and A for ASCII. If you specify the conversion flag, you do not use the formatting mechanism of the object itself. Instead, you use the specified function to convert the object to a string, followed by further formatting.
  • Format specifier: An expression followed by a colon (which is expressed in a microformat specified language). The lattice specifier lets you specify the final format in detail, including the format type (such as string, floating point, or hexadecimal), the field width and precision of the number, how to display symbols and thousands separators, and various alignment and padding methods.

The field name

You simply supply format with unnamed parameters to format it and use unnamed fields in the format string. At this point, the fields and parameters are paired in order. You can also name the parameter that will be used in the corresponding replacement field. You can mix the two methods.

>>> "{foo} {} {bar} {}".format(1.2, bar=4, foo=3)
 1 4 2 ' '3
Copy the code

You can also specify by index which field to use the corresponding unnamed parameters in, so that unnamed parameters are used out of order.

>>> "{foo} {1} {bar} {0}".format(1.2, bar=4, foo=3) 
'3 2 4 1'
Copy the code

Instead of using only the supplied value itself, you can access its components, use indexes, and use period notation to access methods, properties, variables, and functions in the imported module

>>> fullname = ["Alfred"."Smoketoomuch"]
>>> "Mr {name[1]}".format(name=fullname) 
'Mr Smoketoomuch'

>>> import math
>>> tmpl = "The {mod.__name__} module defines The value {mod. PI} for π"
>>> tmpl.format(mod=math) 
'The Math Module defines The value '
Copy the code

Transition marks

(s, r, and a) specify the use of STR, REPR, and ASCII for conversion, respectively. The function STR usually creates a general-looking version of the string \. The repr function attempts to create a Python representation of the given value (in this case, a string literal). The ASCII function creates a representation that contains only ASCII characters.

>>> print("{pi! s} {pi! r} {pi! a}".format(pi="PI")) PI.'PI. '\u03c0'
Copy the code

Format specification

(that is, after the colon) use the character F (for fixed points).

>>> "The number is {num}".format(num=42) 
'The number is 42'
>>> "The number is {num:f}".format(num=42) 
'The number is 42.000000'
>>> "The number is {num:b}".format(num=42) 
'The number is 101010'
Copy the code

0x04 String method

constant

Several useful constants in the module string

  • String. digits: a string containing the digits 0 to 9.
  • String.ascii_letters: A string containing all ASCII letters (upper and lower case).
  • String. ascii_lowercase: a string containing all lowercase ASCII letters.
  • Printable: a string containing all printable ASCII characters.
  • String. punctuation: A string containing all ASCII punctuation characters.
  • String. ascii_uppercase: A string containing all uppercase ASCII letters.

Filling method

String padding character method

Center, LJUST, RJUST, zfill

split

If no delimiter is specified, the default is to split at one or more consecutive whitespace characters (Spaces, tabs, newlines, and so on)

>>> seq = ['1'.'2'.'3'.'4'.'5']
>>> sep = '+'
>>> sep.join('+') # Merge a list of strings
'1 + 2 + 3 + 4 + 5'

>>> '1 + 2 + 3 + 4 + 5'.split('+')
['1'.'2'.'3'.'4'.'5']
>>> 'Using the default'.split()
['Using'.'the'.'default']
Copy the code

strip

Strip Specifies which characters to delete in a string argument. The strip method removes only the lstrip and rstrip related to the beginning or end of a specified character

>>> '*** SPAM * for * everyone!!! * * * '.strip(' *!')
'SPAM * for * everyone'
Copy the code