One of the great programmers I saw somewhere joked that programmers do two things every day, and one of them is working with strings. I believe many students will feel the same way.
Almost any programming language considers strings to be the most basic and indispensable data type. Concatenating strings is a necessary skill. Today, I’m going to take a look at seven ways to concatenate strings in Python.
1, from the C language % way
print('%s %s' % ('Hello', 'world'))
Copy the code
>>> Hello world
The % sign format string is inherited from the ancient C language, which has similar implementations in many programming languages. %s in the above example is a placeholder that represents only a string, not the actual content of the concatenation. The actual concatenation content is placed in a tuple after a single % sign.
Similar placeholders are: %d (for an integer), %f (for a floating point number), %x (for a hexadecimal number), and so on. The % placeholder is both a feature of this concatenation and a limitation, because each placeholder has a specific meaning and is too cumbersome to actually use.
2. Format () Stitching mode
# concise version
Copy the code
s1 = 'Hello {}! My name is {}.'. Format ('World', 'Python ')print(s1)
>>>Hello World! My name is Python cat.
# Matching seats
s2 = 'Hello {0}! My name is {1}.'. Format ('World', 'Python cat ')
s3 = 'Hello {name1}! '. Format (name1='World', name2='Python cat ')print(s2)
>>>Hello World! My name is Python cat. Print (s3)
>>>Hello World! My name is Python cat.
In this way, curly braces {} are used as placeholders and the actual concatenated values are carried over in the format method. It is easy to see that it is actually an improvement on the % sign concatenation. This approach was introduced in Python2.6.
In the example above, the concise version has nothing in curly braces, and the disadvantage is that it is easy to get the order wrong. There are two main kinds of registration version, one is the introduction of the serial number, one is the use of key-value. In practice, we prefer the latter one, which is more intuitive and readable than the wrong order.
3. () is similar to tuple mode
s_tuple = ('Hello', ' ', 'world')
Copy the code
s_like_tuple = ('Hello' ' ' 'world')
print(s_tuple)
>>>('Hello', ' ', 'world')
print(s_like_tuple)
>>>Hello
worldtype(s_like_tuple)
>>>str
Note that s_like_tuple is not a tuple because there are no comma delimiters between elements, which can be separated with or without Spaces. Use type() to see that it is a STR type. I don’t know what the reason is, but maybe the () parentheses are optimized by Python.
This may seem quick, but the elements in parentheses () are required to be real strings and cannot be mixed with variables, so it is not flexible.
Holding variables is not supported for multiple elements
Copy the code
str_1 = 'Hello'
str_2 = (str_1 'world')
>>> SyntaxError: invalid
syntaxstr_3 = (str_1 str_1)
>>> SyntaxError: invalid syntax
# but the following is not an error
str_4 = (str_1)
4. Object-oriented template stitching
from string import
Copy the code
Templates = Template('${s1} ${s2}! ')
print(s.safe_substitute(s1='Hello',s2='world'))
>>> Hello world!
To be honest, I don’t like this implementation. A strong stink of object-oriented thinking.
I won’t go into that.
5, commonly used + sign way
Str_1 = 'Hello world! '
Copy the code
Str_2 = 'My name is Python.' Print (str_1 + str_2)
> > > Hello world!
My name is Python cat. Print (str_1)
> > > Hello world!
This approach is the most common, intuitive, easy to understand, and entry-level implementation. But it also has two pitfalls.
First, beginners to programming tend to make mistakes. They don’t realize that strings are immutable, and that new strings monopolize a chunk of new memory while the old ones stay the same. In the above example, there are two strings before concatenation and three strings after concatenation.
Second, some experienced programmers are prone to making the mistake of thinking that using the + sign concatenation is faster than the other way around when concatenation is no more than 3 (ps: many Python tutorials suggest this), but there is no reasonable basis for this.
In fact, when concatenating short literals, these literals are converted to a shorter form due to the constant Folding feature in CPython, for example ‘a’+’b’+’c’ is converted to ‘ABC’, ‘Hello ‘+’world’ will also be converted to’ Hello world’. This conversion is done at compile time, but at run time no concatenation takes place, thus speeding up the overall computation.
Constant folding optimization has a limit, which requires that the length of the splicing result not exceed 20. Therefore, when the final string length of the concatenation is less than 20, the + sign operator is much faster than join, regardless of the number of times the + sign is used.
Off topic: Does the number 20 sound familiar to you? That’s right, what was our “privileged race” in Python? As mentioned, the privileged race of the string class is also limited to 20. There is also an example that shows the difference between compile time and run time, which I recommend you go back to.
6. Join () stitching mode
str_list = ['Hello', 'world']str_join1 = ' '.join(str_list)str_join2 = '-'.join(str_list)
Copy the code
print(str_join1)
>>>Hello
worldprint(str_join2)
>>>Hello-world
The STR object has its own join() method, which takes a sequence parameter and can be concatenated. When concatenating, elements that are not strings need to be converted. As you can see, this approach works well for concatenating elements in a sequence object (such as a list) and setting a uniform interval.
This method is generally preferred when the stitching length exceeds 20. However, its disadvantage is that it is not suitable for piece-wise stitching of elements that are not in a sequence set.
7. F-string mode
name = 'world'
Copy the code
myname = 'python_cat'
words = f'Hello {name}. My name is {myname}.'print(words)
>>> Hello world. My name is python_cat.
The f-string method is from PEP 498 (Literal string Interpolation), introduced from Python3.6. The string is marked with an f, and other string variables are wrapped in curly braces {}.
This approach beats format() in readability and is as fast as join() in concatenating long strings.
Still, this approach is less elegant than some other programming languages because it introduces an F identifier. Some other programming languages can be more concise, such as the shell:
name="world"
Copy the code
myname="python_cat"
words="Hello ${name}. My name is ${myname}."
echo $words
>>>Hello world. My name is python_cat.
To sum up, the “string concatenation” we talked about earlier is actually understood in terms of results. In terms of implementation principles, we can divide these methods into three types:
Formatting classes: %, format(), template
Join classes: +, (), join()
Interpolation class: F-string
Join () is used when dealing with sequence structures such as lists of strings. When the splicing length is less than 20, the + sign operator is used. If the length is greater than 20, f-string is used for earlier versions. Format () or Join () is used for earlier versions.