If you search the Web for Python regular expressions, you’ll see a lot of junk writing code like this:
import re
pattern = re.compile('Regular expression')
text = 'A string'
result = pattern.findall(text)
Copy the code
The authors of these articles may have been influenced by the bad habits of other languages, or they may have been misled by other rubbish articles and used them without thinking.
In Python, you really don’t need to use re.compile!
To demonstrate this, let’s look at the Python source code.
Enter in PyCharm:
import re
re.search
Copy the code
The Windows user then holds down the Ctrl key on the keyboard and clicks Search, while the Mac user holds down the Command key and clicks Search, and PyCharm automatically jumps to Python’s RE module. Here, you’ll see that our regular expression methods, whether findAll or search or sub or match, all say something like this:
_compile(pattern, flag). Corresponding method (string)Copy the code
Such as:
def findall(pattern, string, flags=0):
"""Return a list of all non-overlapping matches in the string. If one or more capturing groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group. Empty matches are included in the result."""
return _compile(pattern, flags).findall(string)
Copy the code
As shown below:
Then we look at compile:
def compile(pattern, flags=0):
"Compile a regular expression pattern, returning a Pattern object."
return _compile(pattern, flags)
Copy the code
As shown below:
See the problem?
All of our regular expression methods already come with Compile!
There is no need to re.compile before calling the regular expression method.
At this point, one might retort:
If I have a million strings and use a regular expression to match them, I can write code like this:
Text = [list of 1 million strings] pattern = re.compile('Regular expression')
for text in texts:
pattern.search(text)
Copy the code
At this point, re.compile only executes once, whereas if you write code like this:
Texts = [list of a million strings]for text in texts:
re.search('Regular expression', text)
Copy the code
This is equivalent to executing the same regular expression re.compile a million times at the bottom.
Talk is cheap, show me the code.
Re.compile calls _compile, so let’s look at the source code, as shown below:
The code in the red box shows that _compile comes with a cache. It will automatically store a maximum of 512 keys consisting of type(pattern), pattern, and flags. As long as the same regular expression and flag are used, the second call to _compile will directly read the cache.
That said, please stop manually calling re.compile, a bad habit you’ve inherited from other languages (yes, I’m talking Java).
If this article is helpful to you, please follow my wechat official account: WEIwei Code(ID: itskingName)