30 days learning Python 19 days — Regular expressions
Regular expression
Regular expressions (Regex/RegExp) are a very powerful programming concept that applies to all programming languages, but is often confusing and difficult for beginners to understand. A regular expression is a set of character patterns that are very efficient at retrieving strings. They provide a wide range of use cases when working with text, such as searching, validating, or replacing text.
Today I’ll explore how to use regular expressions in Python.
A regular expression (abbreviated as a regex or regexp, also known as a rational expression) is a set of characters that define a retrieval model. Typically such models are used by search algorithms to retrieve or replace strings, or to perform input validation. It is a technique developed from theoretical computer science and formal language theory. (Wikipedia)
I’m already familiar with regular expressions, thanks to my previous experience writing JavaScript programs. At the same time, there is a lot of information about regular expressions on the web. My goal today is to examine the syntax and ways to use regular expressions in Python, because knowing how to use regular expressions in Python can be very useful later in building projects. So in this article I’ve put together some great articles on regular expressions and some practical coding exercises for my future use. It is also useful for regular expression enthusiasts. You don’t need to remember every regular expression rule, just Google it when you need it. Most regular expression patterns can be found, so most of the time you don’t need to create your own.
However, knowing how to read regular expression patterns is a very useful skill that can help us understand the basic function of patterns.
Here are some resources for regular expressions specific to Python
- This is a Python RegEx cheatsheet with examples
- A web cheat-sheet
- Another compact web-based cheat-sheet
Regex101 is a great site for exercises and tests, as well as generating equivalent Python regular expression patterns
The method of regular expression in Python
To use regular expressions in Python, you need to import a built-in RE module. This module provides some ways to use regular expressions.
function | describe |
---|---|
re.search | Checks if the given pattern exists anywhere in the input string |
The output is a re.match object that can be used with conditional expressions | |
R-strings preferentially defines regular expressions | |
Python also maintains a small cache of regular expressions | |
re.fullmatch | Make sure pattern matches the entire input string |
re.compile | Compile a reusable Pattern and print the re.pattern object |
re.sub | Search and replace |
re.sub(r’pat’, f, s) | Function f takes a re.match object as an argument |
re.escape | Automatically escape all metacharacters |
re.split | Returns the list after splitting the string by matching substrings |
The text matched by the group will be part of the output | |
The part of the out-of-group pattern match is not in the output | |
re.findall | Returns all matches as a list |
If one capture group is used, only matching capture groups are returned | |
When there is more than one capture group, each element will be a tuple of the capture group | |
The part of the out-of-group pattern match is not in the output | |
re.finditer | For each Match, an iterator of the re.match object is used |
re.subn | Gives the number of tuples and substitutions that modify the string |
Code practice
Let’s try to build some code to test various real-world use cases of Regex while building a Python application.
Password authentication
Prompt the user for a password and confirm
# conditions:
Contains at least 8 characters
# Only letters, digits, and @$! % *? &
Have at least one capital letter
Have at least one lowercase letter
# Has at least one special character
# Have at least one number
import re
def password_checker() :
password = input('Please enter a password')
password_pattern = re.compile(
r"^(? =.*[a-z])(? =.*[A-Z])(? =.*\d)(? =.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$"
)
result = re.fullmatch(password_pattern, password)
if result:
print('Valid password')
else:
print('Invalid password')
password_checker()
Copy the code
Note: The code above is much more interactive with switch, checking each condition individually and displaying a separate error if any condition fails. If the regular expression above seems confusing, try copying it to Regex101. It breaks the regular expression into chunks and interprets them.
I prefer to use the compile method to store the regular expression pattern as a reference so that it can be executed later. It returns a regular expression object.
The r before the regular expression string tells the compiler that this is a raw string. Character escape is not required when raw strings are used.
Extracts a number from a string
This program extracts a number from a string
import re
string = 'Python was introduced in 1992. This is year 2020.'
pattern = '\d+'
result = re.findall(pattern, string)
print(result) # (' 1992 ', '2020')
Copy the code
These are some basic examples of how to use regular expressions in Python.
Here are some great articles to take a closer look at regular expressions in Python
- www.programiz.com/python-prog…
- Realpython.com/regex-pytho…
- Github.com/ziishaned/l… (This is my own favorite)
That’s all for today. Tomorrow I’ll dive into testing techniques in Python. I’m looking forward to it.
The original link
30 Days of Python – Day 19 – Regular Expressions