Pathlib is more straightforward and Pythonic than the usual os.path. But it’s not just about simplifying things, it’s about something bigger
An overview of the
Pathlib is a Python built-in library defined by the Python documentation as object-oriented filesystem paths. Pathlib provides classes that represent file system paths, with semantics that are applicable to different operating systems. The path class is divided between pure paths, which provide pure computation without I/O, and concrete paths, which inherit from pure paths but also provide I/O operations.
Sound a little convoluted? That’s right, it’s a literal translation after all, but that doesn’t stop us from liking it. Let’s look at a couple of examples
Take a chestnut
The Python3 standard library pathlib module’s path is easier to manipulate paths than the OS module’s path method.
Gets the current file path
When using the OS module, there are two ways to directly obtain the current file path:
import os
value1 = os.path.dirname(__file__)
value2 = os.getcwd()
print(value1)
print(value2)
Copy the code
How does pathlib get the current file path?
The official documentation gives recommendations for eye-piercing teleportation
Give it a try
import pathlib
value1 = pathlib.Path.cwd()
print(value1)
Copy the code
How is it implemented
As documented, it returns the path in the form of os.getcwd(). Let’s take a look at the source code (CTRL + left mouse click on Pycharm to follow up on the specified object)
Return a new path pointing to the current working directory by pointing to a new path pointing to the current working directory
It doesn’t look like anything special, but why did the authorities go out of their way to introduce it?
Other packages
Pathlib encapsulates a number of OS paths, as documented, such as:
# Relationship description
os.path.expanduser() --> pathlib.Path.home()
os.path.expanduser() --> pathlib.Path.expanduser()
os.stat() --> pathlib.Path.stat()
os.chmod() --> pathlib.Path.chmod()
Copy the code
Official website document screenshot:
For details, please check the official document: Eye transfer
Just a few more chestnuts
This example doesn’t show anything, but it gives us an idea of what pathlib is made of, so let’s get a feel for how convenient it is.
Gets the upper/upper level directory
Which is to get the name of its grandfather
import os
print(os.path.dirname(os.path.dirname(os.getcwd())))
Copy the code
If implemented with pathlib:
import pathlib
print(pathlib.Path.cwd().parent.parent)
Copy the code
Parent is done. Is this closer to Pythonic? Write code like English.
If you only need to find the dad, use it once:
import pathlib
print(pathlib.Path.cwd().parent)
Copy the code
You can also go back to your grandparents:
import pathlib
print(pathlib.Path.cwd().parent.parent.parent)
Copy the code
Is using parent much more convenient than using the multilayer OS.path. dirname used by previous OS modules?
Path joining together
If you want to concatenate paths from its grandparents, you need to write this long string of code:
import os
print(os.path.join(os.path.dirname(os.path.dirname(os.getcwd())), "Attention"."Wechat Official Account"."[attacking]"."Coder 】"))
Copy the code
When you use Pathlib, you must feel happy:
import pathlib
parts = ["Attention"."Wechat Official Account"."[attacking]"."Coder 】"]
print(pathlib.Path.cwd().parent.parent.joinpath(*parts))
Copy the code
You can also adjust the number of parents by increasing or decreasing the number of parents.
PurePath
Much of this is done through Path in Pathlib, which also has a module called PurePath.
PurePath is a PurePath object that provides path-handling operations without actually accessing the file system. There are three ways to access these classes, which we also call Flavor.
The above quote comes from an official document, but it sounds a bit convoluted, so let’s look at it through chestnuts
PurePath.match
Let’s determine if the current file path has a file that matches the ‘*.py’ rule
import pathlib
print(pathlib.PurePath(__file__).match('*.py'))
Copy the code
Obviously, the coder. Py that we wrote the code to conform to the rules, so the output is True.
Why am I using this as an example? Further thinking that pathlib.PurePath can be followed by match means that it should be an object, not a path string. To test this idea, change the code:
import pathlib
import os
os_path = os.path.dirname(__file__)
pure_path = pathlib.PurePath(__file__)
print(os_path, type(os_path))
print(pure_path, type(pure_path))
print(pathlib.PurePath(__file__).match('*.py'))
Copy the code
PurePosixPath
coder.py
So this is a little bit of a mystery, what exactly is PurePosixPath?
PurePosixPath and PureWindowsPath, but don’t worry, These classes are not specified to run on some operating system. You can instantiate all of them regardless of which system you are running, because they do not provide any operations for making system calls.
Does not provide any operation to make a system call. What is that? I’m really getting deeper and deeper
The document begins with this description:
Pure paths are useful in some special cases; for example: If you want to manipulate Windows paths on a Unix machine (or vice versa). You cannot instantiate a WindowsPath when running on Unix, but you can instantiate PureWindowsPath. You want to make sure that your code only manipulates paths without actually accessing the OS. In this case, instantiating one of the pure classes may be useful since those simply don’t have any OS-accessing operations.
Pure paths are useful in some special cases; For example, if you want to operate Windows paths on a Unix computer (and vice versa). WindowsPath cannot be instantiated when running on Unix, but PureWindowsPath can be instantiated. You want to ensure that your code only manipulates paths and does not actually access the operating system. In this case, it might be useful to instantiate one of the pure classes, because those just don’t have any operating system access operations.
Along with a picture:
I don’t quite understand what it means. Never mind, keep reading.
Corresponding relations between
As you can see from the examples above, it not only encapsulates common methods related to OS.path, but also integrates other modules of the OS, such as creating folder path.mkdir.
If you’re worried about forgetting it, it’s okay. It’s always there. And the document lists the corresponding relationship table for us
Basic usage
Path.iterdir() # Iterate over subdirectories or files of a directory
Path.is_dir() # check if it is a directory
Path.glob() # Filter directory (return generator)
Path.resolve() # Return absolute Path
Path.exists() # Determine whether a Path exists
Path.open() # Open file (with support)
Path.unlink() # Delete file or directory
Basic attributes
Path.parts # Splits the Path like os.path.split(), but returns a tuple
Path.drive # Returns the drive name
Root # Returns the root directory of the Path
Path.anchor # Automatically returns drive or root
Path.parents # returns a list of all parent directories
Change the path
Path.with_name() # Change Path name, change last level pathname
Path.with_suffix() # Change the Path suffix
Stitching path
Path.joinpath() # Joinpath
Path.relative_to() # Computes relative paths
Testing path
Path.match() # Tests whether the Path matches pattern
Path.is_dir() # Is a file
Path.is_absolute() # Is an absolute Path
Path.is_reserved() # Whether the Path is reserved
Path.exists() # Determine whether a Path actually exists
Other methods
Path.cwd() # return the Path object of the current directory
Path.home() # Return the home Path object of the current user
Path.stat() # return Path information, same as os.stat()
Path.chmod() # change Path permissions similar to os.chmod()
Path.expanduser() # expand ~ return the full Path object
Path.mkdir() # create directory
Path.rename() # Rename a Path
Path.rglob() # Recursively traverses files in all subdirectories
Pathlib review
From the above examples, we should have a general idea of pathlib. Let’s review the official definition of the Pathlib library:
This module offers classes representing filesystem paths with semantics appropriate for different operating systems. Path classes are divided between pure paths, which provide purely computational operations without I/O, and concrete paths, which inherit from pure paths but also provide I/O operations.
Definition: Pathlib provides classes that represent file system paths, with semantics that are applicable to different operating systems. The path class is divided between pure paths, which provide pure computation without I/O, and concrete paths, which inherit from pure paths but also provide I/O operations.
Go back to the diagram and re-understand Pathlib
If you’ve never used this module before, or are just not sure which class is right for your task, Path is probably what you need. It instantiates a concrete path for the platform on which the code runs.
Summary: PathLib does not simply encapsulate modules or methods in the OS, but is designed to be compatible with different operating systems. It defines interfaces for each type of operating system. You want to manipulate Windows paths on a UNIX machine, but you can’t do that directly, so PurePath is a set of interfaces that you can use to do that (and vice versa)