1. Introduction
The first time you encounter the IO API of Python, it’s amazing. Compare this to IO stream apis provided by other languages.
Both from the user’s point of view and from the bottom of the designer’s point of view, it can be unparalleled.
Many people almost crash learning the IO stream API in the JAVA language. There are too many apis and the relationships between apis are too complex. The hierarchy of classes takes a lot of time to figure out. API designers are showing off.
Life is too short for me to learn Python.
Using the open() function as a starting point to do everything quickly and easily is an absolute example of lightweight design and demonstrates the concept of “high cohesion”. Use rise quite have “four two dial gold daughter” relaxed.
By understanding the parameter design of the open() function, the open and close design idea can be used to the extreme.
2. the open() function
2.1 Function Prototype
def open(file, mode='r', buffering=None, encoding=None, errors=None, newline=None, closefd=True) :...Copy the code
2.2 Functions
Opens a file at the specified location and returns an IO stream object.
2.3 Function Parameters
Tip: Open () may seem like a lot of arguments, but when used, many of the arguments can be set to default Settings and it will provide an optimal working scheme.
-
The file parameter: specifies the file location. It can be a file path described by a string, or it can be a file descriptor of type int.
Tip: When a string is used, it can be an absolute path or a relative path.
Absolute path: The absolute location is used as the starting point of the path. Windows uses the logical drive letter as the absolute starting point, while Liunx uses the “/” root directory as the absolute starting point.
file=open("d:/guoke.txt") Copy the code
To run this code, make sure you have a file named guoke.txt on drive D of your system.
Relative path: the so-called relative path refers to an existing path (or reference directory, current directory) as the starting point. By default, the relative path takes the current project directory as the reference directory. You can use the getcwd() method in the OS module to get information about the current reference directory.
import os print(os.getcwd()) The test items of this code are placed under d:\myc; Project name: Filedmeo # output result # D:\myc\filedmeo Copy the code
The following code needs to ensure that “guoke.txt” exists in the project directory
file = open("guoke.txt") At execution, the Python interpreter will automatically concatenate a full path D:\myc\filedmeo\guoke.txt Copy the code
The reference directory can be not fixed, but mutable.
Change the reference directory of the relative path:
import os Drive D as the current directory os.chdir("d:/") print(os.getcwd()) file = open("guoke.txt") The python interpreter will look for the guoke. TXT file in the root directory of drive D Copy the code
Descriptors: When you open a file using the open() function, the Python interpreter system specifies a unique numeric identifier for the file. You can use this descriptor as the open() argument.
file = open("guo_ke.txt") # fileno() gets the file descriptor file1 = open(file.fileno()) Copy the code
Tip: Descriptors for user files start at 3. 0, 1, and 2 are system reserved file descriptors.
- 0: indicates the standard input (keyboard) device descriptor.
file = open(0) print(Please enter a number:) res = file.readline() print("Echo.", res) Output Result Please enter a number: 88 Copy the code
- 1: represents the standard output device (display) descriptor.
file = open(1."w") file.write("you are welcome!") Print (" You are welcome!" The function of the) Copy the code
- 2: represents the standard error output device (display) descriptor.
file = open(2."w") file.write("you are welcome!") The output text is highlighted in red Copy the code
-
Mode: indicates the file operation mode. The default value is “R”, indicating read-only mode.
Pattern key describe abnormal ‘r’ Open the file in read-only mode If the file is not saved, FileNotFoundError is raised ‘r +’ Open files in readable and writable mode If the file is not saved, FileNotFoundError is raised “W” Open the file in writable mode Creates an empty file with byte 0 if the file does not exist “W +” Open the file in writable and readable mode (clear the original contents) Creates an empty file with byte 0 if the file does not exist ‘a’ To append a file Creates an empty file with byte 0 if the file does not exist ‘a +’ To open a file in appending, readable form Creates an empty file with byte 0 if the file does not exist ‘t’ Open the file as a text file The default “B” Open the file in binary format “X” Creates an empty file and is writable Raises an FileExistsError exception if the file exists As long as there is the ‘r’ keyword in the pattern combination, the file must exist ahead of time:
file = open("guo_ke.txt") file = open("guo_ke.txt".'r') file = open("guo_ke.txt".'rt') file = open("guo_ke.txt".'r+t') Copy the code
As long as the ‘W’ keyword is present in the pattern combination, the file does not have to exist; if it does, the contents of the original file will be cleared.
# can be written file = open("guo_ke.txt".'w') Writable, readable file = open("guo_ke.txt".'w+') Copy the code
As long as the ‘A’ keyword is present in the pattern combination, the file does not need to exist, and if it exists, the empty space in the original file will not be cleared
# additional written file = open("guo_ke.txt".'a') Appending, and readable file = open("guo_ke.txt".'a+') Copy the code
-
Buffering: Sets the buffering policy. The value can be 0, 1, or >1.
-
0: Disables buffering in binary mode.
-
**1: ** Uses line buffering in text mode.
Row buffering: Cache data in units of row data.
-
Integers >1: Specifies the size of the buffer, in bytes.
If no buffering parameter is specified, the default buffering policy is provided:
-
Binaries use fixed-size buffer blocks.
On many systems, the buffer is typically 4096 or 8192 bytes long.
-
“Interactive” text files (files for which isatty() returns True) use line buffering. Other text files use the same caching strategy as binary files.
The isatty() method checks whether a file is connected to a terminal device.
-
-
Encoding: Specifies the encoding name used when decoding or encoding the file.
Can only be used for text files. Platform encoding is used by default.
-
Errors: Specifies how to handle errors thrown during encoding and decoding. The options are as follows:
- Strict: Raises ValueError if there is an encoding error. The default value None has the same effect.
- Ignore: Ignores errors. Data loss may occur.
- Replace: will replace tags (such as ‘? ‘) insert where there is bad data.
-
Newline: How to handle line breaks when reading or writing text content. The value can be None, ‘ ‘, ‘\n’, ‘\r’, or ‘\r\n’.
The description of the newline character varies according to the OS. Line end ‘\n’ on Unix, ‘\r\n’ on Windows
- When reading data from the stream, if newline is None, platform convention newline mode is enabled.
- If newline is None when written to the stream, any ‘\n’ characters written will be converted to the system default line separator. If newline is’ ‘or ‘\n’, it writes directly. If newline is any other legal value, any ‘\n’ characters written will be converted to the given string.
-
Closefd:
file = open("guo_ke.txt",closefd=False) Traceback (most recent call last): File "D:/myc/filedmeo/ filedmeo ", line 1, in
File = open("guo_ke.txt",closefd=False) ValueError: Cannot use closefd=False with file name ''' Copy the codeIf the file is opened through a string path description, CLOSEFD must be True (the default) or an error will be raised.
file = open("guo_ke.txt".)Open the file through the file descriptor file1 = open(file.fileno(), closefd=False) file1.close() print("Open file first:", file.closed) print("After opening file:", file1.closed) Output result Open file first: False open file second: True Copy the code
When set to CLOSEFD =False when opening file1, file will remain open when file1 is closed.
-
Opener: It can be understood that the open() function is a high-level encapsulation object. In essence, it accesses a real interface with low-level file operation capability through opener parameter.
import os def opener(path, flags) : return os.open(path, flags) # Call opener('guo_ke.txt','r') with arguments from the first and second of open() with open('guo_ke.txt'.'r', opener=opener) as f: print(f.read()) Copy the code
The default opener argument references the os.open() method.
3. Read/write operations
The open() function returns an IO stream object. IO stream objects provide general read and write related properties and methods.
class IO(Generic[AnyStr]) :
Return the read/write mode of the file
@abstractproperty
def mode(self) - >str:
pass
Return the name of the file
@abstractproperty
def name(self) - >str:
pass
# close file
@abstractmethod
def close(self) - >None:
pass
Check whether the file is closed
@abstractproperty
def closed(self) - >bool:
pass
Each time a file is opened, Python assigns a unique numeric descriptor
@abstractmethod
def fileno(self) - >int:
pass
Refresh the contents of the cache
@abstractmethod
def flush(self) - >None:
pass
Whether to connect to a terminal device
@abstractmethod
def isatty(self) - >bool:
pass
# if n is -1 or not passed, all contents of the file can be read at one time. If the file content is too much, it can be read several times
# return an empty string ('') at the end of the file
@abstractmethod
def read(self, n: int = -1) -> AnyStr:
pass
Is the file readable
@abstractmethod
def readable(self) - >bool:
pass
# read a line from a file; A newline character (\n) is left at the end of the string
# returns an empty string, indicating that the end of the file has been reached
# empty line is denoted by '\n'
@abstractmethod
def readline(self, limit: int = -1) -> AnyStr:
pass
Read all rows and store them in the list
You can also use list(f)
@abstractmethod
def readlines(self, hint: int = -1) - >List[AnyStr]:
pass
Move the read/write cursor to change the read/write position of the file
# calculate position by adding offset to a reference point; The reference point is specified by the whence parameter.
A value of 0 for # whence means counting from the beginning of the file, 1 means using the current file location, and 2 means using the end of the file as a reference point.
# whence defaults to 0 if omitted, using the beginning of the file as a reference point.
@abstractmethod
def seek(self, offset: int, whence: int = 0) - >int:
pass
Can you move the cursor
@abstractmethod
def seekable(self) - >bool:
pass
# return the current location of the file
@abstractmethod
def tell(self) - >int:
pass
# Clear content
@abstractmethod
def truncate(self, size: int = None) - >int:
pass
# writable
@abstractmethod
def writable(self) - >bool:
pass
Write to the file
@abstractmethod
def write(self, s: AnyStr) - >int:
pass
Write a line of data to a file
@abstractmethod
def writelines(self, lines: List[AnyStr]) - >None:
pass
Copy the code
A call to open() using text mode returns a TextIO object with several more properties specific to text operations than the parent class.
class TextIO(IO[str]) :
# Cache information
@abstractproperty
def buffer(self) -> BinaryIO:
pass
# set encoding
@abstractproperty
def encoding(self) - >str:
pass
Device error handling scheme
@abstractproperty
def errors(self) - >Optional[str] :
pass
Set the row cache
@abstractproperty
def line_buffering(self) - >bool:
pass
# newline setting scheme
@abstractproperty
def newlines(self) - >Any:
pass
Copy the code
3.1 Text File Reading Operations
- Basic operation
file = open("guo_ke.txt", mode='r')
print("Read-write mode:", file.mode)
print("File name:", file.name)
print("File closed or not :", file.closed)
print("File description symbol:", file.fileno())
print("Is the file readable?", file.readable())
print("Is the standard input stream:", file.isatty())
print("Is the file writable?", file.writable())
print("Cache scheme", file.buffer)
print("Default file encoding :", file.encoding)
print("Programming error Handling scheme", file.errors)
print("Is row caching set?", file.line_buffering)
print("Newline setting scheme", file.newlines)
Read/Write mode: r File name: GUO_ke. TXT File Closed or not: False File Description: 3 File readable or not True Standard input stream: False File writable or not: False Cache scheme <_io.BufferedReader name='guo_ke. TXT '> File default encoding: cp936 Programming error Handling strict Whether line cache is set False Line feed scheme None ""
Copy the code
Cp936 refers to the 936 encoding scheme of the system, namely GBK encoding.
-
A variety of reading methods:
When reading or writing, you need to understand the concept of a file pointer (cursor), also known as a file location. When reading or writing, you can only move forward from your current position.
Prepare a text file in advance and write the following content in the file
You hide in my heart deeply.
Happiness! There is only you and I together time...
With you just I don't want to give anyone the chance.
Honey, can you marry me, I'll marry you!
Don't know love you count is a close reason?
Copy the code
-
Use of the read() method
file = open("guo_ke.txt"."r") print("---------- Read all contents --------------") res = file.read() print(res) print("---------- Read part of the content --------------") Return to the file header file.seek(0) res = file.read(100) print(res) Close the file resource file.close() "' output -- -- -- -- -- -- -- -- -- -- read everything -- -- -- -- -- -- -- -- -- -- -- -- -- -- You hide in my heart deeply. The Happiness! There is only you and I together time... With you just I don't want to give anyone the chance. Honey, can you marry me, I'll marry you! Don't know love you count is a close reason? -- -- -- -- -- -- -- -- -- -- read part -- -- -- -- -- -- -- -- -- -- -- -- -- -- You hide in my heart deeply. The Happiness! There is only you and I together time... With you just I don ''' Copy the code
Here’s one detail to note:
After reading all the contents of the file for the first time, the read position has been moved to the end of the file. Data cannot be read while continuing to read.
Move the cursor to the head of the file using the seek() method.
-
Use of the readline() method
file = open("guo_ke.txt"."r")
print("--------- read a line --------")
res = file.readline()
print("Data length :".len(res))
print(res)
print("----------- restricted content -------------")
res = file.readline(10)
print("Data length :".len(res))
print(res)
print("----------- Read all data in action unit -------------")
Return to the file header
file.seek(0)
while True:
res = file.readline()
print(res)
if res == "":
break
file.close()
"' output -- -- -- -- -- -- -- -- -- read a line -- -- -- -- -- -- -- -- the data length: 29 You hide in my heart deeply. -- -- -- -- -- -- -- -- -- -- - limit content -- -- -- -- -- -- -- -- -- -- -- -- -- the data length: 10 Happiness! -- -- -- -- -- -- -- -- -- -- - to conduct unit reads all the data -- -- -- -- -- -- -- -- -- -- -- -- -- You hide in my heart deeply. The Happiness! There is only you and I together time... With you just I don't want to give anyone the chance. Honey, can you marry me, I'll marry you! Don't know love you count is a close reason? ' ' '
Copy the code
When you read everything line by line, the output will produce a blank line between the lines. The reason is that the end-of-line symbol ‘n’ is printed as a blank line.
- Readline () also has a brother readlines(). Store data in behavioral units in a list at once.
file = open("guo_ke.txt"."r")
print("----------- stores the data in a file in a list of behavior units ---------")
res = file.readlines()
print(res)
file.close()
"' output -- -- -- -- -- -- -- -- -- -- - the file data to conduct unit is stored in the list -- -- -- -- -- -- -- -- -- [' You hide in my heart deeply. \ n ', 'Happiness! There is only you and I together time...\n', "With you just I don't want to give anyone the chance.\n", "Honey, can you marry me, I'll marry you!\n", "Don't know love you count is a close reason?"] ' ' '
Copy the code
Note the effect of the newline symbol when using data.
The IST (f) mode can also be used to read all rows.
file = open("guo_ke.txt"."r") print(list(file)) Copy the code
- File objects support iterating in units of behavior.
file = open("guo_ke.txt"."r")
print("----------- Output file contents iteratively ---------")
for f in file:
print(f)
file.close()
Copy the code
3.2 Text File Write Operations
If you write data in “W” mode, the original data will be lost. If you don’t want this to happen, use “A” mode for writing.
file = open("guo_ke_0.txt"."w")
file.write("this is a test")
Add a new line
file.write("\n")
file.write("who are you?")
Write the list data to the file at once
lst = ["food\n"."fish\n"."cat\n"]
file.write("\n")
file.writelines(lst)
file.close()
Copy the code
3.3 Coding problems
When reading and writing files at the same time, ensure the consistency of coding.
UnicodeDecodeError occurs in the following code.
file = open("guo_ke_1.txt", mode="w", encoding="utf-8")
file.write("Hello! The nut...")
file.close()
file_ = open("guo_ke_1.txt", mode="r", encoding="gbk")
res = file_.read()
print(res)
Traceback (most recent call last): File "D:/myc/filedmeo/ garble problem.py ", line 6, in
res = file_.read() UnicodeDecodeError: 'gbk' codec can't decode byte 0x80 in position 16: illegal multibyte sequence '''
Copy the code
3.4 Binary text operations
Calling open() with mode=’rb’ returns a BinaryIO object. This object provides reading and writing to binary files, which is not much different from text.
Text file and binary text operations can be switched flexibly with a single parameter.
class BinaryIO(IO[bytes]) :
@abstractmethod
def write(self, s: Union[bytes.bytearray]) - >int:
pass
Copy the code
4. To summarize
The open() function is a magic thing. It works well for both text and binary files, both read and write. You have to admire the simplicity of the Python designers.
Things like opening a file through the file descriptor and customizing the underlying implementation with the opener parameter are really cool.
In addition to calling the close() method directly, you may also use the with statement, which automatically calls close().
with open("guo_ke.txt") as f:
pass
Copy the code