1. Introduction

The first time you encounter the IO API of Python, it’s amazing. Compare this to IO stream apis provided by other languages.

Both from the user’s point of view and from the bottom of the designer’s point of view, it can be unparalleled.

Many people almost crash learning the IO stream API in the JAVA language. There are too many apis and the relationships between apis are too complex. The hierarchy of classes takes a lot of time to figure out. API designers are showing off.

Life is too short for me to learn Python.

Using the open() function as a starting point to do everything quickly and easily is an absolute example of lightweight design and demonstrates the concept of “high cohesion”. Use rise quite have “four two dial gold daughter” relaxed.

By understanding the parameter design of the open() function, the open and close design idea can be used to the extreme.

2. the open() function

2.1 Function Prototype

def open(file, mode='r', buffering=None, encoding=None, errors=None, newline=None, closefd=True) :...Copy the code

2.2 Functions

Opens a file at the specified location and returns an IO stream object.

2.3 Function Parameters

Tip: Open () may seem like a lot of arguments, but when used, many of the arguments can be set to default Settings and it will provide an optimal working scheme.

  • The file parameter: specifies the file location. It can be a file path described by a string, or it can be a file descriptor of type int.

    Tip: When a string is used, it can be an absolute path or a relative path.

    Absolute path: The absolute location is used as the starting point of the path. Windows uses the logical drive letter as the absolute starting point, while Liunx uses the “/” root directory as the absolute starting point.

    file=open("d:/guoke.txt")
    Copy the code

    To run this code, make sure you have a file named guoke.txt on drive D of your system.

    Relative path: the so-called relative path refers to an existing path (or reference directory, current directory) as the starting point. By default, the relative path takes the current project directory as the reference directory. You can use the getcwd() method in the OS module to get information about the current reference directory.

    import os
    print(os.getcwd())
    The test items of this code are placed under d:\myc; Project name: Filedmeo
    # output result
    # D:\myc\filedmeo
    Copy the code

    The following code needs to ensure that “guoke.txt” exists in the project directory

    file = open("guoke.txt")
    At execution, the Python interpreter will automatically concatenate a full path D:\myc\filedmeo\guoke.txt
    Copy the code

    The reference directory can be not fixed, but mutable.

    Change the reference directory of the relative path:

    import os
    Drive D as the current directory
    os.chdir("d:/")
    print(os.getcwd())
    file = open("guoke.txt")
    The python interpreter will look for the guoke. TXT file in the root directory of drive D
    Copy the code

    Descriptors: When you open a file using the open() function, the Python interpreter system specifies a unique numeric identifier for the file. You can use this descriptor as the open() argument.

    file = open("guo_ke.txt")
    # fileno() gets the file descriptor
    file1 = open(file.fileno())
    Copy the code

    Tip: Descriptors for user files start at 3. 0, 1, and 2 are system reserved file descriptors.

    • 0: indicates the standard input (keyboard) device descriptor.
    file = open(0)
    print(Please enter a number:)
    res = file.readline()
    print("Echo.", res)
    Output Result Please enter a number: 88
    Copy the code
    • 1: represents the standard output device (display) descriptor.
    file = open(1."w")
    file.write("you are welcome!")
    Print (" You are welcome!" The function of the)
    Copy the code
    • 2: represents the standard error output device (display) descriptor.
    file = open(2."w")
    file.write("you are welcome!")
    The output text is highlighted in red
    Copy the code
  • Mode: indicates the file operation mode. The default value is “R”, indicating read-only mode.

    Pattern key describe abnormal
    ‘r’ Open the file in read-only mode If the file is not saved, FileNotFoundError is raised
    ‘r +’ Open files in readable and writable mode If the file is not saved, FileNotFoundError is raised
    “W” Open the file in writable mode Creates an empty file with byte 0 if the file does not exist
    “W +” Open the file in writable and readable mode (clear the original contents) Creates an empty file with byte 0 if the file does not exist
    ‘a’ To append a file Creates an empty file with byte 0 if the file does not exist
    ‘a +’ To open a file in appending, readable form Creates an empty file with byte 0 if the file does not exist
    ‘t’ Open the file as a text file The default
    “B” Open the file in binary format
    “X” Creates an empty file and is writable Raises an FileExistsError exception if the file exists

    As long as there is the ‘r’ keyword in the pattern combination, the file must exist ahead of time:

    file = open("guo_ke.txt")
    file = open("guo_ke.txt".'r')
    file = open("guo_ke.txt".'rt')
    file = open("guo_ke.txt".'r+t')
    Copy the code

    As long as the ‘W’ keyword is present in the pattern combination, the file does not have to exist; if it does, the contents of the original file will be cleared.

    # can be written
    file = open("guo_ke.txt".'w')
    Writable, readable
    file = open("guo_ke.txt".'w+')
    Copy the code

    As long as the ‘A’ keyword is present in the pattern combination, the file does not need to exist, and if it exists, the empty space in the original file will not be cleared

    # additional written
    file = open("guo_ke.txt".'a')
    Appending, and readable
    file = open("guo_ke.txt".'a+')
    Copy the code
  • Buffering: Sets the buffering policy. The value can be 0, 1, or >1.

    • 0: Disables buffering in binary mode.

    • **1: ** Uses line buffering in text mode.

      Row buffering: Cache data in units of row data.

    • Integers >1: Specifies the size of the buffer, in bytes.

    If no buffering parameter is specified, the default buffering policy is provided:

    • Binaries use fixed-size buffer blocks.

      On many systems, the buffer is typically 4096 or 8192 bytes long.

    • “Interactive” text files (files for which isatty() returns True) use line buffering. Other text files use the same caching strategy as binary files.

      The isatty() method checks whether a file is connected to a terminal device.

  • Encoding: Specifies the encoding name used when decoding or encoding the file.

    Can only be used for text files. Platform encoding is used by default.

  • Errors: Specifies how to handle errors thrown during encoding and decoding. The options are as follows:

    • Strict: Raises ValueError if there is an encoding error. The default value None has the same effect.
    • Ignore: Ignores errors. Data loss may occur.
    • Replace: will replace tags (such as ‘? ‘) insert where there is bad data.
  • Newline: How to handle line breaks when reading or writing text content. The value can be None, ‘ ‘, ‘\n’, ‘\r’, or ‘\r\n’.

    The description of the newline character varies according to the OS. Line end ‘\n’ on Unix, ‘\r\n’ on Windows

    • When reading data from the stream, if newline is None, platform convention newline mode is enabled.
    • If newline is None when written to the stream, any ‘\n’ characters written will be converted to the system default line separator. If newline is’ ‘or ‘\n’, it writes directly. If newline is any other legal value, any ‘\n’ characters written will be converted to the given string.
  • Closefd:

    file = open("guo_ke.txt",closefd=False)
    Traceback (most recent call last): File "D:/myc/filedmeo/ filedmeo ", line 1, in 
            
              File = open("guo_ke.txt",closefd=False) ValueError: Cannot use closefd=False with file name '''
            
    Copy the code

    If the file is opened through a string path description, CLOSEFD must be True (the default) or an error will be raised.

    file = open("guo_ke.txt".)Open the file through the file descriptor
    file1 = open(file.fileno(), closefd=False)
    file1.close()
    print("Open file first:", file.closed)
    print("After opening file:", file1.closed)
    Output result Open file first: False open file second: True
    Copy the code

    When set to CLOSEFD =False when opening file1, file will remain open when file1 is closed.

  • Opener: It can be understood that the open() function is a high-level encapsulation object. In essence, it accesses a real interface with low-level file operation capability through opener parameter.

    import os
    
    def opener(path, flags) :
        return os.open(path, flags)
    # Call opener('guo_ke.txt','r') with arguments from the first and second of open()
    with open('guo_ke.txt'.'r', opener=opener) as f:
        print(f.read())
    Copy the code

    The default opener argument references the os.open() method.

3. Read/write operations

The open() function returns an IO stream object. IO stream objects provide general read and write related properties and methods.

class IO(Generic[AnyStr]) :
    Return the read/write mode of the file
    @abstractproperty
    def mode(self) - >str:
        pass
    Return the name of the file
    @abstractproperty
    def name(self) - >str:
        pass
    # close file
    @abstractmethod
    def close(self) - >None:
        pass
    Check whether the file is closed
    @abstractproperty
    def closed(self) - >bool:
        pass
    Each time a file is opened, Python assigns a unique numeric descriptor
    @abstractmethod
    def fileno(self) - >int:
        pass
    Refresh the contents of the cache
    @abstractmethod
    def flush(self) - >None:
        pass
    Whether to connect to a terminal device
    @abstractmethod
    def isatty(self) - >bool:
        pass
    # if n is -1 or not passed, all contents of the file can be read at one time. If the file content is too much, it can be read several times
    # return an empty string ('') at the end of the file
    @abstractmethod
    def read(self, n: int = -1) -> AnyStr:
        pass
    Is the file readable
    @abstractmethod
    def readable(self) - >bool:
        pass
    # read a line from a file; A newline character (\n) is left at the end of the string
    # returns an empty string, indicating that the end of the file has been reached
    # empty line is denoted by '\n'
    @abstractmethod
    def readline(self, limit: int = -1) -> AnyStr:
        pass
    Read all rows and store them in the list
    You can also use list(f)
    @abstractmethod
    def readlines(self, hint: int = -1) - >List[AnyStr]:
        pass
    Move the read/write cursor to change the read/write position of the file
    # calculate position by adding offset to a reference point; The reference point is specified by the whence parameter.
    A value of 0 for # whence means counting from the beginning of the file, 1 means using the current file location, and 2 means using the end of the file as a reference point.
    # whence defaults to 0 if omitted, using the beginning of the file as a reference point.
    @abstractmethod
    def seek(self, offset: int, whence: int = 0) - >int:
        pass
   Can you move the cursor
    @abstractmethod
    def seekable(self) - >bool:
        pass
    # return the current location of the file
    @abstractmethod
    def tell(self) - >int:
        pass
    # Clear content
    @abstractmethod
    def truncate(self, size: int = None) - >int:
        pass
    # writable
    @abstractmethod
    def writable(self) - >bool:
        pass
    Write to the file
    @abstractmethod
    def write(self, s: AnyStr) - >int:
        pass
    Write a line of data to a file
    @abstractmethod
    def writelines(self, lines: List[AnyStr]) - >None:
        pass
Copy the code

A call to open() using text mode returns a TextIO object with several more properties specific to text operations than the parent class.

class TextIO(IO[str]) :
    # Cache information
    @abstractproperty
    def buffer(self) -> BinaryIO:
        pass
    # set encoding
    @abstractproperty
    def encoding(self) - >str:
        pass
    Device error handling scheme
    @abstractproperty
    def errors(self) - >Optional[str] :
        pass
    Set the row cache
    @abstractproperty
    def line_buffering(self) - >bool:
        pass
    # newline setting scheme
    @abstractproperty
    def newlines(self) - >Any:
        pass
Copy the code

3.1 Text File Reading Operations

  1. Basic operation
file = open("guo_ke.txt", mode='r')
print("Read-write mode:", file.mode)
print("File name:", file.name)
print("File closed or not :", file.closed)
print("File description symbol:", file.fileno())
print("Is the file readable?", file.readable())
print("Is the standard input stream:", file.isatty())
print("Is the file writable?", file.writable())
print("Cache scheme", file.buffer)
print("Default file encoding :", file.encoding)
print("Programming error Handling scheme", file.errors)
print("Is row caching set?", file.line_buffering)
print("Newline setting scheme", file.newlines)

Read/Write mode: r File name: GUO_ke. TXT File Closed or not: False File Description: 3 File readable or not True Standard input stream: False File writable or not: False Cache scheme <_io.BufferedReader name='guo_ke. TXT '> File default encoding: cp936 Programming error Handling strict Whether line cache is set False Line feed scheme None ""
Copy the code

Cp936 refers to the 936 encoding scheme of the system, namely GBK encoding.

  1. A variety of reading methods:

    When reading or writing, you need to understand the concept of a file pointer (cursor), also known as a file location. When reading or writing, you can only move forward from your current position.

    Prepare a text file in advance and write the following content in the file

You hide in my heart deeply.
Happiness! There is only you and I together time...
With you just I don't want to give anyone the chance.
Honey, can you marry me, I'll marry you!
Don't know love you count is a close reason?
Copy the code
  • Use of the read() method

    file = open("guo_ke.txt"."r")
    print("---------- Read all contents --------------")
    res = file.read()
    print(res)
    print("---------- Read part of the content --------------")
    Return to the file header
    file.seek(0)
    res = file.read(100)
    print(res)
    Close the file resource
    file.close()
    "' output -- -- -- -- -- -- -- -- -- -- read everything -- -- -- -- -- -- -- -- -- -- -- -- -- -- You hide in my heart deeply. The Happiness! There is only you and I together time... With you just I don't want to give anyone the chance. Honey, can you marry me, I'll marry you! Don't know love you count is a close reason? -- -- -- -- -- -- -- -- -- -- read part -- -- -- -- -- -- -- -- -- -- -- -- -- -- You hide in my heart deeply. The Happiness! There is only you and I together time... With you just I don '''
    Copy the code

    Here’s one detail to note:

    After reading all the contents of the file for the first time, the read position has been moved to the end of the file. Data cannot be read while continuing to read.

    Move the cursor to the head of the file using the seek() method.

  • Use of the readline() method

file = open("guo_ke.txt"."r")

print("--------- read a line --------")
res = file.readline()
print("Data length :".len(res))
print(res)
print("----------- restricted content -------------")
res = file.readline(10)
print("Data length :".len(res))
print(res)
print("----------- Read all data in action unit -------------")
Return to the file header
file.seek(0)
while True:
    res = file.readline()
    print(res)
    if res == "":
        break

file.close()
"' output -- -- -- -- -- -- -- -- -- read a line -- -- -- -- -- -- -- -- the data length: 29 You hide in my heart deeply. -- -- -- -- -- -- -- -- -- -- - limit content -- -- -- -- -- -- -- -- -- -- -- -- -- the data length: 10 Happiness! -- -- -- -- -- -- -- -- -- -- - to conduct unit reads all the data -- -- -- -- -- -- -- -- -- -- -- -- -- You hide in my heart deeply. The Happiness! There is only you and I together time... With you just I don't want to give anyone the chance. Honey, can you marry me, I'll marry you! Don't know love you count is a close reason? ' ' '
Copy the code

When you read everything line by line, the output will produce a blank line between the lines. The reason is that the end-of-line symbol ‘n’ is printed as a blank line.

  • Readline () also has a brother readlines(). Store data in behavioral units in a list at once.
file = open("guo_ke.txt"."r")

print("----------- stores the data in a file in a list of behavior units ---------")
res = file.readlines()
print(res)
file.close()
"' output -- -- -- -- -- -- -- -- -- -- - the file data to conduct unit is stored in the list -- -- -- -- -- -- -- -- -- [' You hide in my heart deeply. \ n ', 'Happiness!  There is only you and I together time...\n', "With you just I don't want to give anyone the chance.\n", "Honey, can you marry me, I'll marry you!\n", "Don't know love you count is a close reason?"] ' ' '
Copy the code

Note the effect of the newline symbol when using data.

The IST (f) mode can also be used to read all rows.

file = open("guo_ke.txt"."r")
print(list(file))
Copy the code
  • File objects support iterating in units of behavior.
file = open("guo_ke.txt"."r")
print("----------- Output file contents iteratively ---------")
for f in file:
    print(f)
file.close()
Copy the code

3.2 Text File Write Operations

If you write data in “W” mode, the original data will be lost. If you don’t want this to happen, use “A” mode for writing.

file = open("guo_ke_0.txt"."w")
file.write("this is a test")
Add a new line
file.write("\n")
file.write("who are you?")
Write the list data to the file at once
lst = ["food\n"."fish\n"."cat\n"]
file.write("\n")
file.writelines(lst)
file.close()
Copy the code

3.3 Coding problems

When reading and writing files at the same time, ensure the consistency of coding.

UnicodeDecodeError occurs in the following code.

file = open("guo_ke_1.txt", mode="w", encoding="utf-8")
file.write("Hello! The nut...")
file.close()

file_ = open("guo_ke_1.txt", mode="r", encoding="gbk")
res = file_.read()
print(res)
Traceback (most recent call last): File "D:/myc/filedmeo/ garble problem.py ", line 6, in 
      
        res = file_.read() UnicodeDecodeError: 'gbk' codec can't decode byte 0x80 in position 16: illegal multibyte sequence '''
      
Copy the code

3.4 Binary text operations

Calling open() with mode=’rb’ returns a BinaryIO object. This object provides reading and writing to binary files, which is not much different from text.

Text file and binary text operations can be switched flexibly with a single parameter.

class BinaryIO(IO[bytes]) :
    
    @abstractmethod
    def write(self, s: Union[bytes.bytearray]) - >int:
        pass
Copy the code

4. To summarize

The open() function is a magic thing. It works well for both text and binary files, both read and write. You have to admire the simplicity of the Python designers.

Things like opening a file through the file descriptor and customizing the underlying implementation with the opener parameter are really cool.

In addition to calling the close() method directly, you may also use the with statement, which automatically calls close().

with open("guo_ke.txt") as f:
    pass
Copy the code