30 days learning Python👨💻 18 days — File I/O
File Operation Overview
Today I’ll explore how to manipulate and communicate with files using Python. These days I’ve explored and shared several concepts in Python and some programming best practices in Python. However, our programs often need to communicate externally for a variety of reasons, such as reading data from Excel, CSV, or PDF files, converting and compressing images, extracting data from text files, reading data from databases, and many more. Interactions with the outside world are performed using I/O or input-output operations.
Files help us store data permanently in the system. When we write programs to manage data, it is temporarily stored in the machine’s RAM, which is wiped when the computer is shut down. To store data for a long time, it needs to be stored in a database or file system so that it can be used later.
Files can be broadly classified into two categories based on their content:
- Binary (also known as rich text)
- Text
If you’re interested in these two file types and want to know more about them, check out this great article
Python provides a built-in open function to open arbitrary files. Any file needs to be opened before data can be read or written. Reading data from files is easy in Python.
I used the REPL as a platform to experiment with all the code snippets provided in this article.
Open the file
I created a test.txt file and wrote some fake data to test it.
# test.txt
I am learning python.
Copy the code
The contents of this file can now be read using Python, like this
main.py
content = open('test.txt')
output = content.read()
print(output) # I am learning python.
Copy the code
When we use the open function to open a file, we can also specify the mode to open it. The default is R (read mode). We can also specify whether the file needs to be opened in text or binary mode.
model | describe |
---|---|
r | Open the file in read-only mode (default) |
w | Open in write mode. Create a new file if it doesn’t exist, and overwrite it if it does |
x | Create a new file, error if the file exists |
a | Open a file and append at the end, or create a new file if it doesn’t exist |
t | Open in text mode. (Default) |
b | Open in binary mode. |
+ | Open files and update (read and write) |
We can also specify the encoding format when opening the file. The default format is UTF-8
Close the file
It is important to close the file after performing operations on it, freeing up memory associated with the file.
main.py
content = open('test.txt', mode='r')
output = content.read()
print(output)
content.close()
Copy the code
A try-except finally statement can be added to the above code to ensure that the file will be closed if any errors occur during the operation.
main.py
try:
content = open('test.txt', mode='r')
output = content.read()
print(output)
except FileNotFoundError as error:
print(f'file not found {error}')
finally:
content.close()
Copy the code
Python provides a handy syntax for performing file-opening operations, using the with statement. It automatically closes the file once it’s done.
main.py
with open('test.txt', mode='r') as content:
output = content.read()
print(output) # I am learning python.
Copy the code
Written to the file
Python provides a write method to write data to a file. The file needs to be opened in W mode for writing to the file. Note that using W mode overwrites the contents of the file. If you need to append content, you should use mode A. If the file does not exist, it is created and then written.
main.py
with open('test.txt', mode='w', encoding='utf-8') as my_file:
my_file.write('This is the first line\n') # \n is used for line breaks
my_file.write('This is the second line\n')
my_file.write('This is the third line')
Copy the code
main.py
with open('test.txt', mode='a', encoding='utf-8') as my_file:
my_file.write('This text will be appended')
Copy the code
The alternative is to use the Writelines method. It can provide a list.
with open('test.txt', mode='w', encoding='utf-8') as my_file:
my_file.writelines(['First line'.'\n'.'Second Line'])
Copy the code
Read the file
Python provides many ways to read files. The file needs to be opened in R mode, or r+ mode if we need to perform both read and write operations. The read method takes a size argument, which is the total number of bytes read. If size is not provided, the entire file is read.
main.py
with open('test.txt', mode='r', encoding='utf-8') as my_file:
content = my_file.read()
print(content)
Copy the code
There is also a tell method that tells us the cursor position of the file we are currently reading.
The seeek method is used to move the cursor to the location specified in the file.
main.py
with open('test.txt', mode='r', encoding='utf-8') as my_file:
my_file.seek(0) Move the cursor to the beginning of the file
print(my_file.tell()) Output file cursor
content = my_file.read()
print(content)
Copy the code
If there are many lines in the file, the most efficient way to read the file is to use a loop.
main.py
with open('test.txt', mode='r', encoding='utf-8') as my_file:
for line in my_file:
print(line)
Copy the code
In addition, Python provides two methods, readline and Readlines
Readline reads a file and stops when it encounters a new line (\n)
Readlines returns a list of all rows
Python file methods
Here is a complete list of file methods in Python
methods | describe |
---|---|
close() | Close the open file. If the file is already closed, there is no impact |
detach() | Separates the underlying binary buffer from TextIOBase and returns. |
fileno() | Returns a file descriptor of integer type |
flush() | Refresh the internal file buffer |
isatty() | Returns True if the file stream is interactive |
read(n) | Reads the specified number of bytes from a file. If negative or none is specified, read all |
readable() | |
readline(n=-1) | Reads and returns a line from a file. If specified, a maximum of n bytes are read. |
readlines(n=-1) | Reads and returns a list of rows and columns from a file. If specified, a maximum of N bytes/characters are read. |
seek(offset,from=SEEK_SET) | Change the file location to offset bytes, referencing from (start, current, end). |
seekable() | |
tell() | |
truncate(size=None) | Resize the file stream to byte size. If size is not specified, adjust to the current position. |
writable() | |
write(s) | Writes the string s to a file and returns the number of characters written. |
writelines(lines) |
Fun exercise
Let’s try to build a language translator that reads a file with English content and creates a new translation of that file in a different language.
For this exercise, we’ll use a third-party package on PyPI called Translate. With the help of this package, we can do offline translation.
First, the package needs to be downloaded. Because I am using the REPL, I will add it to the Packages section of the REPL. If using a local project, we can download it on the console using PIP.
Create a file called quote.txt and write a quote:
quote.txt
If you can't make it good, at least make it look good. - Bill Gates
Copy the code
Now let’s generate two translations of this quote. One is in Spanish with a file named quote-es.txt, and the other is in French with a file named quote-fr.txt
main.py
from translate import Translator
spanish_translate = Translator(to_lang="es")
french_translate = Translator(to_lang="fr")
try:
with open('quote.txt', mode='r') as quote_file:
# read the file
quote = quote_file.read()
# do the translations
quote_spanish = spanish_translate.translate(quote)
quote_french = french_translate.translate(quote)
# create the translated files
try:
with open('quote-es.txt', mode='w') as quote_de:
quote_de.write(quote_spanish)
with open('quote-fr.txt', mode='w') as quote_fr:
quote_fr.write(quote_french)
except IOError as error:
print('An error ocurred')
raise (error)
except FileNotFoundError as error:
print('File not found')
raise (error)
Copy the code
Two translation files for this quote will be automatically generated. It does look great!
Built-in file manipulation module
Python provides a built-in module called Pathlib as part of the standard library. It provides a variety of classes that can easily represent file system paths with semantics that are applicable to different operating systems. This module was introduced in Python3.4 and is ideal for dealing with many directories.
Here are some resources to explain the Pathlib module
- Realpython.com/python-path…
- Docs.python.org/3/library/p…
- www.geeksforgeeks.org/pathlib-mod…
We will use the Pathlib module later when we create the project.
That’s all for today. Tomorrow I plan to explore the use of regular expressions in Python and some examples.
The original link
30 Days of Python 👨💻 – Day 18 – File I/O