1 Gracefully obtain the file name extension

import os
file_ext = os.path.splitext('./data/py/test.py')
front,ext = file_ext
In [5]: front
Out[5] :'./data/py/test'

In [6]: ext
Out[6] :'.py'
Copy the code

2 Change file suffixes in batches

This example uses Python’s OS module and argparse module to change all files with suffix old_ext in the working directory work_dir to suffix new_ext

From this example, you will get a sense of the main uses of the Argparse module.

The import module

import argparse
import os
Copy the code

Defining script parameters

def get_parser():
    parser = argparse.ArgumentParser(
        description='File name suffix change in working directory')
    parser.add_argument('work_dir', metavar='WORK_DIR'.type=str, nargs=1,
                        help='Modify file directories with suffix names')
    parser.add_argument('old_ext', metavar='OLD_EXT'.type=str, nargs=1, help='Original suffix')
    parser.add_argument('new_ext', metavar='NEW_EXT'.type=str, nargs=1, help='New suffix')
    return parser
Copy the code

Change the name extension in batches

def batch_rename(work_dir, old_ext, new_ext):
    """Pass current directory, old suffix, new suffix, batch rename suffix"""
    forSplit_file = os.path.splitext(filename) file_ext = split_file[filename = os.path.splitext(filename) file_ext = split_file[1] # locate file with suffix old_extifOld_ext == file_ext: newfile = split_file[0Rename (os.path.join(work_dir, filename), os.path.join(work_dir, newfile))print("Rename done")
    print(os.listdir(work_dir))
Copy the code

To achieve the Main

def main():
    """Main function"""Parser = get_parser() args = vars(parse_args())'work_dir'] [0]
    old_ext = args['old_ext'] [0]
    if old_ext[0] != '. ':
        old_ext = '. ' + old_ext
    new_ext = args['new_ext'] [0]
    if new_ext[0] != '. ':
        new_ext = '. ' + new_ext

    batch_rename(work_dir, old_ext, new_ext)
Copy the code

3 Extract files from the path

In [11] :importos ... : file_ext = os.path.split('./data/py/test.py')... : ipath,ifile = file_ext ... : In [12]: ipath
Out[12] :'./data/py'

In [13]: ifile
Out[13] :'test.py'
Copy the code

4 Search for the file with the specified file name extension

import os

def find_file(work_dir,extension='jpg'):
    lst = []
    for filename in os.listdir(work_dir):
        print(filename)
        splits = os.path.splitext(filename)
        ext = splits[1] # get the extensionif ext == '. '+extension:
            lst.append(filename)
    return lst

r = find_file('. '.'md')
print(r) # return md files in all directoriesCopy the code

5 Convert XLS files to XLSX in batches

# Batch convert file xls-xlsximport win32com.client as win32
import os.path
import os


def xls2xlsx():    
    rootdir = r"C:\Users\CQ375\Desktop\temp1"# XLS file to be converted rootdir1 = r"C:\Users\CQ375\Desktop\ex"Num = num = num = num = num = num = numlen(files) # list the number of filesfor i in rangeKname = os.path.splitext(files[I])[kname = os.path.splitext(files[I])[1Return (f_name, f_extension) tupleif kname == '.xls'Fname = rootdir +'\ \'+ files[I] # synthesize the path and filename to be converted fname1 = rootDir1 +'\ \'+ files [I] # synthesis going into conversion path and filename excel. = the win32 gencache. EnsureDispatch ('Excel.Application'Wb.saveas (fname1+) # Open wb.saveas (fname1+"x", FileFormat=51Wb.close () excel.application.quit ()if __name__ == '__main__':
    xls2xlsx()
Copy the code

6 Change time of all files in the directory

import os
import datetime
print(f"The current time: {datetime. Datetime. Now () strftime (' % % Y - m - H: % d % % m: % S ')}")
for root,dirs,files in os.walk(r"D:\works"):# loop D:\works directories and subdirectoriesfor file in files:
        absPathFile=os.path.join(root,file)
        modefiedTime=datetime.datetime.fromtimestamp(os.path.getmtime(absPathFile))
        now=datetime.datetime.now()
        diffTime=now-modefiedTime
        if diffTime.days<20Filter files that exceed the specified timeprint(f"{absPathFile: < 27 S} modification time [{modefiedTime. Strftime (' % % Y - m - H: % d % % m: % S ')}] \ It [{diffTime. Days: 3 d} days {diffTime. Seconds / / 3600:2 d} at {diffTime. 3600 / / 60 seconds % : 2 d}]")# Print the relevant informationCopy the code

7 Compress folders and files in batches

importZipfile # import zipfile, a Python module used for compression and decompression;import os
importTime def batch_zip(start_dir): start_dir = start_dir # File_news = start_dir +'.zip'Z = zipfile.zipfile (file_news,'w', zipfile.ZIP_DEFLATED)
    fordir_path, dir_names, file_names in os.walk(start_dir): F_path = dir_path.replace(start_dir,' ') f_path = f_PATH and f_path + os.sep # Implement compression of the current folder and all files contained in itfor filename in file_names:
            z.write(os.path.join(dir_path, filename), f_path + filename)
    z.close(a)return file_news


batch_zip('./data/ziptest')
Copy the code

8 Read files

importDef mkdir(path): isexists = os.path.exists(path)ifNot isexists: os.mkdir(path) # def openfile(filename): f = open(filename) fllist = f.read() fclose(a)returnFllist # returns to read contentCopy the code

9 Write files

F = open(r) f = open(r) f = open(r"./data/test.txt"."w", encoding="utf-8")
print(f.write("Test file write"))
f.close# example2 # a Write, if the file exists, append to the file contents, if the file does not exist, create f = open(r)"./data/test.txt"."a", encoding="utf-8")
print(f.write("Test file write"))
f.close# example3 # with keyword The system will automatically close files and handle exceptions with open(r)"./data/test.txt"."w") as f:
    f.write("hello world!")
Copy the code

10 participle and save the file

Pkuseg is an open source Chinese word segmentation tool kit of Peking University. It has very high word segmentation accuracy on multiple word segmentation data sets, which is better than the commonly used jieba segmental function and effect.

Below, pkuseg’s cut function is used to count the top 10 frequency words after word segmentation, and write them into the file cut_words.csv according to the frequency of all words from high to low.

Here are the paragraphs to slice:

mystr = """The Python language Reference describes the syntax and semantics of the Python language, and this library reference describes the standard library that ships with Python. It also describes some optional components that are typically included in Python distributions. The Python standard library is very large and provides a wide range of components, as the following contents table shows. The library contains multiple built-in modules (written in C) that Python programmers must rely on to implement system-level functions such as file I/O, as well as a large number of modules written in Python that provide standard solutions to many problems in everyday programming. Some of these modules are specifically designed to encourage and enhance the portability of Python programs by abstracting platform-specific functionality into platform-neutral apis. Windows versions of Python installers typically contain the entire standard library, often with many additional components. For Unix-like operating systems, Python is typically divided into a series of packages, so you may need to use the package management tools provided by the operating system to obtain some or all of the optional components.""
Copy the code

In a few lines of code:

from pkuseg import pkuseg
from collections import Counter

seg = pkuseg()
words = seg.cut(mystr)
frequency_sort = Counter(words).most_common()
with open('./data/cut_words.csv'.'w') as f:
    for line in frequency_sort:
        f.write(str(line[0]) +', ' + str(line[1]) +"\n")

print('writing done')
Copy the code

The top 10 words with the highest frequency:

Counter(words).most_common(10)
# [('the'.12), (', '.11), ('Python'.10), ('. '.7), ('了'.5), ('contains'.4), ('components'.4), (Standard library.3), ('normally'.3), ('what'.3)]
Copy the code

Note: the menu of the official account includes an AI cheat sheet, which is very suitable for learning on the commute.

Highlights from the past2019Machine learning Online Manual Deep Learning online Manual AI Basic Download (Part I) note: To join our wechat group or QQ group, please reply "add group" to join knowledge planet (4600+ user ID:92416895), please reply to knowledge PlanetCopy the code

Like articles, click Looking at the