This article was first published on the public account “AntDream”. You are welcome to search “AntDream” on wechat or scan the QR code at the bottom of this article to follow it. We will make progress every day
One great use of Python is data analysis, and data analysis often involves dealing with Excel data. Here’s a summary of how Python handles Excel data, and it works for the most part. Excel data processing in Python will not be difficult in the future!
Python processes Excel data using two libraries: XLWT and XLRD. The XLWT library is responsible for importing data into the generated Excel spreadsheet file, while the XLRD library is responsible for pulling the data out of the Excel spreadsheet.
xlwt
The library imports data into Excel
Write the data to an Excel file
wb = xlwt.Workbook()
Add a table
ws = wb.add_sheet('test')
# 3 arguments are row number, column number, and content
Note that both the row and column numbers start at 0
ws.write(0, 0, 'Column 1')
ws.write(0, 1, 'Column 2')
ws.write(0, 2, 'Column 3')
Save the Excel file
wb.save('./test.xls')
Copy the code
As you can see, using the XLWT library is very simple, basically in three steps:
- To open a
Workbook
Object and useadd_sheet
Method to add a table - And then you use
write
Method to write data - In the end,
save
Methods to save
Note that the rows and columns defined by the XLWT library surface are counted from 0
Customize Excel table styles
Table styles typically have several pieces of content: font, alignment, border, background color, width, and special content such as hyperlinks, dates and times. Let’s take a look at how to customize each of these styles using the XLWT library.
The font
XLWT library also supports many font attributes, which are as follows:
To set the Font, XFStyle class and Font class of XLWT library are used, and the code template is as follows:
style = xlwt.XFStyle()
# set font size
font = xlwt.Font()
For example, set font bold and underline
font.bold = True
font.underline = True
style.font = font
# then apply
ws.write(2, 1, 'test', style)
Copy the code
The following properties are set similarly in 4 steps:
- get
XFStyle
- Get the properties you need, like this one right here
Font
object - Sets the specific property value
- And then at the end
write
Method is applied when writing data
Cell alignment
So how do we set cell alignment
# Cell alignment
alignment = xlwt.Alignment()
Horizontal alignment and vertical alignment
alignment.horz = xlwt.Alignment.HORZ_CENTER
alignment.vert = xlwt.Alignment.VERT_CENTER
# wrap
alignment.wrap = 1
style.alignment = alignment
# then apply
ws.write(2, 1, 'test', style)
Copy the code
The wrap attribute above is useful because most of the time the data will be long, and it is best to use the width attribute together with the cell, so that the overall style will be much better
Cell width Settings:
Set the cell width, that is, the width of a column
ws.col(0).width = 6666
Copy the code
The background color of the cell
The attribute corresponding to the background color is Pattern
# the background color
pattern = xlwt.Pattern()
pattern.pattern = xlwt.Pattern.SOLID_PATTERN
The background color is yellow
# 0 = Black, 1 = White, 2 = Red, 3 = Green, 4 = Blue, 5 = Yellow, 6 = Magenta,
# 7 = Cyan, 16 = Maroon, 17 = Dark Green, 18 = Dark Blue, 19 = Dark Yellow ,
# almost brown), 20 = Dark Magenta, 21 = Teal, 22 = Light Gray, 23 = Dark Gray
#...
pattern.pattern_fore_colour = 5
style.pattern = pattern
# then apply
ws.write(2, 1, 'test', style)
Copy the code
Cell border
The border property is Borders
There are two types of cell borders: color and border line styles
You can set the color and style of the top, bottom, left and right borders separately
# border
borders = xlwt.Borders()
# border can be set to top, bottom, left, right respectively
You can set color and line style for each border: solid line, dashed line, none
# color Settings, other similar
borders.left_colour = 0x40
# set dotted lines, otherwise similar
borders.bottom = xlwt.Borders.DASHED
style.borders = borders
# then apply
ws.write(2, 1, 'test', style)
Copy the code
Special content, such as hyperlinks and formulas
There are a few main types of specific content that you will encounter: hyperlinks, formulas, and dates and times
We need formulas to handle these special things
# hyperlinks
link = 'HYPERLINK("http://www.baidu.com";" Baidu")'
formula = xlwt.Formula(link)
ws.write(2, 0, formula)
The formula is similar
ws.write(1, 1, xlwt.Formula('SUM(A1,B1)'))
# time
style.num_format_str = 'M/D/YY'
ws.write(2, 1, datetime.datetime.now(), style)
Copy the code
That’s it for writing data to Excel in Python. Now let’s look at how to read data in Excel.
xlrd
The library reads data in Excel
Reading Excel files
It is also easy to read Excel data with XLRD library. Let’s take a look at the implementation code
Open a file first
wb = xlrd.open_workbook(file_path)
Get the first table
sheet1 = wb.sheet_by_index(0)
# total number of rows
nrows = sheet1.nrows
The total number of columns
ncols = sheet1.ncols
We can loop through the data
# the data
for i in range(nrows):
for j in range(ncols):
The # cell_value method retrieves row I and column J
value = sheet1.cell_value(i, j)
print(value)
Copy the code
To sum up, it is divided into the following steps:
- First of all by
xlrd
The libraryopen_workbook
Method to open an Excel file - Then through
sheet_by_index
Method to obtain table - Then get the number of rows and columns of the table, respectively, for later iteration
- Based on the number of columns and rows, loop through
cell_value
Method gets the data in each cell
Work sheet related operations
There are several ways to get a worksheet
# by index
sheet1 = wb.sheets()[0]
sheet1 = wb.sheet_by_index(0)
# by name
sheet1 = wb.sheet_by_name('test')
Get all table names
# sheet_names = wb.sheet_names()
Copy the code
Gets all data for a row or column
Get all data in the row, return a list
tabs = sheet1.row_values(rowx=0, start_colx=0, end_colx=None)
# return how much data there is in a row
len_value = sheet1.row_len(rowx=0)
Copy the code
The three arguments to row_values are the row number, the starting column, and the ending column, where the ending column None means that all data is retrieved from the beginning column to the end column
Similarly, get the data for a column
cols = sheet1.col_values(colx=0, start_rowx=0, end_rowx=None)
Copy the code
Processing time data
The time data is special and is obtained directly through the cell_value method above. We need to convert it to a timestamp first, and then format it to the format we want.
For example, you want to obtain the time data in the format of 2019/8/13 20:46:35 in an Excel table
# Fetch time
time_value = sheet1.cell_value(3, 0)
Get the timestamp
time_stamp = int(xlrd.xldate.xldate_as_datetime(time_value, 0).timestamp())
print(time_stamp)
# format date
time_rel = time.strftime("%Y/%m/%d", time.localtime(time_stamp))
print(time_rel)
Copy the code
Basically, there are three steps:
- through
cell_value
Method to get the time value - Then through
xldate_as_datetime
Method to get the timestamp - And then format it
conclusion
Excel files are a common scenario when dealing with data in Python. With XLWT and XLRD, you can import and export Excel data very quickly. You can bookmark this article for future reference when dealing with Excel files.
Welcome to pay attention to my public number to view more wonderful articles!Copy the code