This article was first published on the public account “AntDream”. You are welcome to search “AntDream” on wechat or scan the QR code at the bottom of this article to follow it. We will make progress every day

One great use of Python is data analysis, and data analysis often involves dealing with Excel data. Here’s a summary of how Python handles Excel data, and it works for the most part. Excel data processing in Python will not be difficult in the future!

Python processes Excel data using two libraries: XLWT and XLRD. The XLWT library is responsible for importing data into the generated Excel spreadsheet file, while the XLRD library is responsible for pulling the data out of the Excel spreadsheet.

xlwtThe library imports data into Excel

Write the data to an Excel file

wb = xlwt.Workbook()
Add a table
ws = wb.add_sheet('test')


# 3 arguments are row number, column number, and content
Note that both the row and column numbers start at 0
ws.write(0, 0, 'Column 1')
ws.write(0, 1, 'Column 2')
ws.write(0, 2, 'Column 3')

Save the Excel file
wb.save('./test.xls')

Copy the code

As you can see, using the XLWT library is very simple, basically in three steps:

  1. To open aWorkbookObject and useadd_sheetMethod to add a table
  2. And then you usewriteMethod to write data
  3. In the end,saveMethods to save

Note that the rows and columns defined by the XLWT library surface are counted from 0

Customize Excel table styles

Table styles typically have several pieces of content: font, alignment, border, background color, width, and special content such as hyperlinks, dates and times. Let’s take a look at how to customize each of these styles using the XLWT library.

The font

XLWT library also supports many font attributes, which are as follows:

To set the Font, XFStyle class and Font class of XLWT library are used, and the code template is as follows:

style = xlwt.XFStyle()

# set font size
font = xlwt.Font()
For example, set font bold and underline
font.bold = True
font.underline = True
style.font = font

# then apply
ws.write(2, 1, 'test', style)
Copy the code

The following properties are set similarly in 4 steps:

  1. getXFStyle
  2. Get the properties you need, like this one right hereFontobject
  3. Sets the specific property value
  4. And then at the endwriteMethod is applied when writing data
Cell alignment

So how do we set cell alignment

# Cell alignment
alignment = xlwt.Alignment()

Horizontal alignment and vertical alignment
alignment.horz = xlwt.Alignment.HORZ_CENTER
alignment.vert = xlwt.Alignment.VERT_CENTER
# wrap
alignment.wrap = 1
style.alignment = alignment

# then apply
ws.write(2, 1, 'test', style)
Copy the code

The wrap attribute above is useful because most of the time the data will be long, and it is best to use the width attribute together with the cell, so that the overall style will be much better

Cell width Settings:

Set the cell width, that is, the width of a column
ws.col(0).width = 6666
Copy the code
The background color of the cell

The attribute corresponding to the background color is Pattern

# the background color
pattern = xlwt.Pattern()
pattern.pattern = xlwt.Pattern.SOLID_PATTERN

The background color is yellow
# 0 = Black, 1 = White, 2 = Red, 3 = Green, 4 = Blue, 5 = Yellow, 6 = Magenta,
# 7 = Cyan, 16 = Maroon, 17 = Dark Green, 18 = Dark Blue, 19 = Dark Yellow ,
# almost brown), 20 = Dark Magenta, 21 = Teal, 22 = Light Gray, 23 = Dark Gray
#...
pattern.pattern_fore_colour = 5
style.pattern = pattern

# then apply
ws.write(2, 1, 'test', style)
Copy the code
Cell border

The border property is Borders

There are two types of cell borders: color and border line styles

You can set the color and style of the top, bottom, left and right borders separately

# border
borders = xlwt.Borders()

# border can be set to top, bottom, left, right respectively
You can set color and line style for each border: solid line, dashed line, none
# color Settings, other similar
borders.left_colour = 0x40
# set dotted lines, otherwise similar
borders.bottom = xlwt.Borders.DASHED
style.borders = borders

# then apply
ws.write(2, 1, 'test', style)
Copy the code
Special content, such as hyperlinks and formulas

There are a few main types of specific content that you will encounter: hyperlinks, formulas, and dates and times

We need formulas to handle these special things

# hyperlinks
link = 'HYPERLINK("http://www.baidu.com";" Baidu")'
formula = xlwt.Formula(link)
ws.write(2, 0, formula)

The formula is similar
ws.write(1, 1, xlwt.Formula('SUM(A1,B1)'))

# time
style.num_format_str = 'M/D/YY'
ws.write(2, 1, datetime.datetime.now(), style)
Copy the code

That’s it for writing data to Excel in Python. Now let’s look at how to read data in Excel.


xlrdThe library reads data in Excel

Reading Excel files

It is also easy to read Excel data with XLRD library. Let’s take a look at the implementation code

Open a file first
wb = xlrd.open_workbook(file_path)
Get the first table
sheet1 = wb.sheet_by_index(0)

# total number of rows
nrows = sheet1.nrows
The total number of columns
ncols = sheet1.ncols

We can loop through the data
# the data
for i in range(nrows):
    for j in range(ncols):
        The # cell_value method retrieves row I and column J
        value = sheet1.cell_value(i, j)
        print(value)
Copy the code

To sum up, it is divided into the following steps:

  1. First of all byxlrdThe libraryopen_workbookMethod to open an Excel file
  2. Then throughsheet_by_indexMethod to obtain table
  3. Then get the number of rows and columns of the table, respectively, for later iteration
  4. Based on the number of columns and rows, loop throughcell_valueMethod gets the data in each cell

Work sheet related operations

There are several ways to get a worksheet

# by index
sheet1 = wb.sheets()[0]
sheet1 = wb.sheet_by_index(0)

# by name
sheet1 = wb.sheet_by_name('test')

Get all table names
# sheet_names = wb.sheet_names()
Copy the code

Gets all data for a row or column

Get all data in the row, return a list
tabs = sheet1.row_values(rowx=0, start_colx=0, end_colx=None)
# return how much data there is in a row
len_value = sheet1.row_len(rowx=0)
Copy the code

The three arguments to row_values are the row number, the starting column, and the ending column, where the ending column None means that all data is retrieved from the beginning column to the end column

Similarly, get the data for a column

cols = sheet1.col_values(colx=0, start_rowx=0, end_rowx=None)
Copy the code

Processing time data

The time data is special and is obtained directly through the cell_value method above. We need to convert it to a timestamp first, and then format it to the format we want.

For example, you want to obtain the time data in the format of 2019/8/13 20:46:35 in an Excel table

# Fetch time
time_value = sheet1.cell_value(3, 0)

Get the timestamp
time_stamp = int(xlrd.xldate.xldate_as_datetime(time_value, 0).timestamp())
print(time_stamp)

# format date
time_rel = time.strftime("%Y/%m/%d", time.localtime(time_stamp))
print(time_rel)
Copy the code

Basically, there are three steps:

  1. throughcell_valueMethod to get the time value
  2. Then throughxldate_as_datetimeMethod to get the timestamp
  3. And then format it

conclusion

Excel files are a common scenario when dealing with data in Python. With XLWT and XLRD, you can import and export Excel data very quickly. You can bookmark this article for future reference when dealing with Excel files.


Welcome to pay attention to my public number to view more wonderful articles!Copy the code