Speaking of Excel reading and writing, we can’t help but think of POI, a respected predecessor. After years of changes, although there have been many up-and-comers such as EasyExcel and EasyPOI, POI, the predecessor, still has a place in Excel’s rivers and rivers, and always move towards us, and still shine. To be fair, POI did not achieve satisfactory performance and memory resource consumption due to the constraints of the initial architecture, but it is undeniable that POI is still an excellent Excel toolkit, and its rich and flexible API is also lacking in other toolkits such as EasyPOI.

POI’s operations on Excel are based on a set of hairlike models, the basic structure of which is shown in Figure 1 below. But success is nothing, failure is nothing. This overly detailed model provides more flexibility, but it is also a double-edged sword, limiting POI performance and increasing resource consumption such as memory. If it is a large amount of data of simple Excel report reading and writing, choose EasyExcel, EasyPOI will be a better choice. POI can also be a good choice for custom formatting and rich presentation.

Figure 1 Excel model diagram of POI

As shown in the figure above, POI performs operations on Excel files through four key model objects and their associated apis. They are:

  • A Workbook is a container that contains information about the entire Excel file. It is also the source of POI operations on Excel, including obtaining a Sheet, defining cell styles, defining font styles, and input and output of files.

  • A Sheet, which can be called a worksheet or table, is a table container that contains multiple rows and columns.

  • Row is a collection container for a single Row of data and cells.

  • A Cell, or Cell, is a container for individual data.

In addition to the four key model objects mentioned above, POI also has CellStyle table styles, Font fonts, and other model objects such as XSSFComment. Some objects support only certain versions of Excel, such as XSSFComment, which requires Excel 2007+. With the Excel model of POI, we can expand the operations of Excel in detail below.

Basic operation

The POI must have four basic elements of Workbook, Sheet, Row and Cell in order to perform detailed operations on Excel. So the basic operation is to get and create the four elements and read and write the cell values.

Workbook operation

If we need to read data from an Excel file, we first build a Workbook instance object through the file InputStream InputStream. We need to build different types of Workbooks for different versions of Excel. For example, for Excel 2003, we need a WorkBook of the HSSFWorkbook type. For higher versions, we can use XSSFWorkbook. Here we use XSSFWorkbook as an example:

Workbook book = new XSSFWorkbook(stream);
Copy the code

When we need to write a new Excel file, we can directly instance a Workbook object as follows:

Workbook book = new XSSFWorkbook();
Copy the code

Of course, after writing the data, you also need to export the Workbook data to a file or target output stream. Also, you need to shut down the Workbook after doing this, or you may end up running out of memory. The specific code is as follows:

book.write(output);
book.close();
Copy the code

Sheet operation

Once we have a Workbook instance of Excel, we can move on to the worksheet. When we need to read data from a particular Sheet, we can get the instance object of the Sheet by index position:

book.getSheetAt(0)
Copy the code

Of course, we can also obtain the worksheet name by way of:

book.getSheet(name)
Copy the code

In addition, there is a more secure way to obtain it by traversing:

For (Sheet Sheet: book) {// to do more...... }Copy the code

We need to get the Sheet instance when we read data, and we need to create the Sheet instance when we write a new Excel file. If you want to name your worksheet, you can use the following code:

This book. The createSheet (WorkbookUtil. CreateSafeSheetName (data. The getName (), '_'));Copy the code

Excel has the following rules for naming a Sheet: 1. Do not use the same name. 2. The value cannot contain special characters. For the second by WorkbookUtil. CreateSafeSheetName this tips can help you effectively circumvent the problem.

The Row operation

Row is relatively easy to manipulate compared to other elements. Get Row instance objects by index or traversal when reading data:

int count = sheet.getLastRowNum(); for (int i = sheet.getFirstRowNum(); i <= count; i++) { Row excelRow = sheet.getRow(i); // To do STH. }Copy the code

Create Row instance object by index number when writing:

this.sheet.createRow(row.getIndex());
Copy the code

Cell operation

For us, the most important thing about Cell operations is not to get or create Cell instance objects, but to be able to read or write data. Of course, when we get the Cell instance and read its value, we can’t simply read the data in it. ** Before reading data, it is important to determine the type of cell data and select the appropriate interface for reading. The ** code is as follows:

Cell excelCell = row.getCell(i); private Object getValue(Cell cell){ Object value = null; CellType type = cell.getCellType(); switch(type){ case BOOLEAN:{ value = cell.getBooleanCellValue(); break; } case STRING:{ value = cell.getStringCellValue(); break; } case NUMERIC:{ CellStyle style= cell.getCellStyle(); if(! ObjectUtils.isEmpty(style) && ! StringUtils.isEmpty(style.getDataFormatString())){ String dateFormat = style.getDataFormatString(); if(dateFormat.contains(":")){ value = LocalDateTimeUtils.from(DateUtil.getJavaDate(cell.getNumericCellValue())); } else if(dateFormat.toLowerCase().contains("y")){ value = LocalDateUtils.from(DateUtil.getJavaDate(cell.getNumericCellValue())); } else{ value = cell.getNumericCellValue(); } } else{ value = cell.getNumericCellValue(); } break; } case FORMULA: case ERROR: case BLANK: case _NONE:{ break; } default:break; } return value; }Copy the code

Especially when reading time type data, you need to be careful. POI usually identifies time-typed data in Excel as NUMERIC, so it’s easy to confuse it with a NUMERIC value when reading it. The only way to remedy this is through the data format part of it. In addition, dateutil.getJavadate () converts Excel values to JavaDate types when retrieving time values. If you need to convert to another type, you need to write your own logic.

Writing cell data is easier than reading:

this.cell = this.row.createCell(data.getIndex()); private void writeValue(Object value) { if (ObjectUtils.isEmpty(value)) { this.cell.setBlank(); } else if (value instanceof String) { this.cell.setCellValue((String) value); } else if (value instanceof Boolean) { this.cell.setCellValue((boolean) value); } else if (value instanceof Double || value instanceof Float) { if (! Double.isNaN((double) value) && ! Double.isInfinite((double) value)) { this.cell.setCellValue( BigDecimal.valueOf((double) value).setScale(2, RoundingMode.HALF_UP).doubleValue()); } } else if (value instanceof Integer || value instanceof Short) { this.cell.setCellValue((int) value); } else if (value instanceof Long) { this.cell.setCellValue((long) value); } else if (value instanceof Date) { this.cell.setCellValue((Date) value); } else if (value instanceof LocalDate) { this.cell.setCellValue(LocalDateUtils.to((LocalDate) value)); } else if (value instanceof LocalDateTime) { this.cell.setCellValue(LocalDateTimeUtils.to((LocalDateTime) value)); }}Copy the code

If you need to customize the time format, this technique will be covered below and not explained here.

SAO operation

In the basic operations above, we talked about the general operations of the four basic elements of Workbook, Sheet, Row and Cell. These operations are sufficient for ordinary daily reading and writing needs. However, these operations do not meet some personalized needs, such as merging cells, adding background colors, and even adding annotations. These operations are explained here.

Grid lines

By default, Excel worksheets have dark gray dividing lines between the columns and columns, called grid lines. Grid lines make tables easier to read. However, grid lines can also affect aesthetics, so many personalized Excel reports will hide grid lines, as shown in the following figure:

Figure 2 Excel grid lines and reports

So how do you hide grid lines? It’s really simple:

this.sheet.setDisplayGridlines(false);
Copy the code

Freeze panes

The freeze pane is used to specify the part of the area where certain rows and columns do not scroll along with the scroll bar. When we have many rows and columns of data, it is often difficult to remember which row and which item the data currently belongs to when scrolling. Frozen Windows lock header rows and columns, making it easy to see which row and which entry the data belongs to as you scroll.

Figure 3 Freezing window

So how does a POI freeze Windows? Also relatively simple as follows:

this.sheet.createFreezePane(data.getFreezen().x, data.getFreezen().y);
Copy the code

A frozen region containing the upper and left regions of the cell can be generated by specifying the lower right cell of the frozen window.

Sheet hiding

Most of the time, worksheets don’t necessarily need to be displayed. For example, when we use a custom Excel template file, it usually contains worksheets that hold configuration items and raw data for the report. These worksheets provide data support only and are not intended for presentation. This is when we usually talk about hiding it.

Figure 4 Hides the worksheet

This is also a relatively simple operation, but requires the Workbook and Sheet to operate together:

this.book.setSheetHidden(this.book.getSheetIndex(this.sheet), true);
Copy the code

Automatic formula validation

There is a special class of cells within cells. It does not have a direct value, but is computed by a formula that references data from other cells and changes as the value of the referenced cell changes. In theory, if cells are configured with formulas, we don’t need to manage them. But the reality is pretty bleak. ** When we write through POI to generate Excel file, we open Excel and find that the cells of the configuration formula do not change with the data written. ** is not hard to imagine! But that’s the reality.

Figure 5. Cell with formula

So how do you make this formula work? The solution is also surprisingly simple, and poIS already provide ready-made interfaces. After filling in the data, call the following code:

workBook.setForceFormulaRecalculation(true);
Copy the code

Cell style

When we need to do personalized report customization, it is inevitable to use the cell style, such as cell background color, font and so on. This is where CellStyle, Font and other objects are needed. Let’s break it down a little bit. On a side note, CellStyle and Font are both globally reusable objects, so it’s best to create an instance cache pool and cache these objects for better memory usage.

The background color

Setting the background color is the basic requirement of personalized report. So how do you add a background color to a cell? First we must have the CellStyle instance object, then configure the background color in this instance object, and finally set the previous style to the Cell, code as follows:

CellStyle style = this.book.createCellStyle();
	
style.setFillPattern(FillPatternType.SOLID_FOREGROUND);
style.setFillForegroundColor(data.getBackground());
	
this.cell.setCellStyle(style);
Copy the code

Including setFillForegroundColor color value can be set by IndexedColors many Excel’s built-in color, for example BLUE can IndexedColors. BLUE. GetIndex () specified.

alignment

Like the background color, alignment is a style of cell, which also requires a CellStyle instance object. Alignment can be divided into vertical alignment and horizontal alignment.

Figure 6 Excel alignment

So how to do this with POI? Let’s look at the code:

style.setAlignment(HorizontalAlignment.CENTER);
style.setVerticalAlignment(VerticalAlignment.CENTER);
Copy the code

The code above is horizontally centered and vertically centered, respectively. Adjusting the HorizontalAlignment and VerticalAlignment parameters can also achieve left and right alignment

A border

Borders are another common personalization of cells. Typically in custom reports, we add a thick border to the title and form header to highlight certain content blocks.

Figure 7 Setting the border

POI also has a set of border processing, of course, still can not leave CellStyle old buddy:

style.setBorderLeft(BorderStyle.THIN);
style.setLeftBorderColor(IndexedColors.BLACK.getIndex());

style.setBorderRight(BorderStyle.THIN);
style.setRightBorderColor(IndexedColors.BLACK.getIndex());

style.setBorderTop(BorderStyle.THIN);
style.setTopBorderColor(IndexedColors.BLACK.getIndex());
 
style.setBorderBottom(BorderStyle.THIN);
style.setBottomBorderColor(IndexedColors.BLACK.getIndex());
Copy the code

The date format

In the chapter on cell data writing, I also said that for date formats, we can customize date formats. POI uses CellStyle to customize the date format:

DataFormat format = this.book.createDataFormat();
style.setDataFormat(format.getFormat("yyyy-MM-dd"));
Copy the code

First we need to build the DataFormat from the Workbook object, and then we need to build a stereotyped date format from the DataFormat. Of course, we can also build other date formats by tweaking the parameters.

The font

The cell font is also one of the Settings that we frequently change in custom reports and use relatively frequently. Like borders, they are often used to highlight parts of content, such as headings that need to be enlarged and bold.

Figure 8. Various fonts in Excel

What about font POI? First we need to build a Font object with the Workbook instance object. The Font instance object basically contains all of the Font Settings such as color, size, thickness, and so on. You then need to configure Font to CellStyle, and finally apply it to the cell with CellStyle. The specific code is as follows:

Font font = this.book.createFont(); font.setColor(style.getColor()); / / color of the font. SetFontHeight (style.css. GetSize ()); Font. SetBold (true); / / in bold font. SetItalic (true); / / italic font. SetUnderline (font. U_SINGLE); // Underline style.setfont (font);Copy the code

notation

A comment is an area of text floating above a cell, usually used for notes, notes, or notes. For example, calibration notes to a paragraph of the article

Figure 9 notation

For the XSSFWorkbook type, you can refer to the following code:

XSSFDrawing p = (XSSFDrawing) this.sheet.createDrawingPatriarch();
XSSFComment comment = p.createCellComment(new 		XSSFClientAnchor(0, 0, 0, 0, firstCell, firstRow, lastCell, lastRow));
comment.setString(new XSSFRichTextString(data.getComments()));
this.cell.setCellComment(comment);
Copy the code

For Excel, the first step in creating an annotation is to get a drawing area, XSSFDrawing. XSSFDrawing determines the anchor point of XSSFComment and the display range roughly equivalent to how many cells. Set the content by annotating the instance object XSSFComment, and finally bind it to the corresponding cell.

Merged cell

For a lot of custom table headers, we usually need to sort, and in this case we need to use merged cells, which represent which rows or columns of data are a set of data.

First we need to define the range of cells we want to merge with CellRangeAddress:

CellRangeAddress address = new CellRangeAddress(range.getFirstCell().y, range.getLastCell().y,
range.getFirstCell().x, range.getLastCell().x);
Copy the code

In addition, we can set borders and other operations on the cells that need to be merged:

RegionUtil.setBorderLeft(BorderStyle.THIN, address, this.sheet);
          RegionUtil.setLeftBorderColor(IndexedColors.BLACK.getIndex(), address, this.sheet);
Copy the code

Finally, let the cell merge work:

this.sheet.addMergedRegion(address);
Copy the code

Note that merging cells not only does not work if there are less than or equal to 1 cell, but also throws an exception, requiring additional detection.

Figure 10 merging cells

Table style

Excel also has built-in table styles, so strictly speaking a Sheet is just a worksheet, with a table-like model definition inside the Sheet. It is XSSFTable.

Figure 11 table style

The setting of the table style is somewhat similar to the generation of comments. The code is as follows:

if (this.sheet instanceof XSSFSheet) {
	XSSFSheet sheet = (XSSFSheet) this.sheet;

	Row firstRow = this.sheet.getRow(tableSheet.getFirstRow());
	Cell firstCell = firstRow.getCell(tableSheet.getFirstCell());

	Row lastRow = this.sheet.getRow(tableSheet.getLastRow());
	Cell lastCell = lastRow.getCell(tableSheet.getLastCell());

	AreaReference reference = new AreaReference(new CellReference(firstCell), new CellReference(lastCell),
					SpreadsheetVersion.EXCEL2007);
	XSSFTable table = sheet.createTable(reference);
	table.setDisplayName(tableSheet.getTableName());
	table.setStyleName(XSSFBuiltinTableStyle.TableStyleMedium2.name());
}
Copy the code

Template export

Although POI can do a lot of manipulation, there are still many advanced Excel functions that POI cannot do, such as chart functions such as bar charts. But that’s not to say it can’t be done. Excel template files allow POI to generate very advanced reports.

First we need to build an Excel template file that contains two types of Sheets. A class of sheets contains raw data and provides data support. The other kind of sheet contains bar charts and corresponding statistical structures. Data from type 2 sheets usually comes from type 1 sheets. This way we can automatically generate very nice reports in the second type sheet just by writing data to the first type sheet. The code is as simple as assigning the Excel template file input stream to the Workbook as if it were reading a file:

Workbook book = new XSSFWorkbook(stream);
Copy the code

Case of operation

The operations and apis provided by the POI are detailed and cumbersome in consideration of universality. However, in the product, we usually do not need such complicated and meticulous operations, only a part of the operation can be. So the best way to use POI is to build a simplified shell around it.

Model mapping

Mapping through simplified models is one way. The core of this approach is to use the POI as a shadow system, and then define a simplified data model of the Workbook, Sheet, Row, and Cell elements and the corresponding mapping shell API to interact with the POI. Doing so simplifies the operation of the POI while retaining some flexibility.

For example, for Row, we define a DataRow mapping data model:

public class DataRow { protected int index; protected List<DataCell> cells = new ArrayList<>(); * @return */ public int getIndex() {return index; } public void setIndex(int index) { this.index = index; } @datacell */ public List<DataCell> getCells() {return cells; } public void setCells(List<DataCell> cells) { this.cells = cells; } @return */ public Boolean isEmpty(){Boolean result = objectutils.isempty (this.cells); if(! result){ for(DataCell cell: cells){ result = cell.isEmpty() && result; } } return result; }}Copy the code

Then remote POI operation via shell API:

class DefaultExcelRowWriter implements ExcelRowWriter { private Sheet sheet; private ExcelContext context; private Row row; public DefaultExcelRowWriter(Sheet sheet, ExcelContext context) { this.sheet = sheet; this.context = context; } @param index */ private void addEmptyCell(int index) {this.row.createcell (index, cellType.blank); } @Override public void write(DataRow row) throws Exception { this.row = this.sheet.createRow(row.getIndex()); int cellIndex = 0; for (DataCell cell : row.getCells()) { for (; cellIndex < cell.getIndex(); cellIndex++) { this.addEmptyCell(cellIndex); } ExcelCellWriter cellWriter = new DefaultExcelCellWriter(this.sheet, this.row, this.context); cellWriter.write(cell); cellIndex = cell.getIndex() + 1; }}}Copy the code

annotations

Although model mapping and shell apis can simplify POI operations to some extent, they still can’t avoid a lot of memory waste and tedious operation. So for simple Excel reports or custom reports, annotations are probably the best solution right now. Annotations also use the principles of mapping and shells. It can be said that annotations are a different approach to implementation. But it can be simpler. Annotations are not expanded here, refer to EasyExcel and EasyPOI if necessary. These are both excellent works of the craft.

conclusion

Indeed, as time goes by, POI is getting old and getting a little senile. And it’s still an excellent tool. POI actually have a lot of SAO operation, space is limited no longer tired. I hope it can be rejuvenated with the progress of The Times.