Background — Ask a good question
In the development process, we often encounter the situation of Excel export, especially in the enterprise development, involving customer information, financial statements, market analysis, etc., there are many scenarios. Most of the development process involves writing a separate set of code for each export. As more and more exports came in, I wondered: Is there something universal enough that we can implement Excel exports without having to write so much code?
With this question in mind, I started my road of “ExcelUtil”. During this process, I mainly contacted easyPOi, but I was still not satisfied. Because EasyPOI does what most Java libraries do: write configurations based on fields. This is not to say that this is not a good idea. There are many libraries that do this, such as Fastjson, Jackson, etc., which annotate a field to describe its information or function. But for Excel exports, I’ve always felt there’s a more general approach.
After a period of exploration and exploration, I found inspiration in the table label of the front end, and I think this way is very good and very good. The table tag itself contains a lot of descriptive information, such as rows, columns, merged rows, and merged columns, which are “strikingly similar” to Excel sheets. Coupled with the development of the three front-end frameworks in recent years, In particular, the way the Angular and Vue frameworks customize attributes on tags further inspired me in writing ExcelUtil.
Introduction to the
“ExcelUtil” – same as RunnerUtil (GitHub) – was written in May/June of this year and has been rewritten recently. .
ExcelUtil defines its scope according to the hierarchy of Excel files, sheet pages, row rows and cell cells. Each scope can be customized to a certain extent, etc. The scope does not affect each other. This is consistent with the scope structure of Java, JavaScript, and other languages.
The difference is that ExcelUtil is used much more often than RunnerUtil. The original idea behind the ExcelUtil export was the Java ScriptEngine. However, the built-in scripting engine is so inefficient that it freezes when the amount of data is slightly (not too large). However, RunnerUtil is fully functional and performs well, running all sorts of complex Java string formulas, and can be used entirely on its own.
Used to introduce
-
The first thing you need to prepare before using ExcelUtil is data. There is no special format requirement for data. It can be any Java type data, such as Collection, Iterable, or Iterator. Can be used for large Excel export, although later failed, but still very grateful to the interviewer! , Map, array, POJO, Number, etc.
-
The second step is to do “annotation programming” on the way to generate Workbook locations — yes, Java annotations are powerful enough to be used both internally and as a “programming language” within Java (which is to say, writing a simple parser).
// Do declarative annotation programming on the method where you export
// First declare that this is an Excel, using type to specify XLS or XLSX
@TableExcel(type = TableExcel.Type.XLS, value = {
/* * value contains information about all sheet pages * from sheet down, each sheet can be checked, looped, etc. * sheetName specifies the sheet name. See RunnerUtil * for more information, because everything is parsed with RunnerUtil and needs to conform to its format */
@TableSheet(sheetName = "' Personal Information '", value = {
/* * declares an array named NAMES here, which is used as the title */
@TableRow(var = "Names = {' serial number ',' Name ',' Gender ',' age ',' phone ',' home address ',' Remarks '}", value = {
/* * Iteration is used here, iterating over names declared on row * this iteration will generate the number and content of cells */ for names
@TableCell(var = "name:names", value = name)
}),
/* * In is supported in homage to the source of inspiration (front end) * But in is not the keyword. Can still be used as a normal variable * except that in has at least one space on each end * Iterable data types more on this in a moment */
@TableRow(var = "($rowData, index) in collect", value = {
@TableCell("index + 1"), / / serial number
@TableCell("$rowData.name"), / / name
@TableCell("$rowData.sex"), / / gender
@TableCell("$rowData.age"), / / age
@TableCell("$rowData.mobile"), / / phone
@TableCell("$rowData.address"), // Home address
// The cell will be created when index == 0
// Colspan is a merge column
@TableCell(when = "index == 0", rowspan = "data.size()")})})})public Workbook exportExcel(Object data){
/* * After writing the annotations, just call this method to get a Workbook * * Where to call the render method */
return ExcelUtil.render(data);
}
Copy the code
- ExcelUtil.render(data); Iterable data for in (or colon:) in rendering are:
- Number (integer) var = “$item in 10”;
- RunnerUtil iterates over each character in the string, but char is not supported, so the iterated over is a single character string
- Collection, Iterable, List, Set, etc.
- Map, iterating over the value of each key-value pair;
- Pojos, ordinary Java objects that iterate over field names, iterate over field values
- The return value of the expression after when must be of Boolean type
- The colSPAN and RowSPAN expressions must return an int
- Heigit, width, etc. must also be int
Effect:
- Corresponding Excel renderings generated
The performance test
Paste a 10-column Excel performance test table exported by this tool (local environment I7-8700K 16G Win10)
Number of lines (ten thousand lines) | Data Generation Time (ms) | Write to file Time (ms) | Total Time (ms) |
---|---|---|---|
100 | 6182 | 5565 | 11747 |
300 | 14800 | 16693 | 31493 |
500 | 25876 | 27317 | 53193 |
700 | 36121 | 42171 | 78292 |
999 | 53532 | 54745 | 108277 |
4000 | 240453 | 271832 | 512285 |
6000 | 366987 | 423351 | 790338 |
8000 | 528654 | 498490 | 1027144 |
It can be seen from this data that as the amount of data increases, the relationship between time and data is positively correlated, close to a linear relationship. It takes 6s (12s in total) to generate a Workbook for 1 million rows of data, which can meet the time requirements in normal business scenarios.
Other instructions
- When the amount of Excel data exceeds 1.5 million rows, it is not recommended to use XLS format (this data should be different on different machines, the machine can export 1.5 million rows of XLS normally, 1.8 million rows of OOM).
- When the amount of data exceeds 5 million rows (depending on the environment), the type value of TableExcel should be SUPER (type = TableExcel.type.super), which corresponds to the XLSX format. But SUPER is designed to support very large data exports;
- 1.5 million rows and 5 million rows are basically normal business limits.
- Currently, only export is supported, but not import.