JAVA operation Word merge, replace placeholder, Word insert rich text, generate watermark

The article gives an overview of

Introduction of POI library and matters needing attention
Merge multiple Word documents
Replace placeholders in documents, including paragraph placeholders, table placeholders
Insert rich text into Word and notes
Generate a watermark for Word
portal
thanks

Introduction of POI library and matters needing attention

Java operating Word tool class library is based on POI4.1.0 version, POI official API, you can use Google’s own full text translation, very convenient. Note that the operation of Word in the article is docX suffix, that is, Word2007 version, if you need to operate Word2003 version also need to convert.

Later, the update will be written from the table data read in Excel to Word, and from another Word to read the template table to the current Word. Each function in the project code provides the test class, and you need to pull down the code to modify the file directory and execute it in one step.

Below begins to enter the theme, the article only posts the key code, all the code please through the portal to GitHub pull, if the feeling is helpful to you, please light up your noble little star on GitHub, code brick is not easy, reproduced please explain the source, thank you.

pox.xml

<? xml version="1.0" encoding="UTF-8"? > <project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> < modelVersion > 4.0.0 < / modelVersion > < groupId > com. Corey < / groupId > < artifactId > wordtools < / artifactId > < version > 1.0 - the SNAPSHOT < / version > < dependencies > <! -!!!!! POI </groupId> <artifactId> POI </artifactId> <version>4.1.0</version> </dependency> <dependency> <groupId>org.apache.poi</groupId> <artifactId>poi-scratchpad</artifactId> <version>4.1.0</version> </dependency> <dependency> <groupId>org.apache.poi</groupId> <artifactId> Poi-ooxml </artifactId> < version > 4.1.0 < / version > < / dependency > < the dependency > < groupId > org.. Apache poi < / groupId > < artifactId > poi - ooxml - schemas < / artifactId > < version > 4.1.0 < / version > < / dependency > <! -- POI dependency packages!! -- > <! Docx4j </groupId> <artifactId>docx4j</artifactId> <version>3.3.6</version> </dependency> <dependency> <groupId>org.docx4j</groupId> <artifactId>docx4j-ImportXHTML</artifactId> < version > 3.3.6 < / version > < / dependency > < the dependency > < groupId > org. Docx4j < / groupId > < artifactId > docx4j - export xsl-fo < / artifactId > < version > 3.3.6 < / version > < / dependency > < the dependency > <groupId>org.jsoup</groupId> <artifactId>jsoup</artifactId> <version>1.11.2</version> </dependency> <! --out net end --> <! -- https://mvnrepository.com/artifact/org.springframework/spring-core --> <! <dependency> <groupId>org.springframework</groupId> <artifactId> Spring-core </artifactId> < version > 5.2.1. RELEASE < / version > < / dependency > < the dependency > < groupId > Commons - IO < / groupId > The < artifactId > Commons - IO < / artifactId > < version > 2.5 < / version > < / dependency > <! -- https://mvnrepository.com/artifact/javax.servlet/javax.servlet-api --> <dependency> <groupId>javax.servlet</groupId> < artifactId > javax.mail. The servlet API - < / artifactId > < version > 4.0.1 < / version > < scope > provided < / scope > < / dependency > </dependencies> </project>Copy the code

Merge multiple Word documents

The basic idea of POI merging documents is that Word itself is an XML file. By remerging different XML Xmlns and adding fixed format tags, the elements in different XML are splicing together to form a new XML file and output into a new Word. See the project’s MagerWord directory for more code.

package magerword; import org.apache.poi.openxml4j.opc.OPCPackage; import org.apache.poi.xwpf.usermodel.XWPFDocument; import org.apache.xmlbeans.XmlOptions; import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTBody; import org.springframework.util.ObjectUtils; import java.io.*; import java.util.*; @description: * @author: Corey * @create: 2020-04-29 19:04 **/ public class MagerUtil {/** * all Word * @param filepaths * @throws Exception */ public static void mergeDoc(String... Filepaths) throws Exception {// If you need to configure the path for exporting files, replace it with your PC's path OutputStream dest = new FileOutputStream("/ Users/Corey/Desktop/temp/wordtools/merge document 3. Docx." ");
       List<CTBody> ctBodyList = new ArrayList<>();
       List<XWPFDocument> srcDocuments = new ArrayList<>();
       for (String filepath : filepaths) {
           InputStream in = null;
           OPCPackage srcPackage = null;
           try {
               in = new FileInputStream(filepath);
               srcPackage = OPCPackage.open(in);
           } catch (Exception e) {
               e.printStackTrace();
           } finally {
               closeStream(in);
           }
           XWPFDocument srcDocument = new XWPFDocument(srcPackage);
           CTBody srcBody = srcDocument.getDocument().getBody();
           ctBodyList.add(srcBody);
           srcDocuments.add(srcDocument);
       }
       if(! ObjectUtils.isEmpty(ctBodyList)) { appendBody(ctBodyList); srcDocuments.get(0).write(dest); @param ctBodyList * @throws Exception */ private static void appendBody(List<CTBody> ctBodyList) throws Exception { XmlOptions optionsOuter = new XmlOptions(); optionsOuter.setSaveOuter(); // all XMLNS StringBuffer allAmlns = new StringBuffer(); AllElement = new StringBuffer(); Ctbodylist.foreach (ct -> {// Get the full XML String of each document appentString = ct.xmlText(); Append (AppentString.subString (appentString.indexof ())"xmlns"), appentString.indexOf(">"))); Appelement.append (AppentString.subString (AppentString.indexof ())">") + 1, appentString.lastIndexOf("< /"))); }); // unduplicate XMLNS String distinctPrefix = distinctXmlns(allamns.toString ()); CTBody makeBody = ctBody.factory.parse (distinctPrefix + allElement.toString() +"</xml-fragment>"); ctBodyList.get(0).set(makeBody); } /** * unmerge XML Xmlns ** @param prefix * @return
    */
   public static String distinctXmlns(String prefix) {
       int start = prefix.indexOf("xmlns");
       int end = prefix.indexOf("xmlns", start + 1);
       Set s = new HashSet();
       while (end > 0) {
           s.add(prefix.substring(start, end));
           start = end;
           end = prefix.indexOf("xmlns", start + 1);
       }
       String xmlHead = "<xml-fragment ";
       StringBuffer sb = new StringBuffer(xmlHead);
       Map<String, String> map = distinctXmlns(s);
       for (Map.Entry<String, String> entry : map.entrySet()) {
           sb.append("");
           sb.append(entry.getKey());
           sb.append("=");
           sb.append(entry.getValue());
       }
       sb.append(">");
       returnsb.toString(); } /** * XMLNS may have the same XMLNS header but pointing to a different address ** @paramset
    * @return
    */
   public static Map<String, String> distinctXmlns(Set set) {
       Map<String, String> map = new HashMap();
       Iterator i = set.iterator();
       while (i.hasNext()) {
           String xmlns = (String) i.next();
           map.put(xmlns.substring(0, xmlns.indexOf("=")), xmlns.substring(xmlns.indexOf("=") + 1));
       }
       returnmap; } /** * Close the stream * this step can be placed in the public utility class, the type of close can be used Closeable, * @param inputStream */ public static void closeStream(inputStream... inputStream) {for (InputStream i : inputStream) {
           if(i ! = null) { try { i.close(); } catch (IOException e) { e.printStackTrace(); } } } } }Copy the code

Replace placeholders in documents, including paragraph placeholders, table placeholders

The idea of replacing placeholders is to first traverse all paragraphs and tables in the document and then match the placeholders with the parameters you need to replace. In Word, paragraphs are XWPFParagraph objects and tables are XWPFTable objects. See the project’s replacemark directory for more code.

package replacemark; import org.apache.poi.xwpf.usermodel.*; import org.springframework.util.StringUtils; import java.util.Iterator; import java.util.List; import java.util.Map; import java.util.regex.Matcher; import java.util.regex.Pattern; /** * Replace paragraphs and table placeholders in documents * @author Corey * @version 1.0 * @date 2020/5/9 9:14am */ public class ReplaceUtil {/** * * @param doc document to be replaced * @param params parameter to be replaced, key= placeholder, Value = actual value */ public static void replaceInPara(XWPFDocument doc, Map<String,Object> params) { Iterator<XWPFParagraph> iterator = doc.getParagraphsIterator(); XWPFParagraph para;while (iterator.hasNext()) {
            para = iterator.next();
            if(! StringUtils.isEmpty(para.getParagraphText())){ replaceInPara(para, params); }}} public static void replaceInPara(XWPFParagraph para, Map<String,Object> params) {// Get the text String of the current paragraphsourceText = para.getParagraphText(); // Control variable Boolean replace =false;
        for (Map.Entry<String, Object> entry : params.entrySet()) {
            String key = entry.getKey();
            if(sourceText.indexOf(key)! =-1){ Object value = entry.getValue();if(value instanceof String){// Replace text placeholderssourceText = sourceText.replace(key, value.toString());
                    replace = true; }}}if(replace){// Get the number of lines in the paragraph List<XWPFRun> runList = para.getruns ();for(int i=runList.size(); i>=0; I --){para.removerun (I); } para.createrun ().settext (para.createrun ().settext ())sourceText); }} public static void replaceTable(XWPFDocument doc,Map<String,Object> Iterator<XWPFTable> Iterator = doc.gettablesiterator (); Iterator<XWPFTable> Iterator = doc.gettablesiterator (); XWPFTable table; List<XWPFTableRow> rows; List<XWPFTableCell> cells; List<XWPFParagraph> paras;while (iterator.hasNext()) {
            table = iterator.next();
            if(table.getrows ().size() > 1) {return ${return ${return ${return ${return ${return $}}if (matcher(table.getText()).find()) {
                    rows = table.getRows();
                    for (XWPFTableRow row : rows) {
                        cells = row.getTableCells();
                        for (XWPFTableCell cell : cells) {
                            paras = cell.getParagraphs();
                            for(XWPFParagraph para : paras) { replaceInPara(para, params); }}}}}}} /** * Matches the string ** @param STR * @return
     */
    private static Matcher matcher(String str) {
        Pattern pattern = Pattern.compile("\ \ $\ \ {(. +?) \ \}", Pattern.CASE_INSENSITIVE);
        Matcher matcher = pattern.matcher(str);
        returnmatcher; }}Copy the code

Rich text to Word and notes

Rich text into a Word’s train of thought, rich text itself is an HTML string, can be directly put the string as a paragraph is written to the Word, but it will be lost HTML styles, so you need to identify the HTML tags to replace Word tag, this is also the difficulty, so need to design a style instead of replacing tool, In my current project, I only do H1\H2\H3\ paragraph \ table \img SRC for url image conversion (base64 is too big to recognize in rich text). By the way, these replacement tools can be designed as chain of responsibility mode, and I haven’t done so yet. More code in the insertWord directory of the project.

package insertword; import org.apache.poi.util.Units; import org.apache.poi.xwpf.usermodel.*; import org.apache.xmlbeans.XmlCursor; import org.jsoup.Jsoup; import org.jsoup.nodes.Document; import org.jsoup.nodes.Element; import org.jsoup.select.Elements; import org.springframework.util.ObjectUtils; import org.springframework.util.StringUtils; import java.io.*; /** * Html tools * @author Corey * @version 1.0 * @date 2020/5/5 9:36pm */ public class HtmlUtil {/** * add the specified element to the document *  @param document */ public static void addElement(Document document){if(ObjectUtils.isEmpty(document)){
            throw new NullPointerException("Adding elements to empty objects is not allowed");
        }
        Elements elements = document.getAllElements();
        for(Element e:elements){
            String attrName = ElementEnum.getValueByCode(e.tag().getName());
            if(! StringUtils.isEmpty(attrName)) { e.attr(CommonConStant.COMMONATTR, attrName); }} /** * Write the contents of rich text to Word * because of the variety of rich text styles, can not be enumerated. * @param ritchText rich text * @param doc Rich text * @param Paragraph */ public static void resolveHtml(String ritchText, XWPFDocument doc, XWPFParagraph paragraph){ Document document = Jsoup.parseBodyFragment(ritchText,"UTF-8"); Try {// Add the fixed element htmlutil.addelement (document); Elements elements = document.select("["+CommonConStant.COMMONATTR+"]");
            for (Element em : elements) {
                XmlCursor xmlCursor = paragraph.getCTP().newCursor();
                switch (em.attr(CommonConStant.COMMONATTR)) {
                    case "title":
                        break;
                    case "subtitle":
                        break;
                    case "imgurl":
                        String url = em.attr("src"); InputStream inputStream = new FileInputStream(url); XWPFParagraph imgurlparagraph = doc.insertNewParagraph(xmlCursor); / / center ParagraphStyleUtil setImageCenter (imgurlparagraph); imgurlparagraph.createRun().addPicture(inputStream,XWPFDocument.PICTURE_TYPE_PNG,"Photo jpeg"., Units.toEMU(200),Units.toEMU(200));
                        closeStream(inputStream);
                        break;
                    case "imgbase64":
                        break;
                    case "table": XWPFTable xwpfTable = doc.insertNewTbl(xmlCursor); addTable(xwpfTable,em); / / set the table center ParagraphStyleUtil. SetTableLocation (xwpfTable,"center"); / / set the center content ParagraphStyleUtil. SetCellLocation (xwpfTable,"CENTER"."center");
                        break;
                    case "h1": XWPFParagraph h1paragraph1 = doc.insertNewParagraph(xmlCursor); XWPFRun xwpfRun_1 = h1paragraph1.createRun(); xwpfRun_1.setText(em.text()); / / set the font ParagraphStyleUtil. SetTitle (xwpfRun_1, TitleFontEnum. H1. GetTitle ());break;
                    case "h2": XWPFParagraph h2paragraph = doc.insertNewParagraph(xmlCursor); XWPFRun xwpfRun_2 = h2paragraph.createRun(); xwpfRun_2.setText(em.text()); / / set the font ParagraphStyleUtil. SetTitle (xwpfRun_2, TitleFontEnum. H2. GetTitle ());break;
                    case "h3": XWPFParagraph h3paragraph = doc.insertNewParagraph(xmlCursor); XWPFRun xwpfRun_3 = h3paragraph.createRun(); xwpfRun_3.setText(em.text()); / / set the font ParagraphStyleUtil. SetTitle (xwpfRun_3, TitleFontEnum. H3. GetTitle ());break;
                    case "paragraph": XWPFParagraph paragraphd = doc.insertNewParagraph(xmlCursor); // Set a paragraph to indent by four Spaces, paragraphd.createrun ().settext (""+em.text());
                        break;
                    default:
                        break; } } } catch (Exception e) { e.printStackTrace(); }} /** * read the contents of the TXT file ** @param file Want to read the file object * @return*/ public static String txt2String(File File) {StringBuilder result = new StringBuilder(); try { BufferedReader br = new BufferedReader(new FileReader(file)); // Construct a BufferedReader class to read the file String s = null;while((s = br.readLine()) ! = null) {// usereadResult.append (system.lineseparator () + s); } br.close(); } catch (Exception e) { e.printStackTrace(); }returnresult.toString(); Private static void addTable(XWPFTable XWPFTable,Element table) {Elements TRS = table.getElementsByTag("tr"); Int rownum = 0;for(Element tr : trs) { addTableTr(xwpfTable,tr,rownum); rownum++; Private static void addTableTr(XWPFTable XWPFTable,Element tr,int rownum) {public static void addTableTr(XWPFTable XWPFTable,Element tr,int rownum)  tds = tr.getElementsByTag("th").isEmpty() ? tr.getElementsByTag("td") : tr.getElementsByTag("th");
        XWPFTableRow row_1 = null;
        for (int i = 0, j = tds.size(); i < j; i++) {
            if// XWPFTableRow row_0 = XWPFTableRow. GetRow (0);if(i==0){
                    row_0.getCell(0).setText(tds.get(i).text());
                }else{ row_0.addNewTableCell().setText(tds.get(i).text()); }}else{
                if(I ==0) {// Create a new row row_1 = xwpfTable.createrow (); row_1.getCell(i).setText(tds.get(i).text()); }else{ row_1.getCell(i).setText(tds.get(i).text()); }}}} /** * @param closeables */ public static void closeStream(Closeable... closeables) {for (Closeable  c: closeables) {
            if(c ! = null) { try { c.close(); } catch (IOException e) { e.printStackTrace(); } } } } }Copy the code

Generate a watermark for Word

Word add watermark idea, use XWPFHeader object to create a page header, add text to the page header, set the font, size, color, rotation Angle can be. The code is in the insertWord directory of the project

package insertword; import com.microsoft.schemas.office.office.CTLock; import com.microsoft.schemas.vml.*; import org.apache.poi.wp.usermodel.HeaderFooterType; import org.apache.poi.xwpf.usermodel.XWPFDocument; import org.apache.poi.xwpf.usermodel.XWPFHeader; import org.openxmlformats.schemas.wordprocessingml.x2006.main.*; import java.util.stream.Stream; /** * @desc Add watermark * @author Corey * @version 1.0 * @date 2020/5/5 10:07 PM */ public class WatermarkUtil private static final String fontName ="宋体"; Private static final String fontSize ="0.2 pt"; Private static final String fontColor ="#d0d0d0"; Private static final Integer widthPerWord = 10; private static final Integer widthPerWord = 10; private static final Integer widthPerWord = 10; Private static Integer styleTop = 0; // Text rotation Angle private static Final String styleRotation ="45"; @param doc @param customText public static void waterMarkDocXDocument(XWPFDocument Doc,String customText){// Watermark the whole pagefor(int lineIndex = -5; lineIndex < 20; lineIndex++) { styleTop = 100*lineIndex; waterMarkDocXDocument_0(doc,customText); }} /** * Add a watermark to the document * @param doc docx document object to be processed * @param customText Watermark to be added */ public static void WaterMarkDocXDocument_0 (XWPFDocument doc,String customText) {customText = customText + repeatString(""And 8); CustomText = repeatString(customText, 10); XWPFHeader Header = doc.createHeader(headerFooterType.default); int size = header.getParagraphs().size();if (size == 0) {
            header.createParagraph();
        }
        CTP ctp = header.getParagraphArray(0).getCTP();
        byte[] rsidr = doc.getDocument().getBody().getPArray(0).getRsidR();
        byte[] rsidrdefault = doc.getDocument().getBody().getPArray(0).getRsidRDefault();
        ctp.setRsidP(rsidr);
        ctp.setRsidRDefault(rsidrdefault);
        CTPPr ppr = ctp.addNewPPr();
        ppr.addNewPStyle().setVal("Header"); CTR CTR = ctp.addnewr (); CTRPr ctrpr = ctr.addNewRPr(); ctrpr.addNewNoProof(); CTGroup group = CTGroup.Factory.newInstance(); CTShapetype shapetype = group.addNewShapetype(); CTTextPath shapeTypeTextPath = shapetype.addNewTextpath(); shapeTypeTextPath.setOn(STTrueFalse.T); shapeTypeTextPath.setFitshape(STTrueFalse.T); CTLock lock = shapetype.addNewLock(); lock.setExt(STExt.VIEW); CTShape shape = group.addNewShape(); shape.setId("PowerPlusWaterMarkObject");
        shape.setSpid("_x0000_s102");
        shape.setType("#_x0000_t136"); // Set the shape style (rotation, position, relative path, etc.) shapeStyle (getShapeStyle(customText)); shape.setFillcolor(fontColor); // Set the font to solid shape.setstroked (sttruefalse.false); // The path to draw the text CTTextPath shapeTextPath = shape.addNewTextPath (); // Set the text font and size shapeTextPath.setstyle ("font-family:" + fontName + "; font-size:"+ fontSize); shapeTextPath.setString(customText); CTPicture pict = ctr.addNewPict(); pict.set(group); } /** * Builds the Shape's style argument * @param customText * @return*/ private static String getShapeStyle(String customText) { StringBuilder sb = new StringBuilder(); Sb.append (sb.append())"position: ").append("absolute"); // Calculate the length of text (total number of text * single word length) sb.append("; width: ").append(customText.length() * widthPerWord).append("pt"); // Font height sb.append("; height: ").append("20pt");
        sb.append("; z-index: ").append("251654144");
        sb.append("; mso-wrap-edited: ").append("f"); // Set the watermark interval, this is a big hole, can not use top, must be margin-top. sb.append("; margin-top: ").append(styleTop);
        sb.append("; mso-position-horizontal-relative: ").append("page");
        sb.append("; mso-position-vertical-relative: ").append("page");
        sb.append("; mso-position-vertical: ").append("left");
        sb.append("; mso-position-horizontal: ").append("center");
        sb.append("; rotation: ").append(styleRotation);
        returnsb.toString(); } /** * repeats the specified String repeatString(String pattern) */ private static String repeatString(String pattern, int repeats) { StringBuilder buffer = new StringBuilder(pattern.length() * repeats); Stream.generate(() -> pattern).limit(repeats).forEach(buffer::append);returnnew String(buffer); }}Copy the code

portal

Making the address

GitEE yards cloud

thanks

Thanks to colleagues in the project for their suggestions on improving Word operation, so that this code can be delivered and run smoothly. Thanks to all the bloggers who contributed the source code. Thank you for taking the time out of your busy schedule to read, like, and bookmark, and help light up your little star on GitHub.

JAVA operation Word merge, replace placeholder, Word insert rich text, generate watermark

The article gives an overview of

Introduction of POI library and matters needing attention

Merge multiple Word documents

Replace placeholders in documents, including paragraph placeholders, table placeholders

Rich text to Word and notes

Generate a watermark for Word

portal

thanks

Related Posts

SpringBoot custom starter implementation

ShardingSphere JDBC statement execution

Golang read/write lock implementation and core analysis to understand the design behind the programming language