精华内容
下载资源
问答
  • 修改word文档需要的包:poi-3.8-20120326.jar,poi-examples-3.8-20120326.jar,poi-excelant-3.8-20120326.jar,poi-ooxml-3.8-20120326.jar,poi-ooxml-schemas-3.8-20120326.jar,poi-scratchpad-3.8-20120326....
  • POI修改word文档有bug?

    2019-03-12 10:42:05
    使用POI3.9中HWPFDocument来修改word文档后保存成一个新文档,发现有些文档改完之后,文档字节变少了,而且打不开,有些又没问题。请问各位是否有过类似经历,是如何解决的?![图片说明]...
  • //mvn引入包 <groupId>org.apache.poi <artifactId>poi-scratchpad <version>3.8-beta4 <groupId>org.apache.poi
    //mvn引入包
    <dependency>
        <groupId>org.apache.poi</groupId>
        <artifactId>poi-scratchpad</artifactId>
        <version>3.8-beta4</version>
    </dependency>
          <dependency>
           <groupId>org.apache.poi</groupId>
           <artifactId>poi</artifactId>
           <version>3.10-FINAL</version>
       </dependency>
       <dependency>
           <groupId>org.apache.poi</groupId>
           <artifactId>poi-ooxml</artifactId>
           <version>3.10-FINAL</version>
       </dependency>
    
    //封装方法
    package com.glprop.util;
    
    import java.io.File;  
    import java.io.FileInputStream;  
    import java.io.FileNotFoundException;  
    import java.io.FileOutputStream;  
    import java.io.IOException;  
    import java.io.InputStream;  
    import java.util.ArrayList;  
    import java.util.HashMap;  
    import java.util.Iterator;  
    import java.util.List;  
    import java.util.Map;  
    import java.util.Map.Entry;  
    import java.util.regex.Matcher;  
    import java.util.regex.Pattern;  
    
    import org.apache.poi.POIXMLDocument;  
    import org.apache.poi.hwpf.HWPFDocument;  
    import org.apache.poi.hwpf.usermodel.Range;  
    import org.apache.poi.xwpf.usermodel.XWPFDocument;  
    import org.apache.poi.xwpf.usermodel.XWPFParagraph;  
    import org.apache.poi.xwpf.usermodel.XWPFRun;  
    import org.apache.poi.xwpf.usermodel.XWPFTable;  
    import org.apache.poi.xwpf.usermodel.XWPFTableCell;  
    import org.apache.poi.xwpf.usermodel.XWPFTableRow; 
    
    public class PoiUtil {  
        // 返回Docx中需要替换的特殊字符,没有重复项  
        // 推荐传入正则表达式参数"\\$\\{[^{}]+\\}"  
        public ArrayList<String> getReplaceElementsInWord(String filePath, String regex) {  
            String[] p = filePath.split("\\.");  
            if (p.length > 0) {// 判断文件有无扩展名  
                // 比较文件扩展名  
                if (p[p.length - 1].equalsIgnoreCase("doc")) {  
                    ArrayList<String> al = new ArrayList<>();  
                    File file = new File(filePath);  
                    HWPFDocument document = null;  
                    try {  
                        InputStream is = new FileInputStream(file);  
                        document = new HWPFDocument(is);  
                    } catch (FileNotFoundException e) {  
                        e.printStackTrace();  
                    } catch (IOException e) {  
                        e.printStackTrace();  
                    }  
                    Range range = document.getRange();  
                    String rangeText = range.text();  
                    CharSequence cs = rangeText.subSequence(0, rangeText.length());  
                    Pattern pattern = Pattern.compile(regex);  
                    Matcher matcher = pattern.matcher(cs);  
                    int startPosition = 0;  
                    while (matcher.find(startPosition)) {  
                        if (!al.contains(matcher.group())) {  
                            al.add(matcher.group());  
                        }  
                        startPosition = matcher.end();  
                    }  
                    return al;  
                } else if (p[p.length - 1].equalsIgnoreCase("docx")) {  
                    ArrayList<String> al = new ArrayList<>();  
                    XWPFDocument document = null;  
                    try {  
                        document = new XWPFDocument(POIXMLDocument.openPackage(filePath));  
                    } catch (IOException e) {  
                        e.printStackTrace();  
                    }  
                    // 遍历段落  
                    Iterator<XWPFParagraph> itPara = document.getParagraphsIterator();  
                    while (itPara.hasNext()) {  
                        XWPFParagraph paragraph = (XWPFParagraph) itPara.next();  
                        String paragraphString = paragraph.getText();  
                        CharSequence cs = paragraphString.subSequence(0,paragraphString.length());  
                        Pattern pattern = Pattern.compile(regex);  
                        Matcher matcher = pattern.matcher(cs);  
                        int startPosition = 0;  
                        while (matcher.find(startPosition)) {  
                            if (!al.contains(matcher.group())) {  
                                al.add(matcher.group());  
                            }  
                            startPosition = matcher.end();  
                        }  
                    }  
                    // 遍历表  
                    Iterator<XWPFTable> itTable = document.getTablesIterator();  
                    while (itTable.hasNext()) {  
                        XWPFTable table = (XWPFTable) itTable.next();  
                        int rcount = table.getNumberOfRows();  
                        for (int i = 0; i < rcount; i++) {  
                            XWPFTableRow row = table.getRow(i);  
                            List<XWPFTableCell> cells = row.getTableCells();  
                            for (XWPFTableCell cell : cells) {  
                                String cellText = "";  
                                cellText = cell.getText();  
                                CharSequence cs = cellText.subSequence(0,  
                                        cellText.length());  
                                Pattern pattern = Pattern.compile(regex);  
                                Matcher matcher = pattern.matcher(cs);  
                                int startPosition = 0;  
                                while (matcher.find(startPosition)) {  
                                    if (!al.contains(matcher.group())) {  
                                        al.add(matcher.group());  
                                    }  
                                    startPosition = matcher.end();  
                                }  
                            }  
                        }  
                    }  
                    return al;  
                } else {  
                    return null;  
                }  
            } else {  
                return null;  
            }  
        }  
        // 替换word中需要替换的特殊字符  
        public static boolean replaceAndGenerateWord(String srcPath, String destPath, Map<String, String> map) {  
            String[] sp = srcPath.split("\\.");  
            String[] dp = destPath.split("\\.");  
            if ((sp.length > 0) && (dp.length > 0)) {// 判断文件有无扩展名  
                // 比较文件扩展名  
                if (sp[sp.length - 1].equalsIgnoreCase("docx")) {  
                    try {  
                        XWPFDocument document = new XWPFDocument(POIXMLDocument.openPackage(srcPath));  
                        // 替换段落中的指定文字  
                        Iterator<XWPFParagraph> itPara = document.getParagraphsIterator();  
                        while (itPara.hasNext()) {  
                            XWPFParagraph paragraph = (XWPFParagraph) itPara.next();  
                            List<XWPFRun> runs = paragraph.getRuns();  
                            for (int i = 0; i < runs.size(); i++) {  
                                String oneparaString = runs.get(i).getText(runs.get(i).getTextPosition());  
                                for (Map.Entry<String, String> entry : map.entrySet()) {  
                                    oneparaString = oneparaString.replace(entry.getKey(), entry.getValue());  
                                }  
                                runs.get(i).setText(oneparaString, 0);  
                            }  
                        }  
    
                        // 替换表格中的指定文字  
                        Iterator<XWPFTable> itTable = document.getTablesIterator();  
                        while (itTable.hasNext()) {  
                            XWPFTable table = (XWPFTable) itTable.next();  
                            int rcount = table.getNumberOfRows();  
                            for (int i = 0; i < rcount; i++) {  
                                XWPFTableRow row = table.getRow(i);  
                                List<XWPFTableCell> cells = row.getTableCells();  
                                for (XWPFTableCell cell : cells) {  
                                    String cellTextString = cell.getText();  
                                    for (Entry<String, String> e : map.entrySet()) {  
                                        if (cellTextString.contains(e.getKey()))  
                                            cellTextString = cellTextString.replace(e.getKey(),e.getValue());  
                                    }  
                                    cell.removeParagraph(0);  
                                    cell.setText(cellTextString);  
                                }  
                            }  
                        }  
                        FileOutputStream outStream = null;  
                        outStream = new FileOutputStream(destPath);  
                        document.write(outStream);  
                        outStream.close();  
                        return true;  
                    } catch (Exception e) {  
                        e.printStackTrace();  
                        return false;  
                    }  
    
                } else  
                // doc只能生成doc,如果生成docx会出错  
                if ((sp[sp.length - 1].equalsIgnoreCase("doc"))  
                        && (dp[dp.length - 1].equalsIgnoreCase("doc"))) {  
                    HWPFDocument document = null;  
                    try {  
                        document = new HWPFDocument(new FileInputStream(srcPath));  
                        Range range = document.getRange();  
                        for (Map.Entry<String, String> entry : map.entrySet()) {  
                            range.replaceText(entry.getKey(), entry.getValue());  
                        }  
                        FileOutputStream outStream = null;  
                        outStream = new FileOutputStream(destPath);  
                        document.write(outStream);  
                        outStream.close();  
                        return true;  
                    } catch (FileNotFoundException e) {  
                        e.printStackTrace();  
                        return false;  
                    } catch (IOException e) {  
                        e.printStackTrace();  
                        return false;  
                    }  
                } else {  
                    return false;  
                }  
            } else {  
                return false;  
            }  
        }  
    
        public static void main(String[] args) {  
            /*String filepathString = "E:/xxx.doc";  
            String destpathString = "E:/xxx22.doc";  
            Map<String, String> map = new HashMap<String, String>();  
            map.put("$name$", "可好");
            map.put("$idcard$", "324324324444444444444444432");
            map.put("$remarks$", "购买装备");
    
            System.out.println(replaceAndGenerateWord(filepathString,destpathString, map));*/  
        }  
    }  
    
    展开全文
  • Java POI 生成Word文档

    2017-10-28 13:54:46
    Java POI 生成Word文档,支持图片插入,关键是修改XML部分,本人已测试通过。
  • POI生成WORD文档

    2017-09-27 15:40:10
    POI生成WORD文档  POI为Java系处理office文档的比较优秀的开源库,其中对于Excel的处理最为优秀,文档也写的很详细。不过很多网友都认为它在word文档处理方面就逊色很多,不过对于我本次的完成文档的生成我依然...

     

    POI生成WORD文档

        POI为Java系处理office文档的比较优秀的开源库,其中对于Excel的处理最为优秀,文档也写的很详细。不过很多网友都认为它在word文档处理方面就逊色很多,不过对于我本次的完成文档的生成我依然选择了POI。

    需要完成功能

    1. 配置Word模板文件,包括表格
    2. 解析配置的Word文档,返回配置的特殊标记
    3. 构造数据,替换配置的标签,以及生成表格

    配置word模版

    采用${xx}方式配置标签,如果是表格在对应一行一列配置表格名称

    注意在word文档中,如果两个相近的字符样式不同,word默认会保存在不同的RUN元素中,由此很多朋友在配置好以后都需要保存为一个单独的文件,然后不把不在一起的标签合并到一个RUN元素中,如果文件比较大,我相信这绝对是一个比较痛苦的事情,这里将会侧重处理这个问题.我的解决方案是只保留第一RUN的样式其他的删掉

    解析word模板

    首先需要将文件转换为XWPFDocument对象,可以通过流的当时,也可以通过opcpackage,不过如果使用opcpackage打开的方式,打开的文件和最终生成的文件不能够是同一个文件,我这里采用文件流的方式

    public XWPFDocument openDocument() {
            XWPFDocument xdoc = null;
            InputStream is = null;
            try {
                is = new FileInputStream(saveFile);
                xdoc = new XWPFDocument(is);
            } catch (IOException e) {
                e.printStackTrace();
            }
            return xdoc;
        }
    

    获取非列表的标签,实现方式XWPFDocument对象有当前所有段落以及表格,这里暂不考虑表格嵌套表格的情况,每个段落的文本信息是可以通过p.getText()获取,获取段落中文档配置信息如下:

       // 获取段落集合中所有文本
        public List<TagInfo> getWordTag(XWPFDocument doc, String regex) {
            List<TagInfo> tags = new ArrayList<TagInfo>();
            // 普通段落
            List<XWPFParagraph> pars = doc.getParagraphs();
            for (int i = 0; i < pars.size(); i++) {
                XWPFParagraph p = pars.get(i);
                setTagInfoList(tags, p, regex);
            }
            // Table中段落
            List<XWPFTable> commTables = getDocTables(doc, false, regex);
            for (XWPFTable table : commTables) {
                List<XWPFParagraph> tparags = getTableParagraph(table);
                for (int i = 0; i < tparags.size(); i++) {
                    XWPFParagraph p = tparags.get(i);
                    setTagInfoList(tags, p, regex);
                }
            }
            return tags;
        }
    

    获取文本后通过正则解析,并依次保存到TagInfo中

    // 向 taglist中添加新解析的段落信息
        private void setTagInfoList(List<TagInfo> list, XWPFParagraph p,
                String regex) {
            if (regex == "")
                regex = defaultRegex;
            Pattern pattern = Pattern.compile(regex);
            Matcher matcher = pattern.matcher(p.getText());
            int startPosition = 0;
            while (matcher.find(startPosition)) {
                String match = matcher.group();
                if (!list.contains(new TagInfo(match, match, ""))) {
                    list.add(new TagInfo(match, match, ""));
                }
                startPosition = matcher.end();
            }
        }
    

    解析表格

        // 获取Table列表中的配置信息
        public Map<String, List<List<TagInfo>>> getTableTag(XWPFDocument doc,
                String regex) {
            Map<String, List<List<TagInfo>>> mapList = new HashMap<String, List<List<TagInfo>>>();
            List<XWPFTable> lstTables = getDocTables(doc, true, regex);
            for (XWPFTable table : lstTables) {
                // 获取每个表格第一个单元格,以及最后一行
                String strTableName = getTableListName(table, regex);
                List<List<TagInfo>> list = new ArrayList<List<TagInfo>>();
                List<TagInfo> lstTag = new ArrayList<TagInfo>();
                int rowSize = table.getRows().size();
                XWPFTableRow lastRow = table.getRow(rowSize - 1);
                for (XWPFTableCell cell : lastRow.getTableCells()) {
                    for (XWPFParagraph p : cell.getParagraphs()) {
                        // 去掉空白字符串
                        if (p.getText() != null && p.getText().length() > 0) {
                            setTagInfoList(lstTag, p, regex);
                        }
                    }
                }
                list.add(lstTag);
                // 添加到数据集
                mapList.put(strTableName, list);
            }
            return mapList;
        }
    

    生成WORD文档

    难点替换标签
    传入数据格式包含三个formtag以及一个tableTag

    {"formTags":
    [{"TagName":"${xxxx}","TagText":"${xxxx}","TagValue":""},
    {"TagName":"${123}","TagText":"${123}","TagValue":""},
    {"TagName":"${ddd}","TagText":"${ddd}","TagValue":""}],
    "tableTags":{
    "${table}":[
    [{"TagName":"${COL1}","TagText":"${COL1}","TagValue":""},{"TagName":"${COL2}","TagText":"${COL2}","TagValue":""}]
    ]}
    }

    普通文档生成,并且保留配置样式,这里主要使用POI中提供searchText方法,返回Tag所有所在的RUN标签,通过一个字符做比较,如果找的第一个匹配的文本开始计数,所有在当前条件下类型 $${xxx}这样的标签是无法实现替换的
    替换普通文本Tag

        public void ReplaceInParagraph(List<TagInfo> tagList, XWPFParagraph para,
                String regex) {
            if (regex == "")
                regex = defaultRegex;
            List<XWPFRun> runs = para.getRuns();
            for (TagInfo ti : tagList) {
                String find = ti.TagText;
                String replValue = ti.TagValue;
                TextSegement found = para.searchText(find,
                        new PositionInParagraph());
                if (found != null) {
                    // 判断查找内容是否在同一个Run标签中
                    if (found.getBeginRun() == found.getEndRun()) {
                        XWPFRun run = runs.get(found.getBeginRun());
                        String runText = run.getText(run.getTextPosition());
                        String replaced = runText.replace(find, replValue);
                        run.setText(replaced, 0);
                    } else {
                        // 存在多个Run标签
                        StringBuilder sb = new StringBuilder();
                        for (int runPos = found.getBeginRun(); runPos <= found
                                .getEndRun(); runPos++) {
                            XWPFRun run = runs.get(runPos);
                            sb.append(run.getText((run.getTextPosition())));
                        }
                        String connectedRuns = sb.toString();
                        String replaced = connectedRuns.replace(find, replValue);
                        XWPFRun firstRun = runs.get(found.getBeginRun());
                        firstRun.setText(replaced, 0);
                        // 删除后边的run标签
                        for (int runPos = found.getBeginRun() + 1; runPos <= found
                                .getEndRun(); runPos++) {
                            // 清空其他标签内容
                            XWPFRun partNext = runs.get(runPos);
                            partNext.setText("", 0);
                        }
                    }
                }
            }
            // 完成第一遍查找,检测段落中的标签是否已经替换完
            Pattern pattern = Pattern.compile(regex);
            Matcher matcher = pattern.matcher(para.getText());
            boolean find = matcher.find();
            if (find) {
                ReplaceInParagraph(tagList, para, regex);
                find = false;
            }
        }
    

    表格主要是通过复制模版行,然后对模版行中的内容做修改
    复制文本标签RUN

        private void CopyRun(XWPFRun target, XWPFRun source) {
            target.getCTR().setRPr(source.getCTR().getRPr());
            // 设置文本
            target.setText(source.text());
        }
    

    复制段落XWPFParagraph

        private void copyParagraph(XWPFParagraph target, XWPFParagraph source) {
            // 设置段落样式
            target.getCTP().setPPr(source.getCTP().getPPr());
            // 添加Run标签
            for (int pos = 0; pos < target.getRuns().size(); pos++) {
                target.removeRun(pos);
            }
            for (XWPFRun s : source.getRuns()) {
                XWPFRun targetrun = target.createRun();
                CopyRun(targetrun, s);
            }
        }
    

    复制单元格XWPFTableCell

        private void copyTableCell(XWPFTableCell target, XWPFTableCell source) {
            // 列属性
            target.getCTTc().setTcPr(source.getCTTc().getTcPr());
            // 删除目标 targetCell 所有单元格
            for (int pos = 0; pos < target.getParagraphs().size(); pos++) {
                target.removeParagraph(pos);
            }
            // 添加段落
            for (XWPFParagraph sp : source.getParagraphs()) {
                XWPFParagraph targetP = target.addParagraph();
                copyParagraph(targetP, sp);
            }
        }
    

    复制行XWPFTableRow

        private void CopytTableRow(XWPFTableRow target, XWPFTableRow source) {
            // 复制样式
            target.getCtRow().setTrPr(source.getCtRow().getTrPr());
            // 复制单元格
            for (int i = 0; i < target.getTableCells().size(); i++) {
                copyTableCell(target.getCell(i), source.getCell(i));
            }
        }

    完整代码

    复制代码
      1 import java.io.File;
      2 import java.io.FileInputStream;
      3 import java.io.FileNotFoundException;
      4 import java.io.FileOutputStream;
      5 import java.io.IOException;
      6 import java.io.InputStream;
      7 import java.io.OutputStream;
      8 import java.nio.channels.FileChannel;
      9 import java.util.ArrayList;
     10 import java.util.HashMap;
     11 import java.util.List;
     12 import java.util.Map;
     13 import java.util.regex.Matcher;
     14 import java.util.regex.Pattern;
     15 
     16 import org.apache.poi.xwpf.usermodel.PositionInParagraph;
     17 import org.apache.poi.xwpf.usermodel.TextSegement;
     18 import org.apache.poi.xwpf.usermodel.XWPFDocument;
     19 import org.apache.poi.xwpf.usermodel.XWPFParagraph;
     20 import org.apache.poi.xwpf.usermodel.XWPFRun;
     21 import org.apache.poi.xwpf.usermodel.XWPFTable;
     22 import org.apache.poi.xwpf.usermodel.XWPFTableRow;
     23 import org.apache.poi.xwpf.usermodel.XWPFTableCell;
     24 
     25 public class WordAnalysis {
     26 
     27     private final String defaultRegex = "\\$\\{[^{}]+\\}";
     28     private String tempFile;
     29     private String saveFile;
     30 
     31     @SuppressWarnings("resource")
     32     private void CopyFile() throws IOException {
     33         File tFile = new File(saveFile);
     34         tFile.deleteOnExit();
     35         if (!tFile.getParentFile().exists()) {
     36             // 目标文件所在目录不存在
     37             tFile.getParentFile().mkdirs();
     38         }
     39         FileInputStream inStream = new FileInputStream(tempFile);
     40         FileOutputStream outStream = new FileOutputStream(tFile);
     41         FileChannel inC = inStream.getChannel();
     42         FileChannel outC = outStream.getChannel();
     43         int length = 2097152;
     44         while (true) {
     45             if (inC.position() == inC.size()) {
     46                 inC.close();
     47                 outC.close();
     48                 tFile = null;
     49                 inC = null;
     50                 outC = null;
     51                 break;
     52             }
     53             if ((inC.size() - inC.position()) < 20971520)
     54                 length = (int) (inC.size() - inC.position());
     55             else
     56                 length = 20971520;
     57             inC.transferTo(inC.position(), length, outC);
     58             inC.position(inC.position() + length);
     59         }
     60 
     61     };
     62 
     63     public WordAnalysis(String tempFile) {
     64         this.tempFile = tempFile;
     65         this.saveFile = tempFile;
     66     }
     67 
     68     public WordAnalysis(String tempFile, String saveFile) {
     69         this.tempFile = tempFile;
     70         this.saveFile = saveFile;
     71         // 复制模版文件到输出文件
     72         try {
     73             CopyFile();
     74         } catch (IOException e) {
     75             e.printStackTrace();
     76         }
     77     }
     78 
     79     // 打开文档
     80     // 采用流的方式可以打开保存在统一个文集
     81     // opcpackage 必须保存为另外一个文件
     82     public XWPFDocument openDocument() throws IOException {
     83         XWPFDocument xdoc = null;
     84         InputStream is = null;
     85         is = new FileInputStream(saveFile);
     86         xdoc = new XWPFDocument(is);
     87         return xdoc;
     88     }
     89 
     90     // 关闭文档
     91     public void closeDocument(XWPFDocument document) {
     92         try {
     93             document.close();
     94         } catch (IOException e) {
     95             e.printStackTrace();
     96         }
     97     }
     98 
     99     // 保存文档
    100     public void saveDocument(XWPFDocument document) {
    101         OutputStream os;
    102         try {
    103             os = new FileOutputStream(saveFile);
    104             if (os != null) {
    105                 document.write(os);
    106                 os.close();
    107             }
    108             closeDocument(document);
    109         } catch (FileNotFoundException e) {
    110             e.printStackTrace();
    111         } catch (IOException e) {
    112             e.printStackTrace();
    113         }
    114     }
    115 
    116     // 复制Run
    117     private void CopyRun(XWPFRun target, XWPFRun source) {
    118         target.getCTR().setRPr(source.getCTR().getRPr());
    119         // 设置文本
    120         target.setText(source.text());
    121     }
    122 
    123     // 复制段落
    124     private void copyParagraph(XWPFParagraph target, XWPFParagraph source) {
    125         // 设置段落样式
    126         target.getCTP().setPPr(source.getCTP().getPPr());
    127         // 添加Run标签
    128         for (int pos = 0; pos < target.getRuns().size(); pos++) {
    129             target.removeRun(pos);
    130         }
    131         for (XWPFRun s : source.getRuns()) {
    132             XWPFRun targetrun = target.createRun();
    133             CopyRun(targetrun, s);
    134         }
    135     }
    136 
    137     // 复制单元格
    138     private void copyTableCell(XWPFTableCell target, XWPFTableCell source) {
    139         // 列属性
    140         target.getCTTc().setTcPr(source.getCTTc().getTcPr());
    141         // 删除目标 targetCell 所有单元格
    142         for (int pos = 0; pos < target.getParagraphs().size(); pos++) {
    143             target.removeParagraph(pos);
    144         }
    145         // 添加段落
    146         for (XWPFParagraph sp : source.getParagraphs()) {
    147             XWPFParagraph targetP = target.addParagraph();
    148             copyParagraph(targetP, sp);
    149         }
    150     }
    151 
    152     // 复制行
    153     private void CopytTableRow(XWPFTableRow target, XWPFTableRow source) {
    154         // 复制样式
    155         target.getCtRow().setTrPr(source.getCtRow().getTrPr());
    156         // 复制单元格
    157         for (int i = 0; i < target.getTableCells().size(); i++) {
    158             copyTableCell(target.getCell(i), source.getCell(i));
    159         }
    160     }
    161 
    162     // 获取表格中所有段落
    163     public List<XWPFParagraph> getTableParagraph(XWPFTable table) {
    164         List<XWPFParagraph> paras = new ArrayList<XWPFParagraph>();
    165         List<XWPFTableRow> rows = table.getRows();
    166         for (XWPFTableRow row : rows) {
    167             for (XWPFTableCell cell : row.getTableCells()) {
    168                 for (XWPFParagraph p : cell.getParagraphs()) {
    169                     // 去掉空白字符串
    170                     if (p.getText() != null && p.getText().length() > 0) {
    171                         paras.add(p);
    172                     }
    173                 }
    174             }
    175         }
    176         return paras;
    177     }
    178 
    179     // 返回为空 表示是普通表格,否则是个列表
    180     private String getTableListName(XWPFTable table, String regex) {
    181         if (regex == "")
    182             regex = defaultRegex;
    183         String tableName = "";
    184         XWPFTableRow firstRow = table.getRow(0);
    185         XWPFTableCell firstCell = firstRow.getCell(0);
    186         String cellText = firstCell.getText();
    187         Pattern pattern = Pattern.compile(regex);
    188         Matcher matcher = pattern.matcher(cellText);
    189         boolean find = matcher.find();
    190         while (find) {
    191             tableName = matcher.group();
    192             // 跳出循环
    193             find = false;
    194         }
    195         firstRow = null;
    196         firstCell = null;
    197         pattern = null;
    198         matcher = null;
    199         cellText = null;
    200         return tableName;
    201 
    202     }
    203 
    204     // 获取文档中所有的表格,不包含嵌套表格
    205     // listTable false 返回普通表格, true 返回列表表格
    206     public List<XWPFTable> getDocTables(XWPFDocument doc, boolean listTable,
    207             String regex) {
    208         List<XWPFTable> lstTables = new ArrayList<XWPFTable>();
    209         for (XWPFTable table : doc.getTables()) {
    210             String tbName = getTableListName(table, regex);
    211             if (listTable && tbName != "") {
    212                 lstTables.add(table);
    213             }
    214             if (!listTable && (tbName == null || tbName.length() <= 0)) {
    215                 lstTables.add(table);
    216             }
    217         }
    218         return lstTables;
    219     }
    220 
    221     // 向 taglist中添加新解析的段落信息
    222     private void setTagInfoList(List<TagInfo> list, XWPFParagraph p,
    223             String regex) {
    224         if (regex == "")
    225             regex = defaultRegex;
    226         Pattern pattern = Pattern.compile(regex);
    227         Matcher matcher = pattern.matcher(p.getText());
    228         int startPosition = 0;
    229         while (matcher.find(startPosition)) {
    230             String match = matcher.group();
    231             if (!list.contains(new TagInfo(match, ""))) {
    232                 list.add(new TagInfo(match, ""));
    233             }
    234             startPosition = matcher.end();
    235         }
    236     }
    237 
    238     // 获取段落集合中所有文本
    239     public List<TagInfo> getWordTag(XWPFDocument doc, String regex) {
    240         List<TagInfo> tags = new ArrayList<TagInfo>();
    241         // 普通段落
    242         List<XWPFParagraph> pars = doc.getParagraphs();
    243         for (int i = 0; i < pars.size(); i++) {
    244             XWPFParagraph p = pars.get(i);
    245             setTagInfoList(tags, p, regex);
    246         }
    247         // Table中段落
    248         List<XWPFTable> commTables = getDocTables(doc, false, regex);
    249         for (XWPFTable table : commTables) {
    250             List<XWPFParagraph> tparags = getTableParagraph(table);
    251             for (int i = 0; i < tparags.size(); i++) {
    252                 XWPFParagraph p = tparags.get(i);
    253                 setTagInfoList(tags, p, regex);
    254             }
    255         }
    256         return tags;
    257     }
    258 
    259     // 获取Table列表中的配置信息
    260     public Map<String, List<List<TagInfo>>> getTableTag(XWPFDocument doc,
    261             String regex) {
    262         Map<String, List<List<TagInfo>>> mapList = new HashMap<String, List<List<TagInfo>>>();
    263         List<XWPFTable> lstTables = getDocTables(doc, true, regex);
    264         for (XWPFTable table : lstTables) {
    265             // 获取每个表格第一个单元格,以及最后一行
    266             String strTableName = getTableListName(table, regex);
    267             List<List<TagInfo>> list = new ArrayList<List<TagInfo>>();
    268             List<TagInfo> lstTag = new ArrayList<TagInfo>();
    269             int rowSize = table.getRows().size();
    270             XWPFTableRow lastRow = table.getRow(rowSize - 1);
    271             for (XWPFTableCell cell : lastRow.getTableCells()) {
    272                 for (XWPFParagraph p : cell.getParagraphs()) {
    273                     // 去掉空白字符串
    274                     if (p.getText() != null && p.getText().length() > 0) {
    275                         setTagInfoList(lstTag, p, regex);
    276                     }
    277                 }
    278             }
    279             list.add(lstTag);
    280             // 添加到数据集
    281             mapList.put(strTableName, list);
    282         }
    283         return mapList;
    284     }
    285 
    286     // 替换文本 已处理跨行的情况
    287     // 注意 文档中 不能出现类似$${\w+}的字符,由于searchText会一个字符一个字符做比价,找到第一个比配的开始计数
    288     public void ReplaceInParagraph(List<TagInfo> tagList, XWPFParagraph para,
    289             String regex) {
    290         if (regex == "")
    291             regex = defaultRegex;
    292         List<XWPFRun> runs = para.getRuns();
    293         for (TagInfo ti : tagList) {
    294             String find = ti.TagText;
    295             String replValue = ti.TagValue;
    296             TextSegement found = para.searchText(find,
    297                     new PositionInParagraph());
    298             if (found != null) {
    299                 // 判断查找内容是否在同一个Run标签中
    300                 if (found.getBeginRun() == found.getEndRun()) {
    301                     XWPFRun run = runs.get(found.getBeginRun());
    302                     String runText = run.getText(run.getTextPosition());
    303                     String replaced = runText.replace(find, replValue);
    304                     run.setText(replaced, 0);
    305                 } else {
    306                     // 存在多个Run标签
    307                     StringBuilder sb = new StringBuilder();
    308                     for (int runPos = found.getBeginRun(); runPos <= found
    309                             .getEndRun(); runPos++) {
    310                         XWPFRun run = runs.get(runPos);
    311                         sb.append(run.getText((run.getTextPosition())));
    312                     }
    313                     String connectedRuns = sb.toString();
    314                     String replaced = connectedRuns.replace(find, replValue);
    315                     XWPFRun firstRun = runs.get(found.getBeginRun());
    316                     firstRun.setText(replaced, 0);
    317                     // 删除后边的run标签
    318                     for (int runPos = found.getBeginRun() + 1; runPos <= found
    319                             .getEndRun(); runPos++) {
    320                         // 清空其他标签内容
    321                         XWPFRun partNext = runs.get(runPos);
    322                         partNext.setText("", 0);
    323                     }
    324                 }
    325             }
    326         }
    327         // 完成第一遍查找,检测段落中的标签是否已经替换完 TODO 2016-06-14忘记当时处于什么考虑 加入这段代码
    328         // Pattern pattern = Pattern.compile(regex);
    329         // Matcher matcher = pattern.matcher(para.getText());
    330         // boolean find = matcher.find();
    331         // if (find) {
    332         // ReplaceInParagraph(tagList, para, regex);
    333         // find = false;
    334         // }
    335     }
    336 
    337     // 替换列表数据
    338     public void ReplaceInTable(List<List<TagInfo>> tagList, XWPFTable table,
    339             String regex) {
    340         int tempRowIndex = table.getRows().size() - 1;
    341         XWPFTableRow tempRow = table.getRow(tempRowIndex);
    342         for (List<TagInfo> lst : tagList) {
    343             table.createRow();
    344             XWPFTableRow newRow = table.getRow(table.getRows().size() - 1);
    345             CopytTableRow(newRow, tempRow);
    346             List<XWPFTableCell> nCells = newRow.getTableCells();
    347             for (int i = 0; i < nCells.size(); i++) {
    348                 XWPFTableCell cell = newRow.getCell(i);
    349                 for (XWPFParagraph p : cell.getParagraphs()) {
    350                     if (p.getText() != null && p.getText().length() > 0) {
    351                         ReplaceInParagraph(lst, p, regex);
    352                     }
    353                 }
    354             }
    355         }
    356         // 删除模版行
    357         table.removeRow(tempRowIndex);
    358     }
    359 
    360     // 替换所有tag
    361     public void ReplaceAllTag(XWPFDocument doc, List<TagInfo> formTagList,
    362             Map<String, List<List<TagInfo>>> tableTagList, String regex) {
    363         // 替换普通段落
    364         for (XWPFParagraph p : doc.getParagraphs()) {
    365             ReplaceInParagraph(formTagList, p, regex);
    366         }
    367         // 替换普通表格中段落
    368         List<XWPFTable> listCommTable = getDocTables(doc, false, regex);
    369         for (XWPFTable t : listCommTable) {
    370             List<XWPFParagraph> lstable = getTableParagraph(t);
    371             for (XWPFParagraph pt : lstable) {
    372                 ReplaceInParagraph(formTagList, pt, regex);
    373             }
    374         }
    375         List<XWPFTable> listTable = getDocTables(doc, true, regex);
    376         for (XWPFTable table : listTable) {
    377             String tableName = getTableListName(table, regex);
    378             List<TagInfo> tableNameTags = new ArrayList<TagInfo>();
    379             tableNameTags.add(new TagInfo(tableName, ""));
    380             XWPFTableCell firstCell = table.getRow(0).getCell(0);
    381             List<XWPFParagraph> cellParas = firstCell.getParagraphs();
    382             for (XWPFParagraph pt : cellParas) {
    383                 ReplaceInParagraph(tableNameTags, pt, regex);
    384             }
    385             List<List<TagInfo>> targetTableList = tableTagList.get(tableName);
    386             ReplaceInTable(targetTableList, table, regex);
    387         }
    388     }
    389 }
    来源:http://www.cnblogs.com/yfrs/p/wordpoi.html
    展开全文
  • 相信做做oa系统的都会遇到客户各种各样的奇葩的要求,以前我们只需要做权限,做功能就好了,但是现在我发现越来越...API的人不多,所以今天给大家分享一个可以修改word文档数据的api  具体可以参考:  https://...

      相信做做oa系统的都会遇到客户各种各样的奇葩的要求,以前我们只需要做权限,做功能就好了,但是现在我发现越来越多的客户

    要求我们做一个导出excel文件的功能丶打印数据的功能,看起来好像很简单,但是做我们这行的都知道难做,主要是因为这种功能比较偏门,知道一些操作文档

    API的人不多,所以今天给大家分享一个可以修改word文档数据的api

      具体可以参考:

      https://stackoverflow.com/questions/22268898/replacing-a-text-in-apache-poi-xwpf/22269035#22269035

      第一步,我们需要添加poi-ooxml的依赖:

    <!--&lt;!&ndash; https://mvnrepository.com/artifact/org.apache.poi/poi &ndash;&gt;-->
            <!--<dependency>-->
                <!--<groupId>org.apache.poi</groupId>-->
                <!--<artifactId>poi</artifactId>-->
                <!--<version>3.9</version>-->
            <!--</dependency>-->
    
            <!-- https://mvnrepository.com/artifact/org.apache.poi/poi-ooxml -->
            <dependency>
                <groupId>org.apache.poi</groupId>
                <artifactId>poi-ooxml</artifactId>
                <version>3.9</version>
            </dependency>

      第二步,开始写工具类了:

    package com.poi.word.util;
    
    import org.apache.poi.POIXMLDocument;
    import org.apache.poi.openxml4j.exceptions.InvalidFormatException;
    import org.apache.poi.xwpf.usermodel.*;
    
    import java.io.FileOutputStream;
    import java.io.IOException;
    import java.util.HashMap;
    import java.util.List;
    import java.util.Map;
    
    public class DocWriter1 {
        public static void writer(String inputSrc, String outSrc, Map<String,String> map) {
    
            try {
                XWPFDocument doc = new XWPFDocument(POIXMLDocument.openPackage(inputSrc));
                /**
                 * 替换段落中指定的文本
                 */
                for(XWPFParagraph p : doc.getParagraphs()){
                    List<XWPFRun> runs = p.getRuns();
                    if(runs != null){
                        for(XWPFRun r : runs){
                            //需要替换的文本
                            String text = r.getText(0);
                            //替换指定的文本
                            for(String key : map.keySet()){
                                if(text != null && text.equals(key)){
                                    //替换的时候要注意,setText是有两个参数的
                                    //第一个是替换的文本,第二个是从哪里开始替换
                                    //0是替换全部,如果不设置那么默认就是从原文字
                                    //结尾开始追加
                                    r.setText(map.get(key),0);
                                }
                            }
                        }
                    }
                }
                /**
                 * 替换表格中指定的文字
                 */
                for(XWPFTable tab : doc.getTables()){
                    for(XWPFTableRow row : tab.getRows()){
                        for(XWPFTableCell cell : row.getTableCells()){
                            //注意,getParagraphs一定不能漏掉
                            //因为一个表格里面可能会有多个需要替换的文字
                            //如果没有这个步骤那么文字会替换不了
                            for(XWPFParagraph p : cell.getParagraphs()){
                                for(XWPFRun r : p.getRuns()){
                                    String text = r.getText(0);
                                    for(String key : map.keySet()){
                                        if(text.equals(key)){
                                            r.setText(map.get(text),0);
                                        }
                                    }
                                }
                            }
                        }
                    }
                }
                doc.write(new FileOutputStream(outSrc));
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    
        public static void main(String[] args) throws IOException, InvalidFormatException {
            Map<String, String> map = new HashMap<String, String>();
            map.put("people", "people");
            for(int i =0; i<3;i++){
                if(i ==0){
                    map.put("beginTime", "2018-01-01");
                    map.put("endTime", "2018-01-02");
                    map.put("${how}", "实施");
                    map.put("address", "南屏一中");
                    map.put("day", "1");
                    map.put("traffic", "滴滴");
                    map.put("zhusu", "100");
                    map.put("buzu", "50");
                    map.put("xiche", "30");
                    map.put("tingche", "50");
                    map.put("guoqiao", "50");
                    map.put("another", "20");
                    map.put("remark", "agree");
                }else{
                    map.put("how"+i+"", "实施");
                    map.put("address"+i+"", "南平一中");
                    map.put("day"+i+"", "1");
                    map.put("traffic"+i+"", "滴滴");
                    map.put("zhusu"+i+"", "100");
                    map.put("buzu"+i+"", "50");
                    map.put("xiche"+i+"", "50");
                    map.put("tingche"+i+"", "20");
                    map.put("guoqiao"+i+"", "60");
                    map.put("another"+i+"", "40");
                    map.put("remark"+i+"", "agree");
                }
            }
            map.put("bankAddress", "斗门交通银行支行");
            map.put("bankNum", "46898566446464646898565");
            map.put("people1", "people1");
            map.put("people2", "people2");
            map.put("people3", "people3");
            map.put("sumMoney", "265");
            map.put("isAgree", "agree");
            map.put("writeTime", "2019-10-12");
            map.put("remarkpro", "hello");
          
         //文件路径  String srcPath
    = "D:\\word\\needle.docx";
         //替换后新文件的路径 String destPath
    = "D:\\word\\output.docx"; writer(srcPath,destPath,map); } }

    因为我写的是测试的没有应用到项目中,所以路径都是写死的,当然有需要的话可以直接响应回客户端下载,也可以放在服务器上面需要的时候再下载。

    关于下载的话上一篇文章介绍有,其实这也可以实现打印功能的。

      如果要实现打印的能的话,传过来数据替换掉生成新文件后调用打印功能打新文件打印就可以了

      考虑到这个api的确比较偏门。所以我把word文档模板也贴出来吧

     

    转载于:https://www.cnblogs.com/MyReM/p/9109919.html

    展开全文
  • Android使用POI打开word文档

    千次阅读 2017-06-15 10:54:20
    Android使用POI打开word文档最近使用Apache的POI包解析打开word文档,遇到不少问题。各种报错。折腾了两天,发现主要问题在WordToHtmlCOnverter.java这个类存在各种问题(好歹这么大的公司,代码太不严谨了吧,最...

    最近使用Apache的POI包解析打开word文档,遇到不少问题。各种报错。折腾了两天,发现主要问题在WordToHtmlCOnverter.java这个类存在各种问题(好歹这么大的公司,代码太不严谨了吧,最基本的判断都没有。。。)。
    尝试了各种重写、重新打包jar包,都不好使。最后,自己按照他的方法重新写了这个类,在此做个记录,也希望能帮到其他朋友。
    话不多说,上代码。

    1、布局,就是一个WebView

    <?xml version="1.0" encoding="utf-8"?>
    <android.support.constraint.ConstraintLayout xmlns:android="http://schemas.android.com/apk/res/android"
        xmlns:app="http://schemas.android.com/apk/res-auto"
        xmlns:tools="http://schemas.android.com/tools"
        android:layout_width="match_parent"
        android:layout_height="match_parent"
        tools:context="com.hfga.docview.DocActivity">
    
        <WebView
            android:id="@+id/docview"
            android:layout_width="match_parent"
            android:layout_height="match_parent"></WebView>
    
    </android.support.constraint.ConstraintLayout>
    

    2、Activity

    import android.os.Bundle;
    import android.support.v7.app.AppCompatActivity;
    import android.webkit.WebSettings;
    import android.webkit.WebView;
    
    import org.apache.poi.hwpf.HWPFDocument;
    import org.apache.poi.hwpf.converter.PicturesManager;
    import org.apache.poi.hwpf.usermodel.Picture;
    import org.apache.poi.hwpf.usermodel.PictureType;
    import org.w3c.dom.Document;
    
    import java.io.BufferedWriter;
    import java.io.ByteArrayOutputStream;
    import java.io.File;
    import java.io.FileInputStream;
    import java.io.FileNotFoundException;
    import java.io.FileOutputStream;
    import java.io.IOException;
    import java.io.OutputStreamWriter;
    import java.util.List;
    
    import javax.xml.parsers.DocumentBuilderFactory;
    import javax.xml.parsers.ParserConfigurationException;
    import javax.xml.transform.OutputKeys;
    import javax.xml.transform.Transformer;
    import javax.xml.transform.TransformerException;
    import javax.xml.transform.TransformerFactory;
    import javax.xml.transform.dom.DOMSource;
    import javax.xml.transform.stream.StreamResult;
    
    
    public class DocActivity extends AppCompatActivity {
    
        private WebView webView;
    
        private String docPath = "/mnt/sdcard/Document/";
        private String docName = "test.doc";
        private String savePath = "/mnt/sdcard/Document/temp/";
    
        @Override
        protected void onCreate(Bundle savedInstanceState) {
            super.onCreate(savedInstanceState);
            setContentView(R.layout.activity_doc);
            initView();
        }
    
        private void initView() {
            webView = (WebView) findViewById(R.id.docview);
            String name = docName.substring(0, docName.indexOf("."));
            if (!(new File(savePath + name).exists())) {
                new File(savePath + name).mkdirs();
            }
            try {
                convert2Html(docPath + docName, savePath + name + ".html");
            } catch (TransformerException e) {
                e.printStackTrace();
            } catch (IOException e) {
                e.printStackTrace();
            } catch (ParserConfigurationException e) {
                e.printStackTrace();
            }
            WebSettings webSettings = webView.getSettings();
            webSettings.setLoadWithOverviewMode(true);
            webSettings.setSupportZoom(true);
            webSettings.setBuiltInZoomControls(true);
            webView.loadUrl("file://" + savePath + name + ".html");
        }
    
        /**
         * word文档转成html格式
         */
        public void convert2Html(String fileName, String outPutFile)
                throws TransformerException, IOException,
                ParserConfigurationException {
            HWPFDocument wordDocument = new HWPFDocument(new FileInputStream(fileName));
            WordToHtmlConverter wordToHtmlConverter = new WordToHtmlConverter(
                    DocumentBuilderFactory.newInstance().newDocumentBuilder().newDocument());
    
            //设置图片路径
            wordToHtmlConverter.setPicturesManager(new PicturesManager() {
                public String savePicture(byte[] content,
                                          PictureType pictureType, String suggestedName,
                                          float widthInches, float heightInches) {
                    String name = docName.substring(0, docName.indexOf("."));
                    return name + "/" + suggestedName;
                }
            });
    
            //保存图片
            List<Picture> pics = wordDocument.getPicturesTable().getAllPictures();
            if (pics != null) {
                for (int i = 0; i < pics.size(); i++) {
                    Picture pic = (Picture) pics.get(i);
                    System.out.println(pic.suggestFullFileName());
                    try {
                        String name = docName.substring(0, docName.indexOf("."));
                        pic.writeImageContent(new FileOutputStream(savePath + name + "/"
                                + pic.suggestFullFileName()));
                    } catch (FileNotFoundException e) {
                        e.printStackTrace();
                    }
                }
            }
            wordToHtmlConverter.processDocument(wordDocument);
            Document htmlDocument = wordToHtmlConverter.getDocument();
            ByteArrayOutputStream out = new ByteArrayOutputStream();
            DOMSource domSource = new DOMSource(htmlDocument);
            StreamResult streamResult = new StreamResult(out);
    
            TransformerFactory tf = TransformerFactory.newInstance();
            Transformer serializer = tf.newTransformer();
            serializer.setOutputProperty(OutputKeys.ENCODING, "utf-8");
            serializer.setOutputProperty(OutputKeys.INDENT, "yes");
            serializer.setOutputProperty(OutputKeys.METHOD, "html");
            serializer.transform(domSource, streamResult);
            out.close();
            //保存html文件
            writeFile(new String(out.toByteArray()), outPutFile);
        }
    
        /**
         * 将html文件保存到sd卡
         */
        public void writeFile(String content, String path) {
            FileOutputStream fos = null;
            BufferedWriter bw = null;
            try {
                File file = new File(path);
                if (!file.exists()) {
                    file.createNewFile();
                }
                fos = new FileOutputStream(file);
                bw = new BufferedWriter(new OutputStreamWriter(fos, "utf-8"));
                bw.write(content);
            } catch (FileNotFoundException fnfe) {
                fnfe.printStackTrace();
            } catch (IOException ioe) {
                ioe.printStackTrace();
            } finally {
                try {
                    if (bw != null)
                        bw.close();
                    if (fos != null)
                        fos.close();
                } catch (IOException ie) {
                }
            }
        }
    }
    

    3、最主要的 WordToHtmlConverter类
    这个类就是把他包里的拷贝过来,有一些包里私有的不能访问的方法直接写在这个类里面。引发jar包不适的是compactChildNodesR(Element parentElement, String childTagName)这个方法,嗯、该吃药了,做了简单的手术,在里面做了标注,大家可以看一看。

    
    import org.apache.poi.hpsf.SummaryInformation;
    import org.apache.poi.hwpf.HWPFDocument;
    import org.apache.poi.hwpf.HWPFDocumentCore;
    import org.apache.poi.hwpf.converter.AbstractWordConverter;
    import org.apache.poi.hwpf.converter.AbstractWordUtils;
    import org.apache.poi.hwpf.converter.FontReplacer.Triplet;
    import org.apache.poi.hwpf.converter.HtmlDocumentFacade;
    import org.apache.poi.hwpf.converter.WordToHtmlUtils;
    import org.apache.poi.hwpf.usermodel.Bookmark;
    import org.apache.poi.hwpf.usermodel.CharacterRun;
    import org.apache.poi.hwpf.usermodel.OfficeDrawing;
    import org.apache.poi.hwpf.usermodel.Paragraph;
    import org.apache.poi.hwpf.usermodel.Picture;
    import org.apache.poi.hwpf.usermodel.Range;
    import org.apache.poi.hwpf.usermodel.Section;
    import org.apache.poi.hwpf.usermodel.Table;
    import org.apache.poi.hwpf.usermodel.TableCell;
    import org.apache.poi.hwpf.usermodel.TableRow;
    import org.apache.poi.util.Beta;
    import org.apache.poi.util.POILogFactory;
    import org.apache.poi.util.POILogger;
    import org.apache.poi.util.XMLHelper;
    import org.w3c.dom.Attr;
    import org.w3c.dom.Document;
    import org.w3c.dom.Element;
    import org.w3c.dom.NamedNodeMap;
    import org.w3c.dom.Node;
    import org.w3c.dom.NodeList;
    import org.w3c.dom.Text;
    
    import java.io.File;
    import java.io.IOException;
    import java.util.Deque;
    import java.util.LinkedList;
    import java.util.List;
    import java.util.Set;
    import java.util.TreeSet;
    
    import javax.xml.parsers.ParserConfigurationException;
    import javax.xml.transform.OutputKeys;
    import javax.xml.transform.Transformer;
    import javax.xml.transform.TransformerException;
    import javax.xml.transform.TransformerFactory;
    import javax.xml.transform.dom.DOMSource;
    import javax.xml.transform.stream.StreamResult;
    
    import static org.apache.poi.hwpf.converter.AbstractWordUtils.TWIPS_PER_INCH;
    
    /**
     * Converts Word files (95-2007) into HTML files.
     * <p>
     * This implementation doesn't create images or links to them. This can be
     * changed by overriding {@link #processImage(Element, boolean, Picture)}
     * method.
     */
    @Beta
    public class WordToHtmlConverter extends AbstractWordConverter {
        private static final POILogger logger = POILogFactory.getLogger(WordToHtmlConverter.class);
        private final Deque<BlockProperies> blocksProperies = new LinkedList<BlockProperies>();
        private final HtmlDocumentFacade htmlDocumentFacade;
        private Element notes;
    
        /**
         * Creates new instance of {@link WordToHtmlConverter}. Can be used for
         * output several {@link HWPFDocument}s into single HTML document.
         *
         * @param document XML DOM Document used as HTML document
         */
        public WordToHtmlConverter(Document document) {
            this.htmlDocumentFacade = new HtmlDocumentFacade(document);
        }
    
        public WordToHtmlConverter(HtmlDocumentFacade htmlDocumentFacade) {
            this.htmlDocumentFacade = htmlDocumentFacade;
        }
    
        private static String getSectionStyle(Section section) {
            float leftMargin = section.getMarginLeft() / TWIPS_PER_INCH;
            float rightMargin = section.getMarginRight() / TWIPS_PER_INCH;
            float topMargin = section.getMarginTop() / TWIPS_PER_INCH;
            float bottomMargin = section.getMarginBottom() / TWIPS_PER_INCH;
    
            String style = "margin: " + topMargin + "in " + rightMargin + "in "
                    + bottomMargin + "in " + leftMargin + "in;";
    
            if (section.getNumColumns() > 1) {
                style += "column-count: " + (section.getNumColumns()) + ";";
                if (section.isColumnsEvenlySpaced()) {
                    float distance = section.getDistanceBetweenColumns()
                            / TWIPS_PER_INCH;
                    style += "column-gap: " + distance + "in;";
                } else {
                    style += "column-gap: 0.25in;";
                }
            }
            return style;
        }
    
        /**
         * Java main() interface to interact with {@link WordToHtmlConverter}<p>
         * <p>
         * Usage: WordToHtmlConverter infile outfile<p>
         * <p>
         * Where infile is an input .doc file ( Word 95-2007) which will be rendered
         * as HTML into outfile
         */
        public static void main(String[] args)
                throws IOException, ParserConfigurationException, TransformerException {
            if (args.length < 2) {
                System.err.println("Usage: WordToHtmlConverter <inputFile.doc> <saveTo.html>");
                return;
            }
    
            System.out.println("Converting " + args[0]);
            System.out.println("Saving output to " + args[1]);
    
            Document doc = WordToHtmlConverter.process(new File(args[0]));
    
            DOMSource domSource = new DOMSource(doc);
            StreamResult streamResult = new StreamResult(new File(args[1]));
    
            TransformerFactory tf = TransformerFactory.newInstance();
            Transformer serializer = tf.newTransformer();
            // TODO set encoding from a command argument
            serializer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
            serializer.setOutputProperty(OutputKeys.INDENT, "yes");
            serializer.setOutputProperty(OutputKeys.METHOD, "html");
            serializer.transform(domSource, streamResult);
        }
    
        static Document process(File docFile) throws IOException, ParserConfigurationException {
            final HWPFDocumentCore wordDocument = AbstractWordUtils.loadDoc(docFile);
            WordToHtmlConverter wordToHtmlConverter = new WordToHtmlConverter(
                    XMLHelper.getDocumentBuilderFactory().newDocumentBuilder()
                            .newDocument());
            wordToHtmlConverter.processDocument(wordDocument);
            return wordToHtmlConverter.getDocument();
        }
    
        static boolean equals(String str1, String str2) {
            return str1 == null ? str2 == null : str1.equals(str2);
        }
    
        @Override
        protected void afterProcess() {
            if (notes != null) {
                htmlDocumentFacade.getBody().appendChild(notes);
            }
    
            htmlDocumentFacade.updateStylesheet();
        }
    
        @Override
        public Document getDocument() {
            return htmlDocumentFacade.getDocument();
        }
    
        @Override
        protected void outputCharacters(Element pElement,
                                        CharacterRun characterRun, String text) {
            Element span = htmlDocumentFacade.getDocument().createElement("span");
            pElement.appendChild(span);
    
            StringBuilder style = new StringBuilder();
            BlockProperies blockProperies = this.blocksProperies.peek();
            Triplet triplet = getCharacterRunTriplet(characterRun);
    
            if ((triplet.fontName != null && !triplet.fontName.equals(""))
                    && (!triplet.fontName.equals(
                    blockProperies.pFontName))) {
                style.append("font-family:" + triplet.fontName + ";");
            }
            if (characterRun.getFontSize() / 2 != blockProperies.pFontSize) {
                style.append("font-size:" + characterRun.getFontSize() / 2 + "pt;");
            }
            if (triplet.bold) {
                style.append("font-weight:bold;");
            }
            if (triplet.italic) {
                style.append("font-style:italic;");
            }
    
            WordToHtmlUtils.addCharactersProperties(characterRun, style);
            if (style.length() != 0) {
                htmlDocumentFacade.addStyleClass(span, "s", style.toString());
            }
    
            Text textNode = htmlDocumentFacade.createText(text);
            span.appendChild(textNode);
        }
    
        @Override
        protected void processBookmarks(HWPFDocumentCore wordDocument,
                                        Element currentBlock, Range range, int currentTableLevel,
                                        List<Bookmark> rangeBookmarks) {
            Element parent = currentBlock;
            for (Bookmark bookmark : rangeBookmarks) {
                Element bookmarkElement = htmlDocumentFacade
                        .createBookmark(bookmark.getName());
                parent.appendChild(bookmarkElement);
                parent = bookmarkElement;
            }
    
            if (range != null) {
                processCharacters(wordDocument, currentTableLevel, range, parent);
            }
        }
    
        @Override
        protected void processDocumentInformation(
                SummaryInformation summaryInformation) {
            if (isNotEmpty(summaryInformation.getTitle())) {
                htmlDocumentFacade.setTitle(summaryInformation.getTitle());
            }
    
            if (isNotEmpty(summaryInformation.getAuthor())) {
                htmlDocumentFacade.addAuthor(summaryInformation.getAuthor());
            }
    
            if (isNotEmpty(summaryInformation.getKeywords())) {
                htmlDocumentFacade.addKeywords(summaryInformation.getKeywords());
            }
    
            if (isNotEmpty(summaryInformation.getComments())) {
                htmlDocumentFacade.addDescription(summaryInformation.getComments());
            }
        }
    
        private boolean isNotEmpty(String s) {
            return s != null && !s.equals("") && s.length() > 0;
        }
    
        @Override
        public void processDocumentPart(HWPFDocumentCore wordDocument, Range range) {
            super.processDocumentPart(wordDocument, range);
            afterProcess();
        }
    
        @Override
        protected void processDropDownList(Element block,
                                           CharacterRun characterRun, String[] values, int defaultIndex) {
            Element select = htmlDocumentFacade.createSelect();
            for (int i = 0; i < values.length; i++) {
                select.appendChild(htmlDocumentFacade.createOption(values[i],
                        defaultIndex == i));
            }
            block.appendChild(select);
        }
    
        @Override
        protected void processDrawnObject(HWPFDocument doc,
                                          CharacterRun characterRun, OfficeDrawing officeDrawing,
                                          String path, Element block) {
            Element img = htmlDocumentFacade.createImage(path);
            block.appendChild(img);
        }
    
        @Override
        protected void processEndnoteAutonumbered(HWPFDocument wordDocument,
                                                  int noteIndex, Element block, Range endnoteTextRange) {
            processNoteAutonumbered(wordDocument, "end", noteIndex, block,
                    endnoteTextRange);
        }
    
        @Override
        protected void processFootnoteAutonumbered(HWPFDocument wordDocument,
                                                   int noteIndex, Element block, Range footnoteTextRange) {
            processNoteAutonumbered(wordDocument, "foot", noteIndex, block,
                    footnoteTextRange);
        }
    
        @Override
        protected void processHyperlink(HWPFDocumentCore wordDocument,
                                        Element currentBlock, Range textRange, int currentTableLevel,
                                        String hyperlink) {
            Element basicLink = htmlDocumentFacade.createHyperlink(hyperlink);
            currentBlock.appendChild(basicLink);
    
            if (textRange != null) {
                processCharacters(wordDocument, currentTableLevel, textRange,
                        basicLink);
            }
        }
    
        @Override
        protected void processImage(Element currentBlock, boolean inlined,
                                    Picture picture, String imageSourcePath) {
            final int aspectRatioX = picture.getHorizontalScalingFactor();
            final int aspectRatioY = picture.getVerticalScalingFactor();
    
            StringBuilder style = new StringBuilder();
    
            final float imageWidth;
            final float imageHeight;
    
            final float cropTop;
            final float cropBottom;
            final float cropLeft;
            final float cropRight;
    
            if (aspectRatioX > 0) {
                imageWidth = picture.getDxaGoal() * aspectRatioX / 1000.f
                        / TWIPS_PER_INCH;
                cropRight = picture.getDxaCropRight() * aspectRatioX / 1000.f
                        / TWIPS_PER_INCH;
                cropLeft = picture.getDxaCropLeft() * aspectRatioX / 1000.f
                        / TWIPS_PER_INCH;
            } else {
                imageWidth = picture.getDxaGoal() / TWIPS_PER_INCH;
                cropRight = picture.getDxaCropRight() / TWIPS_PER_INCH;
                cropLeft = picture.getDxaCropLeft() / TWIPS_PER_INCH;
            }
    
            if (aspectRatioY > 0) {
                imageHeight = picture.getDyaGoal() * aspectRatioY / 1000.f
                        / TWIPS_PER_INCH;
                cropTop = picture.getDyaCropTop() * aspectRatioY / 1000.f
                        / TWIPS_PER_INCH;
                cropBottom = picture.getDyaCropBottom() * aspectRatioY / 1000.f
                        / TWIPS_PER_INCH;
            } else {
                imageHeight = picture.getDyaGoal() / TWIPS_PER_INCH;
                cropTop = picture.getDyaCropTop() / TWIPS_PER_INCH;
                cropBottom = picture.getDyaCropBottom() / TWIPS_PER_INCH;
            }
    
            Element root;
            if (Math.abs(cropTop) + Math.abs(cropRight) + Math.abs(cropBottom) + Math.abs(cropLeft) > 0) {
                float visibleWidth = Math
                        .max(0, imageWidth - cropLeft - cropRight);
                float visibleHeight = Math.max(0, imageHeight - cropTop
                        - cropBottom);
    
                root = htmlDocumentFacade.createBlock();
                htmlDocumentFacade.addStyleClass(root, "d",
                        "vertical-align:text-bottom;width:" + visibleWidth
                                + "in;height:" + visibleHeight + "in;");
    
                // complex
                Element inner = htmlDocumentFacade.createBlock();
                htmlDocumentFacade.addStyleClass(inner, "d",
                        "position:relative;width:" + visibleWidth + "in;height:"
                                + visibleHeight + "in;overflow:hidden;");
                root.appendChild(inner);
    
                Element image = htmlDocumentFacade.createImage(imageSourcePath);
                htmlDocumentFacade.addStyleClass(image, "i",
                        "position:absolute;left:-" + cropLeft + ";top:-" + cropTop
                                + ";width:" + imageWidth + "in;height:"
                                + imageHeight + "in;");
                inner.appendChild(image);
    
                style.append("overflow:hidden;");
            } else {
                root = htmlDocumentFacade.createImage(imageSourcePath);
                root.setAttribute("style", "width:" + imageWidth + "in;height:"
                        + imageHeight + "in;vertical-align:text-bottom;");
            }
    
            currentBlock.appendChild(root);
        }
    
        @Override
        protected void processImageWithoutPicturesManager(Element currentBlock,
                                                          boolean inlined, Picture picture) {
            // no default implementation -- skip
            currentBlock.appendChild(htmlDocumentFacade.getDocument()
                    .createComment("Image link to '"
                            + picture.suggestFullFileName() + "' can be here"));
        }
    
        @Override
        protected void processLineBreak(Element block, CharacterRun characterRun) {
            block.appendChild(htmlDocumentFacade.createLineBreak());
        }
    
        protected void processNoteAutonumbered(HWPFDocument doc, String type,
                                               int noteIndex, Element block, Range noteTextRange) {
            final String textIndex = String.valueOf(noteIndex + 1);
            final String textIndexClass = htmlDocumentFacade.getOrCreateCssClass(
                    "a", "vertical-align:super;font-size:smaller;");
            final String forwardNoteLink = type + "note_" + textIndex;
            final String backwardNoteLink = type + "note_back_" + textIndex;
    
            Element anchor = htmlDocumentFacade.createHyperlink("#"
                    + forwardNoteLink);
            anchor.setAttribute("name", backwardNoteLink);
            anchor.setAttribute("class", textIndexClass + " " + type
                    + "noteanchor");
            anchor.setTextContent(textIndex);
            block.appendChild(anchor);
    
            if (notes == null) {
                notes = htmlDocumentFacade.createBlock();
                notes.setAttribute("class", "notes");
            }
    
            Element note = htmlDocumentFacade.createBlock();
            note.setAttribute("class", type + "note");
            notes.appendChild(note);
    
            Element bookmark = htmlDocumentFacade.createBookmark(forwardNoteLink);
            bookmark.setAttribute("href", "#" + backwardNoteLink);
            bookmark.setTextContent(textIndex);
            bookmark.setAttribute("class", textIndexClass + " " + type
                    + "noteindex");
            note.appendChild(bookmark);
            note.appendChild(htmlDocumentFacade.createText(" "));
    
            Element span = htmlDocumentFacade.getDocument().createElement("span");
            span.setAttribute("class", type + "notetext");
            note.appendChild(span);
    
            this.blocksProperies.add(new BlockProperies("", -1));
            try {
                processCharacters(doc, Integer.MIN_VALUE, noteTextRange, span);
            } finally {
                this.blocksProperies.pop();
            }
        }
    
        @Override
        protected void processPageBreak(HWPFDocumentCore wordDocument, Element flow) {
            flow.appendChild(htmlDocumentFacade.createLineBreak());
        }
    
        @Override
        protected void processPageref(HWPFDocumentCore hwpfDocument,
                                      Element currentBlock, Range textRange, int currentTableLevel,
                                      String pageref) {
            Element basicLink = htmlDocumentFacade.createHyperlink("#" + pageref);
            currentBlock.appendChild(basicLink);
    
            if (textRange != null) {
                processCharacters(hwpfDocument, currentTableLevel, textRange,
                        basicLink);
            }
        }
    
        @Override
        protected void processParagraph(HWPFDocumentCore hwpfDocument,
                                        Element parentElement, int currentTableLevel, Paragraph paragraph,
                                        String bulletText) {
            final Element pElement = htmlDocumentFacade.createParagraph();
            parentElement.appendChild(pElement);
    
            StringBuilder style = new StringBuilder();
            WordToHtmlUtils.addParagraphProperties(paragraph, style);
    
            final int charRuns = paragraph.numCharacterRuns();
    
            if (charRuns == 0) {
                return;
            }
    
            {
                final String pFontName;
                final int pFontSize;
                final CharacterRun characterRun = paragraph.getCharacterRun(0);
                if (characterRun != null) {
                    Triplet triplet = getCharacterRunTriplet(characterRun);
                    pFontSize = characterRun.getFontSize() / 2;
                    pFontName = triplet.fontName;
                    WordToHtmlUtils.addFontFamily(pFontName, style);
                    WordToHtmlUtils.addFontSize(pFontSize, style);
                } else {
                    pFontSize = -1;
                    pFontName = "";
                }
                blocksProperies.push(new BlockProperies(pFontName, pFontSize));
            }
            try {
                if (isNotEmpty(bulletText)) {
                    if (bulletText.endsWith("\t")) {
                        /*
                         * We don't know how to handle all cases in HTML, but at
                         * least simplest case shall be handled
                         */
                        final float defaultTab = TWIPS_PER_INCH / 2;
                        // char have some space
                        float firstLinePosition = paragraph.getIndentFromLeft()
                                + paragraph.getFirstLineIndent() + 20f;
    
                        float nextStop = (float) (Math.ceil(firstLinePosition
                                / defaultTab) * defaultTab);
    
                        final float spanMinWidth = nextStop - firstLinePosition;
    
                        Element span = htmlDocumentFacade.getDocument()
                                .createElement("span");
                        htmlDocumentFacade
                                .addStyleClass(span, "s",
                                        "display: inline-block; text-indent: 0; min-width: "
                                                + (spanMinWidth / TWIPS_PER_INCH)
                                                + "in;");
                        pElement.appendChild(span);
    
                        Text textNode = htmlDocumentFacade.createText(bulletText
                                .substring(0, bulletText.length() - 1)
                                + UNICODECHAR_ZERO_WIDTH_SPACE
                                + UNICODECHAR_NO_BREAK_SPACE);
                        span.appendChild(textNode);
                    } else {
                        Text textNode = htmlDocumentFacade.createText(bulletText
                                .substring(0, bulletText.length() - 1));
                        pElement.appendChild(textNode);
                    }
                }
    
                processCharacters(hwpfDocument, currentTableLevel, paragraph,
                        pElement);
            } finally {
                blocksProperies.pop();
            }
    
            if (style.length() > 0) {
                htmlDocumentFacade.addStyleClass(pElement, "p", style.toString());
            }
    
            compactSpans(pElement);
            return;
        }
    
        private void compactSpans(Element pElement) {
            compactChildNodesR(pElement, "span");
        }
    
    
        private void compactChildNodesR(Element parentElement, String childTagName) {
            NodeList childNodes = parentElement.getChildNodes();
            //修改原方法,添加childNodes为空判断
            if (childNodes != null) {
                for (int i = 0; i < childNodes.getLength() - 1; i++) {
                    Node child1 = childNodes.item(i);
                    Node child2 = childNodes.item(i + 1);
                    if (!canBeMerged(child1, child2, childTagName))
                        continue;
    
                    // merge
                    while (child2.getChildNodes().getLength() > 0)
                        child1.appendChild(child2.getFirstChild());
                    //添加判断
                    if (child2.getParentNode() != null) {
                        child2.getParentNode().removeChild(child2);
                        i--;
                    }
    
                }
            }
    
            childNodes = parentElement.getChildNodes();
            if (childNodes != null) {
                for (int i = 0; i < childNodes.getLength() - 1; i++) {
                    Node child = childNodes.item(i);
                    if (child instanceof Element) {
                        compactChildNodesR((Element) child, childTagName);
                    }
                }
            }
        }
    
        private boolean canBeMerged(Node node1, Node node2, String requiredTagName) {
            if (node1.getNodeType() != Node.ELEMENT_NODE
                    || node2.getNodeType() != Node.ELEMENT_NODE)
                return false;
    
            Element element1 = (Element) node1;
            Element element2 = (Element) node2;
    
            if (!equals(requiredTagName, element1.getTagName())
                    || !equals(requiredTagName, element2.getTagName()))
                return false;
    
            NamedNodeMap attributes1 = element1.getAttributes();
            NamedNodeMap attributes2 = element2.getAttributes();
    
            if (attributes1.getLength() != attributes2.getLength())
                return false;
    
            for (int i = 0; i < attributes1.getLength(); i++) {
                final Attr attr1 = (Attr) attributes1.item(i);
                final Attr attr2;
                if (isNotEmpty(attr1.getNamespaceURI()))
                    attr2 = (Attr) attributes2.getNamedItemNS(
                            attr1.getNamespaceURI(), attr1.getLocalName());
                else
                    attr2 = (Attr) attributes2.getNamedItem(attr1.getName());
    
                if (attr2 == null
                        || !equals(attr1.getTextContent(), attr2.getTextContent()))
                    return false;
            }
    
            return true;
        }
    
        @Override
        protected void processSection(HWPFDocumentCore wordDocument,
                                      Section section, int sectionCounter) {
            Element div = htmlDocumentFacade.createBlock();
            htmlDocumentFacade.addStyleClass(div, "d", getSectionStyle(section));
            htmlDocumentFacade.getBody().appendChild(div);
    
            processParagraphes(wordDocument, div, section, Integer.MIN_VALUE);
        }
    
        @Override
        protected void processSingleSection(HWPFDocumentCore wordDocument,
                                            Section section) {
            htmlDocumentFacade.addStyleClass(htmlDocumentFacade.getBody(), "b",
                    getSectionStyle(section));
    
            processParagraphes(wordDocument, htmlDocumentFacade.getBody(), section,
                    Integer.MIN_VALUE);
        }
    
        @Override
        protected void processTable(HWPFDocumentCore hwpfDocument, Element flow,
                                    Table table) {
            Element tableHeader = htmlDocumentFacade.createTableHeader();
            Element tableBody = htmlDocumentFacade.createTableBody();
    
            final int[] tableCellEdges = buildTableCellEdgesArray(table);
            final int tableRows = table.numRows();
    
            int maxColumns = Integer.MIN_VALUE;
            for (int r = 0; r < tableRows; r++) {
                maxColumns = Math.max(maxColumns, table.getRow(r).numCells());
            }
    
            for (int r = 0; r < tableRows; r++) {
                TableRow tableRow = table.getRow(r);
    
                Element tableRowElement = htmlDocumentFacade.createTableRow();
                StringBuilder tableRowStyle = new StringBuilder();
                WordToHtmlUtils.addTableRowProperties(tableRow, tableRowStyle);
    
                // index of current element in tableCellEdges[]
                int currentEdgeIndex = 0;
                final int rowCells = tableRow.numCells();
                for (int c = 0; c < rowCells; c++) {
                    TableCell tableCell = tableRow.getCell(c);
    
                    if (tableCell.isVerticallyMerged()
                            && !tableCell.isFirstVerticallyMerged()) {
                        currentEdgeIndex += getNumberColumnsSpanned(
                                tableCellEdges, currentEdgeIndex, tableCell);
                        continue;
                    }
    
                    Element tableCellElement;
                    if (tableRow.isTableHeader()) {
                        tableCellElement = htmlDocumentFacade
                                .createTableHeaderCell();
                    } else {
                        tableCellElement = htmlDocumentFacade.createTableCell();
                    }
                    StringBuilder tableCellStyle = new StringBuilder();
                    WordToHtmlUtils.addTableCellProperties(tableRow, tableCell,
                            r == 0, r == tableRows - 1, c == 0, c == rowCells - 1,
                            tableCellStyle);
    
                    int colSpan = getNumberColumnsSpanned(tableCellEdges,
                            currentEdgeIndex, tableCell);
                    currentEdgeIndex += colSpan;
    
                    if (colSpan == 0) {
                        continue;
                    }
    
                    if (colSpan != 1) {
                        tableCellElement.setAttribute("colspan",
                                String.valueOf(colSpan));
                    }
    
                    final int rowSpan = getNumberRowsSpanned(table,
                            tableCellEdges, r, c, tableCell);
                    if (rowSpan > 1) {
                        tableCellElement.setAttribute("rowspan",
                                String.valueOf(rowSpan));
                    }
    
                    processParagraphes(hwpfDocument, tableCellElement, tableCell,
                            table.getTableLevel());
    
                    if (!tableCellElement.hasChildNodes()) {
                        tableCellElement.appendChild(htmlDocumentFacade
                                .createParagraph());
                    }
                    if (tableCellStyle.length() > 0) {
                        htmlDocumentFacade.addStyleClass(tableCellElement,
                                tableCellElement.getTagName(),
                                tableCellStyle.toString());
                    }
    
                    tableRowElement.appendChild(tableCellElement);
                }
    
                if (tableRowStyle.length() > 0) {
                    tableRowElement.setAttribute("class", htmlDocumentFacade
                            .getOrCreateCssClass("r", tableRowStyle.toString()));
                }
    
                if (tableRow.isTableHeader()) {
                    tableHeader.appendChild(tableRowElement);
                } else {
                    tableBody.appendChild(tableRowElement);
                }
            }
    
            final Element tableElement = htmlDocumentFacade.createTable();
            tableElement
                    .setAttribute(
                            "class",
                            htmlDocumentFacade
                                    .getOrCreateCssClass("t",
                                            "table-layout:fixed;border-collapse:collapse;border-spacing:0;"));
            if (tableHeader.hasChildNodes()) {
                tableElement.appendChild(tableHeader);
            }
            if (tableBody.hasChildNodes()) {
                tableElement.appendChild(tableBody);
                flow.appendChild(tableElement);
            } else {
                logger.log(POILogger.WARN, "Table without body starting at [",
                        Integer.valueOf(table.getStartOffset()), "; ",
                        Integer.valueOf(table.getEndOffset()), ")");
            }
        }
    
        private int[] buildTableCellEdgesArray(Table table) {
            Set<Integer> edges = new TreeSet<Integer>();
    
            for (int r = 0; r < table.numRows(); r++) {
                TableRow tableRow = table.getRow(r);
                for (int c = 0; c < tableRow.numCells(); c++) {
                    TableCell tableCell = tableRow.getCell(c);
    
                    edges.add(Integer.valueOf(tableCell.getLeftEdge()));
                    edges.add(Integer.valueOf(tableCell.getLeftEdge()
                            + tableCell.getWidth()));
                }
            }
    
            Integer[] sorted = edges.toArray(new Integer[edges.size()]);
            int[] result = new int[sorted.length];
            for (int i = 0; i < sorted.length; i++) {
                result[i] = sorted[i].intValue();
            }
    
            return result;
        }
    
        /**
         * Holds properties values, applied to current <tt>p</tt> element. Those
         * properties shall not be doubled in children <tt>span</tt> elements.
         */
        private static class BlockProperies {
            final String pFontName;
            final int pFontSize;
    
            public BlockProperies(String pFontName, int pFontSize) {
                this.pFontName = pFontName;
                this.pFontSize = pFontSize;
            }
        }
    
    }
    
    

    有时文档读出来后会少文字,无故丢失,其实是读出来的,但是在做处理的时候,有一个bug使得逻辑不正确,可以对代码做如下修改:

    private void compactChildNodesR(Element parentElement, String childTagName) {
            NodeList childNodes = parentElement.getChildNodes();
            //修改原方法,添加childNodes为空判断
            if (childNodes != null) {
                for (int i = 0; i < childNodes.getLength() - 1; i++) {
                    Node child1 = childNodes.item(i);
                    Node child2 = childNodes.item(i + 1);
                    if (!canBeMerged(child1, child2, childTagName))
                        continue;
    
                    // merge
                    while (child2.getChildNodes() != null && child2.getChildNodes().getLength() > 0) {
                        child1.appendChild(child2.getFirstChild());
                    }
                    //添加判断
                    if (child2.getParentNode() != null) {
                        child2.getParentNode().removeChild(child2);
                        childNodes = parentElement.getChildNodes();//添加此行代码,绕过child2的parentNode会变为null的问题。
                        i--;
                    } else {
    
                    }
    
                }
            }
    
            childNodes = parentElement.getChildNodes();
            if (childNodes != null) {
                for (int i = 0; i < childNodes.getLength() - 1; i++) {
                    Node child = childNodes.item(i);
                    if (child instanceof Element) {
                        compactChildNodesR((Element) child, childTagName);
                    }
                }
            }
        }
    
    1. Activity里的代码是参照别人写的(就是拷过来直接用了),http://www.cnblogs.com/esrichina/p/3347454.html
      同时感谢其提供方法
    2. POI官网 http://poi.apache.org ,我用的最新版,jar包可以在此网站下载

    展开全文
  • java用poi读取word文档的确很容易。不过不同版本之间差别还是挺大的。目前只是试出了poi-3.8-beta4这个版本编辑word内容后输出不是乱码。最新官方版本都是乱码,不知道为什么。贴出代码 FileInputStream ...
  • poi操作word文档

    2020-05-28 19:59:50
    //输出为doc格式的文档可以更改为docx格式的 template.writeToFile("D:/漏洞问题.doc"); } } ## Problem实体类 package com.example.demo.Test_poi; import org.springframework.stereotype.Component; public ...
  • 这个是完整的一个poiword文档转化为html,导入eclipase就可以运行。 不至于骗一点积分。
  • POI解析word文档

    万次阅读 2018-07-26 15:50:02
    import ... import org.apache.log4j.Logger; import org.apache.poi.hwpf.HWPFDocument; import org.apache.poi.hwpf.model.PicturesTable; import org.apache.poi...
  • 使用POI读取word文档内容

    万次阅读 2017-05-14 22:59:33
    word doc文件2中方式 1.1 通过WordExtractor读文件(在WordExtractor内部进行信息读取时...Apache poi的hwpf模块是专门用来对word doc文件进行读写操作的。在hwpf里面我们使用HWPFDocument来表示一个word doc文
  • poi 读写word文档并下载文件 **本文主要介绍如何将数据导入word文档模板,根据需要动态拼接列,并将模板下载下来 1)首先需要导入依赖包 <dependency> <groupId>org.apache.poi</groupId> &...
  • 利用poi操作word文档

    2019-09-29 01:49:18
    关键字:POI JAVA 批注 总页数 总字符数一:认识POIApache POI是一个开源的利用Java读写Excel、WORD等微软OLE2组件文档的项目。最新的3.5版本有很多改进,加入了对采用OOXML格式的Office 2007支持,如xlsx、docx、...
  • POI替换word文档标签

    2020-07-31 22:30:01
    maven仓库依赖如下: <!-- ...org.apache.poi</groupId> <artifactId>poi</artifactId> <version>3.13</version> </dependen
  • POI操作WORD文档,生成的新文档为空白文档 ...... 目前只是找到以上现象,初步结论是表格的规格过大会对生成的WORD文件结果又影响。具体怎么解决还没有找到途径。
  • poi操作word文档表操作

    千次阅读 2017-03-27 09:33:08
    最近项目有用到java自动生成word文档的程序,简单的文字图片等操作还是比较容易实现的,主要是表格的操作,自己找了好久才找到相关的,整理了下相关的jar包和代码。 jar希望你们去poi官网下载最新的jar包 ,这样...
  • POI 替换word文档书签

    千次阅读 2019-08-22 12:03:51
    word按照版本为两种:doc和docx,POI针对这两种操作需要不同的API: doc操作使用HWPFDocument,替换思想就是找到书签的起止位置,对范围内的文本进行替换: public static void docOperate(InputStream ...
  • 初识POI编辑word文档

    2017-09-15 16:27:40
    1.确定你的POI版本和word版本 经测试,3.10final版本的POI可以操作word2013,其它不知, 贴maven引用的jar 2.关于ooxml-schemas 感谢这位哥们解决了这个问题 ... 环境解决,下面开始上代码。
  • 场景:读取docx文档后,将数据插入文档,并设置标题等样式 ...一 、POI读取word文档 InputStream is = null; is = new FileInputStream("docx文档路径"); XWPFDocument doc = new XWPFDocument(is); //doc为文.
  • 电脑安装的office,不论是新创建还是修改之前word模板文件,使用代码poi对其文档修改生成新word就打不开,并提示以下错误: 后台代码也没报错,就是打不开word,网上百度的方法都试了,没解决, 解决办法:使用其他...
  • java利用POI替换word文档中的标签 <dependency> <groupId>org.apache.poi</groupId> <artifactId>poi</artifactId> <version>3.13</version> </dependency> ...
  • 做项目经常会碰到导出excel和word文档,相对来说导出excel更多一点,但是有时候的确不得不导出word文档(包含表格)。哎,咋办,做呗,然后开始百度。。一天一夜后终于完成。。。这里记录一下,让更多的人少走弯路,...
  • poi操作word文档总结

    万次阅读 2014-06-05 17:18:43
    POI分段落生成纯Word动态模板并导入数据 导出数据,可以用word另存为xml格式的ftl文件,变量用${变量名}表示,然后在类中通过 freemarker去替换变量。...后来找到方法可以分段落读入纯word文档。要找到了word基
  • 【Java工具类】 POI操作word文档模版可修改文字图片 1.WordUtil import java.io.ByteArrayInputStream; import java.io.IOException; import java.io.InputStream; import java.util.Iterator; import java.util....
  • Java使用poi读取word文档中的表格

    万次阅读 热门讨论 2018-09-14 17:32:44
    使用poi读取文档中的表格,当有多个表格时可以指定需要读取的表格,同时支持读取docx和doc格式。需要添加poi的jar包 测试文档如下图: 程序代码:  package com.fise19.read; import java.io....

空空如也

空空如也

1 2 3 4 5 ... 20
收藏数 3,395
精华内容 1,358
热门标签
关键字:

poi修改word文档