zoukankan      html  css  js  c++  java
  • 解析xlsx文件---Java读取Excel2007

    关于Java读取Excel2007的文章在Google、百度上搜索一下,没有太好的例子,实现的也不算太好。查看了一下Poi,最新的 POI 3.5 beta 4 支持读写 Excel2007PPT2007(XLSX and PPTX),自己来实现Java读取Excel2007了。

    一、老版本案例,08年的poi.jar;下面案例只适用于与解析格式较小的文件,对于较大的文件会出现内存溢出情况。

    1,下载 POI 3.5 beta 4 解压,把其中的jar包导入项目文件。以我的读取为例,导入了以下jar包。


    没有配置 log4j,测试时报告警报信息,应该为加载顺序导致的初始化问题造成(暂时没有找原因)

    2,建立读取 Excel2007 方法

    Java代码

    // 构造 XSSFWorkbook 对象,strPath 传入文件路径

    XSSFWorkbook xwb = new XSSFWorkbook(path);  //已过时,被废弃,使用会出错

    XSSFWorkbook xwb = new XSSFWorkbook(new FileInputStream(file));

    // 读取第一章表格内容

    XSSFSheet sheet = xwb.getSheetAt(0);

    // 定义 rowcell

    XSSFRow row;

    String cell;

    // 循环输出表格中的内容

    for (int i = sheet.getFirstRowNum(); i < sheet.getPhysicalNumberOfRows(); i++) {

    row = sheet.getRow(i);

    for (int j = row.getFirstCellNum(); j < row.getPhysicalNumberOfCells(); j++) {

    // 通过 row.getCell(j).toString() 获取单元格内容,

    cell = row.getCell(j).toString();

    System.out.print(cell + " ");

    }

    System.out.println("");

    }

            // 构造 XSSFWorkbook 对象,strPath 传入文件路径

            XSSFWorkbook xwb = new XSSFWorkbook(strPath);

            // 读取第一章表格内容

            XSSFSheet sheet = xwb.getSheetAt(0);

            // 定义 rowcell

            XSSFRow row;

            String cell;

            // 循环输出表格中的内容

            for (int i = sheet.getFirstRowNum(); i < sheet.getPhysicalNumberOfRows(); i++) {

                row = sheet.getRow(i);

                for (int j = row.getFirstCellNum(); j < row.getPhysicalNumberOfCells(); j++) {

                    // 通过 row.getCell(j).toString() 获取单元格内容,

                    cell = row.getCell(j).toString();

                    System.out.print(cell + " ");

                }

                System.out.println("");

            }

    此过程直接传入文件所在路径,其他一些过程已经被自动实现。

    3,测试,文件为83563行数据,以文本方式读取内容,

    Java代码

    public static void main(String[] args) {

    SimpleDateFormat sdf = new SimpleDateFormat("HH:mm:ss:SS");

    TimeZone t = sdf.getTimeZone();

    t.setRawOffset(0);

    sdf.setTimeZone(t);

    Long startTime = System.currentTimeMillis();

    String fileName = "F:\我的文档\学生缴费代码.xlsx";

    // 检测代码

    try {

    PoiReadExcel er = new PoiReadExcel();

    // 读取excel2007

    er.testPoiExcel2007(fileName);

    } catch (Exception ex) {

    Logger.getLogger(FastexcelReadExcel.class.getName()).log(Level.SEVERE, null, ex);

    }

    Long endTime = System.currentTimeMillis();

    System.out.println("用时:" + sdf.format(new Date(endTime - startTime)));

    }

        public static void main(String[] args) {

            SimpleDateFormat sdf = new SimpleDateFormat("HH:mm:ss:SS");

            TimeZone t = sdf.getTimeZone();

            t.setRawOffset(0);

            sdf.setTimeZone(t);

            Long startTime = System.currentTimeMillis();

            String fileName = "F:\我的文档\学生缴费代码.xlsx";

            // 检测代码

            try {

                PoiReadExcel er = new PoiReadExcel();

                // 读取excel2007

                er.testPoiExcel2007(fileName);

            } catch (Exception ex) {

                Logger.getLogger(FastexcelReadExcel.class.getName()).log(Level.SEVERE, null, ex);

            }

            Long endTime = System.currentTimeMillis();

            System.out.println("用时:" + sdf.format(new Date(endTime - startTime)));

        }

    读取所用时间为:4.172秒。

    二、java解析大数据分析

    . Apache POI 简介( http://poi.apache.org/

    使用Java程序读写Microsoft Office,提供了下面这几种类型:

        HSSF-提供读写Microsoft Excel XLS格式档案的功能。

        XSSF-提供读写Microsoft Excel OOXML XLSX格式档案的功能。

        HWPF-提供读写Microsoft Word DOC格式档案的功能。

        HSLF-供读写Microsoft PowerPoint格式档案的功能。

        HDGF-提供读Microsoft Visio格式档案的功能。

        HPBF-提供读Microsoft Publisher格式档案的功能。

    二、POI操作Excel

        1. 官方快速帮助:http://poi.apache.org/spreadsheet/quick-guide.html

        2. 导入包:poi-3.6.jar

    参考:

        1. http://www.blogjava.net/vwpolo/archive/2009/09/16/295243.html

        2. http://hacker-zxf.javaeye.com/blog/746546

        3. http://zmx.javaeye.com/blog/622536

        4. http://canfly2010.javaeye.com/blog/701726

     

    [java] view plaincopyprint?

    package excel.poi.input; 

    import java.io.File; 

    import java.io.FileInputStream; 

    import java.io.IOException; 

    import java.io.InputStream; 

    import java.util.Iterator; 

     

    import org.apache.poi.POITextExtractor; 

    import org.apache.poi.extractor.ExtractorFactory; 

    import org.apache.poi.hssf.usermodel.HSSFCell; 

    import org.apache.poi.hssf.usermodel.HSSFRow; 

    import org.apache.poi.hssf.usermodel.HSSFSheet; 

    import org.apache.poi.hssf.usermodel.HSSFWorkbook; 

    import org.apache.poi.openxml4j.exceptions.InvalidFormatException; 

    import org.apache.poi.openxml4j.exceptions.OpenXML4JException; 

    import org.apache.poi.poifs.filesystem.POIFSFileSystem; 

    import org.apache.poi.xssf.usermodel.XSSFRow; 

    import org.apache.poi.xssf.usermodel.XSSFSheet; 

    import org.apache.poi.xssf.usermodel.XSSFWorkbook; 

    import org.apache.xmlbeans.XmlException; 

    public class ReadExcel { 

        /**

         * 读取office 2003 xls

         * @param filePath

         */ 

     @SuppressWarnings({ "unchecked", "deprecation" }) 

    public void loadXls(String filePath){ 

          try { 

               InputStream input = new FileInputStream("D://test.xls"); 

               POIFSFileSystem fs = new POIFSFileSystem(input); 

               HSSFWorkbook wb = new HSSFWorkbook(fs); 

               HSSFSheet sheet = wb.getSheetAt(0); 

               // Iterate over each row in the sheet  

               Iterator rows = sheet.rowIterator(); 

               while (rows.hasNext()) { 

                HSSFRow row = (HSSFRow) rows.next(); 

                System.out.println("Row #" + row.getRowNum()); 

                // Iterate over each cell in the row and print out the cell"s  

                // content  

                Iterator cells = row.cellIterator(); 

                while (cells.hasNext()) { 

                 HSSFCell cell = (HSSFCell) cells.next(); 

                 System.out.println("Cell #" + cell.getCellNum()); 

                 switch (cell.getCellType()) { 

                 case HSSFCell.CELL_TYPE_NUMERIC: 

                  System.out.println(cell.getNumericCellValue()); 

                  break; 

                 case HSSFCell.CELL_TYPE_STRING: 

                  System.out.println(cell.getStringCellValue()); 

                  break; 

                 case HSSFCell.CELL_TYPE_BOOLEAN: 

                  System.out.println(cell.getBooleanCellValue()); 

                  break; 

                 case HSSFCell.CELL_TYPE_FORMULA: 

                  System.out.println(cell.getCellFormula()); 

                  break; 

                 default: 

                  System.out.println("unsuported sell type"); 

                  break; 

                 } 

                } 

               } 

              } catch (IOException ex) { 

               ex.printStackTrace(); 

              } 

     } 

     /**

      * 读取xlsx文本

      * @param filePath

      */ 

     public void loadXlsxText(String filePath){ 

         File inputFile = new File("D://test.xlsx");    

         try { 

            POITextExtractor extractor = ExtractorFactory.createExtractor(inputFile); 

            System.out.println(extractor.getText()); 

        } catch (InvalidFormatException e) { 

            e.printStackTrace(); 

        } catch (IOException e) { 

            e.printStackTrace(); 

        } catch (OpenXML4JException e) { 

            e.printStackTrace(); 

        } catch (XmlException e) { 

            e.printStackTrace(); 

        }    

     } 

     /**

      * 读取office 2007 xlsx

      * @param filePath

      */ 

     public void loadXlsx(String filePath){ 

         // 构造 XSSFWorkbook 对象,strPath 传入文件路径     

        XSSFWorkbook xwb = null; 

        try { 

            xwb = new XSSFWorkbook("D://test.xlsx"); 

        } catch (IOException e) { 

            System.out.println("读取文件出错"); 

            e.printStackTrace(); 

        }    

         // 读取第一章表格内容     

         XSSFSheet sheet = xwb.getSheetAt(0);    

         xwb.getSheetAt(1); 

        // 定义 rowcell     

         XSSFRow row;    

         String cell;     

         // 循环输出表格中的内容     

         for (int i = sheet.getFirstRowNum(); i < sheet.getPhysicalNumberOfRows(); i++) {    

             row = sheet.getRow(i);    

              

             for (int j = row.getFirstCellNum(); j < row.getPhysicalNumberOfCells(); j++) {     

                // 通过 row.getCell(j).toString() 获取单元格内容,     

                cell = row.getCell(j).toString();   

                System.out.print(cell + "/t");    

            }    

             System.out.println("");    

         }    

     } 

     public static void main(String[] args) { 

         ReadExcel readExcel =new ReadExcel(); 

         readExcel.loadXlsx(""); 

     } 

     

    package excel.poi.input;

    import java.io.File;

    import java.io.FileInputStream;

    import java.io.IOException;

    import java.io.InputStream;

    import java.util.Iterator;

    import org.apache.poi.POITextExtractor;

    import org.apache.poi.extractor.ExtractorFactory;

    import org.apache.poi.hssf.usermodel.HSSFCell;

    import org.apache.poi.hssf.usermodel.HSSFRow;

    import org.apache.poi.hssf.usermodel.HSSFSheet;

    import org.apache.poi.hssf.usermodel.HSSFWorkbook;

    import org.apache.poi.openxml4j.exceptions.InvalidFormatException;

    import org.apache.poi.openxml4j.exceptions.OpenXML4JException;

    import org.apache.poi.poifs.filesystem.POIFSFileSystem;

    import org.apache.poi.xssf.usermodel.XSSFRow;

    import org.apache.poi.xssf.usermodel.XSSFSheet;

    import org.apache.poi.xssf.usermodel.XSSFWorkbook;

    import org.apache.xmlbeans.XmlException;

    public class ReadExcel {

           /**

            * 读取office 2003 xls

            * @param filePath

            */

     @SuppressWarnings({ "unchecked", "deprecation" })

    public void loadXls(String filePath){

             try {

                     InputStream input = new FileInputStream("D://test.xls");

                     POIFSFileSystem fs = new POIFSFileSystem(input);

                     HSSFWorkbook wb = new HSSFWorkbook(fs);

                     HSSFSheet sheet = wb.getSheetAt(0);

                     // Iterate over each row in the sheet

                     Iterator rows = sheet.rowIterator();

                     while (rows.hasNext()) {

                      HSSFRow row = (HSSFRow) rows.next();

                      System.out.println("Row #" + row.getRowNum());

                      // Iterate over each cell in the row and print out the cell"s

                      // content

                      Iterator cells = row.cellIterator();

                      while (cells.hasNext()) {

                       HSSFCell cell = (HSSFCell) cells.next();

                       System.out.println("Cell #" + cell.getCellNum());

                       switch (cell.getCellType()) {

                       case HSSFCell.CELL_TYPE_NUMERIC:

                        System.out.println(cell.getNumericCellValue());

                        break;

                       case HSSFCell.CELL_TYPE_STRING:

                        System.out.println(cell.getStringCellValue());

                        break;

                       case HSSFCell.CELL_TYPE_BOOLEAN:

                        System.out.println(cell.getBooleanCellValue());

                        break;

                       case HSSFCell.CELL_TYPE_FORMULA:

                        System.out.println(cell.getCellFormula());

                        break;

                       default:

                        System.out.println("unsuported sell type");

                        break;

                       }

                      }

                     }

                    } catch (IOException ex) {

                     ex.printStackTrace();

                    }

     }

     /**

      * 读取xlsx文本

      * @param filePath

      */

     public void loadXlsxText(String filePath){

            File inputFile = new File("D://test.xlsx");  

            try {

                  POITextExtractor extractor = ExtractorFactory.createExtractor(inputFile);

                  System.out.println(extractor.getText());

           } catch (InvalidFormatException e) {

                  e.printStackTrace();

           } catch (IOException e) {

                  e.printStackTrace();

           } catch (OpenXML4JException e) {

                  e.printStackTrace();

           } catch (XmlException e) {

                  e.printStackTrace();

           }  

     }

     /**

      * 读取office 2007 xlsx

      * @param filePath

      */

     public void loadXlsx(String filePath){

            // 构造 XSSFWorkbook 对象,strPath 传入文件路径  

           XSSFWorkbook xwb = null;

           try {

                  xwb = new XSSFWorkbook("D://test.xlsx");

           } catch (IOException e) {

                  System.out.println("读取文件出错");

                  e.printStackTrace();

           }  

            // 读取第一章表格内容  

            XSSFSheet sheet = xwb.getSheetAt(0);  

            xwb.getSheetAt(1);

           // 定义 rowcell  

            XSSFRow row;  

            String cell;  

            // 循环输出表格中的内容  

            for (int i = sheet.getFirstRowNum(); i < sheet.getPhysicalNumberOfRows(); i++) {  

                row = sheet.getRow(i);  

               

                for (int j = row.getFirstCellNum(); j < row.getPhysicalNumberOfCells(); j++) {  

                   // 通过 row.getCell(j).toString() 获取单元格内容,  

                   cell = row.getCell(j).toString(); 

                   System.out.print(cell + "/t");  

               }  

                System.out.println("");  

            }  

     }

     public static void main(String[] args) {

            ReadExcel readExcel =new ReadExcel();

            readExcel.loadXlsx("");

     }

    }

    [java] view plaincopyprint?

    <SPAN style="COLOR: #ff0000">但是:Workbook wb = WorkbookFactory.create(new FileInputStream(FILE_URL));</SPAN> 

     

    但是:Workbook wb = WorkbookFactory.create(new FileInputStream(FILE_URL));[java] view plaincopyprint?

    <SPAN style="COLOR: #ff0000">查看Create函数的源代码:</SPAN> 

     

    三、java解析excel大数据;临时表的形式

    此篇是紧接上篇《大数据excel文件读取——2003及之前版本》

    excel2007文件格式与之前版本不同,之前版本采用的是微软自己的存储格式。07版内容的存储采用XML格式,所以,理所当然的,对大数据量的xlsx文件的读取采用的也是XML的处理方式SAX

    同之前的版本一样,大数据量文件的读取采用的是事件模型eventusermodelusermodel模式需要将文件一次性全部读到内存中,07版的既然采用的存储模式是xml,解析用的DOM方式也是如此,这种模式操作简单,容易上手,但是对于大量数据占用的内存也是相当可观,在Eclipse中经常出现内存溢出。

    下面就是采用eventusermodel07excel文件读取。

    同上篇,我将当前行的单元格数据存储到List中,抽象出 optRows 方法,该方法会在每行末尾时调用,方法参数为当前行索引curRow(int)及存有行内单元格数据的List。继承类只需实现该行级方法即可。

     

    补充:今天发现读取2007的脚本存在存在一处问题,在遇到空单元格时会跳过该单元格,由于工作紧张没有时间去解决该问题,这里给出一个暂时的处理办法。打开文件,在开始菜单中选择"查找和选择""定位条件",选择"空值",确定,这时会找出所有的空单元格,直接按空格,然后Ctrl+enter,就会将所有空单元格填入一个空格,保存即可。

    2010.6.10补充:空单元格的问题已经解决,在2007的文档中空单元格是不存储的,单元格的r属性是单元格位置信息,格式为[A-Z]+[0-9]+。字母部分是列索引,数字部分是行索引。

    抽象类:XxlsAbstract ,作用:遍历excel文件,提供行级操作方法 optRows

    Java代码

    package com.gaosheng.util.xls;

     

    import java.io.InputStream;

    import java.sql.SQLException;

    import java.util.ArrayList;

    import java.util.Iterator;

    import java.util.List;

    import org.apache.poi.xssf.eventusermodel.XSSFReader;

    import org.apache.poi.xssf.model.SharedStringsTable;

    import org.apache.poi.xssf.usermodel.XSSFRichTextString;

    import org.apache.poi.openxml4j.opc.OPCPackage;

    import org.xml.sax.Attributes;

    import org.xml.sax.InputSource;

    import org.xml.sax.SAXException;

    import org.xml.sax.XMLReader;

    import org.xml.sax.helpers.DefaultHandler;

    import org.xml.sax.helpers.XMLReaderFactory;

     

    /**

    * XSSF and SAX (Event API)

    */

    public abstract class XxlsAbstract extends DefaultHandler {

    private SharedStringsTable sst;

    private String lastContents;

    private boolean nextIsString;

     

    private int sheetIndex = -1;

    private List<String> rowlist = new ArrayList<String>();

    private int curRow = 0; //当前行

    private int curCol = 0; //当前列索引

    private int preCol = 0; //上一列列索引

    private int titleRow = 0; //标题行,一般情况下为0

    private int rowsize = 0; //列数

     

    //excel记录行操作方法,以行索引和行元素列表为参数,对一行元素进行操作,元素为String类型

    // public abstract void optRows(int curRow, List<String> rowlist) throws SQLException ;

     

    //excel记录行操作方法,以sheet索引,行索引和行元素列表为参数,对sheet的一行元素进行操作,元素为String类型

    public abstract void optRows(int sheetIndex,int curRow, List<String> rowlist) throws SQLException;

     

    //只遍历一个sheet,其中sheetId为要遍历的sheet索引,从1开始,1-3

    public void processOneSheet(String filename,int sheetId) throws Exception {

    OPCPackage pkg = OPCPackage.open(filename);

    XSSFReader r = new XSSFReader(pkg);

    SharedStringsTable sst = r.getSharedStringsTable();

     

    XMLReader parser = fetchSheetParser(sst);

     

    // rId2 found by processing the Workbook

    // 根据 rId# rSheet# 查找sheet

    InputStream sheet2 = r.getSheet("rId"+sheetId);

    sheetIndex++;

    InputSource sheetSource = new InputSource(sheet2);

    parser.parse(sheetSource);

    sheet2.close();

    }

     

    /**

    * 遍历 excel 文件

    */

    public void process(String filename) throws Exception {

    OPCPackage pkg = OPCPackage.open(filename);

    XSSFReader r = new XSSFReader(pkg);

    SharedStringsTable sst = r.getSharedStringsTable();

     

    XMLReader parser = fetchSheetParser(sst);

     

    Iterator<InputStream> sheets = r.getSheetsData();

    while (sheets.hasNext()) {

    curRow = 0;

    sheetIndex++;

    InputStream sheet = sheets.next();

    InputSource sheetSource = new InputSource(sheet);

    parser.parse(sheetSource);

    sheet.close();

    }

    }

     

    public XMLReader fetchSheetParser(SharedStringsTable sst)

    throws SAXException {

    XMLReader parser = XMLReaderFactory

    .createXMLReader("org.apache.xerces.parsers.SAXParser");

    this.sst = sst;

    parser.setContentHandler(this);

    return parser;

    }

     

    public void startElement(String uri, String localName, String name,

    Attributes attributes) throws SAXException {

    // c => 单元格

    if (name.equals("c")) {

    // 如果下一个元素是 SST 的索引,则将nextIsString标记为true

    String cellType = attributes.getValue("t");

    String rowStr = attributes.getValue("r");

    curCol = this.getRowIndex(rowStr);

    if (cellType != null && cellType.equals("s")) {

    nextIsString = true;

    } else {

    nextIsString = false;

    }

    }

    // 置空

    lastContents = "";

    }

     

    public void endElement(String uri, String localName, String name)

    throws SAXException {

    // 根据SST的索引值的到单元格的真正要存储的字符串

    // 这时characters()方法可能会被调用多次

    if (nextIsString) {

    try {

    int idx = Integer.parseInt(lastContents);

    lastContents = new XSSFRichTextString(sst.getEntryAt(idx))

    .toString();

    } catch (Exception e) {

     

    }

    }

     

    // v => 单元格的值,如果单元格是字符串则v标签的值为该字符串在SST中的索引

    // 将单元格内容加入rowlist中,在这之前先去掉字符串前后的空白符

    if (name.equals("v")) {

    String value = lastContents.trim();

    value = value.equals("")?" ":value;

    int cols = curCol-preCol;

    if (cols>1){

    for (int i = 0;i < cols-1;i++){

    rowlist.add(preCol,"");

    }

    }

    preCol = curCol;

    rowlist.add(curCol-1, value);

    }else {

    //如果标签名称为 row ,这说明已到行尾,调用 optRows() 方法

    if (name.equals("row")) {

    int tmpCols = rowlist.size();

    if(curRow>this.titleRow && tmpCols<this.rowsize){

    for (int i = 0;i < this.rowsize-tmpCols;i++){

    rowlist.add(rowlist.size(), "");

    }

    }

    try {

    optRows(sheetIndex,curRow,rowlist);

    } catch (SQLException e) {

    e.printStackTrace();

    }

    if(curRow==this.titleRow){

    this.rowsize = rowlist.size();

    }

    rowlist.clear();

    curRow++;

    curCol = 0;

    preCol = 0;

    }

    }

    }

     

    public void characters(char[] ch, int start, int length)

    throws SAXException {

    //得到单元格内容的值

    lastContents += new String(ch, start, length);

    }

     

    //得到列索引,每一列c元素的r属性构成为字母加数字的形式,字母组合为列索引,数字组合为行索引,

    //AB45,表示为第(A-A+1*26+B-A+1*26列,45

    public int getRowIndex(String rowStr){

    rowStr = rowStr.replaceAll("[^A-Z]", "");

    byte[] rowAbc = rowStr.getBytes();

    int len = rowAbc.length;

    float num = 0;

    for (int i=0;i<len;i++){

    num += (rowAbc[i]-'A'+1)*Math.pow(26,len-i-1 );

    }

    return (int) num;

    }

     

    public int getTitleRow() {

    return titleRow;

    }

     

    public void setTitleRow(int titleRow) {

    this.titleRow = titleRow;

    }

    }

    package com.gaosheng.util.xls;

     

    import java.io.InputStream;

    import java.sql.SQLException;

    import java.util.ArrayList;

    import java.util.Iterator;

    import java.util.List;

     

    import org.apache.poi.xssf.eventusermodel.XSSFReader;

    import org.apache.poi.xssf.model.SharedStringsTable;

    import org.apache.poi.xssf.usermodel.XSSFRichTextString;

    import org.apache.poi.openxml4j.opc.OPCPackage;

    import org.xml.sax.Attributes;

    import org.xml.sax.InputSource;

    import org.xml.sax.SAXException;

    import org.xml.sax.XMLReader;

    import org.xml.sax.helpers.DefaultHandler;

    import org.xml.sax.helpers.XMLReaderFactory;

     

    /**

     * XSSF and SAX (Event API)

     */

    public abstract class XxlsAbstract extends DefaultHandler {

           private SharedStringsTable sst;

           private String lastContents;

           private boolean nextIsString;

     

           private int sheetIndex = -1;

           private List<String> rowlist = new ArrayList<String>();

           private int curRow = 0;         //当前行

           private int curCol = 0;          //当前列索引

           private int preCol = 0;          //上一列列索引

           private int titleRow = 0; //标题行,一般情况下为0

           private int rowsize = 0;  //列数

          

           //excel记录行操作方法,以行索引和行元素列表为参数,对一行元素进行操作,元素为String类型

    //     public abstract void optRows(int curRow, List<String> rowlist) throws SQLException ;

          

           //excel记录行操作方法,以sheet索引,行索引和行元素列表为参数,对sheet的一行元素进行操作,元素为String类型

           public abstract void optRows(int sheetIndex,int curRow, List<String> rowlist) throws SQLException;

          

           //只遍历一个sheet,其中sheetId为要遍历的sheet索引,从1开始,1-3

           public void processOneSheet(String filename,int sheetId) throws Exception {

                  OPCPackage pkg = OPCPackage.open(filename);

                  XSSFReader r = new XSSFReader(pkg);

                  SharedStringsTable sst = r.getSharedStringsTable();

                 

                  XMLReader parser = fetchSheetParser(sst);

     

                  // rId2 found by processing the Workbook

                  // 根据 rId# rSheet# 查找sheet

                  InputStream sheet2 = r.getSheet("rId"+sheetId);

                  sheetIndex++;

                  InputSource sheetSource = new InputSource(sheet2);

                  parser.parse(sheetSource);

                  sheet2.close();

           }

     

           /**

            * 遍历 excel 文件

            */

           public void process(String filename) throws Exception {

                  OPCPackage pkg = OPCPackage.open(filename);

                  XSSFReader r = new XSSFReader(pkg);

                  SharedStringsTable sst = r.getSharedStringsTable();

     

                  XMLReader parser = fetchSheetParser(sst);

     

                  Iterator<InputStream> sheets = r.getSheetsData();

                  while (sheets.hasNext()) {

                         curRow = 0;

                         sheetIndex++;

                         InputStream sheet = sheets.next();

                         InputSource sheetSource = new InputSource(sheet);

                         parser.parse(sheetSource);

                         sheet.close();

                  }

           }

     

           public XMLReader fetchSheetParser(SharedStringsTable sst)

                         throws SAXException {

                  XMLReader parser = XMLReaderFactory

                                .createXMLReader("org.apache.xerces.parsers.SAXParser");

                  this.sst = sst;

                  parser.setContentHandler(this);

                  return parser;

           }

     

           public void startElement(String uri, String localName, String name,

                         Attributes attributes) throws SAXException {

                  // c => 单元格

                  if (name.equals("c")) {

                         // 如果下一个元素是 SST 的索引,则将nextIsString标记为true

                         String cellType = attributes.getValue("t");

                         String rowStr = attributes.getValue("r");

                         curCol = this.getRowIndex(rowStr);

                         if (cellType != null && cellType.equals("s")) {

                                nextIsString = true;

                         } else {

                                nextIsString = false;

                         }

                  }

                  // 置空

                  lastContents = "";

           }

     

           public void endElement(String uri, String localName, String name)

                         throws SAXException {

                  // 根据SST的索引值的到单元格的真正要存储的字符串

                  // 这时characters()方法可能会被调用多次

                  if (nextIsString) {

                         try {

                                int idx = Integer.parseInt(lastContents);

                                lastContents = new XSSFRichTextString(sst.getEntryAt(idx))

                                              .toString();

                         } catch (Exception e) {

     

                         }

                  }

     

                  // v => 单元格的值,如果单元格是字符串则v标签的值为该字符串在SST中的索引

                  // 将单元格内容加入rowlist中,在这之前先去掉字符串前后的空白符

                  if (name.equals("v")) {

                         String value = lastContents.trim();

                         value = value.equals("")?" ":value;

                         int cols = curCol-preCol;

                         if (cols>1){

                                for (int i = 0;i < cols-1;i++){

                                       rowlist.add(preCol,"");

                                }

                         }

                         preCol = curCol;

                         rowlist.add(curCol-1, value);

                  }else {

                         //如果标签名称为 row ,这说明已到行尾,调用 optRows() 方法

                         if (name.equals("row")) {

                                int tmpCols = rowlist.size();

                                if(curRow>this.titleRow && tmpCols<this.rowsize){

                                       for (int i = 0;i < this.rowsize-tmpCols;i++){

                                              rowlist.add(rowlist.size(), "");

                                       }

                                }

                                try {

                                       optRows(sheetIndex,curRow,rowlist);

                                } catch (SQLException e) {

                                       e.printStackTrace();

                                }

                                if(curRow==this.titleRow){

                                       this.rowsize = rowlist.size();

                                }

                                rowlist.clear();

                                curRow++;

                                curCol = 0;

                                preCol = 0;

                         }

                  }

           }

     

           public void characters(char[] ch, int start, int length)

                         throws SAXException {

                  //得到单元格内容的值

                  lastContents += new String(ch, start, length);

           }

          

           //得到列索引,每一列c元素的r属性构成为字母加数字的形式,字母组合为列索引,数字组合为行索引,

           //AB45,表示为第(A-A+1*26+B-A+1*26列,45

           public int getRowIndex(String rowStr){

                  rowStr = rowStr.replaceAll("[^A-Z]", "");

                  byte[] rowAbc = rowStr.getBytes();

                  int len = rowAbc.length;

                  float num = 0;

                  for (int i=0;i<len;i++){

                         num += (rowAbc[i]-'A'+1)*Math.pow(26,len-i-1 );

                  }

                  return (int) num;

           }

     

           public int getTitleRow() {

                  return titleRow;

           }

     

           public void setTitleRow(int titleRow) {

                  this.titleRow = titleRow;

           }

    }

     

    继承类:XxlsBig,作用:将数据转出到数据库临时表

    Java代码

    package com.gaosheng.util.examples.xls;

     

    import java.io.FileInputStream;

    import java.io.IOException;

    import java.sql.Connection;

    import java.sql.DriverManager;

    import java.sql.PreparedStatement;

    import java.sql.SQLException;

    import java.sql.Statement;

    import java.util.List;

    import java.util.Properties;

     

    import com.gaosheng.util.xls.XxlsAbstract;

     

    public class XxlsBig extends XxlsAbstract {

    public static void main(String[] args) throws Exception {

    XxlsBig howto = new XxlsBig("temp_table");

    howto.processOneSheet("F:/new.xlsx",1);

    howto.process("F:/new.xlsx");

    howto.close();

    }

     

    public XxlsBig(String tableName) throws SQLException{

    this.conn = getNew_Conn();

    this.statement = conn.createStatement();

    this.tableName = tableName;

    }

     

    private Connection conn = null;

    private Statement statement = null;

    private PreparedStatement newStatement = null;

     

    private String tableName = "temp_table";

    private boolean create = true;

     

    public void optRows(int sheetIndex,int curRow, List<String> rowlist) throws SQLException {

    if (sheetIndex == 0 && curRow == 0) {

    StringBuffer preSql = new StringBuffer("insert into " + tableName

    + " values(");

    StringBuffer table = new StringBuffer("create table " + tableName

    + "(");

    int c = rowlist.size();

    for (int i = 0; i < c; i++) {

    preSql.append("?,");

    table.append(rowlist.get(i));

    table.append(" varchar2(100) ,");

    }

     

    table.deleteCharAt(table.length() - 1);

    preSql.deleteCharAt(preSql.length() - 1);

    table.append(")");

    preSql.append(")");

    if (create) {

    statement = conn.createStatement();

    try{

    statement.execute("drop table "+tableName);

    }catch(Exception e){

     

    }finally{

    System.out.println(" "+tableName+" 删除成功");

    }

    if (!statement.execute(table.toString())) {

    System.out.println("创建表 "+tableName+" 成功");

    // return;

    } else {

    System.out.println("创建表 "+tableName+" 失败");

    return;

    }

    }

    conn.setAutoCommit(false);

    newStatement = conn.prepareStatement(preSql.toString());

     

    } else if(curRow>0) {

    // 一般行

    int col = rowlist.size();

    for (int i = 0; i < col; i++) {

    newStatement.setString(i + 1, rowlist.get(i).toString());

    }

    newStatement.addBatch();

    if (curRow % 1000 == 0) {

    newStatement.executeBatch();

    conn.commit();

    }

    }

    }

     

    private static Connection getNew_Conn() {

    Connection conn = null;

    Properties props = new Properties();

    FileInputStream fis = null;

     

    try {

    fis = new FileInputStream("D:/database.properties");

    props.load(fis);

    DriverManager.registerDriver(new oracle.jdbc.driver.OracleDriver());

    // String jdbcURLString =

    // "jdbc:oracle:thin:@192.168.0.28:1521:orcl";

    StringBuffer jdbcURLString = new StringBuffer();

    jdbcURLString.append("jdbc:oracle:thin:@");

    jdbcURLString.append(props.getProperty("host"));

    jdbcURLString.append(":");

    jdbcURLString.append(props.getProperty("port"));

    jdbcURLString.append(":");

    jdbcURLString.append(props.getProperty("database"));

    conn = DriverManager.getConnection(jdbcURLString.toString(), props

    .getProperty("user"), props.getProperty("password"));

    } catch (Exception e) {

    e.printStackTrace();

    } finally {

    try {

    fis.close();

    } catch (IOException e) {

    e.printStackTrace();

    }

    }

    return conn;

    }

     

    public int close() {

    try {

    newStatement.executeBatch();

    conn.commit();

    System.out.println("数据写入完毕");

    this.newStatement.close();

    this.statement.close();

    this.conn.close();

    return 1;

    } catch (SQLException e) {

    return 0;

    }

    }

    }

    package com.gaosheng.util.examples.xls;

     

    import java.io.FileInputStream;

    import java.io.IOException;

    import java.sql.Connection;

    import java.sql.DriverManager;

    import java.sql.PreparedStatement;

    import java.sql.SQLException;

    import java.sql.Statement;

    import java.util.List;

    import java.util.Properties;

     

    import com.gaosheng.util.xls.XxlsAbstract;

     

    public class XxlsBig extends XxlsAbstract {

           public static void main(String[] args) throws Exception {

                  XxlsBig howto = new XxlsBig("temp_table");

                  howto.processOneSheet("F:/new.xlsx",1);

                  howto.process("F:/new.xlsx");

                  howto.close();

           }

          

           public XxlsBig(String tableName) throws SQLException{

                  this.conn = getNew_Conn();

                  this.statement = conn.createStatement();

                  this.tableName = tableName;

           }

     

           private Connection conn = null;

           private Statement statement = null;

           private PreparedStatement newStatement = null;

     

           private String tableName = "temp_table";

           private boolean create = true;

          

           public void optRows(int sheetIndex,int curRow, List<String> rowlist) throws SQLException {

                  if (sheetIndex == 0 && curRow == 0) {

                         StringBuffer preSql = new StringBuffer("insert into " + tableName

                                       + " values(");

                         StringBuffer table = new StringBuffer("create table " + tableName

                                       + "(");

                         int c = rowlist.size();

                         for (int i = 0; i < c; i++) {

                                preSql.append("?,");

                                table.append(rowlist.get(i));

                                table.append("  varchar2(100) ,");

                         }

     

                         table.deleteCharAt(table.length() - 1);

                         preSql.deleteCharAt(preSql.length() - 1);

                         table.append(")");

                         preSql.append(")");

                         if (create) {

                                statement = conn.createStatement();

                                try{

                                       statement.execute("drop table "+tableName);

                                }catch(Exception e){

                                      

                                }finally{

                                       System.out.println(" "+tableName+" 删除成功");

                                }

                                if (!statement.execute(table.toString())) {

                                       System.out.println("创建表 "+tableName+" 成功");

                                       // return;

                                } else {

                                       System.out.println("创建表 "+tableName+" 失败");

                                       return;

                                }

                         }

                         conn.setAutoCommit(false);

                         newStatement = conn.prepareStatement(preSql.toString());

     

                  } else if(curRow>0) {

                         // 一般行

                         int col = rowlist.size();

                         for (int i = 0; i < col; i++) {

                                newStatement.setString(i + 1, rowlist.get(i).toString());

                         }

                         newStatement.addBatch();

                         if (curRow % 1000 == 0) {

                                newStatement.executeBatch();

                                conn.commit();

                         }

                  }

           }

          

        private static Connection getNew_Conn() {

            Connection conn = null;

            Properties props = new Properties();

            FileInputStream fis = null;

     

            try {

                fis = new FileInputStream("D:/database.properties");

                props.load(fis);

                DriverManager.registerDriver(new oracle.jdbc.driver.OracleDriver());

                // String jdbcURLString =

                // "jdbc:oracle:thin:@192.168.0.28:1521:orcl";

                StringBuffer jdbcURLString = new StringBuffer();

                jdbcURLString.append("jdbc:oracle:thin:@");

                jdbcURLString.append(props.getProperty("host"));

                jdbcURLString.append(":");

                jdbcURLString.append(props.getProperty("port"));

                jdbcURLString.append(":");

                jdbcURLString.append(props.getProperty("database"));

                conn = DriverManager.getConnection(jdbcURLString.toString(), props

                        .getProperty("user"), props.getProperty("password"));

            } catch (Exception e) {

                e.printStackTrace();

            } finally {

                try {

                    fis.close();

                } catch (IOException e) {

                    e.printStackTrace();

                }

            }

            return conn;

        }

       

           public int close() {

                  try {

                         newStatement.executeBatch();

                         conn.commit();

                         System.out.println("数据写入完毕");

                         this.newStatement.close();

                         this.statement.close();

                         this.conn.close();

                         return 1;

                  } catch (SQLException e) {

                         return 0;

                  }

           }

    }

    继承类:XxlsPrint,作用:将数据输出到控制台

    Java代码

    package com.gaosheng.util.examples.xls;

     

    import java.sql.SQLException;

    import java.util.List;

     

    import com.gaosheng.util.xls.XxlsAbstract;

     

    public class XxlsPrint extends XxlsAbstract {

     

    @Override

    public void optRows(int sheetIndex,int curRow, List<String> rowlist) throws SQLException {

    for (int i = 0; i < rowlist.size(); i++) {

    System.out.print("'" + rowlist.get(i) + "',");

    }

    System.out.println();

    }

     

    public static void main(String[] args) throws Exception {

    XxlsPrint howto = new XxlsPrint();

    howto.processOneSheet("F:/new.xlsx",1);

    // howto.processAllSheets("F:/new.xlsx");

    }

    }

    package com.gaosheng.util.examples.xls;

     

    import java.sql.SQLException;

    import java.util.List;

     

    import com.gaosheng.util.xls.XxlsAbstract;

     

    public class XxlsPrint extends XxlsAbstract {

     

           @Override

           public void optRows(int sheetIndex,int curRow, List<String> rowlist) throws SQLException {

                  for (int i = 0; i < rowlist.size(); i++) {

                         System.out.print("'" + rowlist.get(i) + "',");

                  }

                  System.out.println();

           }

     

           public static void main(String[] args) throws Exception {

                  XxlsPrint howto = new XxlsPrint();

                  howto.processOneSheet("F:/new.xlsx",1);

    //            howto.processAllSheets("F:/new.xlsx");

           }

    }

    源代码在附件中,还包含了说明文件、数据库配置文件、以及整合xls文件和xlsx文件读取的类:Xls2Do

    excelutil.zip (87.6 KB)

    下载次数: 65

     

  • 相关阅读:
    springMVC的@RequestMapping的参数和用法
    springMVC的helloword
    eclipse 导入MyEclipse创建的web项目该如何启动?
    解决mysql中文乱码
    java核心技术 卷1 基础知识—对象与继承
    Java核心技术 卷1 基础知识—概述、环境
    python爬虫框架之scrapy的快速上手
    如何评测软件工程知识技能水平
    金融文本挖掘的业务领域建模
    金融文本挖掘中的针对业务的用例建模
  • 原文地址:https://www.cnblogs.com/yangkai-cn/p/4017287.html
Copyright © 2011-2022 走看看