Generic excel file processing with Apache POI

I'm assuming that you already have a working knowledge of Apache POI. If not, here are some good starting points -

Generic?
Generic here stands for a single piece of code that can read both xls (Excel 2003 and prior) and xlsx (Excel 2007 and onwards) files.

How?
POI provides different components/models for working with xls and xlsx files - HSSF for xls and XSSF for xlsx. These components are further divided into models which differ in the way they process files. A comparision between these is shown in the image below (borrowed from here) -


The SS model lies on top of HSSF and XSSF models and provides the genericity we're talking about here. POI code written using the SS model will work for both xls (HSSF) and xlsx (XSSF) files.

Code
There's just one catch that needs to be taken care of - loading the excel file generically. For this we'll use the SS model's WorkbookFactory class which automatically detects the type of excel file and loads it.

(Note - Code borrowed from Java Dev Tips )

public List<List<String>> parseSpreadSheet(InputStream inputStream) {
    Workbook workBook = null;
    try {
        workBook = WorkbookFactory.create(inputStream);
    } catch (Exception e) {
        throw new RuntimeException(e);
    }
    Sheet sheet = workBook.getSheetAt(0);   
    List<List<String>> rowHolder = new ArrayList<List<String>>();
    int cellNum = sheet.getRow(1).getLastCellNum();   
 
    for (int i = 0; i <= sheet.getLastRowNum(); i++) {
        Row row = sheet.getRow(i);
        List<String> cellHolder = new ArrayList<String>();
 
        for (int j = 0; j < row.getLastCellNum(); j++) {
            Cell cell = row.getCell(j);          
            String cellValue = parseCellValue(workBook, cell);
            cellHolder.add(cellValue);
        }
 
        //add empty cells to the end if required
        while (cellHolder.size() < cellNum) {
            cellHolder.add(null);                  
        }
        rowHolder.add(cellHolder);   
    }
    return rowHolder;
}

No comments:

Post a Comment