I have 2 excel files and i wanted to compare the contents and highlight the differences. For example:
first file...
name|age
abc|123
def|456
second f
Like DaDaDom said Apache POI is what you are looking for. You can download it from this page. Mind that POI project is not fully independent and you may need to download some extra libraries. Follow the instructions on Apache POI website. This is how you use it:
InputStream myxls = new FileInputStream("workbook.xls"));
HSSFWorkbook wb = new HSSFWorkbook(myxls); // for *.xlsx use XSSFWorkbook
If it's a new file you might need to create sheet before proceeding, but in this case the files are already created.
HSSFSheet sheet = wb.getSheetAt(0); // first sheet
HSSFRow row = sheet.getRow(0); // first row
HSSFCell cell = row.getCell((short)0); // first cell
To get value from the cell use:
String value = cell.getStringCellValue();
However if the type stored in cell is numeric you would get an error. In case of numbers use:
Int value = cell.getCellValue();
This is a method I wrote to deal with different cell data types:
public String getValue(int x, int y){
Row row = this.activeSheet.getRow(y);
if(row==null) return "";
Cell cell = row.getCell(x);
if(cell==null) return "";
int type = cell.getCellType();
switch(type){
case 0:
return cell.getNumericCellValue() + "";
case 1:
return cell.getStringCellValue();
case 2:
return cell.getCellFormula();
case 3:
return "";
case 4:
return cell.getBooleanCellValue() + "";
case 5:
return cell.getErrorCellValue() + "";
default:
return "";
}
}
I hope this quick introduction into Apache POI will help you with your project :)
From this question, my answer partially duplicated below.
My project simple-excel which provides a bunch of Hamcrest Matchers and wraps up Apache POI's syntax.
When you do something like the following,
assertThat(actual, WorkbookMatcher.sameWorkbook(expected));
You'd see, for example,
java.lang.AssertionError:
Expected: entire workbook to be equal
but: cell at "C14" contained <"bananas"> expected <nothing>,
cell at "C15" contained <"1,850,000 EUR"> expected <"1,850,000.00 EUR">,
cell at "D16" contained <nothing> expected <"Tue Sep 04 06:30:00">
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
Read a blog post about it
I would use epplus to load both documents into datatables and then iterate over them to find differences. Depending on how you want to highlight differences, you could simply color the cells with epplus and save them back to the files..