I need to convert pdf to byte array and vice versa.
Can any one help me?
This is how I am converting to byte array
public static byte[] conve
I have implemented similiar behaviour in my Application too without fail. Below is my version of code and it is functional.
byte[] getFileInBytes(String filename) {
File file = new File(filename);
int length = (int)file.length();
byte[] bytes = new byte[length];
try {
BufferedInputStream reader = new BufferedInputStream(new
FileInputStream(file));
reader.read(bytes, 0, length);
System.out.println(reader);
// setFile(bytes);
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return bytes;
}
You can do it by using Apache Commons IO
without worrying about internal details.
Use org.apache.commons.io.FileUtils.readFileToByteArray(File file)
which return data of type byte[]
.
Click here for Javadoc
public static void main(String[] args) throws FileNotFoundException, IOException {
File file = new File("java.pdf");
FileInputStream fis = new FileInputStream(file);
//System.out.println(file.exists() + "!!");
//InputStream in = resource.openStream();
ByteArrayOutputStream bos = new ByteArrayOutputStream();
byte[] buf = new byte[1024];
try {
for (int readNum; (readNum = fis.read(buf)) != -1;) {
bos.write(buf, 0, readNum); //no doubt here is 0
//Writes len bytes from the specified byte array starting at offset off to this byte array output stream.
System.out.println("read " + readNum + " bytes,");
}
} catch (IOException ex) {
Logger.getLogger(genJpeg.class.getName()).log(Level.SEVERE, null, ex);
}
byte[] bytes = bos.toByteArray();
//below is the different part
File someFile = new File("java2.pdf");
FileOutputStream fos = new FileOutputStream(someFile);
fos.write(bytes);
fos.flush();
fos.close();
}
To convert pdf to byteArray :
public byte[] pdfToByte(String filePath)throws JRException {
File file = new File(<filePath>);
FileInputStream fileInputStream;
byte[] data = null;
byte[] finalData = null;
ByteArrayOutputStream byteArrayOutputStream = null;
try {
fileInputStream = new FileInputStream(file);
data = new byte[(int)file.length()];
finalData = new byte[(int)file.length()];
byteArrayOutputStream = new ByteArrayOutputStream();
fileInputStream.read(data);
byteArrayOutputStream.write(data);
finalData = byteArrayOutputStream.toByteArray();
fileInputStream.close();
} catch (FileNotFoundException e) {
LOGGER.info("File not found" + e);
} catch (IOException e) {
LOGGER.info("IO exception" + e);
}
return finalData;
}
PDFs may contain binary data and chances are it's getting mangled when you do ToString. It seems to me that you want this:
FileInputStream inputStream = new FileInputStream(sourcePath);
int numberBytes = inputStream .available();
byte bytearray[] = new byte[numberBytes];
inputStream .read(bytearray);
None of these worked for us, possibly because our inputstream
was byte
s from a rest call, and not from a locally hosted pdf file. What worked was using RestAssured
to read the PDF as an input stream, and then using Tika pdf reader to parse it and then call the toString()
method.
import com.jayway.restassured.RestAssured;
import com.jayway.restassured.response.Response;
import com.jayway.restassured.response.ResponseBody;
import org.apache.tika.exception.TikaException;
import org.apache.tika.metadata.Metadata;
import org.apache.tika.parser.AutoDetectParser;
import org.apache.tika.parser.ParseContext;
import org.apache.tika.sax.BodyContentHandler;
import org.apache.tika.parser.Parser;
import org.xml.sax.ContentHandler;
import org.xml.sax.SAXException;
InputStream stream = response.asInputStream();
Parser parser = new AutoDetectParser(); // Should auto-detect!
ContentHandler handler = new BodyContentHandler();
Metadata metadata = new Metadata();
ParseContext context = new ParseContext();
try {
parser.parse(stream, handler, metadata, context);
} finally {
stream.close();
}
for (int i = 0; i < metadata.names().length; i++) {
String item = metadata.names()[i];
System.out.println(item + " -- " + metadata.get(item));
}
System.out.println("!!Printing pdf content: \n" +handler.toString());
System.out.println("content type: " + metadata.get(Metadata.CONTENT_TYPE));