I need to convert pdf to byte array and vice versa.
Can any one help me?
This is how I am converting to byte array
public static byte[] conve
Are'nt you creating the pdf file but not actually writing the byte array back? Therefore you cannot open the PDF.
out = new FileOutputStream("D:/ABC_XYZ/1.pdf");
out.Write(b, 0, b.Length);
out.Position = 0;
out.Close();
This is in addition to correctly reading in the PDF to byte array.
This worked for me. I haven't used any third-party libraries. Just the ones that are shipped with Java.
import java.io.*;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
public class PDFUtility {
public static void main(String[] args) throws IOException {
/**
* Converts byte stream into PDF.
*/
PDFUtility pdfUtility = new PDFUtility();
byte[] byteStreamPDF = pdfUtility.convertPDFtoByteStream();
FileOutputStream fileOutputStream = new FileOutputStream("C:\\Users\\aseem\\Desktop\\BlaFolder\\BlaFolder2\\aseempdf.pdf");
fileOutputStream.write(byteStreamPDF);
fileOutputStream.close();
System.out.println("File written successfully");
}
/**
* Creates PDF to Byte Stream
*
* @return
* @throws IOException
*/
protected byte[] convertPDFtoByteStream() throws IOException {
Path path = Paths.get("C:\\Users\\aseem\\aaa.pdf");
return Files.readAllBytes(path);
}
}
The problem is that you are calling toString()
on the InputStream
object itself. This will return a String
representation of the InputStream
object not the actual PDF document.
You want to read the PDF only as bytes as PDF is a binary format. You will then be able to write out that same byte
array and it will be a valid PDF as it has not been modified.
e.g. to read a file as bytes
File file = new File(sourcePath);
InputStream inputStream = new FileInputStream(file);
byte[] bytes = new byte[file.length()];
inputStream.read(bytes);
You basically need a helper method to read a stream into memory. This works pretty well:
public static byte[] readFully(InputStream stream) throws IOException
{
byte[] buffer = new byte[8192];
ByteArrayOutputStream baos = new ByteArrayOutputStream();
int bytesRead;
while ((bytesRead = stream.read(buffer)) != -1)
{
baos.write(buffer, 0, bytesRead);
}
return baos.toByteArray();
}
Then you'd call it with:
public static byte[] loadFile(String sourcePath) throws IOException
{
InputStream inputStream = null;
try
{
inputStream = new FileInputStream(sourcePath);
return readFully(inputStream);
}
finally
{
if (inputStream != null)
{
inputStream.close();
}
}
}
Don't mix up text and binary data - it only leads to tears.
Calling toString()
on an InputStream
doesn't do what you think it does. Even if it did, a PDF contains binary data, so you wouldn't want to convert it to a string first.
What you need to do is read from the stream, write the results into a ByteArrayOutputStream
, then convert the ByteArrayOutputStream
into an actual byte
array by calling toByteArray()
:
InputStream inputStream = new FileInputStream(sourcePath);
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
int data;
while( (data = inputStream.read()) >= 0 ) {
outputStream.write(data);
}
inputStream.close();
return outputStream.toByteArray();
This works for me:
try(InputStream pdfin = new FileInputStream("input.pdf");OutputStream pdfout = new FileOutputStream("output.pdf")){
byte[] buffer = new byte[1024];
int bytesRead;
while((bytesRead = pdfin.read(buffer))!=-1){
pdfout.write(buffer,0,bytesRead);
}
}
But Jon's answer doesn't work for me if used in the following way:
try(InputStream pdfin = new FileInputStream("input.pdf");OutputStream pdfout = new FileOutputStream("output.pdf")){
int k = readFully(pdfin).length;
System.out.println(k);
}
Outputs zero as length. Why is that ?