PDF to byte array and vice versa

前端 未结 13 1130
独厮守ぢ
独厮守ぢ 2020-11-27 04:19

I need to convert pdf to byte array and vice versa.

Can any one help me?

This is how I am converting to byte array

public static byte[] conve         


        
相关标签:
13条回答
  • 2020-11-27 04:46

    I have implemented similiar behaviour in my Application too without fail. Below is my version of code and it is functional.

        byte[] getFileInBytes(String filename) {
        File file  = new File(filename);
        int length = (int)file.length();
        byte[] bytes = new byte[length];
        try {
            BufferedInputStream reader = new BufferedInputStream(new 
        FileInputStream(file));
        reader.read(bytes, 0, length);
        System.out.println(reader);
        // setFile(bytes);
    
        } catch (FileNotFoundException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
    
        return bytes;
        }
    
    0 讨论(0)
  • 2020-11-27 04:53

    You can do it by using Apache Commons IO without worrying about internal details.

    Use org.apache.commons.io.FileUtils.readFileToByteArray(File file) which return data of type byte[].

    Click here for Javadoc

    0 讨论(0)
  • 2020-11-27 04:54
    public static void main(String[] args) throws FileNotFoundException, IOException {
            File file = new File("java.pdf");
    
            FileInputStream fis = new FileInputStream(file);
            //System.out.println(file.exists() + "!!");
            //InputStream in = resource.openStream();
            ByteArrayOutputStream bos = new ByteArrayOutputStream();
            byte[] buf = new byte[1024];
            try {
                for (int readNum; (readNum = fis.read(buf)) != -1;) {
                    bos.write(buf, 0, readNum); //no doubt here is 0
                    //Writes len bytes from the specified byte array starting at offset off to this byte array output stream.
                    System.out.println("read " + readNum + " bytes,");
                }
            } catch (IOException ex) {
                Logger.getLogger(genJpeg.class.getName()).log(Level.SEVERE, null, ex);
            }
            byte[] bytes = bos.toByteArray();
    
            //below is the different part
            File someFile = new File("java2.pdf");
            FileOutputStream fos = new FileOutputStream(someFile);
            fos.write(bytes);
            fos.flush();
            fos.close();
        }
    
    0 讨论(0)
  • 2020-11-27 04:55

    To convert pdf to byteArray :

    public byte[] pdfToByte(String filePath)throws JRException {
    
             File file = new File(<filePath>);
             FileInputStream fileInputStream;
             byte[] data = null;
             byte[] finalData = null;
             ByteArrayOutputStream byteArrayOutputStream = null;
    
             try {
                fileInputStream = new FileInputStream(file);
                data = new byte[(int)file.length()];
                finalData = new byte[(int)file.length()];
                byteArrayOutputStream = new ByteArrayOutputStream();
    
                fileInputStream.read(data);
                byteArrayOutputStream.write(data);
                finalData = byteArrayOutputStream.toByteArray();
    
                fileInputStream.close(); 
    
            } catch (FileNotFoundException e) {
                LOGGER.info("File not found" + e);
            } catch (IOException e) {
                LOGGER.info("IO exception" + e);
            }
    
            return finalData;
    
        }
    
    0 讨论(0)
  • 2020-11-27 04:55

    PDFs may contain binary data and chances are it's getting mangled when you do ToString. It seems to me that you want this:

            FileInputStream inputStream = new FileInputStream(sourcePath);
    
            int numberBytes = inputStream .available();
            byte bytearray[] = new byte[numberBytes];
    
            inputStream .read(bytearray);
    
    0 讨论(0)
  • 2020-11-27 04:56

    None of these worked for us, possibly because our inputstream was bytes from a rest call, and not from a locally hosted pdf file. What worked was using RestAssured to read the PDF as an input stream, and then using Tika pdf reader to parse it and then call the toString() method.

    import com.jayway.restassured.RestAssured;
    import com.jayway.restassured.response.Response;
    import com.jayway.restassured.response.ResponseBody;
    
    import org.apache.tika.exception.TikaException;
    import org.apache.tika.metadata.Metadata;
    import org.apache.tika.parser.AutoDetectParser;
    import org.apache.tika.parser.ParseContext;
    import org.apache.tika.sax.BodyContentHandler;
    import org.apache.tika.parser.Parser;
    import org.xml.sax.ContentHandler;
    import org.xml.sax.SAXException;
    
                InputStream stream = response.asInputStream();
                Parser parser = new AutoDetectParser(); // Should auto-detect!
                ContentHandler handler = new BodyContentHandler();
                Metadata metadata = new Metadata();
                ParseContext context = new ParseContext();
    
                try {
                    parser.parse(stream, handler, metadata, context);
                } finally {
                    stream.close();
                }
                for (int i = 0; i < metadata.names().length; i++) {
                    String item = metadata.names()[i];
                    System.out.println(item + " -- " + metadata.get(item));
                }
    
                System.out.println("!!Printing pdf content: \n" +handler.toString());
                System.out.println("content type: " + metadata.get(Metadata.CONTENT_TYPE));
    
    0 讨论(0)
提交回复
热议问题