Launching wkhtmltopdf from Runtime.getRuntime().exec(): never terminates?

前端 未结 4 2057
清歌不尽
清歌不尽 2021-01-12 06:20

I\'m launching wkhtmltopdf from within my Java app (part of a Tomcat server, running in debug mode within Eclipse Helios on Win7 64-bit): I\'d like to wait for it to comple

相关标签:
4条回答
  • 2021-01-12 06:41

    I had the same exact issue as you and I solved it. Here are my findings:

    For some reason, the output from wkhtmltopdf goes to STDERR of the process and NOT STDOUT. I have verified this by calling wkhtmltopdf from Java as well as perl

    So, for example in java, you would have to do:

    //ProcessBuilder is the recommended way of creating processes since Java 1.5 
    //Runtime.getRuntime().exec() is deprecated. Do not use. 
    ProcessBuilder pb = new ProcessBuilder("wkhtmltopdf.exe", htmlFilePath, pdfFilePath);
    Process process = pb.start();
    
    BufferedReader errStreamReader = new BufferedReader(new  InputStreamReader(process.getErrorStream())); 
    //not "process.getInputStream()" 
    String line = errStreamReader.readLine(); 
    while(line != null) 
    { 
        System.out.println(line); //or whatever else
        line = reader.readLine(); 
    }
    

    On a side note, if you spawn a process from java, you MUST read from the stdout and stderr streams (even if you do nothing with it) because otherwise the stream buffer will fill and the process will hang and never return.

    To futureproof your code, just in case the devs of wkhtmltopdf decide to write to stdout, you can redirect stderr of the child process to stdout and read only one stream like this:

    ProcessBuilder pb = new ProcessBuilder("wkhtmltopdf.exe", htmlFilePath, pdfFilePath); 
    pb.redirectErrorStream(true); 
    Process process = pb.start(); 
    BufferedReader inStreamReader = new BufferedReader(new  InputStreamReader(process.getInputStream())); 
    

    Actually, I do this in all the cases where I have to spawn an external process from java. That way I don't have to read two streams.

    You should also read the streams of the spawned process in different threads if you dont want your main thread to block, since reading from streams is blocking.

    Hope this helps.

    UPDATE: I raised this issue in the project page and was replied that this is by design because wkhtmltopdf supports giving the actual pdf output in STDOUT. Please see the link for more details and java code.

    0 讨论(0)
  • 2021-01-12 06:47
        final Semaphore semaphore = new Semaphore(numOfThreads);
        final String whktmlExe = tmpwhktmlExePath;
        int doccount = 0;
        try{
            File fileObject = new File(inputDir);
            for(final File f : fileObject.listFiles()) {
    
                if(f.getAbsolutePath().endsWith(".html")) {
                    doccount ++;
                    if(doccount >500 ) {
                        LOG.info(" done with conversion of 1000 docs exiting ");
                        break;
                    }
                    System.out.println(" inside for before "+semaphore.availablePermits());
                    semaphore.acquire();
                    System.out.println(" inside for after "+semaphore.availablePermits() + " ---" +f.getName());
                    new java.lang.Thread() {
                        public void run() {
                            try {
                                String F_ =  f.getName().replaceAll(".html", ".pdf") ;
                                ProcessBuilder pb = new ProcessBuilder(whktmlExe , f.getAbsolutePath(), outPutDir + F_ .replaceAll(" ", "_") );//"wkhtmltopdf.exe", htmlFilePath, pdfFilePath);
                                pb.redirectErrorStream(true);
                                Process process = pb.start();
                                BufferedReader errStreamReader = new BufferedReader(new  InputStreamReader(process.getInputStream()));  
                                String line = errStreamReader.readLine(); 
                                while(line != null) 
                                { 
                                    System.err.println(line); //or whatever else
                                    line = errStreamReader.readLine(); 
                                }
    
                                System.out.println("after completion for ");
                            } catch (Exception e) {
                                e.printStackTrace();
                            }finally {
                                System.out.println(" in finally releasing ");
                            semaphore.release();
                            }
                      }
                    }.start();
                }
            }
        }catch (Exception ex) {
            LOG.error(" *** Error in pdf generation *** ", ex);
        }
    
        while (semaphore.availablePermits() < numOfThreads) {//till all threads finish 
            LOG.info( " Waiting for all threads to exit "+ semaphore.availablePermits() + " --- " +( numOfThreads - semaphore.availablePermits()));
            java.lang.Thread.sleep(10000);
        }
    
    0 讨论(0)
  • 2021-01-12 06:50

    You should read from the streams in a different thread.

    0 讨论(0)
  • 2021-01-12 06:51

    A process has 3 streams: input, output and error. you can read both output and error stream at the same time using separate processes. see this question and its accepted answer and also this one for example.

    0 讨论(0)
提交回复
热议问题