问题
I'm using cloud Dataflow to import data from Pub/Sub messages to BigQuery tables. I'm using DynamicDestinations since these messages can be put into different tables.
I've recently noticed that the process started consuming all resources and messages stating that the process is stuck started showing:
Processing stuck in step Write Avros to BigQuery Table/StreamingInserts/StreamingWriteTables/StreamingWrite for at least 26h45m00s without outputting or completing in state finish at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429) at java.util.concurrent.FutureTask.get(FutureTask.java:191) at org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl.insertAll(BigQueryServicesImpl.java:765) at org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl.insertAll(BigQueryServicesImpl.java:829) at org.apache.beam.sdk.io.gcp.bigquery.StreamingWriteFn.flushRows(StreamingWriteFn.java:131) at org.apache.beam.sdk.io.gcp.bigquery.StreamingWriteFn.finishBundle(StreamingWriteFn.java:103) at org.apache.beam.sdk.io.gcp.bigquery.StreamingWriteFn$DoFnInvoker.invokeFinishBundle(Unknown Source)
Currently, simply cancelling the pipeline and restarting it seems to temporarily solve the problem, but I can't seem to pinpoint the reason the process is getting stuck.
The pipeline is using beam-runners-google-cloud-dataflow-java version 2.8.0 and google-cloud-bigquery version 1.56.0
回答1:
This log message may seem scary, but it is not indicative of a problem. What this message is trying to convey is that your pipeline has been performing the same operation for a while.
This is not necessarily a problem: Your files may be large enough that they take a while to write. If you've arrived at this question concerned that you're seeing these messages, please consider what kind of pipeline you've got, and whether it makes sense to think it may have some slow steps.
In your case, your pipeline has been writing for 26 HOURS, so this is certainly a problem. I believe the problem is related to a deadlock introduced by a library in older versions of Beam. This should not be a problem in more recent ones (e.g. 2.15.0).
回答2:
May be I am late to the party. But it might help someone.. I also faced similar errors, and that too in version 2.22 of beam. But it turned out that is actually not the problem, before the exception were thrown there were errors which were passed on silently in INFO.
BigQuery insertAll error, retrying, Not found: Dataset <projectname>:<datasetname>
With the error, the pipeline goes on running for days.
When I fixed the above error, things started working fine. So, you might have other unforgiving exceptions creeping in silently.
True Story!
来源:https://stackoverflow.com/questions/54716332/processing-stuck-when-writing-to-bigquery