问题
My beam dataflow try to read data from GCS and write data to Pub/Sub.
However, the pipeline is hang with following error
{
job: "2019-11-04_03_53_38-5223486841492484115"
logger: "org.apache.beam.runners.dataflow.worker.windmill.GrpcWindmillServer"
message: "20 streaming Windmill RPC errors for a stream, last was: org.apache.beam.vendor.grpc.v1p21p0.io.grpc.StatusRuntimeException: ABORTED: The operation was aborted. with status Status{code=ABORTED, description=The operation was aborted., cause=null}"
thread: "36"
worker: "gcs-to-pubsub-job14-11040353-a72j-harness-xrg3"
}
What cause this error? How to fix it?
The firewall rule config as
gcloud compute firewall-rules create data-flow-test-firewall \
--network dataflow-test \
--action allow \
--direction ingress \
--target-tags dataflow \
--source-tags dataflow \
--priority 0 \
--rules tcp:12345-12346
and dataflow start parameters
-Dexec.mainClass=com.beam.test.beamPubSubV2 -Dexec.args="--project=pid
--runner=DataflowRunner --stagingLocation=gs://bucket/stage/
--tempLocation=gs://bucket/temp/ --jobName=gcs-to-pubsub-job14
--network=dataflow-test --enableStreamingEngine --maxNumWorkers=15
--autoscalingAlgorithm=THROUGHPUT_BASED" -Pdataflow-runner
Beam version: 2.16.0
来源:https://stackoverflow.com/questions/58693680/dataflow-streaming-windmill-rpc-errors-for-a-stream