py4j.protocol.Py4JJavaError: An error occurred while calling o788.save. : com.mongodb.MongoTimeoutException, WritableServerSelector

偶尔善良 提交于 2020-01-25 08:59:26

问题


Pyspark version: 2.4.4 MongoDB version: 4.2.0 RAM: 64GB CPU Core:32 running script: spark-submit --executor-memory 8G --driver-memory 8G --packages org.mongodb.spark:mongo-spark-connector_2.11:2.3.1 demographic.py

when I run the code I am getting the error: "py4j.protocol.Py4JJavaError: An error occurred while calling o764.save. : com.mongodb.MongoTimeoutException: Timed out after 30000 ms while waiting for a server that matches WritableServerSelector. Client view of cluster state is {type=REPLICA_SET, servers=[{address=172...*:27017, type=REPLICA_SET_SECONDARY, roundTripTime=34.3 ms, state=CONNECTED}]"

I am trying to read a MongoDB collection from one replica server which has authentication and I can read from that server using:

df_ipapp = spark.read.format('com.mongodb.spark.sql.DefaultSource').option('uri', '{}/{}.IpAppointment?authSource={}'.format(mongo_url, mongo_db,auth_source)).load()

and it's working fine. but after processing this data frame I am writing that data frame to another MongoDB which has no authentication that is situated locally where I process, using: df.write.format('com.mongodb.spark.sql.DefaultSource').mode('overwrite').option('uri', '{}/{}.demographic'.format(mongo_final_url, mongo_final_db)).save()

and every time I get error here

  File "/home/svr_data_analytic/hmis-analytics-data-processing/src/main/python/sales/demographic.py", line 297, in save_n_rename
    .option('uri', '{}/{}.demographic'.format(mongo_url, mongo_final_db)).save()
  File "/home/svr_data_analytic/spark/spark-2.4.4-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/sql/readwriter.py", line 736, in save
  File "/home/svr_data_analytic/spark/spark-2.4.4-bin-hadoop2.7/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1257, in __call__
  File "/home/svr_data_analytic/spark/spark-2.4.4-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/sql/utils.py", line 63, in deco
  File "/home/svr_data_analytic/spark/spark-2.4.4-bin-hadoop2.7/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py", line 328, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o788.save.
: com.mongodb.MongoTimeoutException: Timed out after 30000 ms while waiting for a server that matches WritableServerSelector. Client view of cluster state is {type=REPLICA_SET, servers=[{address=172.*.*.*:27017, type=REPLICA_SET_SECONDARY, roundTripTime=0.8 ms, state=CONNECTED}]

reading from replica server:

df_bills = spark.read.format('com.mongodb.spark.sql.DefaultSource').option('uri', '{}/{}.Bills?authSource={}'.format(mongo_url, mongo_db, auth_source)).load()

writing to mongodb:

df.write.format('com.mongodb.spark.sql.DefaultSource').mode('overwrite').option('uri', '{}/{}.demographic'.format(mongo_final_url, mongo_final_db)).save()

I want to read from a replica server MondoDb which has authentication and process the data frame and write it to the local MongoDB thanks in advance

来源:https://stackoverflow.com/questions/58624903/py4j-protocol-py4jjavaerror-an-error-occurred-while-calling-o788-save-com-mo

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!