PySpark print to console

二次信任 提交于 2019-12-02 01:26:14

Printing or logging inside of a transform will end up in the Spark executor logs, which can be accessed through your Application's AppMaster or HistoryServer via the YARN ResourceManager Web UI.

You could alternatively collect the information you are printing alongside your output (e.g. in a dict or tuple). You could also stash it away in an accumulator and then print it from the driver.

If you are doing a lot of print statement debugging, you might find it faster to SSH into your master node and use the pyspark REPL or IPython to experiment with your code. This would also allow you to use the --master local flag which would make your print statements appear in stdout.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!