Check if two pyspark Rows are equal

你。 提交于 2019-12-11 15:46:17

问题


I am writing unit tests for a Spark job, and some of the outputs are named tuples: pyspark.sql.Row

How can I assert their equality?

actual = get_data(df)
expected = Row(total=4, unique_ids=2)
self.assertEqual(actual, expected)

When I do this, the values are rearranged in an order I can not determine.


回答1:


Your code should work as written because according to the docs:

the fields will be sorted by names.

Nevertheless, another way is to use the asDict() method of the pySpark.sql.Row and compare them as dictionaries:

actual = get_data(df)
expected = Row(total=4, unique_ids=2)
self.assertEqual(actual.asDict(), expected.asDict())


来源:https://stackoverflow.com/questions/49519475/check-if-two-pyspark-rows-are-equal

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!