Pyspark: filter dataframe by regex with string formatting?

后端未结

关注

 3  1691

长发绾君心 2021-02-01 06:35

I\'ve read several posts on using the \"like\" operator to filter a spark dataframe by the condition of containing a string/expression, but was wondering if the following is a \

3条回答

北恋 (楼主)

2021-02-01 07:01
From neeraj's hint, it seems like the correct way to do this in pyspark is:
```
expr = "Arizona.*hot"
dk = dx.filter(dx["keyword"].rlike(expr))
```
Note that dx.filter($"keyword" ...) did not work since (my version) of pyspark didn't seem to support the $ nomenclature out of the box.
0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...