Pyspark: filter dataframe by regex with string formatting?

后端 未结 3 1672
长发绾君心
长发绾君心 2021-02-01 06:35

I\'ve read several posts on using the \"like\" operator to filter a spark dataframe by the condition of containing a string/expression, but was wondering if the following is a \

3条回答
  •  抹茶落季
    2021-02-01 06:39

    I used the following for the timestamp regex

    expression = r'[0-9]{4}-(0[1-9]|1[0-2])-(0[1-9]|[1-2][0-9]|3[0-1]) (2[0-3]|[01][0-9]):[0-5][0-9]:[0-5][0-9]'
    df1 = df.filter(df['eta'].rlike(expression))
    

提交回复
热议问题