How to implement Like-condition in SparkSQL?

后端 未结 1 1795
时光说笑
时光说笑 2021-01-11 17:36

How do I write SQL statement to reach the goal like the following statement:

SELECT * FROM table t WHERE t.a LIKE \'         


        
相关标签:
1条回答
  • 2021-01-11 18:25

    spark.sql.Column provides like method but as for now (Spark 1.6.0 / 2.0.0) it works only with string literals. Still you can use raw SQL:

    import org.apache.spark.sql.hive.HiveContext
    val sqlContext = new HiveContext(sc) // Make sure you use HiveContext
    import sqlContext.implicits._ // Optional, just to be able to use toDF
    
    val df = Seq(("foo", "bar"), ("foobar", "foo"), ("foobar", "bar")).toDF("a", "b")
    
    df.registerTempTable("df")
    sqlContext.sql("SELECT * FROM df  WHERE a LIKE CONCAT('%', b, '%')")
    
    // +------+---+
    // |     a|  b|
    // +------+---+
    // |foobar|foo|
    // |foobar|bar|
    // +------+---+
    

    or expr / selectExpr:

    df.selectExpr("a like CONCAT('%', b, '%')")
    

    In Spark 1.5 it will requireHiveContext. If for some reason Hive context is not an option you can use custom udf:

    import org.apache.spark.sql.functions.udf
    
    val simple_like = udf((s: String, p: String) => s.contains(p))
    df.where(simple_like($"a", $"b"))
    
    val regex_like = udf((s: String, p: String) =>
      new scala.util.matching.Regex(p).findFirstIn(s).nonEmpty)
    df.where(regex_like($"a", $"b"))
    
    0 讨论(0)
提交回复
热议问题