I am tryng to write a subquery in where clause like below. But i am getting "Correlated column is not allowed in a non-equality predicate:"
SELECT *, holidays FROM ( SELECT *, s.holidays, s.entity FROM transit_t tt WHERE ( SELECT Count(thedate) AS holidays FROM fact_ent_rt WHERE entity=tt.awborigin AND ( Substring(thedate,1,10)) BETWEEN (Substring(awbpickupdate,1,10)) AND ( Substring(deliverydate,1,10)) AND ( nholidayflag = true OR weekendflag = true))) s
Any issues with this query. because i thought spark >2.0 supported subqueries in where clause. any suggestions would be appreciated. Thanks
The Input will be pick up date and delivery date from transit table. We need to find if there is a weekend falling between these dates(this data is available in fact_ent_rt ) and take the count of no of holidays.
The output i am getting is pyspark.sql.utils.AnalysisException: u"Correlated column is not allowed in a non-equality predicate:\nAggregate
Sample inputs:
Input 1: +---------++-------------------+-------------------+
|AWBOrigin|| AWBPickupDate| DeliveryDate| +---------++-------------------+-------------------+ | LON||2018-09-01 08:52:00|2018-09-12 13:57:00| | DHA||2018-09-04 11:47:00|2018-09-08 07:30:00| | NIC||2009-01-01 01:47:00|2009-01-09 11:37:00 +---------+-----------+-----------------------------
Input 2 (fact_ent)
------+-------------------+-----------+------------ Entity| TheDate|WeekendFlag|NHolidayFlag ------+-------------------+-----------+------------ NIC|2009-01-01 00:00:00| False| False NIC|2009-01-02 00:00:00| False| False NIC|2009-01-03 00:00:00| True| False NIC|2009-01-04 00:00:00| True| False NIC|2009-01-05 00:00:00| False| False NIC|2009-01-06 00:00:00| False| False NIC|2009-01-07 00:00:00| False| False NIC|2009-01-08 00:00:00| False| False NIC|2009-01-09 00:00:00| False| False NIC|2009-01-10 00:00:00| True| False NIC|2009-01-11 00:00:00| True| False NIC|2009-01-12 00:00:00| False| False NIC|2009-01-13 00:00:00| False| False NIC|2009-01-14 00:00:00| False| False NIC|2009-01-15 00:00:00| False| False NIC|2009-01-16 00:00:00| False| False NIC|2009-01-17 00:00:00| True| False NIC|2009-01-18 00:00:00| True| False NIC|2009-01-19 00:00:00| False| False NIC|2009-01-20 00:00:00| False| False ------+-------------------+-----------+------------
expectede Output
+---------++-------------------+-------------------+ |AWBOrigin|| AWBPickupDate| DeliveryDate| Holidays +---------++-------------------+-------------------+ | LON||2018-09-01 08:52:00|2018-09-12 13:57:00| NA | DHA||2018-09-04 11:47:00|2018-09-08 07:30:00| NA | NIC||2009-01-01 01:47:00|2009-01-09 11:37:00| 2 +---------+-----------+-----------------------------