Split Spark DataFrame into two DataFrames (70% and 30% ) based on id column by preserving order
问题 I have a spark dataframe which is like id start_time feature 1 01-01-2018 3.567 1 01-02-2018 4.454 1 01-03-2018 6.455 2 01-02-2018 343.4 2 01-08-2018 45.4 3 02-04-2018 43.56 3 02-07-2018 34.56 3 03-07-2018 23.6 I want to be able to split this into two dataframes based on the id column .So I should groupby the id column, sort by start_time and take 70% of the rows into one dataframe and 30% of the rows into another dataframe by preserving the order.The result should look like: Dataframe1: id