I have a dataframe df
that have following structure:
+-----+-----+-----+-------+
| s |col_1|col_2|col_...|
+-----+-----+-----+-------+
| f1 |
If data is small enough to be transposed (not pivoted with aggregation) you can just convert it to Pandas DataFrame
:
df = sc.parallelize([
("f1", 0.0, 0.6, 0.5),
("f2", 0.6, 0.7, 0.9)]).toDF(["s", "col_1", "col_2", "col_3"])
df.toPandas().set_index("s").transpose()
s f1 f2
col_1 0.0 0.6
col_2 0.6 0.7
col_3 0.5 0.9
If it is to large for this, Spark won't help. Spark DataFrame
distributes data by row (although locally uses columnar storage), therefore size of a individual rows is limited to local memory.