I have a spark job (for 1.4.1) receiving a stream of kafka events. I would like to save them continuously as parquet on tachyon.
val lines = KafkaUtils.creat
setting "parquet.enable.summary-metadata" as text ("false" and not false) seems to work for us.
By the way Spark does use the _common_metadata file (we copy that over manually for repetitive jobs)