问题
I have a script I am writing that will use either plain text or Parquet files. If it is a parquet file it will read it in using a dataframe and a few other things. On my cluster I am working on the first solution was the easiest and was if the extension of a file was .parquet
if (parquetD(1) == "parquet") {
if (args.length != 2) {
println(usage2)
System.exit(1)
println(args)
}
}
it would read it in with the dataframe. The problem is I have a bunch of files some people have created with no extension. So the first solution that comes to mind which is not ideal is to read the first line of the file and check for the: PAR.
Is there any other solutions
来源:https://stackoverflow.com/questions/32569678/how-to-detect-parquet-files