Covid Death Predictions gone wrong [closed]

邮差的信 提交于 2020-11-30 02:01:04

问题


I'm attempting to write a code that will predict fatalities in Toronto due to Covid19...with no luck. I'm sure this has an easy fix that I'm over looking, but I'm too new to spark to know what that is... does anyone have any insight on making this code run-able?

Data set is here:https://open.toronto.ca/dataset/covid-19-cases-in-toronto/

Here is my code:

// Set the Environment - Spark shell

spark-shell --master yarn --jars commons-csv-1.5.jar,spark-csv_2.10-1.5.0.jar

//-- Just a bunch of import statements

import org.apache.spark.sql.functions._
import org.apache.spark.ml.feature.{VectorAssembler}
import org.apache.spark.ml.Pipeline
import org.apache.spark.ml.regression.{LinearRegression}
import org.apache.spark.ml.tuning.{CrossValidator, CrossValidatorModel, ParamGridBuilder}
import org.apache.spark.ml.evaluation.{RegressionEvaluator}
import org.apache.spark.ml.param.ParamMap
import org.apache.spark.sql.types.{DoubleType}


 //SQLcontext to deal with CSV files in Spark 1.6 and lower. If //you ever end up working in Spark     
2.0 and above, the commands to //load a CSV will be slightly different

import org.apache.spark.sql.SQLContext
val sqlContext = new SQLContext(sc)

//Load the Data - The Applied options are for CSV files

df = spark.read.format("csv") 
.option("inferSchema","true") 
.option("header","true") 
.option("sep",",") 
.load(FILE LOCATION)

// Load training data

val training = spark.read.format("libsvm") .load("FILE LOCATION")

val lr = new LinearRegression() .setMaxIter(10) .setRegParam(0.3) .setElasticNetParam(0.8)

// Fit the model
 val lrModel = lr.fit(training)

//Next we setup our cross validator

val cross_validator = new CrossValidator() .setEstimator(pipeline) .setEvaluator(evaluator) .setEstimatorParamMaps(new ParamGridBuilder().build) .setNumFolds(3)

// Next we call fit on the cross validator passing our training dataset

val cvModel = cross_validator.fit(trainingData)

val predictions = cvModel.transform(testData)


// Print the coefficients and intercept for linear regression

println(s"Coefficients: ${lrModel.coefficients} Intercept: ${lrModel.intercept}")


// Summarize the model over the training set and print out some metrics

val trainingSummary = lrModel.summary
println(s"numIterations: ${trainingSummary.totalIterations}")
println(s"objectiveHistory: [${trainingSummary.objectiveHistory.mkString(",")}]")
trainingSummary.residuals.show()
println(s"RMSE: ${trainingSummary.rootMeanSquaredError}")
println(s"r2: ${trainingSummary.r2}")

val r2 = evaluator.evaluate(predictions)

println("r-squared on test data = " + r2)'''

来源:https://stackoverflow.com/questions/65013233/covid-death-predictions-gone-wrong

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!