问题
I have data from the USGS Nation Water Data website. I am currently trying to plot and fit curves to the data to use in prediction for different measurements taken within the dataset (dissolved oxygen, pH, gage height and temp) all in relation to discharge rate. I used the "nls" command and I am using a book of equations to find which curve to use...for this example I have specifically used Schumacher’s equation (p.48 in the book).
Find the link to data:
curve book: http://www.for.gov.bc.ca/hfd/pubs/docs/bio/bio04.htm
data I used: http://waterdata.usgs.gov/mi/nwis/uv?referred_module=qw&search_station_nm=River%20Rouge%20at%20Detroit%20MI&search_station_nm_match_type=anywhere&index_pmcode_00065=1&index_pmcode_00060=1&index_pmcode_00300=1&index_pmcode_00400=1&index_pmcode_00095=1&index_pmcode_00010=1&group_key=NONE&sitefile_output_format=html_table&column_name=agency_cd&column_name=site_no&column_name=station_nm&range_selection=date_range&begin_date=2013-11-18&end_date=2013-12-18&format=html_table&date_format=YYYY-MM-DD&rdb_compression=file&list_of_search_criteria=search_station_nm,realtime_parameter_selection
My problem is that I CANNOT get nls to predict new values once I picked a curve coded it... I also can't quite figure out how to plot it...I'm guessing this can be with the residuals? In the code I have used "aggregate" to extract means of the listed measurements and the corresponding discharge rates, now I just need to get R to predict for me. I got as far as getting what I think are fitted values... but I'm not sure and I hit a wall with "?nls."
##Create new dataframes with means given date for each constituent
ph <- aggregate(Discharge~pH, data=River.Data, mean)
##pH models
pH <- ph$pH
disch <- ph$Discharge
phm <- nls(disch~exp(a+(b/pH)), data=ph, trace=T, start=list(a=-47.06 ,b=400.2))
newph<- data.frame(ph=c(3.0,4.0,5.0,6.0,7.0,8.0,9.0))
predict(phm, newdata=newph)
回答1:
Seems like you've already got your answer(??), but:
ph <- aggregate(Discharge~pH, data=River.Data, mean)
phm <- nls(Discharge~exp(a+(b/pH)), data=ph, trace=T, start=list(a=-47.06 ,b=400.2))
newph <- data.frame(pH=seq(3,9,by=0.1))
Discharge.pred <- predict(phm, newdata=newph)
plot(ph$pH, ph$Discharge, xlim=c(3,9), ylim=c(0,1000))
par(new=t)
plot(newph$pH,Discharge.pred, xlab="", ylab="", axes=F, xlim=c(3,9), ylim=c(0,1000), type="l")
The problem is that your data is for pH in [7.5,8.2] but you are trying to predict in [3,9]. The model you've chosen is not stable for pH that far outside the range.
回答2:
Jarrod, try this. Cheers, Robert.
#Try this
#pH <- ph$pH # you don't need this
#disch <- ph$Discharge # you don't need this
phm <- nls(Discharge~exp(a+(b/pH)), data=ph, trace=T, start=list(a=-47.06 ,b=400.2))
newph<- data.frame(pH=seq(3,9,0.1)) # it'll be smoother with a sequence in increments of 0.1
plot(newph,predict(phm, data=newph,type="l"))
回答3:
This answer is based on @Carl Witthoft's comment. I'm highlighting this in a separate answer, because I skimmed over the comments and didn't really take in what Carl was proposing. Hoping it will help others who also find themselves here.
From the ?predict.nls
file, the new data needs to be a "named list or dataframe". Which is where I have fallen down in the past; the list or dataframe column needs to have a name. And as Carl states, this needs to be exactly the same name as that used in the model. I think the OP's issue may be because they used pH
in the model, but ph
in the creation of new data.
来源:https://stackoverflow.com/questions/20721030/using-predict-in-nls