I have trained a model using sklearn and exported it into a pmml format using sklearn2pmml. Is there a way to convert that pmml file back into something that can be imported and
You don't need to test the correctness of sklearn2pmml generated models. It's based on the JPMML-SkLearn library, which has full coverage with integration tests - Scikit-Learn predictions and PMML predictions are provably identical.
Your real issue is that you want to apply models outside of their intended "applicability domain". It's a bead idea, because model's behaviour is not specified in that case - garbage input, garbage predictions.
However, if you insist that you must be able to feed garbage to your models in production environment, then simply disable PMML value bounds checking. There are many ways how this can be accomplished:
Value
and Interval
child elements from /PMML/DataDictionary/DataField
elements.Value
and Interval
child elements so that those previously unseen values would be recognized as valid values. For example, you can define the margins of the Input
element to include all values [-Inf, +Inf]. See the explanation of Value and Interval elements in the PMML specification for correct syntax.invalidValueTreatment
attribute value of all /PMML/<Model>/MiningSchema/MiningField
elements from "returnInvalid" to "asIs". If this attribute is missing, then it defaults to "returnInvalid". So you'd need to insert invalidValueTreatment=asIs
there.I would recommend option #3. You can automate the process using JPMML-Model library:
org.dmg.pmml.PMML pmml = loadFromFile(..)
org.dmg.pmml.Visitor mfUpdater = new org.jpmml.model.visitors.AbstractVisitor(){
@Override
public VisitorAction visit(MiningField miningField){
miningField.setInvalidValueTreatment(InvalidValueTreatmentMethod.AS_IS);
return VisitorAction.CONTINUE;
}
}
mfUpdater.applyTo(pmml);
saveToFile(pmml, ...)