问题
I am new to SPSS modeler. I am triyng to create a simple data transformation with python on a dummy created data.
The dummy data is created as exected. (see at the bottom) I try to access and modify the data with python using the example that i found on IBM website
import spss.pyspark.runtime
from pyspark.sql.types import *
cxt = spss.pyspark.runtime.getContext()
if cxt.isComputeDataModelOnly():
_schema = cxt.getSparkInputSchema()
cxt.setSparkOutputSchema(_schema)
else:
_structType = cxt.getSparkInputSchema()
df = cxt.getSparkInputData()
_newDF = df.sample(False, 0.01, 1)
cxt.setSparkOutputData(_newDF)
When i try to press the preview to get see the result i got 2 errors: - Can not get data model: null - No record was received
(https://www.ibm.com/support/knowledgecenter/da/SS3RA7_18.0.0/modeler_r_nodes_ddita/clementine/r_pyspark_api_examples.html)
The whole setup looks like this
回答1:
I'd like to comment, but have not enough reputation, so I have to ask using an answer.
Are you using the correct syntax tab?
Since when I use it like that, I'll get what I'd expect as the output.
This code should just return your dataframe and print "Hello World" into the Console Output Tab:
import spss.pyspark.runtime
from pyspark.sql.types import *
cxt = spss.pyspark.runtime.getContext()
if cxt.isComputeDataModelOnly():
_schema = cxt.getSparkInputSchema()
cxt.setSparkOutputSchema(_schema)
else:
df = cxt.getSparkInputData()
print("Hello World")
cxt.setSparkOutputData(df)
回答2:
You can also try the use the legacy mode in the same script tab. I always use the legacy mode and the code it's similar to Clementine (old version of SPSS Modeler).
Ref from IBM
来源:https://stackoverflow.com/questions/51028999/spss-modeler-extension-transform-python