SPSS modeler Extension Transform - Python

问题

I am new to SPSS modeler. I am triyng to create a simple data transformation with python on a dummy created data.

The dummy data is created as exected. (see at the bottom) I try to access and modify the data with python using the example that i found on IBM website

import spss.pyspark.runtime
from pyspark.sql.types import *

cxt = spss.pyspark.runtime.getContext() 

if  cxt.isComputeDataModelOnly():   
        _schema = cxt.getSparkInputSchema()   
        cxt.setSparkOutputSchema(_schema)
else:   
        _structType = cxt.getSparkInputSchema()
        df = cxt.getSparkInputData()   
        _newDF = df.sample(False, 0.01, 1)
        cxt.setSparkOutputData(_newDF)

When i try to press the preview to get see the result i got 2 errors: - Can not get data model: null - No record was received

(https://www.ibm.com/support/knowledgecenter/da/SS3RA7_18.0.0/modeler_r_nodes_ddita/clementine/r_pyspark_api_examples.html)

The whole setup looks like this

回答1:

I'd like to comment, but have not enough reputation, so I have to ask using an answer.

Are you using the correct syntax tab?

Since when I use it like that, I'll get what I'd expect as the output.

This code should just return your dataframe and print "Hello World" into the Console Output Tab:

import spss.pyspark.runtime
from pyspark.sql.types import *

cxt = spss.pyspark.runtime.getContext() 

if  cxt.isComputeDataModelOnly():   
        _schema = cxt.getSparkInputSchema()   
        cxt.setSparkOutputSchema(_schema)
else:   
        df = cxt.getSparkInputData()
        print("Hello World")
        cxt.setSparkOutputData(df)

回答2:

You can also try the use the legacy mode in the same script tab. I always use the legacy mode and the code it's similar to Clementine (old version of SPSS Modeler).

Ref from IBM

来源：https://stackoverflow.com/questions/51028999/spss-modeler-extension-transform-python

标签

python

spss-modeler