SPSS modeler Extension Transform - Python

♀尐吖头ヾ 提交于 2021-01-29 09:31:51

问题


I am new to SPSS modeler. I am triyng to create a simple data transformation with python on a dummy created data.

The dummy data is created as exected. (see at the bottom) I try to access and modify the data with python using the example that i found on IBM website

import spss.pyspark.runtime
from pyspark.sql.types import *

cxt = spss.pyspark.runtime.getContext() 

if  cxt.isComputeDataModelOnly():   
        _schema = cxt.getSparkInputSchema()   
        cxt.setSparkOutputSchema(_schema)
else:   
        _structType = cxt.getSparkInputSchema()
        df = cxt.getSparkInputData()   
        _newDF = df.sample(False, 0.01, 1)
        cxt.setSparkOutputData(_newDF)

When i try to press the preview to get see the result i got 2 errors: - Can not get data model: null - No record was received

(https://www.ibm.com/support/knowledgecenter/da/SS3RA7_18.0.0/modeler_r_nodes_ddita/clementine/r_pyspark_api_examples.html)

The whole setup looks like this


回答1:


I'd like to comment, but have not enough reputation, so I have to ask using an answer.

Are you using the correct syntax tab?

Since when I use it like that, I'll get what I'd expect as the output.


This code should just return your dataframe and print "Hello World" into the Console Output Tab:

import spss.pyspark.runtime
from pyspark.sql.types import *

cxt = spss.pyspark.runtime.getContext() 

if  cxt.isComputeDataModelOnly():   
        _schema = cxt.getSparkInputSchema()   
        cxt.setSparkOutputSchema(_schema)
else:   
        df = cxt.getSparkInputData()
        print("Hello World")
        cxt.setSparkOutputData(df)



回答2:


You can also try the use the legacy mode in the same script tab. I always use the legacy mode and the code it's similar to Clementine (old version of SPSS Modeler).

Ref from IBM



来源:https://stackoverflow.com/questions/51028999/spss-modeler-extension-transform-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!