问题
tl;dr: I need to first dichotomize a set of variables to 0/1, then sum up these values. I need to do this for 14x8 variables, so I am looking for a way to to this in a loop.
Hi guys,
I have a very specific problem I need your help with:
Description of problem: In my dataset I have 14 sets of 8 variables each (e.g. a1 to a8, b1 to b8, c1 to c8, etc.) with scores ranging from 1 to 6. Note that the variables are non-contiguous, with string variables in between them (which I need for a different purpose).
I know want to compute scores for each set of these variables (e.g. scoreA, scoreB, scoreC). The score should be computed according the following rule:
scoreA = 0.
If a1 > 1 then increment scoreA by 1.
If a2 > 1 then increment scoreA by 1.
... etc.
Example: Dataset:
1 5 6 3 2 1 1 5
1 1 1 3 4 6 2 3
scores:
5
5
My previous attempts: I know I could do this task by first recoding the variables to dichotomize them, and then sum up these values. This has two large drawbacks for me: Firstly it creates a lot of new variables which I don't need. Secondly it is a very tedious and repetitive task since I have multiple sets of variables (which have different variable names) with which I need to do the same task.
I took a look at the DO REPEAT and LOOP with VECTOR commands, but I seem to not fully understand how they work. I was not able to transfer solutions from other examples I read online to my problem. I would be happy with a solution that only loops through one set of variables and does the task, then I would adjust the syntax appropriately for my other 13 sets of variables. Hope you can help me out.
Kind regards, David
回答1:
See two solutions: one loops over each of the sets, the second is a macro which loops over a list of sets:
* creating some sample data.
DATA LIST list/a1 to a8 b1 to b8 c1 to c8 hello1 to hello8.
BEGIN DATA
1 1 1 1 1 1 1 2 1 1 1 1 1 1 2 2 1 1 1 1 1 3 3 3 1 1 1 1 4 4 4 4
1 1 1 1 2 3 4 5 1 1 1 2 3 4 1 0 0 0 0 0 1 2 1 2 3 2 1 2 3 2 1 6
END DATA.
* solution 1: a loop for each set (example for sets a, b and c).
compute scoreA=0.
compute scoreB=0.
compute scoreC=0.
do repeat
a=a1 a2 a3 a4 a5 a6 a7 a8
/b=b1 b2 b3 b4 b5 b6 b7 b8
/c=c1 c2 c3 c4 c5 c6 c7 c8./* if variable names are consecutive replace with "a1 to a8" etc'.
compute scoreA=scoreA+(a>1).
compute scoreB=scoreB+(b>1).
compute scoreC=scoreC+(c>1).
end repeat.
execute.
Doing this for 14 different sets is no fun, so assuming your sets are always named $1 to $8, you can use the following macro:
define DoSets (SetList=!cmdend)
!do !set !in (!SetList)
compute !concat("Score_",!set)=0.
do repeat !set=!concat(!set,"1") !concat(!set,"2") !concat(!set,"3") !concat(!set,"4") !concat(!set,"5") !concat(!set,"6") !concat(!set,"7") !concat(!set,"8").
compute !concat("Score_",!set)=!concat("Score_",!set)+(!set>1).
end repeat.
!doend
execute.
!enddefine.
* now call the macro and list all set names.
DoSets SetList= a b c hello.
回答2:
The do repeat loop above works perfectly, but with a lot of sets of variables, it would be tedious to create. Using Python programmability, this can be generated automatically without regard to the variable order. The code below assumes an unlimited number of variables with names of the form lowercase letter digit that occur in sets of 8 and generates and runs the do repeat. For simplicity it generates one loop for each output variable, but these will all be executed on a single data pass. If the name pattern is different, this code could be adjusted if you say what it is.
begin program.
import spss, spssaux
vars = sorted(spssaux.VariableDict(pattern="[a-z]\d").variables)
cmd = """compute %(score)s = 0.
do repeat index = %(vlist)s.
compute %(score)s = %(score)s + (index > 1).
end repeat."""
if len(vars) % 8 != 0:
raise ValueError("Number of input variables not a multiple of 8")
for v in range(0, len(vars),8):
score = "score" + vars[v][0]
vlist = " ".join(vars[v:v+8])
spss.Submit(cmd % locals())
end program.
execute.
来源:https://stackoverflow.com/questions/37456141/spss-summing-up-multiple-variable-scores-depending-on-their-score