I have a data file consisting of 57 variables. I want to transform about 12 of them into z-scores due to their uneven level of measurement. I looked up internet resources an
mu <- mean(myRow)
sigma <- sqrt ( var(myRow) )
myRow <- (myRow - mu )/ sqrt(sigma)
Simply, you forgot the brackets in: x - mean(x)/sd(x)
The correct code is: ( x - mean(x) ) /sd(x)
scale()
is the correct choice here:
> x <- 1:10
> scale(x)
[,1]
[1,] -1.4863011
[2,] -1.1560120
[3,] -0.8257228
[4,] -0.4954337
[5,] -0.1651446
[6,] 0.1651446
[7,] 0.4954337
[8,] 0.8257228
[9,] 1.1560120
[10,] 1.4863011
attr(,"scaled:center")
[1] 5.5
attr(,"scaled:scale")
[1] 3.02765
> (x - mean(x)) / sd(x)
[1] -1.4863011 -1.1560120 -0.8257228 -0.4954337 -0.1651446
[6] 0.1651446 0.4954337 0.8257228 1.1560120 1.4863011
> mean(x)
[1] 5.5
> sd(x)
[1] 3.02765
Notice how the attributes in the object returned from scale()
are the mean and SD of the input data.
Now you don't provide real code to show how you computed "V5-mean/st.dev" but if you did it exactly like that the operator precedence might have caught you out. This for example doesn't return the correct z-scores:
> x - mean(x) / sd(x)
[1] -0.8165902 0.1834098 1.1834098 2.1834098 3.1834098
[6] 4.1834098 5.1834098 6.1834098 7.1834098 8.1834098