I\'m quite confused on when to use
factor(educ) or factor(agegroup)in R. Is it used for categorical ordered data? or can I just use to i
You can flag a factor as ordered by creating it with ordered(x)
or with factor(x, ordered=TRUE)
. The "Details" section of ?factor
explains that:
Ordered factors differ from factors only in their class, but methods and the model-fitting functions treat the two classes quite differently.
You can confirm the first part of that quote (that they differ only in their class) by comparing the attributes of these two objects:
f <- factor(letters[3:1], levels=letters[3:1])
of <- ordered(letters[3:1], levels=letters[3:1])
attributes(f)
# $levels
# [1] "c" "b" "a"
#
# $class
# [1] "factor"
attributes(of)
# $levels
# [1] "c" "b" "a"
#
# $class
# [1] "ordered" "factor"
Various factor-handling R functions (the "methods and model-fitting functions" of the second part of that quote) will then use is.ordered()
to test for the presence of that "ordered"
class indicator, taking it as a directive to treat an ordered factor differently than an unordered one. Here are a couple of examples:
## The print method for factors. (Type 'print.factor' to see the function's code)
print(f)
# [1] c b a
# Levels: c b a
print(of)
# [1] c b a
# Levels: c < b < a
## The contrasts function. (Type 'contrasts' to see the function's code.)
contrasts(of)
# .L .Q
# [1,] -7.071068e-01 0.4082483
# [2,] 4.350720e-18 -0.8164966
# [3,] 7.071068e-01 0.4082483
contrasts(f)
# b a
# c 0 0
# b 1 0
# a 0 1