I have a dataframe in r :
word positive.polarity negative.polarity
1 interesting 1 0
2
Not knowing what your "special character" is... I'm going to use the condition of : "[o]{2}|[y]$"
or in basic terms
if the word contains two "o's" OR ends in a 'y': multiply by 3; if not divide by 3.
Using the tm
package for the stopwords
and package::dplyr
# Created some data to mimic yours
var_df <- data.frame(word = tm::stopwords(),
stringsAsFactors = FALSE) %>% mutate(
positive.polarity = sample(0:1, nrow(.), TRUE)) %>% mutate(
negative.polarity = ifelse(positive.polarity == 1, 0, 1)
) %>%
# Applying the condition and evaluating the variable formula if met
mutate(
positive.ponderate.polarity = ifelse(
grepl("[o]{2}|[y]$", word),
positive.polarity * 3,
positive.polarity / 3)
)
tail(var_df, 10)
word positive.polarity negative.polarity positive.ponderate.polarity
165 no 0 1 0.0000000
166 nor 0 1 0.0000000
167 not 1 0 0.3333333
168 only 1 0 3.0000000
169 own 1 0 0.3333333
170 same 1 0 0.3333333
171 so 0 1 0.0000000
172 than 1 0 0.3333333
173 too 1 0 3.0000000
174 very 1 0 3.0000000