How to display coefficients in scientific notation with stargazer

南楼画角 提交于 2019-12-23 07:13:26

问题


I want to compare the results of different models (lm, glm, plm, pglm) in a table in R using stargazer or a similar tool. However I can't find a way to display the coefficients in scientific notation. This is kind of a problem because the intercept is rather large (about a million) while other coefficients are small (about e-7) which results in lots of useless zeros making it harder to read the table.

I found a similar question here: Format model display in texreg or stargazer R as scientific. But the results there require rescaling the variables and since I use count data I wouldn't want to rescale it.

I am grateful for any suggestions.


回答1:


Here's a reproducible example:

m1 <- lm(Sepal.Length ~ Petal.Length*Sepal.Width,
         transform(iris, Sepal.Length = Sepal.Length+1e6,
                   Petal.Length=Petal.Length*10, Sepal.Width=Sepal.Width*100))
# Coefficients:
#              (Intercept)              Petal.Length               Sepal.Width  Petal.Length:Sepal.Width  
#                1.000e+06                 7.185e-02                 8.500e-03                -7.701e-05  

I don't believe stargazer has easy support for this. You could try other alternatives like xtable or any of the many options here (I have not tried them all)

library(xtable)
xtable(m1, display=rep('g', 5)) # or there's `digits` too; see `?xtable`

Or if you're using knitr or pandoc I quite like pander, which has automagic scientific notation already (note: this is pandoc output which looks like markdown, not tex output, and then you knit or pandoc to latex/pdf):

library(pander)
pander(m1)



回答2:


It's probably worth making a feature request to the package maintainer to include this option.

In the meantime, you can replace numbers in the output with scientific notation auto-magically. There are a few things to be careful about when replacing numbers. It is important not to reformat numbers that are part of the latex encoding. Also, be careful not to replace characters that are part of variable names. For example the . in Sepal.Width could easily be mistaken for a number by regex. The following code should deal with most common situations. But, if someone, for example, calls their variable X_123456789 it might rename this to X_1.23e+09 depending on the scipen setting. So some caution is needed and a more robust solution probably will need to be implemented within the stargazer package.

here's an example stargazer table to demonstrate on (shamelessly copied from @mathematical.coffee):

library(stargazer)
library(gsubfn)
m1 <- lm(Sepal.Length ~ Petal.Length*Sepal.Width,
  transform(iris, Sepal.Length = Sepal.Length+1e6,
    Petal.Length=Petal.Length*10, Sepal.Width=Sepal.Width*100))    
star = stargazer(m1, header = F, digit.separator = '')

Now a helper function to reformat the numbers. You can play around with the digits and scipen parameters to control the output format. If you want to force scientific format more often use a smaller (more negative) scipen. Otherwise we can have it automatically use scientific format only for very small or large numbers by using a larger scipen. The cutoff parameter is there to prevent reformatting of numbers represented by only a few characters.

replace_numbers = function(x, cutoff=4, digits=3, scipen=-7) {
  ifelse(nchar(x) < cutoff, x, prettyNum(as.numeric(x), digits=digits, scientific=scipen))
}

And apply that to the stargazer output using gsubfn::gsubfn

gsubfn("([0-9.]+)", ~replace_numbers(x), star)




回答3:


Another robust way to get scientific notation using stargazer is to hack the digit.separator parameter. This option allows the user to specify the character that separates decimals (usually a period . in most locales). We can usurp this parameter to insert a uniquely identifiable string into any number that we want to be able to find using regex. The advantage of searching for numbers this way is that we shall only find numbers that correspond to numeric values in the stargazer output. I.e. there is no possibility to also match numbers that are part of variable names (e.g. X_12345) or that are part of the latex formatting code (e.g. \hline \\[-1.8ex]). In the following I use the string ::::, but any unique character string (such as a hash) that we will not find elsewhere in the table will do. It's probably best to avoid having any special regex characters in the identifier mark, as this will complicate things slightly.

Using the example model m1 from this other answer.

mark  = '::::'
star = stargazer(m1, header = F, decimal.mark  = mark, digit.separator = '')

replace_numbers = function(x, low=0.01, high=1e3, digits = 3, scipen=-7, ...) {
  x = gsub(mark,'.',x)
  x.num = as.numeric(x)
  ifelse(
    (x.num >= low) & (x.num < high), 
    round(x.num, digits = digits), 
    prettyNum(x.num, digits=digits, scientific = scipen, ...)
  )
}    

reg = paste0("([0-9.\\-]+", mark, "[0-9.\\-]+)")
cat(gsubfn(reg, ~replace_numbers(x), star), sep='\n')

Update If you want to ensure that trailing zeros are retained in the scientific notation, then we can use sprintf instead of prettyNum.

Like this

replace_numbers = function(x, low=0.01, high=1e3, digits = 3) {
  x = gsub(mark,'.',x)
  x.num = as.numeric(x)
  form = paste0('%.', digits, 'e')
  ifelse(
    (abs(x.num) >= low) & (abs(x.num) < high), 
    round(x.num, digits = digits), 
    sprintf(form, x.num) 
  )
}



来源:https://stackoverflow.com/questions/31551822/how-to-display-coefficients-in-scientific-notation-with-stargazer

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!