How can I predict memory usage and time based on historical values

前端 未结 1 649
春和景丽
春和景丽 2021-01-27 03:05

A maths problem really I think... I have some historical data for some spreadsheet outputs along with the number of rows and columns.

What I\'d like to do is use this da

1条回答
  •  别那么骄傲
    2021-01-27 03:29

    You could fit a linear regression model.

    Since this is a programming site, here is some R code:

    > d <- read.table("data.tsv", sep="\t", header=T)
    > summary(lm(log(Bytes.RAM) ~ log(Rows) + log(Columns), d))
    
    Call:
    lm(formula = log(Bytes.RAM) ~ log(Rows) + log(Columns), data = d)
    
    Residuals:
        Min      1Q  Median      3Q     Max 
    -0.4800 -0.2409 -0.1618  0.1729  0.6827 
    
    Coefficients:
                 Estimate Std. Error t value Pr(>|t|)    
    (Intercept)  12.42118    0.61820  20.093 8.72e-09 ***
    log(Rows)     0.51032    0.09083   5.618 0.000327 ***
    log(Columns)  0.58200    0.07821   7.441 3.93e-05 ***
    ---
    Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
    
    Residual standard error: 0.4052 on 9 degrees of freedom
    Multiple R-squared: 0.9062, Adjusted R-squared: 0.8853 
    F-statistic: 43.47 on 2 and 9 DF,  p-value: 2.372e-05 
    

    This model explains the data pretty well (the R² is 0.89) and suggests the following relationship between the size of the spreadsheet and memory usage:

    Bytes.RAM = exp(12.42 + 0.51 * log(Rows) + 0.58 * log(Columns))
    

    A similar model can be used to predict the execution time (the Seconds column). There, the R² is 0.998.

    0 讨论(0)
提交回复
热议问题