How to select standard deviation within a row? (in SQL - or R :)

前端 未结 4 1184
时光取名叫无心
时光取名叫无心 2020-12-21 04:41

I wonder whether there is a way to select the standard deviation from several integer fields in MySQL within the same row. Obviously, if I use



        
相关标签:
4条回答
  • 2020-12-21 05:12

    Have you tried using UNION to effectively put all your column values into separate rows? Something like this, maybe:

    SELECT STDDEV(allcols)
    FROM (
        SELECT col1 FROM table WHERE id=requiredID
        UNION
        SELECT col2 FROM table WHERE id=requiredID
        UNION
        SELECT col3 FROM table WHERE id=requiredID
        UNION
        SELECT col4 FROM table WHERE id=requiredID
        UNION
        SELECT col5 FROM table WHERE id=requiredID
    )
    
    0 讨论(0)
  • 2020-12-21 05:13

    I found two solutions on my own:

    1) Normalize the database. I end up with two tables:

    table one uid | information1 | metainformation2

    table two uid | col | result_of_col

    Then I can easily use the standard STDDEV function.

    2) Use R. The data is a de-normalized format because it should be used in statistical analysis. Thus it´s easy to get into R and use the following code.

    sd(t(dataset[1:4,3:8]))

    Note that, I just take the numeric part of this data.frame by leaving selecting the columns 3-8. And dont get hit by too much data (that´s why I only use the first couple of rows this time). t() transposes the data which is necessary because sd() only works with columns.

    There´s a function rowSds around in the vsn package, that is supposed to work analogously to rowMean and rowSum, but somehow this might be deprecated. At least this packages was not available on the Swiss CRAN mirror ;) .

    HTH someone else.

    0 讨论(0)
  • 2020-12-21 05:17

    for simplicity, assume you have n columns, named A, B, C .... :

    SELECT SQRT(  
      (A*A + B*B + C*C + ...)/n  - (A+B+C+...)*(A+B+C+...)/n/n) AS sd
      FROM table;
    
    0 讨论(0)
  • 2020-12-21 05:23

    With R:

    df <- your.pull
    sd(t(df[sapply(df, is.numeric)]))
    

    Pull data with RMySQL or RODBC, remove non numeric columns, transpose and use sd.

    0 讨论(0)
提交回复
热议问题