How can I do standard deviation in Ruby?

后端 未结 9 1459
一个人的身影
一个人的身影 2021-01-30 15:45

I have several records with a given attribute, and I want to find the standard deviation.

How do I do that?

相关标签:
9条回答
  • 2021-01-30 16:30

    It appears that Angela may have been wanting an existing library. After playing with statsample, array-statisics, and a few others, I'd recommend the descriptive_statistics gem if you're trying to avoid reinventing the wheel.

    gem install descriptive_statistics
    
    $ irb
    1.9.2 :001 > require 'descriptive_statistics'
     => true 
    1.9.2 :002 > samples = [1, 2, 2.2, 2.3, 4, 5]
     => [1, 2, 2.2, 2.3, 4, 5] 
    1.9.2p290 :003 > samples.sum
     => 16.5 
    1.9.2 :004 > samples.mean
     => 2.75 
    1.9.2 :005 > samples.variance
     => 1.7924999999999998 
    1.9.2 :006 > samples.standard_deviation
     => 1.3388427838995882 
    

    I can't speak to its statistical correctness, or your comfort with monkey-patching Enumerable; but it's easy to use and easy to contribute to.

    0 讨论(0)
  • 2021-01-30 16:31

    If the records at hand are of type Integer or Rational, you may want to compute the variance using Rational instead of Float to avoid errors introduced by rounding.

    For example:

    def variance(list)
      mean = list.reduce(:+)/list.length.to_r
      sum_of_squared_differences = list.map { |i| (i - mean)**2 }.reduce(:+)
      sum_of_squared_differences/list.length
    end
    

    (It would be prudent to add special-case handling for empty lists and other edge cases.)

    Then the square root can be defined as:

    def std_dev(list)
      Math.sqrt(variance(list))
    end
    
    0 讨论(0)
  • 2021-01-30 16:34

    I'm not a big fan of adding methods to Enumerable since there could be unwanted side effects. It also gives methods really specific to an array of numbers to any class inheriting from Enumerable, which doesn't make sense in most cases.

    While this is fine for tests, scripts or small apps, it's risky for larger applications, so here's an alternative based on @tolitius' answer which was already perfect. This is more for reference than anything else:

    module MyApp::Maths
      def self.sum(a)
        a.inject(0){ |accum, i| accum + i }
      end
    
      def self.mean(a)
        sum(a) / a.length.to_f
      end
    
      def self.sample_variance(a)
        m = mean(a)
        sum = a.inject(0){ |accum, i| accum + (i - m) ** 2 }
        sum / (a.length - 1).to_f
      end
    
      def self.standard_deviation(a)
        Math.sqrt(sample_variance(a))
      end
    end
    

    And then you use it as such:

    2.0.0p353 > MyApp::Maths.standard_deviation([1,2,3,4,5])
    => 1.5811388300841898
    
    2.0.0p353 :007 > a = [ 20, 23, 23, 24, 25, 22, 12, 21, 29 ]
     => [20, 23, 23, 24, 25, 22, 12, 21, 29]
    
    2.0.0p353 :008 > MyApp::Maths.standard_deviation(a)
     => 4.594682917363407
    
    2.0.0p353 :043 > MyApp::Maths.standard_deviation([1,2,2.2,2.3,4,5])
     => 1.466628787389638
    

    The behavior is the same, but it avoids the overheads and risks of adding methods to Enumerable.

    0 讨论(0)
  • 2021-01-30 16:36
    module Enumerable
    
        def sum
          self.inject(0){|accum, i| accum + i }
        end
    
        def mean
          self.sum/self.length.to_f
        end
    
        def sample_variance
          m = self.mean
          sum = self.inject(0){|accum, i| accum +(i-m)**2 }
          sum/(self.length - 1).to_f
        end
    
        def standard_deviation
          Math.sqrt(self.sample_variance)
        end
    
    end 
    

    Testing it:

    a = [ 20, 23, 23, 24, 25, 22, 12, 21, 29 ]
    a.standard_deviation  
    # => 4.594682917363407
    

    01/17/2012:

    fixing "sample_variance" thanks to Dave Sag

    0 讨论(0)
  • 2021-01-30 16:37

    Or how about:

    class Stats
        def initialize( a )
            @avg = a.count > 0 ? a.sum / a.count.to_f : 0.0
            @stdev = a.count > 0 ? ( a.reduce(0){ |sum, v| sum + (@avg - v) ** 2 } / a.count ) ** 0.5 : 0.0
        end
    end
    
    0 讨论(0)
  • 2021-01-30 16:41

    As a simple function, given a list of numbers:

    def standard_deviation(list)
      mean = list.inject(:+) / list.length.to_f
      var_sum = list.map{|n| (n-mean)**2}.inject(:+).to_f
      sample_variance = var_sum / (list.length - 1)
      Math.sqrt(sample_variance)
    end
    
    0 讨论(0)
提交回复
热议问题