standard-deviation | 易学教程

Removing outliers easily in R

阅读更多关于 Removing outliers easily in R

问题 I have data with discrete x-values, such as x = c(3,8,13,8,13,3,3,8,13,8,3,8,8,13,8,13,8,3,3,8,13,8,13,3,3) y = c(4,5,4,6,7,20,1,4,6,2,6,8,2,6,7,3,2,5,7,3,2,5,7,3,2); How can I generate a new dataset of x and y values where I eliminate pairs of values where the y-value is 2 standard deviations above the mean for that bin. For example, in the x=3 bin, 20 is more than 2 SDs above the mean, so that data point should be removed. 回答1: for me you want something like : by(dat,dat$x, function(z) z$y

Calculate variation of IP addresses column using MySQL

阅读更多关于 Calculate variation of IP addresses column using MySQL

问题 I'm trying to detect people using proxies to abuse my website. Often they will change proxies and so forth. But there is definitely a pattern of them using one proxy address many times. Much more than is normal for legitimate visitors. Usually most accessing of my website is by unique ip addresses that have only visited once or a few times. Not repeatedly. Let's say I have these ip addresses in a column: 89.46.74.56 89.46.74.56 89.46.74.56 91.14.37.249 104.233.103.6 That would mean there are

Detect major events in signal data?

阅读更多关于 Detect major events in signal data?

问题 If I have a signal as the one below, how would I go about finding the beginning and end of the two "major events" (illustrated by a green arrow where the event begins, and a red arrow where it ends)? I've tried the method suggested in this answer, but it seems that no matter how much I play around with the lag , threshold and influence variables, it either reacts to the tiny changes in the beginning, middle and end of the graph (where there are no major events), or it doesn't react at all. I

StDev() function returns Null when table contains only one row

阅读更多关于 StDev() function returns Null when table contains only one row

I am trying to use the StDev function and am getting blank results. I am using it as... SELECT StDev(fldMean) FROM myTable Where fldMean contains a value of 2.3 and should evaluate to 0 but instead I am simply getting an empty result. I can't seem to understand how expressions are to be used in the function, Microsoft's manual really didn't help. SELECT StDev(fldMean) FROM myTable will return Null if [myTable] has only one row because the Standard Deviation cannot be computed from a single observation. You will need at least two rows in that table before you can receive a meaningful result. If

Removing outliers easily in R

阅读更多关于 Removing outliers easily in R

I have data with discrete x-values, such as x = c(3,8,13,8,13,3,3,8,13,8,3,8,8,13,8,13,8,3,3,8,13,8,13,3,3) y = c(4,5,4,6,7,20,1,4,6,2,6,8,2,6,7,3,2,5,7,3,2,5,7,3,2); How can I generate a new dataset of x and y values where I eliminate pairs of values where the y-value is 2 standard deviations above the mean for that bin. For example, in the x=3 bin, 20 is more than 2 SDs above the mean, so that data point should be removed. for me you want something like : by(dat,dat$x, function(z) z$y[z$y < 2*sd(z$y)]) dat$x: 3 [1] 4 1 6 5 7 3 2 ---------------------------------------------------------------

Calculate variation of IP addresses column using MySQL

阅读更多关于 Calculate variation of IP addresses column using MySQL

I'm trying to detect people using proxies to abuse my website. Often they will change proxies and so forth. But there is definitely a pattern of them using one proxy address many times. Much more than is normal for legitimate visitors. Usually most accessing of my website is by unique ip addresses that have only visited once or a few times. Not repeatedly. Let's say I have these ip addresses in a column: 89.46.74.56 89.46.74.56 89.46.74.56 91.14.37.249 104.233.103.6 That would mean there are 3 uniques out of 5. Giving a "uniqueness score" of 60%. How would I calculate this efficiently using

Detect major events in signal data?

阅读更多关于 Detect major events in signal data?

If I have a signal as the one below, how would I go about finding the beginning and end of the two "major events" (illustrated by a green arrow where the event begins, and a red arrow where it ends)? I've tried the method suggested in this answer , but it seems that no matter how much I play around with the lag , threshold and influence variables, it either reacts to the tiny changes in the beginning, middle and end of the graph (where there are no major events), or it doesn't react at all. I can't simply determine if the signal is above a fixed threshold, as the strength of the signal can

Function that converts a vector of numbers to a vector of standard units

阅读更多关于 Function that converts a vector of numbers to a vector of standard units

问题 Is there a function that given a vector of numbers, returns another vector with the standard units corresponding to each value? where standard unit: how many SDs a value is + or - from the mean Example: x <- c(1,3,4,5,7) # note: mean = 4, sd = 2 foo(x) [1] -1.5 -0.5 0.0 0.5 1.5 Is this fictitious "foo" function already included in a package? 回答1: yes, scale() : x <- c(1,3,4,5,7) scale(x) 回答2: The function you are looking for is scale . scale(x) [,1] [1,] -1.3416408 [2,] -0.4472136 [3,] 0

How to efficiently calculate a moving Standard Deviation

阅读更多关于 How to efficiently calculate a moving Standard Deviation

问题 Below you can see my C# method to calculate Bollinger Bands for each point (moving average, up band, down band). As you can see this method uses 2 for loops to calculate the moving standard deviation using the moving average. It used to contain an additional loop to calculate the moving average over the last n periods. This one I could remove by adding the new point value to total_average at the beginning of the loop and removing the i - n point value at the end of the loop. My question now

How can I do standard deviation in Ruby?

阅读更多关于 How can I do standard deviation in Ruby?

问题 I have several records with a given attribute, and I want to find the standard deviation. How do I do that? 回答1: module Enumerable def sum self.inject(0){|accum, i| accum + i } end def mean self.sum/self.length.to_f end def sample_variance m = self.mean sum = self.inject(0){|accum, i| accum +(i-m)**2 } sum/(self.length - 1).to_f end def standard_deviation Math.sqrt(self.sample_variance) end end Testing it: a = [ 20, 23, 23, 24, 25, 22, 12, 21, 29 ] a.standard_deviation # => 4.594682917363407