binning | 易学教程

R - faster alternative to hist(XX, plot=FALSE)$count

阅读更多关于 R - faster alternative to hist(XX, plot=FALSE)$count

问题 I am on the lookout for a faster alternative to R's hist(x, breaks=XXX, plot=FALSE)$count function as I don't need any of the other output that is produced (as I want to use it in an sapply call, requiring 1 million iterations in which this function would be called), e.g. x = runif(100000000, 2.5, 2.6) bincounts = hist(x, breaks=seq(0,3,length.out=100), plot=FALSE)$count Any thoughts? 回答1: A first attempt using table and cut : table(cut(x, breaks=seq(0,3,length.out=100))) It avoids the extra

Applying my custom function to a data frame python

阅读更多关于 Applying my custom function to a data frame python

问题 I have a dataframe with a column called Signal. I want to add a new column to that dataframe and apply a custom function i've built. I'm very new at this and I seem to be having trouble when it comes to passing values that I'm getting out of a data frame column into a function so any help as to my syntax errors or reasoningg would be greatly appreciated! Signal 3.98 3.78 -6.67 -17.6 -18.05 -14.48 -12.25 -13.9 -16.89 -13.3 -13.19 -18.63 -26.36 -26.23 -22.94 -23.23 -15.7 This is my simple

Hexbin: apply function for every bin

阅读更多关于 Hexbin: apply function for every bin

问题 I would like to build the hexbin plot where for every bin is the "ratio between class 1 and class2 points falling into this bin" is plotted (either log or not). x <- rnorm(10000) y <- rnorm(10000) h <- hexbin(x,y) plot(h) l <- as.factor(c( rep(1,2000), rep(2,8000) )) Any suggestions on how to implement this? Is there a way to introduce function to every bin based on bin statistics? 回答1: @cryo111's answer has the most important ingredient - IDs = TRUE . After that it's just a matter of

Binning data into a hexagonal grid in Google Maps

阅读更多关于 Binning data into a hexagonal grid in Google Maps

问题 I'm trying to display geospatial data in a hexagonal grid on a Google Map. In order to do so, given a hexagon tile grid size X I need to be able to convert ( {lat, lng} ) coordinates into the ( {lat, lng} ) centers of the hexagon grid tiles that contain them. In the end, I would like to be able to display data on a Google Map like this: Does anybody have any insight into how this is done? I've tried porting this Python hexagon binning script, binner.py to Javascript but it doesn't seem to be

Binning of data along one axis in numpy

阅读更多关于 Binning of data along one axis in numpy

问题 I have a large two dimensional array arr which I would like to bin over the second axis using numpy. Because np.histogram flattens the array I'm currently using a for loop: import numpy as np arr = np.random.randn(100, 100) nbins = 10 binned = np.empty((arr.shape[0], nbins)) for i in range(arr.shape[0]): binned[i,:] = np.histogram(arr[i,:], bins=nbins)[0] I feel like there should be a more direct and more efficient way to do that within numpy but I failed to find one. 回答1: You could use np

Numpy rebinning a 2D array

阅读更多关于 Numpy rebinning a 2D array

问题 I am looking for a fast formulation to do a numerical binning of a 2D numpy array. By binning I mean calculate submatrix averages or cumulative values. For ex. x = numpy.arange(16).reshape(4, 4) would have been splitted in 4 submatrix of 2x2 each and gives numpy.array([[2.5,4.5],[10.5,12.5]]) where 2.5=numpy.average([0,1,4,5]) etc... How to perform such an operation in an efficient way... I don't have really any ideay how to perform this ... Many thanks... 回答1: You can use a higher

Numpy rebinning a 2D array

阅读更多关于 Numpy rebinning a 2D array

Create binned variable from results of class interval determination

阅读更多关于 Create binned variable from results of class interval determination

问题 I want to create a binned variable out of a continuous variable. I want 10 bins, with break points set from whatever results from a jenks classification. How do I assign each value to one of these 10 bins? # dataframe w/ values (AllwdAmt) df <- structure(list(X = c(2078L, 2079L, 2080L, 2084L, 2085L, 2086L, 2087L, 2092L, 2093L, 2094L, 2095L, 4084L, 4085L, 4086L, 4087L, 4088L, 4089L, 4091L, 4092L, 4093L, 4094L, 4095L, 4096L, 4097L, 4098L, 4099L, 4727L, 4728L, 4733L, 4734L, 4739L, 4740L, 4741L,

Create binned variable from results of class interval determination

阅读更多关于 Create binned variable from results of class interval determination

How to Plot a Pre-Binned Histogram In R

阅读更多关于 How to Plot a Pre-Binned Histogram In R

问题 I have a pre-binned frequency table for a rather large dataset. That is, a single column vector of bins and a single column vector of counts associated with those bins. I'd like R to plot a histogram of this data by doing further binning and summing the existing counts. For example, if in the pre-binned data I have something like [(0.01, 5000), (0.02, 231), (0.03, 948)], where the first number is the bin and the second is the count, and I choose 0.04 as the new bin width, I'd expect to get [