Creating Probability/Frequency Axis Grid (Irregularly Spaced) with Matplotlib

前端 未结 2 1868
深忆病人
深忆病人 2020-12-15 11:10

I\'m trying to create a frequency curve plot, and I\'m having trouble manipulating the axis to get the plot I want.

Here is an example of the desired grid/plot I am

2条回答
  •  有刺的猬
    2020-12-15 12:01

    You can define a custom scale for the x-axis, which you can use instead of 'log'. Unfortunately, it's complicated and you'll need to figure out a function that lets you transform the numbers you give for the x-axis into something more linear. See http://matplotlib.org/examples/api/custom_scale_example.html.

    Edit to add:

    The problem was so interesting I decided to figure out if I could make the custom axis myself. I altered the code from the link to work with your example. I'd be interested to see whether it works the way you want.

    Edit: New and improved(?) code! The spacing isn't quite as even as before, but it's now done automatically when you pass a list of points to plt.gca().set_xscale (see near the end of the code for example). It does a curve fit to fit those points to a logistic function and uses the resulting parameters as the basis for the transformation. I get a warning when I run this code (Warning: converting a masked element to nan). I still haven't figured out what's going on there, but it doesn't seem to be causing problems. Here's the figure that I generated:

    matplotlib figure with custom axis

    import numpy as np
    from numpy import ma
    from matplotlib import scale as mscale
    from matplotlib import transforms as mtransforms
    from matplotlib.ticker import Formatter, FixedLocator
    from scipy.optimize import curve_fit
    
    def logistic(x, L, k, x0):
        """Logistic function (s-curve)."""
        return L / (1 + np.exp(-k * (x - x0)))
    
    class ProbabilityScale(mscale.ScaleBase):
        """
        Scales data so that points along a logistic curve become evenly spaced.
        """
    
        # The scale class must have a member ``name`` that defines the
        # string used to select the scale.  For example,
        # ``gca().set_yscale("probability")`` would be used to select this
        # scale.
        name = 'probability'
    
    
        def __init__(self, axis, **kwargs):
            """
            Any keyword arguments passed to ``set_xscale`` and
            ``set_yscale`` will be passed along to the scale's
            constructor.
    
            lower_bound: Minimum value of x. Defaults to .01.
            upper_bound_dist: L - upper_bound_dist is the maximum value
            of x. Defaults to lower_bound.
    
            """
            mscale.ScaleBase.__init__(self)
            lower_bound = kwargs.pop("lower_bound", .01)
            if lower_bound <= 0:
                raise ValueError("lower_bound must be greater than 0")
            self.lower_bound = lower_bound
            upper_bound_dist = kwargs.pop("upper_bound_dist", lower_bound)
            self.points = kwargs['points']
            #determine parameters of logistic function with curve fitting
            x = np.linspace(0, 1, len(self.points))
            #initial guess for parameters
            p0 = [max(self.points), 1, .5]
            popt, pcov = curve_fit(logistic, x, self.points, p0 = p0)
            [self.L, self.k, self.x0] = popt
            self.upper_bound = self.L - upper_bound_dist
    
        def get_transform(self):
            """
            Override this method to return a new instance that does the
            actual transformation of the data.
    
            The ProbabilityTransform class is defined below as a
            nested class of this one.
            """
            return self.ProbabilityTransform(self.lower_bound, self.upper_bound, self.L, self.k, self.x0)
    
        def set_default_locators_and_formatters(self, axis):
            """
            Override to set up the locators and formatters to use with the
            scale.  This is only required if the scale requires custom
            locators and formatters.  Writing custom locators and
            formatters is rather outside the scope of this example, but
            there are many helpful examples in ``ticker.py``.
            """
    
    
            axis.set_major_locator(FixedLocator(self.points))
    
        def limit_range_for_scale(self, vmin, vmax, minpos):
            """
            Override to limit the bounds of the axis to the domain of the
            transform.  In this case, the bounds should be
            limited to the threshold that was passed in.  Unlike the
            autoscaling provided by the tick locators, this range limiting
            will always be adhered to, whether the axis range is set
            manually, determined automatically or changed through panning
            and zooming.
            """
            return max(vmin, self.lower_bound), min(vmax, self.upper_bound)
    
        class ProbabilityTransform(mtransforms.Transform):
            # There are two value members that must be defined.
            # ``input_dims`` and ``output_dims`` specify number of input
            # dimensions and output dimensions to the transformation.
            # These are used by the transformation framework to do some
            # error checking and prevent incompatible transformations from
            # being connected together.  When defining transforms for a
            # scale, which are, by definition, separable and have only one
            # dimension, these members should always be set to 1.
            input_dims = 1
            output_dims = 1
            is_separable = True
    
            def __init__(self, lower_bound, upper_bound, L, k, x0):
                mtransforms.Transform.__init__(self)
                self.lower_bound = lower_bound
                self.L = L
                self.k = k
                self.x0 = x0
                self.upper_bound = upper_bound
            def transform_non_affine(self, a):
                """
                This transform takes an Nx1 ``numpy`` array and returns a
                transformed copy.  Since the range of the scale
                is limited by the user-specified threshold, the input
                array must be masked to contain only valid values.
                ``matplotlib`` will handle masked arrays and remove the
                out-of-range data from the plot.  Importantly, the
                ``transform`` method *must* return an array that is the
                same shape as the input array, since these values need to
                remain synchronized with values in the other dimension.
                """
                masked = ma.masked_where((a < self.lower_bound) | (a > self.upper_bound), a)
                return ma.log((self.L - masked) / masked) / -self.k + self.x0
    
            def inverted(self):
                """
                Override this method so matplotlib knows how to get the
                inverse transform for this transform.
                """
                return ProbabilityScale.InvertedProbabilityTransform(self.lower_bound, self.upper_bound, self.L, self.k, self.x0)
    
        class InvertedProbabilityTransform(mtransforms.Transform):
            input_dims = 1
            output_dims = 1
            is_separable = True
    
            def __init__(self, lower_bound, upper_bound, L, k, x0):
                mtransforms.Transform.__init__(self)
                self.lower_bound = lower_bound
                self.L = L
                self.k = k
                self.x0 = x0
                self.upper_bound = upper_bound
    
            def transform_non_affine(self, a):
                return self.L / (1 + np.exp(-self.k * (a - self.x0)))
            def inverted(self):
                return ProbabilityScale.ProbabilityTransform(self.lower_bound, self.upper_bound, self.L, self.k, self.x0)
    
    # Now that the Scale class has been defined, it must be registered so
    # that ``matplotlib`` can find it.
    mscale.register_scale(ProbabilityScale)
    
    
    if __name__ == '__main__':
        import matplotlib.pyplot as plt
        x = np.linspace(.1, 100, 1000)
        points = np.array([.2,.5,1,2,5,10,20,30,40,50,60,70,80,90,95,98])
    
        plt.plot(x, x)
        plt.gca().set_xscale('probability', points = points, vmin = .01)
        plt.grid(True)
    
        plt.show()
    

提交回复
热议问题