Draw a plot in which the Y-axis text data (not numeric), and X-axis numeric data

后端 未结 3 1642
说谎
说谎 2021-01-29 09:59

I can create a simple columnar diagram in a matplotlib according to the \'simple\' dictionary:

import matplotlib.pyplot as plt
D = {u\'Label1\':26, u\'Label2\'         


        
相关标签:
3条回答
  • 2021-01-29 10:40

    You may use numpy to convert the dictionary to an array with two columns, which can be plotted.

    import matplotlib.pyplot as plt
    import numpy as np
    
    T_OLD = {'10' : 'need1', '11':'need2', '12':'need1', '13':'need2','14':'need1'}
    x = list(zip(*T_OLD.items()))
    # sort array, since dictionary is unsorted
    x = np.array(x)[:,np.argsort(x[0])].T
    # let second column be "True" if "need2", else be "False
    x[:,1] = (x[:,1] == "need2").astype(int)
    
    # plot the two columns of the array
    plt.plot(x[:,0], x[:,1])
    #set the labels accordinly
    plt.gca().set_yticks([0,1])
    plt.gca().set_yticklabels(['need1', 'need2'])
    
    plt.show()
    

    The following would be a version, which is independent on the actual content of the dictionary; only assumption is that the keys can be converted to floats.

    import matplotlib.pyplot as plt
    import numpy as np
    
    T_OLD = {'10': 'run', '11': 'tea', '12': 'mathematics', '13': 'run', '14' :'chemistry'}
    x = np.array(list(zip(*T_OLD.items())))
    u, ind = np.unique(x[1,:], return_inverse=True)
    x[1,:] = ind
    x = x.astype(float)[:,np.argsort(x[0])].T
    
    # plot the two columns of the array
    plt.plot(x[:,0], x[:,1])
    #set the labels accordinly
    plt.gca().set_yticks(range(len(u)))
    plt.gca().set_yticklabels(u)
    
    plt.show()
    

    0 讨论(0)
  • 2021-01-29 10:40

    Use numeric values for your y-axis ticks, and then map them to desired strings with plt.yticks():

    import matplotlib.pyplot as plt
    import pandas as pd 
    
    # example data
    times = pd.date_range(start='2017-10-17 00:00', end='2017-10-17 5:00', freq='H')
    data = np.random.choice([0,1], size=len(times))
    data_labels = ['need1','need2']
    
    fig, ax = plt.subplots()
    ax.plot(times, data, marker='o', linestyle="None")
    plt.yticks(data, data_labels)
    plt.xlabel("time")
    

    Note: It's generally not a good idea to use a line graph to represent categorical changes in time (e.g. from need1 to need2). Doing that gives the visual impression of a continuum between time points, which may not be accurate. Here, I changed the plotting style to points instead of lines. If for some reason you need the lines, just remove linestyle="None" from the call to plt.plot().

    UPDATE
    (per comments)

    To make this work with a y-axis category set of arbitrary length, use ax.set_yticks() and ax.set_yticklabels() to map to y-axis values.

    For example, given a set of potential y-axis values labels, let N be the size of a subset of labels (here we'll set it to 4, but it could be any size).

    Then draw a random sample data of y values and plot against time, labeling the y-axis ticks based on the full set labels. Note that we still use set_yticks() first with numerical markers, and then replace with our category labels with set_yticklabels().

    labels = np.array(['A','B','C','D','E','F','G'])
    N = 4
    
    # example data
    times = pd.date_range(start='2017-10-17 00:00', end='2017-10-17 5:00', freq='H')
    data = np.random.choice(np.arange(len(labels)), size=len(times))
    
    fig, ax = plt.subplots(figsize=(15,10))
    ax.plot(times, data, marker='o', linestyle="None")
    ax.set_yticks(np.arange(len(labels)))
    ax.set_yticklabels(labels)
    plt.xlabel("time")
    
    0 讨论(0)
  • 2021-01-29 10:53

    This gives the exact desired plot:

    import matplotlib.pyplot as plt
    from collections import OrderedDict
    
    T_OLD = {'10' : 'need1', '11':'need2', '12':'need1', '13':'need2','14':'need1'}
    T_SRT = OrderedDict(sorted(T_OLD.items(), key=lambda t: t[0]))
    
    plt.plot(map(int, T_SRT.keys()), map(lambda x: int(x[-1]), T_SRT.values()),'r')
    
    plt.ylim([0.9,2.1])
    ax = plt.gca()
    ax.set_yticks([1,2])
    ax.set_yticklabels(['need1', 'need2'])
    
    plt.title('T_OLD')
    plt.xlabel('time')
    plt.ylabel('need')
    
    plt.show()
    

    For Python 3.X the plotting lines needs to explicitly convert the map() output to lists:

    plt.plot(list(map(int, T_SRT.keys())), list(map(lambda x: int(x[-1]), T_SRT.values())),'r')
    

    as in Python 3.X map() returns an iterator as opposed to a list in Python 2.7.

    The plot uses the dictionary keys converted to ints and last elements of need1 or need2, also converted to ints. This relies on the particular structure of your data, if the values where need1 and need3 it would need a couple more operations.

    After plotting and changing the axes limits, the program simply modifies the tick labels at y positions 1 and 2. It then also adds the title and the x and y axis labels.

    Important part is that the dictionary/input data has to be sorted. One way to do it is to use OrderedDict. Here T_SRT is an OrderedDict object sorted by keys in T_OLD.

    The output is:

    This is a more general case for more values/labels in T_OLD. It assumes that the label is always 'needX' where X is any number. This can readily be done for a general case of any string preceding the number though it would require more processing,

    import matplotlib.pyplot as plt
    from collections import OrderedDict
    import re
    
    T_OLD = {'10' : 'need1', '11':'need8', '12':'need11', '13':'need1','14':'need3'}
    T_SRT = OrderedDict(sorted(T_OLD.items(), key=lambda t: t[0]))
    
    x_val = list(map(int, T_SRT.keys()))
    y_val = list(map(lambda x: int(re.findall(r'\d+', x)[-1]), T_SRT.values()))
    
    plt.plot(x_val, y_val,'r')
    
    plt.ylim([0.9*min(y_val),1.1*max(y_val)])
    ax = plt.gca()
    y_axis = list(set(y_val))
    ax.set_yticks(y_axis)
    ax.set_yticklabels(['need' + str(i) for i in y_axis])
    
    plt.title('T_OLD')
    plt.xlabel('time')
    plt.ylabel('need')
    
    plt.show()
    

    This solution finds the number at the end of the label using re.findall to accommodate for the possibility of multi-digit numbers. Previous solution just took the last component of the string because numbers were single digit. It still assumes that the number for plotting position is the last number in the string, hence the [-1]. Again for Python 3.X map output is explicitly converted to list, step not necessary in Python 2.7.

    The labels are now generated by first selecting unique y-values using set and then renaming their labels through concatenation of the strings 'need' with its corresponding integer.

    The limits of y-axis are set as 0.9 of the minimum value and 1.1 of the maximum value. Rest of the formatting is as before.

    The result for this test case is:

    0 讨论(0)
提交回复
热议问题