Plotting two variables then coloring by a third variable

问题

I have a dataset from an aircraft flight and I am trying to plot the position of the plane (longitude x latitude) then color that line by the altitude of the plan at those coordinates. My code looks like this:

lat_data = np.array( [ 39.916294, 39.87139 , 39.8005  , 39.70801 , 39.64645 , 39.58172 ,
       39.537853, 39.55141 , 39.6787  , 39.796528, 39.91702 , 40.008347,
       40.09513 , 40.144157, 40.090584, 39.96447 , 39.838924, 39.712112,
       39.597103, 39.488377, 39.499096, 39.99354 , 40.112175, 39.77281 ,
       39.641186, 39.51512 , 39.538853, 39.882736, 39.90413 , 39.811333,
       39.73279 , 39.65676 , 39.584026, 39.5484  , 39.54484 , 39.629486,
       39.96    , 40.07143 , 40.187405, 40.304718, 40.423153, 40.549305,
       40.673313, 40.794548, 40.74402 , 40.755558, 40.770306, 40.73574 ,
       40.795086, 40.774628] )

long_data = np.array( [ -105.13034 , -105.144104, -105.01132 , -104.92708 , -104.78505 ,
       -104.6449  , -104.49255 , -104.36578 , -104.32623 , -104.31285 ,
       -104.32199 , -104.41774 , -104.527435, -104.673935, -104.81152 ,
       -104.82184 , -104.81882 , -104.81314 , -104.74657 , -104.78108 ,
       -104.93442 , -104.98039 , -105.0168  , -105.04967 , -105.056564,
       -105.03639 , -105.13429 , -105.05214 , -105.17435 , -105.070526,
       -104.93587 , -104.80029 , -104.65973 , -104.50339 , -104.33972 ,
       -104.21634 , -103.96216 , -103.84808 , -103.72534 , -103.60455 ,
       -103.48926 , -103.376495, -103.25937 , -103.10858 , -103.08469 ,
       -103.24878 , -103.4169  , -103.53073 , -103.23694 , -103.41254 ] )

altitude_data = np.array( [1.6957603e+00,  1.9788861e+00,  1.8547169e+00,  1.8768315e+00,
        1.9633590e+00,  2.0504241e+00,  2.1115899e+00,  2.1085002e+00,
        1.8621666e+00,  1.8893014e+00,  1.8268168e+00,  1.7574688e+00,
        1.7666028e+00,  1.7682364e+00,  1.8120643e+00,  1.7637002e+00,
        1.8054264e+00,  1.9149075e+00,  2.0173934e+00,  2.0875392e+00,
        2.1486480e+00,  1.8622510e+00,  1.7937366e+00,  1.8748144e+00,
        1.9063262e+00,  1.9397615e+00,  2.1261981e+00,  2.0180094e+00,
        1.9827688e+00, -9.9999990e+06,  1.8933343e+00,  1.9615903e+00,
        2.1000245e+00,  2.1989927e+00,  2.3200927e+00, -9.9999990e+06,
        4.0542388e+00,  4.0591464e+00,  4.0597038e+00,  4.3395977e+00,
        4.6702847e+00,  5.0433373e+00,  5.2824092e+00,  5.2813010e+00,
        5.2735353e+00,  5.2784677e+00,  5.2784038e+00,  5.2795196e+00,
        4.9482727e+00,  4.2531524e+00] )

import matplotlib as plt    

fig, ax1 = plt.subplots( figsize = ( 10, 10 ) )
ax1.plot( long_data, lat_data, alpha = .4)
ax1.scatter( long_data, lat_data, c = altitude_data )
plt.show()

Which gives us this track: .

Is there a way to consolidate the data into one line that plots the location of the aircraft and adjusts the color for the elevation?

While plotting a line and a scatter together works, it does not look very good when I put in all the data (n = 2400 ). Thanks!

回答1:

Update
As discussed, here now the code without a for loop and including a fourth category, e.g., acceleration. Now the code uses Line3DCollection to generate the trajectory and a custom-made color map with LinearSegmentedColormap to indicate the fourth category (acceleration):

import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d.art3d import Line3DCollection
from matplotlib.colors import LinearSegmentedColormap

fig = plt.figure(figsize=(12,12))
ax = fig.gca(projection='3d')

#rolling average between two acceleration data points
aver_accel = np.convolve(acceleration_data, np.ones((2,))/2, mode='valid')     

#custom colour map to visualize acceleartion and decelaration
cmap_bgr = LinearSegmentedColormap.from_list("bluegreyred", ["red", "lightgrey", "lightgrey", "blue"])

#creating the trajectory as line segments
points = np.transpose([lat_data, long_data, altitude_data])
window = (2, 3)
view_shape = (len(points) - window[0] + 1,) + window 
segments = np.lib.stride_tricks.as_strided(points, shape = view_shape, strides = (points.itemsize,) + points.strides)
trajectory = Line3DCollection(segments, cmap=cmap_bgr, linewidth=3)
#set the colour according to the acceleration data
trajectory.set_array(aver_accel)
#add line collection and plot color bar for acceleration
cb = ax.add_collection(trajectory)
cbar = plt.colorbar(cb, shrink=0.5)
cbar.set_label("acceleration", rotation=270)

#let's call it "autoscale"
ax.set_xlim(min(lat_data), max(lat_data))
ax.set_ylim(min(long_data), max(long_data))
ax.set_zlim(min(altitude_data), max(altitude_data))

ax.set_xlabel("latitude")
ax.set_ylabel("longitude")
ax.set_zlabel("altitude")

plt.show()

Sample output (with arbitrary acceleration data):

Thanks to the tailored colormap, one can clearly see acceleration and deceleration phases. Since we directly use the array, a colorbar for calibration can be easily added. Mind you, you still have the variable linewidth that also takes an array (for instance for velocity), although this will probably then be difficult to read. There is also substantial time gain in the generation of large-scale 3D line collections thanks to this marvellous answer.

For comparison, here the 2D view as produced by the other answers:

Original answer
Since you have 3D data, why not create a 3D projection? You can always move the view into a 2D projection if you feel like it. To avoid the problem that the color is defined by the first point of each line (i.e., a steep ascent would look different from a steep descent), this program determines the middle point of each line for the color-coded altitude calculation. Disadvantages: Uses a slow for loop, and the altitude colors are normalized between 0 and 1 (which doesn't matter here because altitude is overdetermined in this 3D projection but will become a problem if you want to color-code another parameter).

import numpy as np
import matplotlib.pyplot as plt
from matplotlib import cm

fig = plt.figure(figsize=(10,10))
ax = fig.gca(projection='3d')

min_alt = np.min(altitude_data)
max_alt = np.max(altitude_data)
#generate normalized altitude array for colour code
#the factor 0.95 filters out the end of this colormap
cols_raw = 0.95 * (altitude_data-min_alt) / (max_alt-min_alt) 
#rolling average between two data point colors
cols = np.convolve(cols_raw, np.ones((2,))/2, mode='valid')     

for i, col in enumerate(cols):
    ax.plot(lat_data[i:i+2], long_data[i:i+2], altitude_data[i:i+2], c=cm.gnuplot(col))

ax.set_xlabel("latitude")
ax.set_ylabel("longitude")
ax.set_zlabel("altitude")

plt.show()

The sample data for the above outputs:

lat_data = np.array( [ 39.916294, 39.87139 , 39.8005  , 39.70801 , 39.64645 , 39.58172 ,
     39.537853, 39.55141 , 39.6787  , 39.796528, 39.91702 , 40.008347,
     40.09513 , 40.144157, 40.090584, 39.96447 , 39.838924, 39.712112,
     39.597103, 39.488377, 39.499096, 39.99354 , 40.112175, 39.77281 ,
     39.641186, 39.51512 , 39.538853, 39.882736, 39.90413 , 39.811333,
     39.73279 , 39.65676 , 39.584026, 39.5484  , 39.54484 , 39.629486,
     39.96    , 40.07143 , 40.187405, 40.304718, 40.423153, 40.549305,
     40.673313, 40.794548, 40.74402 , 40.755558, 40.770306, 40.73574 ,
     40.795086, 40.774628] )
  
long_data = np.array( [ -105.13034 , -105.144104, -105.01132 , -104.92708 , -104.78505 ,
       -104.6449  , -104.49255 , -104.36578 , -104.32623 , -104.31285 ,
       -104.32199 , -104.41774 , -104.527435, -104.673935, -104.81152 ,
       -104.82184 , -104.81882 , -104.81314 , -104.74657 , -104.78108 ,
       -104.93442 , -104.98039 , -105.0168  , -105.04967 , -105.056564,
       -105.03639 , -105.13429 , -105.05214 , -105.17435 , -105.070526,
       -104.93587 , -104.80029 , -104.65973 , -104.50339 , -104.33972 ,
       -104.21634 , -103.96216 , -103.84808 , -103.72534 , -103.60455 ,
       -103.48926 , -103.376495, -103.25937 , -103.10858 , -103.08469 ,
       -103.24878 , -103.4169  , -103.53073 , -103.23694 , -103.41254 ] )

altitude_data = np.array( [1.6957603e+00,  1.9788861e+00,  1.8547169e+00,  1.8768315e+00,
        1.9633590e+00,  2.0504241e+00,  2.1115899e+00,  2.1085002e+00,
        1.8621666e+00,  1.8893014e+00,  1.8268168e+00,  1.7574688e+00,
        1.7666028e+00,  1.7682364e+00,  1.8120643e+00,  1.7637002e+00,
        1.8054264e+00,  1.9149075e+00,  2.0173934e+00,  2.0875392e+00,
        2.1486480e+00,  1.8622510e+00,  1.7937366e+00,  1.8748144e+00,
        1.9063262e+00,  1.9397615e+00,  2.1261981e+00,  2.0180094e+00,
        1.9827688e+00,  1.9999990e+00,  1.8933343e+00,  1.9615903e+00,
        2.1000245e+00,  2.1989927e+00,  2.3200927e+00,  2.9999990e+00,
        4.0542388e+00,  4.0591464e+00,  4.0597038e+00,  4.3395977e+00,
        4.6702847e+00,  5.0433373e+00,  5.2824092e+00,  5.2813010e+00,
        5.2735353e+00,  5.2784677e+00,  5.2784038e+00,  5.2795196e+00,
        4.9482727e+00,  4.2531524e+00] )

acceleration_data = np.array( 
    [1,   2,   2,   3,
     3,   3,   2,   2,
     2,   2,   4,   5,
     4,   3,   4,   3,
     3,   3,   3,   4,
     3,   3,   4,   5,
     4,   4,   4,   5,
     4,   15,  26,  49,
     67,  83,  89,  72,
     77,  63,  75,  82,
     69,  37,  5,  -29,
     -37, -27, -29, -14,
     9,   4] )

回答2:

So, I have something that is pretty close. there will be some missing/averaging of altitude data though.

from matplotlib import pyplot as plt
import matplotlib
import matplotlib.cm as cm
#... define arrays ...

fig, ax1 = plt.subplots( figsize = ( 10, 10 ) )
minima = min(altitude_data)
maxima = max(altitude_data)

norm = matplotlib.colors.Normalize(vmin=0, vmax=maxima, clip=True)
mapper = cm.ScalarMappable(norm=norm, cmap=cm.summer)

pointsPerColor = 2

for x in range(len(lat_data)//pointsPerColor):
    startIndex = x * pointsPerColor
    stopIndex = startIndex + pointsPerColor + 1

    #get color for this section
    avgAltitude = sum(altitude_data[startIndex:stopIndex])/pointsPerColor
    rbga = mapper.to_rgba(avgAltitude)

    #plot section (leng)
    ax1.plot( long_data[startIndex:stopIndex], 
            lat_data[startIndex:stopIndex], 
            alpha=.7,color=rbga )

plt.show()

So what's happening in order is..

get min & max of your altitude & use that to make a color mapper there's several color options
determine interval. need atleast 2 points to make a line obviously
loop for (number of points)/pointsPerColor (need to do integer division) a. get average color b. plot segment with color

thats it!.. I probably could've done this a lil prettier but it works also.. those super low values messed the mapping..so I just set min to 0

line plot with color scale of altitude data

回答3:

It looks like if you want to use a Line2D object, you're stuck with a single color per object. As a workaround, you could plot each line segment as a set of (first order linearly) interpolated segments and color each of those by its corresponding infinitesimal value.

It looks like this functionality is contained in a LineCollection instance, however I just went for a more quick and dirty approach below.

For extra credit, since we're talking about geospatial data here, why not use cartopy to plot your data? That way you can have a "basemap" which gives you some reference. After all, if it's worth plotting, it's worth plotting beautifully.

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

import cartopy
import cartopy.crs as ccrs

import numpy as np
import scipy
from scipy import interpolate

import matplotlib
#matplotlib.use('Agg')
import matplotlib.pyplot as plt

### clean data
filter_inds   = np.where(np.abs(altitude_data) < 100)
lat_data      = lat_data[filter_inds]
long_data     = long_data[filter_inds]
altitude_data = altitude_data[filter_inds]

# =============== plot

plt.close('all')
plt.style.use('dark_background') ## 'default'
fig = plt.figure(figsize=(1500/100, 1000/100))
#ax1 = plt.gca()

lon_center = np.mean(long_data); lat_center = np.mean(lat_data)

ax1 = plt.axes(projection=ccrs.Orthographic(central_longitude=lon_center, central_latitude=lat_center))
ax1.set_aspect('equal')

scale = 3 ### 'zoom' with smaller numbers
ax1.set_extent((lon_center-((0.9*scale)), lon_center+((0.7*scale)), lat_center-(0.5*scale), lat_center+(0.5*scale)), crs=ccrs.PlateCarree())

### states
ax1.add_feature(cartopy.feature.NaturalEarthFeature(category='cultural', scale='10m', facecolor='none', name='admin_1_states_provinces_shp'), zorder=2, linewidth=1.0, edgecolor='w')

ax1.add_feature(cartopy.feature.RIVERS.with_scale('10m'), zorder=2, linewidth=1.0, edgecolor='lightblue')
ax1.add_feature(cartopy.feature.LAKES.with_scale('10m'), zorder=2, linewidth=1.0, edgecolor='gray')

### download counties from https://prd-tnm.s3.amazonaws.com/StagedProducts/Small-scale/data/Boundaries/countyl010g_shp_nt00964.tar.gz
### untar with : tar -xzf countyl010g_shp_nt00964.tar.gz

try:
    reader = cartopy.io.shapereader.Reader('countyl010g.shp')
    counties = list(reader.geometries())
    COUNTIES = cartopy.feature.ShapelyFeature(counties, ccrs.PlateCarree())
    ax1.add_feature(COUNTIES, facecolor='none', alpha=0.5, zorder=2, edgecolor='gray')
except:
    pass

#norm = matplotlib.colors.Normalize(vmin=altitude_data.min(), vmax=altitude_data.max())
norm = matplotlib.colors.Normalize(vmin=1.0, vmax=6.0)
cmap = matplotlib.cm.viridis
mappableCmap = matplotlib.cm.ScalarMappable(norm=norm, cmap=cmap)

# ===== plot line segments individually for gradient effect

for i in range(long_data.size-1):
    long_data_this_segment = long_data[i:i+2]
    lat_data_this_segment  = lat_data[i:i+2]
    altitude_data_this_segment  = altitude_data[i:i+2]
    
    ### create linear interp objects
    ### scipy doesnt like when the data isn't ascending (hence the flip)
    
    try:
        spl_lon = scipy.interpolate.splrep(altitude_data_this_segment, long_data_this_segment, k=1)
        spl_lat = scipy.interpolate.splrep(altitude_data_this_segment, lat_data_this_segment,  k=1)
    except:
        long_data_this_segment = np.flip(long_data_this_segment)
        lat_data_this_segment = np.flip(lat_data_this_segment)
        altitude_data_this_segment = np.flip(altitude_data_this_segment)
        spl_lon = scipy.interpolate.splrep(altitude_data_this_segment, long_data_this_segment, k=1)
        spl_lat = scipy.interpolate.splrep(altitude_data_this_segment, lat_data_this_segment,  k=1)
    
    ### linearly resample on each segment
    nrsmpl=100
    altitude_data_this_segment_rsmpl = np.linspace(altitude_data_this_segment[0],altitude_data_this_segment[1],nrsmpl)
    long_data_this_segment_rsmpl = scipy.interpolate.splev(altitude_data_this_segment_rsmpl, spl_lon)
    lat_data_this_segment_rsmpl = scipy.interpolate.splev(altitude_data_this_segment_rsmpl, spl_lat)
    
    for j in range(long_data_this_segment_rsmpl.size-1):
        
        long_data_this_segment_2 = long_data_this_segment_rsmpl[j:j+2]
        lat_data_this_segment_2  = lat_data_this_segment_rsmpl[j:j+2]
        altitude_data_this_segment_2  = altitude_data_this_segment_rsmpl[j:j+2]
        
        ax1.plot(long_data_this_segment_2, lat_data_this_segment_2, transform=ccrs.PlateCarree(), c=mappableCmap.to_rgba(np.mean(altitude_data_this_segment_2)), zorder=3, linestyle='solid', alpha=0.8, lw=5.0)

# =====

### plot the actual data points as a scatter plot
pts = ax1.scatter(long_data, lat_data, transform=ccrs.PlateCarree(), alpha=1.0, marker='o', c=mappableCmap.to_rgba(altitude_data), edgecolor='w', zorder=4)

cbar = fig.colorbar(mappable=mappableCmap, ax=ax1, orientation='vertical', fraction=0.046, pad=0.04)
cbar.set_label(r'$Altitude$ [units]', fontsize=20)
cbar.ax.tick_params(labelsize=16)
cbar.set_ticks(np.linspace(1.0, 6.0, 5+1), update_ticks=True)
cbar.set_ticklabels([ ('%0.1f' % x) for x in cbar.get_ticks() ])

fig.tight_layout()
fig.savefig('flightPath.png',dpi=100)
plt.show()

回答4:

Here is my solution using Plotly's ScatterGeo object as well as Pandas and NumPy to load in the data. I chose this package since you could then have an interactive plot (with zoom and hover data) and also see which states the plane flew over :).

# Import packages
import pandas as pd
import numpy as np
import plotly.graph_objects as go

# Load your data into a Pandas DataFrame object
d = {'Lat': lat_data, 'Long': long_data, 'Altitude': altitude_data}
df = pd.DataFrame(data=d)

# Create scatterGeo object with the proper data 
scatterMapData = go.Scattergeo(lon = df['Long'], lat = df['Lat'], text=df['Altitude'],
                               mode = 'markers+lines', marker_color = df['Altitude'],
                               marker = dict(colorscale = 'Viridis', cmin = 0, 
                                             cmax = df['Altitude'].max(),
                                             colorbar_title = "Altitude",
                                             #line = dict(width=1, color='black')
                                            )
                               )

# Load scatterMapData object into Plotly Figure
# and configure basic options for title and scoping
fig = go.Figure(data=scatterMapData)
fig.update_layout(title = 'Plane Flight Data', geo_scope = 'usa',
                  geo = dict(scope = 'usa',
                             #projection_scale = 5,
                             center={'lat': np.median(df['Lat']), 'lon': np.median(df['Long'])})
                 )

# Finally show the plot
fig.show()

Here is a zoomed in version of the plot:

I just want to point out that you can change to mode='marker' in the scattergeo object for just a scatter plot and mode='lines' for just a line plot connecting each of the locations.

来源：https://stackoverflow.com/questions/64599117/plotting-two-variables-then-coloring-by-a-third-variable

标签

python

matplotlib

scipy

data-visualization