How to add legend for regional map with a legend describing associated labels using ggplot2?

后端 未结 1 2023
离开以前
离开以前 2021-02-04 20:46

SpatialPoly Data: SpatialData

Yield Data: Yield Data

Code:

    ## Loading packages
    library(rgdal)
    library(plyr)
    library(maps)
    lib         


        
1条回答
  •  隐瞒了意图╮
    2021-02-04 21:02

    First, let me apologize for taking so long to get back - I missed your comment among all the others. Is this what you had in mind?

    This was produced with the following code. Before getting into an explanation, you should be aware that creating a legend is the least of your problems. Note how the colors are different in the two maps. Your code above does not assign CO2 changes to the correct regions. For example, according to MoroccoYields.csv, the largest change (improvement?) was -0.205 in Region 4, but on your map the largest (darkest red) is at the northeastern tip of Morocco, which is actually l'Oriental (Region 6). An explanation follows the code.

    ## Loading packages
    library(rgdal)
    library(plyr)
    library(maps)
    library(maptools)
    library(mapdata)
    library(ggplot2)
    library(RColorBrewer)
    library(foreign)  
    library(sp)
    
    # get.centroids: function to extract polygon ID and centroid from shapefile
    get.centroids = function(x){
      poly = MoroccoReg@polygons[[x]]
      ID   = poly@ID
      centroid = as.numeric(poly@labpt)
      return(c(id=ID, long=centroid[1], lat=centroid[2]))
    }
    #setwd("Directory where shapefile and Yields are stored")
    ## Loading shapefiles and .csv files
    MoroccoReg        <- readOGR(dsn=".", layer="Morocco_adm1")
    MoroccoYield      <- read.csv(file = "Morocco_Yield.csv", header=TRUE, sep=",", na.string="NA", dec=".", strip.white=TRUE)
    MoroccoYield$ID_1 <- substr(MoroccoYield$ID_1,3,10)
    
    ## Reorder the data in the shapefile based on the category variable "ID_1" and change to dataframe
    MoroccoReg    <- MoroccoReg[order(MoroccoReg$ID_1), ]
    MoroccoYield  <- cbind(id=rownames(MoroccoReg@data),MoroccoYield)
    #  build table of labels for annotation (legend).
    labs          <- do.call(rbind,lapply(1:14,get.centroids))
    labs          <- merge(labs,MoroccoYield[,c("id","ID_1","Label")],by="id")
    labs[,2:3]    <- sapply(labs[,2:3],function(x){as.numeric(as.character(x))})
    labs$sort <- as.numeric(as.character(labs$ID_1))
    labs          <- labs[order(labs$sort),]
    
    MoroccoReg.df <- fortify(MoroccoReg)
    ## This does NOT work...
    ## Add the yield impacts column to shapefile from the .csv file by "ID_1"
    ## Note that in the .csv file, I just added the column "ID_1" to match it with the shapefile
    #MoroccoReg.df <- cbind(MoroccoReg.df,MoroccoYield,by = 'ID_1')
    ## Do it this way...
    MoroccoReg.df <- merge(MoroccoReg.df,MoroccoYield, by="id")
    
    ## Check the structure and contents of shapefile
    attributes(MoroccoReg.df)
    ## Plotting 
    
    MoroccoRegMap1 <- ggplot(data = MoroccoReg.df, aes(long, lat, group=id)) 
    MoroccoRegMap1 <- MoroccoRegMap1 + geom_polygon(aes(fill = A2Med_noCO2))
    MoroccoRegMap1 <- MoroccoRegMap1 + geom_path(colour = 'gray', linestyle = 2)
    MoroccoRegMap1 <- MoroccoRegMap1 + scale_fill_gradient2(name = "%Change in yield",low = "#CC0000",mid = "#FFFFFF",high = "#006600")
    MoroccoRegMap1 <- MoroccoRegMap1 + labs(title="SRES_A2, noCO2 Effect")
    MoroccoRegMap1 <- MoroccoRegMap1 + coord_equal() #+ theme_map()
    MoroccoRegMap1 <- MoroccoRegMap1 + geom_text(data=labs, aes(x=long, y=lat, label=ID_1), size=4)
    MoroccoRegMap1 <- MoroccoRegMap1 + annotate("text", x=max(labs$long)-5, y=min(labs$lat)+3-0.5*(1:14),
                                                label=paste(labs$ID_1,": ",labs$Label,sep=""),
                                                size=3, hjust=0)
    MoroccoRegMap1
    

    Explanation:

    First, on merging your yield data with the map regions: you use

    MoroccoReg.df <- cbind(MoroccoReg.df,MoroccoYield,by = 'ID_1')
    

    This is not how cbind(...) works. cbind(...) merely combines it's arguments column-wise. It is not a merge function. So you had a data frame, MoroccoReg.df, with 107,800 rows (one row for every line endpoint on your map), and you are combining it with MoroccoYield, which has 14 rows (1 for every Region). So cbind(...) replicates those 14 rows 7700 times to fill out the 107,800 rows it needs. The expression by="ID_1" merely adds another column named "by" with "ID_1" replicated 107,800 times. Run the statement above and type head(MoroccoReg.df) and look for the last column.

    So how to do the merge? There are a number of functions in R that are supposed to make this easy, but I couldn't get any of them to work. This is what did work:

    Every polygon in the shapefile has an ID. There is also an "ID_1" field in the shapefile data section, but these are different and unrelated. [BTW: The ID_1 field in the shapefile data section, and the ID_1 field in your csv file were also different: the latter has "TR" prepended to the region number; so that had to be dealt with as well]. Reordering the shapefile with:

    MoroccoReg    <- MoroccoReg[order(MoroccoReg$ID_1), ]
    

    will change the order of the polygons, but will not change their ID's. It turns out that the polygon ID matches the row name in the data section of the shapefile, so I prepended that (using cbind(...)!) to your MoroccoYeild data frame.

    MoroccoYield  <- cbind(id=rownames(MoroccoReg@data),MoroccoYield)
    

    So now MoroccoYield has an id field which maps to the polygon ID, and an ID_1 field, which identifies the Region. Now we can fortify(...) and merge(...). merge(...) does take a by= argument.

    MoroccoReg.df <- fortify(MoroccoReg)
    MoroccoReg.df <- merge(MoroccoReg.df,MoroccoYield, by="id")
    

    This appends all of your MoroccoYield columns to the appropriate rows of MoroccoReg.df.

    Creating the legend:

    The obvious question is how to position the labels? Ideally, we would place the Region number from MoroccoYield$ID_1 at the centroid of each region, and then create a legend that identifies the Regions, based on MoroccoYield$Label.

    So where to find the centroids? These are stored in an obscure location in the polygon section of the shapefile. To make a long story short, I created a utility function get.centroid(...) which extracts the centroid from a polygon. Then I applied that function to all the polygons to produce a table of centroids with corresponding polygon ID. Then I merged that with the labels in MoroccoYield. This created a data frame labs which has the following columns:

    id:    polygon ID
    long:  centroid longitude
    lat:   centroid latitude
    ID_1:  region ID
    label: region name
    sort:  a sortable (numeric) version of ID_1
    

    Then, adding the following code to your ggplot...

    ...
    MoroccoRegMap1 <- MoroccoRegMap1 + geom_text(data=labs, aes(x=long, y=lat, label=label.id), size=4)
    MoroccoRegMap1 <- MoroccoRegMap1 + annotate("text", x=max(labs$long)-5, y=min(labs$lat)+3-0.5*(1:14),
                                                label=paste(labs$label.id,": ",labs$Label,sep=""),
                                                size=3, hjust=0)
    

    ...creates the map. There's no clean way, that I could find, to do this with a formal "ggplot legend", so I had to use annotate(...). Positioning the annotation is kind of a hack, but it seems to work.

    Edit: In response to @smailov83's comment, if you change the code to create the ggplot to this...

    MoroccoRegMap1 <- ggplot(data = MoroccoReg.df, aes(long, lat, group=group)) 
    MoroccoRegMap1 <- MoroccoRegMap1 + geom_polygon(aes(fill = A2Med_noCO2))
    MoroccoRegMap1 <- MoroccoRegMap1 + geom_path(colour = 'gray', linestyle = 2)
    MoroccoRegMap1 <- MoroccoRegMap1 + scale_fill_gradient2(name = "%Change in yield",low = "#CC0000",mid = "#FFFFFF",high = "#006600")
    MoroccoRegMap1 <- MoroccoRegMap1 + labs(title="SRES_A2, noCO2 Effect")
    MoroccoRegMap1 <- MoroccoRegMap1 + coord_equal() #+ theme_map()
    MoroccoRegMap1 <- MoroccoRegMap1 + geom_text(data=labs, aes(x=long, y=lat, label=ID_1, group=ID_1), size=4)
    MoroccoRegMap1 <- MoroccoRegMap1 + annotate("text", x=max(labs$long)-5, y=min(labs$lat)+3-0.5*(1:14),
                                                label=paste(labs$ID_1,": ",labs$Label,sep=""),
                                                size=3, hjust=0)
    

    ...you get this:

    Which I believe fixes the problem. The reason for the extra lines in your map was that the ggplot must be grouped by the column MoroccoReg.df$group (so, aes(..., group=group) not aes(...,group=id)). When you do this, however, ggplot tries to group by "group" in all layers. In geom_text(...), where we are using a new, local dataset - the labs data frame - there is no group column. To deal with this, we must explicitly set group to something else in geom_text(...). Bottom line: this seem to work.

    0 讨论(0)
提交回复
热议问题