Monday, March 11, 2019

Projecting a phylogeny onto a geographic map showing species ranges in R

In the past couple of days I have posted a few times (1, 2, 3) about projecting a phylogeny to a geographic map in which we had multiple observations per species.

Today, I thought it might be neat to demonstrate what we can do if we have lots of observations per species and what to project a tree onto estimated range areas for each species. Note that to duplicate this post exactly it is necessary to update phytools from GitHub.

First, let's start with a fictional dataset from Brazil:

library(phytools)
library(mapdata)
Brazil.tree
## 
## Phylogenetic tree with 8 tips and 7 internal nodes.
## 
## Tip labels:
##  I.ike, I.zawsgyqmd, C.mkgiraxh, I.nqxrdes, T.zfrbt, F.uhlvwd, ...
## 
## Rooted; includes branch lengths.
head(Brazil)
##                  lat      long
## C.mkgiraxh -5.923737 -50.30310
## C.mkgiraxh -6.961451 -51.40991
## C.mkgiraxh -6.105770 -51.44966
## C.mkgiraxh -6.133681 -49.31009
## C.mkgiraxh -7.022131 -53.61715
## C.mkgiraxh -4.509163 -49.00638
tail(Brazil)
##                lat      long
## T.zfrbt  -9.086615 -49.16497
## T.zfrbt  -8.874021 -52.17523
## T.zfrbt  -7.510258 -52.21743
## T.zfrbt  -8.656923 -51.57321
## T.zfrbt -10.740220 -51.95592
## T.zfrbt  -8.993945 -50.38208

Now, let's plot these data just as we did yesterday. Again, I'll use the viridis package (which I just discovered) for colors:

library(viridis)
cols<-setNames(sample(viridis(n=Ntip(Brazil.tree))),
    Brazil.tree$tip.label)
cols
##       I.ike I.zawsgyqmd  C.mkgiraxh   I.nqxrdes     T.zfrbt    F.uhlvwd 
## "#277F8EFF" "#46337EFF" "#9FDA3AFF" "#1FA187FF" "#440154FF" "#FDE725FF" 
##      T.etng   K.jzwgtku 
## "#365C8DFF" "#4AC16DFF"
obj<-phylo.to.map(Brazil.tree,Brazil,database="worldHires",
    regions="Brazil",plot=FALSE,direction="rightwards")
## objective: 24
## objective: 12
## objective: 12
## objective: 6
## objective: 4
## objective: 2
## objective: 2
plot(obj,direction="rightwards",colors=cols,cex.points=c(0,1),
    lwd=c(3,1),ftype="i")

plot of chunk unnamed-chunk-3

That's actually pretty cool looking, but maybe instead of plotting all the points we'd like to plot shapes. One way we could do this is by estimating a convex hull (MCP) for each taxon, and then overlay those at the end:

for(i in 1:Ntip(Brazil.tree)){
    spr<-Brazil[which(rownames(Brazil)==Brazil.tree$tip.label[i]),]
    mcp<-if(i==1) spr[chull(spr),] else rbind(mcp,spr[chull(spr),])
}
plot(obj,direction="rightwards",colors=cols,cex.points=c(0,0.6),
    lwd=c(3,1),ftype="i")
for(i in 1:Ntip(Brazil.tree)){
    ii<-which(rownames(mcp)==Brazil.tree$tip.label[i])
    polygon(mcp[ii,2:1],col=make.transparent(cols[Brazil.tree$tip.label[i]],0.8),
        border="darkgrey")
}

plot of chunk unnamed-chunk-4

Or, alternatively, we could plot the shapes without the points:

obj<-phylo.to.map(Brazil.tree,mcp,database="worldHires",regions="Brazil",
    plot=FALSE,direction="rightwards")
## objective: 24
## objective: 12
## objective: 12
## objective: 6
## objective: 4
## objective: 2
## objective: 2
plot(obj,direction="rightwards",colors=sapply(cols,make.transparent,0.4),
    pts=FALSE,ftype="off",cex.points=c(0,0),ftype="off",lwd=c(3,1))
for(i in 1:Ntip(Brazil.tree)){
    ii<-which(rownames(mcp)==Brazil.tree$tip.label[i])
    polygon(mcp[ii,2:1],col=make.transparent(cols[Brazil.tree$tip.label[i]],0.8),
        border="darkgrey")
}

plot of chunk unnamed-chunk-5

That's not bad.

15 comments:

  1. First, thank you for creating and demonstrating this wonderful package. Great visualization of data!

    I am trying to implement this pipeline myself. I have a phylogeny and coordinates for each species in the phylogeny. Unfortunately, I am getting an error with phylo.to.map():
    objective: 0
    objective: 0
    objective: 0
    objective: 0
    objective: 0
    objective: 0
    objective: 0
    objective: 0
    objective: 0
    Error in coords[cw$tip.label, 2:1] : subscript out of bounds

    I am sure the names in the phylogeny match the names in the coordinate data set and they are all characters. Any advice would be helpful.

    Thank you for your time,
    Kyle

    ReplyDelete
    Replies
    1. I figured out my problem! I need a matrix for the coordinates, not a data frame.

      Delete
    2. @kwgray Hello, How did you resolve it?

      Delete
    3. Dr. Revell, thank you for your answer, I appreciate very much your work, It's amazing! Thank you
      as.matrix Didn't work :(
      > lat.long=read.table(file.choose(), sep=',', head=T, rownames=NULL)
      > head(lat.long)
      X lat long X.1 X.2 X.3 X.4 X.5 X.6 X.7 X
      1 acutus 19.36117 -101.5737 NA NA NA NA NA NA NA
      2 acutus 19.51342 -101.6574 NA NA NA NA NA NA NA
      3 acutus 19.51342 -101.6574 NA NA NA NA NA NA NA
      4 acutus 19.46861 -101.5619 NA NA NA NA NA NA NA
      5 acutus 19.46861 -101.5619 NA NA NA NA NA NA NA
      6 acutus 19.46861 -101.5619 NA NA NA NA NA NA NA
      >> x=as.vector(lat.long$X)
      > x
      [1] "acutus" "acutus" "acutus" "acutus" "acutus"
      [6] "acutus" "acutus" "acutus" "acutus" "acutus"
      [11] "acutus" "acutus" "acutus" "acutus" "acutus"
      > data=as.matrix(lat.long[2:3])
      > head(data)
      lat long
      [1,] 19.36117 -101.5737
      [2,] 19.51342 -101.6574
      [3,] 19.51342 -101.6574
      [4,] 19.46861 -101.5619
      [5,] 19.46861 -101.5619
      [6,] 19.46861 -101.5619
      > is.matrix(data)
      [1] TRUE
      > coords=cbind(x,data)
      > head(coords)
      x lat long
      [1,] "acutus" "19.3611667" "-101.5736667"
      [2,] "acutus" "19.5134167" "-101.6574444"
      [3,] "acutus" "19.5134167" "-101.6574444"
      [4,] "acutus" "19.4686056" "-101.56195"
      [5,] "acutus" "19.4686056" "-101.56195"
      [6,] "acutus" "19.4686056" "-101.56195"
      > is.matrix(coords)
      [1] TRUE
      >objplot(obj,direction="rightwards",colors=cols,cex.points=c(0,1),lwd=c(3,1),ftype="i")
      Error in coords[cw$tip.label, 2:1] : subscript out of bounds

      Delete
    4. The problem may be that your matrix should have the species names as row names, not as a separate column. Try assigning the species names as the row names of your matrix & see if that works.

      Delete
    5. Thank you, Dr Revell, but How I can do that?

      Delete
    6. In your example:

      coords<-as.matrix(lat.long[2:3])
      rownames(coords)<-x

      All the best, Liam

      Delete
    7. Hi Liam. I'm having a similar problem. I have multiple coordinates per species, so when I try to assign the rownames() to the matrix it ads a sequential number to the names (e.g. species.1, especies.2, ect.), this ends in a "objective: 0". Any recommendation to deal with the duplicates in the row names of a matrix?

      Delete
    8. Andrés, did you try converting your data frame of coordinates to a matrix as I advised above (this can be done using the function as.matrix)? This is most likely your problem

      Delete
  2. Hi Liam, those polygons are simply beautiful!
    I have some questions about phylo.to.map object and its posterior plotting:
    I am trying to plot single locations of some primate species on a map, but I'd like to plot just Africa, Asia and some parts of Oceania. I've seen your example pruning Isla de Pascua from a Chile map, but that method doesn't seem to be working. I've checked every region included in the map and its respective code, and unsuccessfully tried to supress the ones I didn't need. Do you know how could I eliminate America and Europe from the world map? (Virtually speaking of course haha). This is some example of what I've been trying:

    tree<-read.tree("arbol30sp.nwk")
    plotTree(tree, ftype = "i", lwd = 0.5)

    geo<-read.csv("Locations.csv", h = T, row.names = 1)
    geo<-as.matrix(geo)

    match<-match.phylo.data(tree, geo)
    #[1] "Dropping tips from the tree because they are not present in the data:"
    #[1] "Homo_sapiens"*
    sp29<-match$phy

    obj<-phylo.to.map(sp29,geo,regions=".",direction="rightwards", plot = FALSE)

    ii<-which(obj$map$x<c(2,3,4,11,12,13,14,15,16,21,22,23,24,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159,160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,176,177,178,179,180,182,183,184,185,186,188,189,190,191,192,193,194,195,196,198,240,241,242,243,244,418,419,420,421,422,423,424,425,426,427,428,429,430,431,432,433,434,435,439,440,441,442,443,444,445,446,447,448,449,471,486,496,497,498,499,516,569,593,595,596,597,598,599,600,601,602,603,604,605,606,668,669,670,692,693,694,695,696,697,698,699,700,701,702,703,704,705,706,707,708,709,710,711,712,713,714,715,716,717,718,719,720,721,722,723,724,725,726,727,728,729,730,731,732,733,734,735,736,737,738,739,740,741,742,743,744,745,746,747,748,749,750,751,752,753,754,755,756,757,758,759,760,761,762,763,764,765,766,767,768,769,770,771,772,773,774,775,776,777,778,779,780,781,782,783,784,785,786,787,788,789,790,791,792,793,794,795,796,797,798,799,800,801,802,803,804,805,806,807,808,809,810,811,812,813,814,815,816,817,818,819,820,821,822,823,824,826,827,828,829,830,831,832,833,834,835,836,837,838,839,846,847,848,849,850,851,862,863,864,865,866,867,868,869,870,871,872,873,874,875,876,877,878,879,880,881,882,883,884,885,886,887,888,889,890,891,892,893,894,895,896,897,898,899,900,901,902,903,904,905,906,907,908,909,910,932,933,934,935,936,937,938,939,940,941,942,944,945,946,947,948,949,952,953,954,955,960,962,963,964,965,966,967,968,969,992,995,996,997,998,999,1000,1001,1002,1003,1004,1005,1006,1007,1008,1009,1010,1011,1012,1013,1015,1022,1023,1024,1025,1026,1028,1029,1030,1031,1032,1033,1034,1035,1036,1037,1038,1039,1040,1041,1048,1050,1051,1098,1113,1114,1115,1116,1117,1125,1126,1127,1128,1129,1130,1131,1132 #And so on#))
    obj$map$x<-obj$map$x[ii]
    obj$map$y<-obj$map$y[ii]

    plot(obj,direction="rightwards",fsize = 0.4, ftype="i", lwd = 0.3)

    ReplyDelete
  3. Great contribution, as usual, Liam. Have you had any requests to apply colors across clades, rather than across tips? From the figures I've seen so far, each tip receives a unique color. In the polygon figure above, the individual coordinate info is lost due to the collapsing of the clades. Cheers.

    ReplyDelete
    Replies
    1. This should already be possible. I'll try to remember to post a tutorial showing how when I get a chance.

      Delete
  4. Hi Liam! thanks for your great contribution! I was trying to use phylo.to.map with different map projections (E.g. Lambert) available in the package maps, with no success. Is phylo.to.map using a particular projection as a default?

    ReplyDelete

Note: due to the very large amount of spam, all comments are now automatically submitted for moderation.