Friday, March 8, 2019

Projecting a phylogenetic tree onto a map with multiple geographic points per taxon

In response to a recent user request I recently added the capability to map more than one geographic location per taxon to the phytools plotting function phylo.to.map. To see the code click here. A phytools version containing this update can already be installed via GitHub using devtools.

In the following demo I'll use some fictional data to illustrate how it can be used - first on a world map projecting from the top, then on a single-country map projecting from the left:

library(phytools)
packageVersion("phytools")
## [1] '0.6.71'
library(mapdata)
## here are my data
World.tree
## 
## Phylogenetic tree with 26 tips and 25 internal nodes.
## 
## Tip labels:
##  B.ijyqhp, J.wjbgkp, I.zgm, U.otngy, E.bgzoej, L.cyl, ...
## 
## Rooted; includes branch lengths.
head(World,20)
##                      lat       long
## B.ijyqhp    -53.75945233   89.82698
## B.ijyqhp    -49.27526159   92.08920
## B.iownj      -1.78488475 -162.17161
## B.iownj      -5.28332361 -160.89541
## B.iownj      -1.15704991 -169.82635
## B.xiwt      -18.32161500  -39.76940
## B.xiwt      -21.81437265  -42.49659
## B.xiwt      -16.38054009  -42.98694
## E.bgzoej    -33.49914588  167.24997
## E.bgzoej    -34.83267341  168.81525
## E.bgzoej    -33.50117255  164.69732
## F.lbfqkh      6.04544745  -99.61004
## F.lbfqkh     -0.06604802  -95.04093
## F.lbfqkh      3.37645461 -101.15848
## G.jltpcivk  -16.65924503  -49.84077
## G.jltpcivk  -14.68628460  -48.19405
## G.thjdsean   10.95726052  -73.92540
## G.thjdsean   10.30423882  -72.88593
## G.thjdsean    9.79876619  -70.26079
## G.xqpudsjri   4.33975876   27.89086
obj<-phylo.to.map(World.tree,World,plot=FALSE)
## objective: 82
## objective: 82
## objective: 72
## objective: 72
## objective: 72
## objective: 72
## objective: 70
## objective: 70
## objective: 70
## objective: 70
## objective: 68
## objective: 46
## objective: 46
## objective: 46
## objective: 40
## objective: 40
## objective: 40
## objective: 38
## objective: 38
## objective: 38
## objective: 38
## objective: 38
## objective: 38
## objective: 36
## objective: 34
plot(obj,ftype="i")

plot of chunk unnamed-chunk-1

If we like, it is very easy to plot the points & linking lines to our species to be shown in different colors. Here's a simple example:

cols<-setNames(sample(rainbow(n=Ntip(World.tree))),
    World.tree$tip.label)
plot(obj,colors=cols,ftype="i",fsize=0.6,cex.points=c(0.7,1.2))

plot of chunk unnamed-chunk-2

This also should work for sideways plotted trees. For the following it's necessary to install both mapdata (for the higher resolution country map) and viridis (for the color palette that I used).

library(mapdata)
library(viridis)
Argentina.tree
## 
## Phylogenetic tree with 12 tips and 11 internal nodes.
## 
## Tip labels:
##  J.zwilcgka, Z.higpu, X.usrfnti, H.noswc, K.jgquihs, P.xdwzvfj, ...
## 
## Rooted; includes branch lengths.
Argentina
##                   lat      long
## J.zwilcgka  -49.22635 -70.70463
## Z.higpu     -48.56640 -68.50478
## X.usrfnti   -47.35648 -70.04468
## H.noswc     -44.05671 -70.53964
## K.jgquihs   -42.90179 -67.73484
## P.xdwzvfj   -40.31697 -65.47999
## G.xydpt     -36.19226 -67.29487
## N.yinzpjo   -33.44245 -68.22980
## H.hpsycu    -31.46258 -64.05009
## Q.fstnuprx  -29.70271 -60.42035
## B.pnhma     -27.17288 -62.01523
## L.lfbpxwutn -24.25808 -64.98503
## J.zwilcgka  -49.92165 -70.27778
## Z.higpu     -47.80319 -69.42856
## X.usrfnti   -48.19909 -71.25053
## X.usrfnti   -48.46782 -70.90932
## H.noswc     -44.78242 -70.02675
## H.noswc     -43.45621 -71.00061
## K.jgquihs   -44.22061 -65.83883
## K.jgquihs   -42.57006 -67.32508
## G.xydpt     -35.57724 -67.16105
## N.yinzpjo   -33.73010 -68.13654
## H.hpsycu    -30.38485 -61.98099
## Q.fstnuprx  -28.14004 -60.40019
## Q.fstnuprx  -29.56328 -60.08278
## B.pnhma     -26.15989 -62.03782
## B.pnhma     -25.90764 -62.64044
## L.lfbpxwutn -24.41996 -65.55994
obj<-phylo.to.map(Argentina.tree,Argentina,database="worldHires",
    regions="Argentina",plot=FALSE)
## objective: 22
## objective: 22
## objective: 22
## objective: 20
## objective: 18
## objective: 18
## objective: 16
## objective: 16
## objective: 16
## objective: 14
## objective: 12
cols<-setNames(sample(viridis(n=Ntip(Argentina.tree))),
    Argentina.tree$tip.label)
plot(obj,direction="rightwards",colors=cols,ftype="off",cex.points=c(0,1.5),
    pts=FALSE)

plot of chunk unnamed-chunk-3

This is brand new & involved me working on code that I haven't really touched in a few years so please let me know if you come across any bugs!

For those interested in such things, here's how the data for this example were generated:

World.tree<-pbtree(n=26,scale=100)
World.tree$tip.label<-replicate(Ntip(World.tree),
    paste(sample(LETTERS,1),".",
    paste(sample(letters,round(runif(n=1,min=3,max=10))),
    collapse=""),
    sep=""))
lat<-fastBM(World.tree,sig2=10,bounds=c(-90,90))
long<-fastBM(World.tree,sig2=80,bounds=c(-180,180))
World<-cbind(lat,long)
for(i in 1:Ntip(World.tree)){
    ni<-sample(0:2,1)
    for(j in 1:ni){ 
        World<-rbind(World,c(World[i,1]+rnorm(n=1,sd=4),
            World[i,2]+rnorm(n=1,sd=4)))
        rownames(World)[nrow(World)]<-rownames(World)[i]
    }
}
Argentina<-locator(n=12) ## with world map plotted
Argentina<-cbind(Argentina$y,Argentina$x)
colnames(Argentina)<-c("lat","long")
Argentina.tree<-pbtree(n=nrow(Argentina),scale=100)
Argentina.tree$tip.label<-replicate(Ntip(Argentina.tree),
    paste(sample(LETTERS,1),".",
    paste(sample(letters,round(runif(n=1,min=3,max=10))),
    collapse=""),
    sep=""))
rownames(Argentina)<-Argentina.tree$tip.label
for(i in 1:Ntip(Argentina.tree)){
    ni<-sample(0:2,1)
    if(ni>0){
        for(j in 1:ni){
            Argentina<-rbind(Argentina,c(Argentina[i,1]+rnorm(n=1),
                Argentina[i,2]+rnorm(n=1)))
            rownames(Argentina)[nrow(Argentina)]<-rownames(Argentina)[i]
        }
    }
}

34 comments:

  1. I am having trouble with your artificial data for Argentina
    "Argentina<-locator(n=12)"
    kicks back the following error: "Error in locator(n = 12) : plot.new has not been called yet"

    ReplyDelete
    Replies
    1. Update: I think this was an issue with my World.map fake data, but now that it works and I have built my map, running the argentina lines cause R to think way to hard about that first line "Argentina<-locator(n=12)"

      Delete
  2. Is there any way to color two points coming from the same tip differently?

    For instance, I would like to have one map point colored blue and another red, but both should connect to the same tip of the tree.

    ReplyDelete
    Replies
    1. Hi Alex. Yes. The easiest way to do this is by simply adding your colored points at the end. There coordinates on the plot are just their longitudes (x) and latitudes (y) on a decimal scale. - Liam

      Delete
  3. Hi Liam,

    I keep getting the error "objective: 0
    objective: 0
    objective: 0
    objective: 0
    objective: 0
    objective: 0
    objective: 0
    objective: 0
    Error in xy.coords(x, y) : 'x' and 'y' lengths differ" when using the map.to.phylo function.

    My tree is a neighbour joining tree, with 10 tips, and I have a matrix with those same 10 tip names and the corresponding latitiude and longitutde coordinates.

    Do you know what might be wrong?

    ReplyDelete
    Replies
    1. My suspicion is that your data are in a data frame, not a matrix. Is that true?

      Delete
  4. Hi Liam,

    I have problem for create the phylo.to.map. When I run the code, the error is:

    tree4 = read.tree(file="Newick Export")
    coords3 = as.matrix(coord3)
    phylo.to.map(tree4,coords = coord3, plots = FALSE)

    Error in aggregate.data.frame(as.data.frame(x), ...) :
    arguments must have same length

    My tree4 run good, the plot is correct. My coords has the same number that $tip.label...

    What could be my problem?

    PD. Congratulations by your blog, is very interesting.

    ReplyDelete
    Replies
    1. Hi Liam. I'm having this same problem.

      Tree has 88 tips, and my data frame is a complete 3x88 matrix with the tip names, lats and longs. Any idea why this might happen?

      Delete
    2. Hi -- that's easy! The object coords should be a 88x2 (rows x columans, for 88 tips) matrix (not a data frame) with row names as species names and latitude & longitude in columns 1 & 2, respectively.

      Delete
    3. Hello,
      I am having a related issue. While the coordinate file "may include more than one set per species", forcing them to become the row names gives me this error:
      Error in `.rowNamesDF<-`(x, value = value) :
      duplicate 'row.names' are not allowed
      How did you get around this in your coordinates files used the examples?
      Thank you!

      Delete
    4. Hello,

      I am having a related issue. While it seems that the coordinates data frame may include more than one set of coords per species, I am not able to get around this error when trying to specify the row names as species:
      "Error in `.rowNamesDF<-`(x, value = value) :
      duplicate 'row.names' are not allowed"

      How did you get around this in the coordinate data frames in your examples?

      Thank you so much for your help!

      Delete
  5. Dear Liam, thanks for your great posts all the time. I have a question for today, is it possible to map two phylogenies on the same map? How it could be done?

    ReplyDelete
    Replies
    1. Dear Karen,
      Have you found a solution to map two phylogenies, for example on the right and left of the map?
      I would be very interested in your solution,

      Al the best

      Delete
    2. Hi! Despyr, unfortunately I never found a solution.

      Delete
    3. Thank you Karen for your reply! Perhaps the feature will be implemented one day,
      Best

      Delete
  6. Dear Liam,

    does "phylo.to.map" work with unrooted trees? I cannot make it to work.

    Cheers Josef

    ReplyDelete
    Replies
    1. Hi Josef. phylo.to.map is pretty primitive, so it doesn't like "phylo" objects with multifurcations (including unrooted trees). A work-around might be:

      object<-phylo.to.map(tree,data,plot=FALSE,rotate=FALSE)
      plot(object,type="direct")

      which then gives you a direct projection of the tree onto the map.

      Does that work?

      - Liam

      Delete
    2. Hi Liam,
      Thanks for the great tutorial,
      what if I want to manually set colours to each tip? I want to be able to manually test and compare when colouring based on species/population and when based on clades, in this case I have to write down all tip names one by one and then assign the colour for them, but do you have a template code for that?

      Delete
  7. Liam hello,
    Do you how I could paint the lines and dots based on a Lat(latitude) value? That is, have the same color for the same latitude value.

    Regards

    ReplyDelete
  8. Liam hello,
    Do you know how I could color the branches based on groups of ids (ids of the tips)?
    I am working with SARS-CoV-2 making a phylo.to.map with the phylogeny of sequences mapped over a Latin America map and I want to add the colores to the branches based on names to lineages of SARS-CoV-2 ( data that comes with each sequence).

    Many Regards

    ReplyDelete
  9. Hi Liam,

    pardon me, but I was busy with other data. My apologies. I'm now having a bifurcating tree, however I am encountering the same problem as gerardo (see above). My tree and my location file have the same number of entries, but I get following error:

    "> phylo.to.map(tree,location,ylim=c(-40,-10), xlim=c(110,155))
    error in aggregate.data.frame(as.data.frame(x), ...) :
    arguments must have same length"

    I have checked that my location file is a matrix using "as.matrix".


    Cheers Josef

    ReplyDelete
    Replies
    1. Hi Josef. Can you send me a reproducible example: preferably with your saved workspace? You can email it to me at liam.revell@umb.edu. Thanks! Liam

      Delete
    2. Hi,
      I have the same problem. Any hint possible solutions?

      Delete
  10. Hi,
    I found this very interesting. I've built a NJ tree based on MAHALANOBIS distances (not phylogenetic tree) as I'm investigating intraspecific variability and I've only conservative genes ... is it possible to project non-phylogenetic distances like Mahalanobis onto a map ?? I'd be very happy if possible It would give nice resolution to my data .. Thank's in advance !

    ReplyDelete
  11. Hello Liam,
    Do you know how I could have a map of Buenos Aires insted of the whole country? Should I create it by my own?

    Regards

    ReplyDelete
  12. Hi Liam,

    Is there a way to change pch of points on tree tips to match those I chose for the map?

    Thank you very much!
    Best regards,
    Jon

    ReplyDelete
  13. Hello Liam,

    I am trying this function. I have read my tree and specified my coordinates data, but when I try phylogenetic.to.map

    object<-phylo.to.map(tree01,marathrum_coor,database="worldHires",
    regions="Colombia",plot(FALSE))

    I get this:
    Error in rotate && type == "phylogram" : invalid 'x' type in 'x && y'

    my final intention is to map this to a raster file hat I have used to produce plots of northern South America in R, but wanted to try with something more similar to the example you provided first.

    Any guidance would be much appreciated.

    Ana

    ReplyDelete
  14. Hello Liam,

    I ha ve read my tree, have my coordinates data, but when I try the phylogenetic.to.plot function

    object<-phylo.to.map(tree01,marathrum_coor,database="worldHires",
    regions="Colombia",plot(FALSE))

    I get this error:
    Error in rotate && type == "phylogram" : invalid 'x' type in 'x && y'

    My intention is to plot the phylogeny onto a custom map using a raster file that I have used to plot maps in R before, but wanted to try with code that is more similar to your example first.

    Any guidance would be much appreciated.

    Best,

    Ana

    ReplyDelete
  15. I tried
    plot(object,direction="rightwards",colors=cols,ftype="off",cex.points=c(0,1.5),pts=FALSE)

    and when I inspect object this is what I get
    Object of class "phylo.to.map" containing:

    (1) A phylogenetic tree with 42 tips and 39 internal nodes.

    (2) A geographic map with range:
    -4.22N, 13.38N
    -81.72W, -66.87W.

    (3) A table containing 42 geographic coordinates (may include
    more than one set per species).

    The nodes of the tree may not have yet been rotated to maximize
    alignment between the phylogeny & the map.

    However, when I use the plot function
    plot<-plot(object,direction="rightwards",colors=cols,ftype="off",cex.points=c(0,1.5),pts=FALSE)

    It plots the phylogeny and the map, but no connecting lines, and 43 warnings are raised, all of them are the same: In xy.coords(x, y) : NAs introduced by coercion

    I am not really sure how to solve this. Any help would be much appreciated

    ReplyDelete
    Replies
    1. Ana. Try resolving polytomies using ape::multi2di and see if that works. - Liam

      Delete
  16. Hello Liam,

    Thank you very much for this tutorial, it's really very useful and clear.

    I was wondering, in your last example, with the “mapdata” package, you focus on Argentina only, is it possible to have a restriction to several countries? or only the worldwide or one country choices are possible?

    I'd like to restrict my analysis to the East African Rift.

    Thank you very much,
    Best,
    Achille

    ReplyDelete
    Replies
    1. Yes. I believe that you can just provide the names of the various shapes (countries) you want in your plot in a vector, i.e., regions = c( "Argentina" , "Chile" )

      Delete
    2. I had tried but without satisfactory results, now it works, so maybe something was wrong with my script before, thank you

      Delete

Note: due to the very large amount of spam, all comments are now automatically submitted for moderation.