Wednesday, July 15, 2020

Projecting a continuous character onto a tree using different edge widths

Earlier today, I received the following message from a phytools user:

“I am calculating rates of change of a character trait and then plotting that change on a phylogenetic tree using the function you created in R, plotBranchbyTrait(). While, I managed to show the differences in rates of change using different colors, I was wondering if you have also created a function that represents branch thickness in each branch of the tree proportionally to the value of that branch (e.g. branches evolving at a faster rate will be represented with a thicker line etc.). If not, do you have any advice of how I would be able to do that?”

Tempted as I was to show him how to do this using phytools (which is possible, though complicated), the more responsible response was to point him towards the help page in ape::plot.phylo which documents the argument edge.width as follows:

edge.width – a numeric vector giving the width of the branches of the plotted phylogeny. These are taken to be in the same order than the component edge of phy. If fewer widths are given than the length of edge, then these are recycled.

I also cautioned that (depending on our plotting device) R might render these as only whole integer (I know that's redundant - but just to be clear) values: i.e., 0.5 to 1.5 as 1; 1.5 to 2.5 as 2; and so on.

Nonetheless, I thought it could be interesting to illustrate this method using a real example, in which I set the width of each plotted edge of the tree to match the “mean” (average of parent & daughter known or reconstructed values) of a continuous trait on the tree.

Let's go:

library(phytools)
data(mammal.tree)
## check our tree
plotTree(mammal.tree,fsize=0.8,ftype="i")

plot of chunk unnamed-chunk-1

data(mammal.data)
## check our data
mammal.data
##                    bodyMass homeRange
## U_maritimus          265.00   115.600
## U_arctos             251.30    82.800
## U_americanus          93.40    56.800
## N_narica               4.40     1.050
## P_locor                7.00     1.140
## M_mephitis             2.50     2.500
## M_meles               11.60     0.870
## C_lupus               35.30   202.800
## C_latrans             13.30    45.000
## L_pictus              20.00   160.000
## C_aureus               8.80     9.100
## U_cinereoargenteus     3.70     1.100
## V_fulva                4.80     3.870
## H_hyaena              26.80   152.800
## C_crocuta             52.00    25.000
## A_jubatus             58.80    62.100
## P_pardus              52.40    23.200
## P_tigris             161.00    69.600
## P_leo                155.80   236.000
## T_bairdii            250.00     2.000
## C_simum             2000.00     6.650
## D_bicornis          1200.00    15.600
## E_hemionus           200.00    35.000
## E_caballus           350.00    22.500
## E_burchelli          235.00   165.000
## L_guanicoe            95.00     0.500
## C_dromedarius        550.00   100.000
## G_camelopardalis    1075.00    84.600
## S_caffer            6210.00   138.000
## B_bison              865.00   133.000
## T_oryx               511.00    87.500
## G_granti              62.50    20.000
## G_thomsonii           20.50     5.300
## A_cervicapra          37.50     6.500
## M_kirki                5.00     0.043
## O_americanus         113.50    22.750
## O_canadensis          85.00    14.330
## H_equinus            226.50    80.000
## A_melampus            53.25     3.800
## C_taurinus           216.00    75.000
## D_lunatus            130.00     2.200
## A_buselaphus         136.00     5.000
## A_americana           50.00    10.000
## C_canadensis         300.00    12.930
## D_dama                55.00     1.300
## A_alces              384.00    16.100
## R_tarandus           100.00    30.000
## O_virginianus         57.00     1.960
## O_hemionus            74.00     2.850
## pull out body mass on log scale
lnBodyMass<-setNames(log(mammal.data$bodyMass),
    rownames(mammal.data))
## reconstruct body mass on the tree
a<-fastAnc(mammal.tree,lnBodyMass)
a
## Ancestral character estimates using fastAnc:
##       50       51       52       53       54       55       56       57       58       59       60 
## 4.640573 3.941365 3.580030 3.378235 5.008602 5.417042 2.506860 1.783328 2.075654 2.092616 2.102696 
##       61       62       63       64       65       66       67       68       69       70       71 
## 2.427963 2.879837 2.935833 4.003930 3.660685 4.315032 4.607959 4.760302 4.873642 5.317474 5.559303 
##       72       73       74       75       76       77       78       79       80       81       82 
## 7.078588 5.517309 5.552671 5.192977 5.370860 5.348416 5.312073 5.325223 5.429285 6.920095 4.688839 
##       83       84       85       86       87       88       89       90       91       92       93 
## 3.685079 3.636850 3.590634 4.698303 4.603683 4.874816 4.790504 4.882241 4.886430 5.262581 5.009917 
##       94       95       96       97 
## 4.889860 4.940759 4.842665 4.247903

OK, now for the hard part. Our edge.width vector needs to be in the order of the edges of mammal.tree$edge. As such, what I'm going to do is go through each edge of this matrix, find the observed or reconstructed value of the parent and daughter nodes, and average them into a single vector. I'm going to do this with apply, but I could have also done it using a for loop.

First, though, I will sort all of my trait values into a single long vector with an order that corresponds to the node indices.

## get node values for all nodes (including tips)
node.values<-c(lnBodyMass[mammal.tree$tip.label],
    unclass(a))
node.values
##        U_maritimus           U_arctos       U_americanus           N_narica            P_locor 
##          5.5797298          5.5266474          4.5368913          1.4816045          1.9459101 
##         M_mephitis            M_meles            C_lupus          C_latrans           L_pictus 
##          0.9162907          2.4510051          3.5638830          2.5877640          2.9957323 
##           C_aureus U_cinereoargenteus            V_fulva           H_hyaena          C_crocuta 
##          2.1747517          1.3083328          1.5686159          3.2884019          3.9512437 
##          A_jubatus           P_pardus           P_tigris              P_leo          T_bairdii 
##          4.0741419          3.9589066          5.0814044          5.0485731          5.5214609 
##            C_simum         D_bicornis         E_hemionus         E_caballus        E_burchelli 
##          7.6009025          7.0900768          5.2983174          5.8579332          5.4595855 
##         L_guanicoe      C_dromedarius   G_camelopardalis           S_caffer            B_bison 
##          4.5538769          6.3099183          6.9800759          8.7339162          6.7627295 
##             T_oryx           G_granti        G_thomsonii       A_cervicapra            M_kirki 
##          6.2363696          4.1351666          3.0204249          3.6243409          1.6094379 
##       O_americanus       O_canadensis          H_equinus         A_melampus         C_taurinus 
##          4.7318028          4.4426513          5.4227449          3.9749978          5.3752784 
##          D_lunatus       A_buselaphus        A_americana       C_canadensis             D_dama 
##          4.8675344          4.9126549          3.9120230          5.7037825          4.0073332 
##            A_alces         R_tarandus      O_virginianus         O_hemionus                 50 
##          5.9506426          4.6051702          4.0430513          4.3040651          4.6405728 
##                 51                 52                 53                 54                 55 
##          3.9413651          3.5800304          3.3782346          5.0086025          5.4170421 
##                 56                 57                 58                 59                 60 
##          2.5068605          1.7833278          2.0756539          2.0926160          2.1026959 
##                 61                 62                 63                 64                 65 
##          2.4279633          2.8798366          2.9358328          4.0039298          3.6606852 
##                 66                 67                 68                 69                 70 
##          4.3150319          4.6079589          4.7603022          4.8736421          5.3174738 
##                 71                 72                 73                 74                 75 
##          5.5593032          7.0785882          5.5173085          5.5526712          5.1929774 
##                 76                 77                 78                 79                 80 
##          5.3708596          5.3484157          5.3120732          5.3252231          5.4292850 
##                 81                 82                 83                 84                 85 
##          6.9200950          4.6888392          3.6850793          3.6368500          3.5906336 
##                 86                 87                 88                 89                 90 
##          4.6983032          4.6036828          4.8748164          4.7905038          4.8822412 
##                 91                 92                 93                 94                 95 
##          4.8864297          5.2625810          5.0099171          4.8898599          4.9407594 
##                 96                 97 
##          4.8426647          4.2479034
## get edge values by averaging parent & daughter nodes
## for each edge
edge.values<-apply(mammal.tree$edge,1,function(e,nv)
    mean(nv[e]),nv=node.values)
edge.values
##  [1] 4.290969 3.760698 3.479132 4.193419 5.212822 5.498386 5.471845 4.772747 2.942548 2.145094 1.632466
## [12] 1.864619 2.291257 1.495972 2.263329 2.836323 2.097656 2.265330 2.653900 2.907835 3.249858 2.761798
## [23] 2.937784 2.301358 1.705514 1.830616 3.972647 3.832308 3.474544 3.805964 4.159481 4.194587 4.461495
## [34] 4.283433 4.684131 4.920853 4.904438 4.757107 5.095558 5.438389 5.540382 6.318946 7.339745 7.084333
## [45] 5.417391 5.407813 5.534990 5.705302 5.506128 5.033310 5.281918 4.962368 5.840389 5.270697 6.164246
## [56] 5.330244 5.318648 5.377254 6.174690 7.827006 6.841412 5.832827 5.007031 4.186959 3.660965 3.613742
## [67] 3.862900 3.305529 3.630595 2.647259 4.693571 4.650993 4.667743 4.523167 4.786560 5.148781 4.832660
## [78] 4.382751 4.836372 5.128760 4.884335 4.876982 4.899542 5.287327 4.587302 5.136249 4.949888 5.296821
## [89] 4.448597 4.975338 5.445701 4.891712 4.723917 4.545284 4.145477 4.275984

OK, now we're ready. I'm going to round my edge widths to have values between 1 & 10. I'll do this using a simple rescale of my current values:

edge.widths<-edge.values-min(edge.values)
edge.widths<-edge.widths*(9/max(edge.widths))+1

OK, now let's plot:

par(lend=2)
plot(mammal.tree,no.margin=TRUE,cex=0.7,edge.width=edge.widths,label.offset=1)

plot of chunk unnamed-chunk-4

Lastly, for fun why don't we try to add a legend!

par(lend=2)
plot(mammal.tree,no.margin=TRUE,cex=0.7,edge.width=edge.widths,
    label.offset=1,edge.col="darkgrey")
leg.length<-30
nsegs<-20
segments(x0=seq(0,(1-1/nsegs)*leg.length,length.out=nsegs),
    y0=rep(Ntip(mammal.tree),nsegs),
    x1=seq(1/nsegs*leg.length,leg.length,length.out=nsegs),
    y1=rep(Ntip(mammal.tree),nsegs),
    lwd=seq(1,10,length.out=nsegs),col="darkgrey")
text(0,Ntip(mammal.tree),round(min(edge.values),2),pos=1,
    cex=0.8)
text(leg.length,Ntip(mammal.tree),round(max(edge.values),2),
    pos=1,cex=0.8)
text(leg.length/2,Ntip(mammal.tree),"log(body size kg)",
    pos=1,cex=0.8)

plot of chunk unnamed-chunk-5

That's not bad, right?

No comments:

Post a Comment

Note: due to the very large amount of spam, all comments are now automatically submitted for moderation.