Phylogenetic Tools for Comparative Biology

Monday, September 8, 2025

Ancestral states for a discrete trait when some nodes are known

Following the Evolution meeting this summer, a grad student reached out to me with a question. Unfortunately, it’s taken me longer than I would’ve hoped to respond to him – but he followed up recently, so I thought I’d give it a quick stab.

The question was as follows:

We met at the Evolution conference last week, because I mentioned having an issue with ancestral reconstruction in my talk. You suggested that I email you after the conference. In short, I want to perform a categorical reconstruction using multiple internal nodes having fixed values.

This is straightforward enough. Unfortunately, there is no “black-box” way to do this – we’re gonna need to code it ourselves.

The trick is that to fix one or any number of internal nodes all we need to do is attach a zero length tip to that node and then assign the known state.

To see how to go about doing that, I’m going to need some data first!

## load phytools
library(phytools)
## simulate a small tree
tree<-pbtree(n=26,tip.label=LETTERS,scale=1)
## set a value of Q
Q<-matrix(c(-1,1,0,1,-2,1,0,1,-1),3,3,
  dimnames=list(letters[1:3],letters[1:3]))
## simulate data
xx<-sim.Mk(tree,Q,internal=TRUE)
xx

##  A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X  Y  Z 27 28 29 30 31 32 33 34 35 36 37 
##  a  b  b  c  c  c  b  c  c  c  c  c  c  c  b  b  b  b  c  c  c  c  c  c  c  b  b  b  b  b  b  c  c  c  c  c  c 
## 38 39 40 41 42 43 44 45 46 47 48 49 50 51 
##  c  c  c  c  c  c  b  b  b  c  c  c  c  c 
## Levels: a b c

Here’s our observed data and all of the known states at internal states graphed onto the tree.

plotTree(tree,offset=1,direction="upwards",
  lwd=1)
pp<-get("last_plot.phylo",envir=.PlotPhyloEnv)
cols<-setNames(palette.colors(3,"Polychrome 36"),
  letters[1:3])
points(pp$xx,pp$yy,pch=16,col=cols[xx],cex=1.5)

plot of chunk unnamed-chunk-97

Our data include the states at all nodes. This is an unrealistic approximation of our scenario in which it’s more likely that we have a sample of nodes. Since our tree contains 25 internal nodes, let’s imagine that we have values for just 5 of these, as follows.

nn<-xx[sort(sample(1:tree$Nnode+Ntip(tree),10))]
nn

## 29 32 34 35 37 38 42 44 48 49 
##  b  c  c  c  c  c  c  b  c  c 
## Levels: a b c

These data are labeled according to the node indices of our "phylo" object, as follows.

plotTree(tree,offset=1,direction="upwards",
  lwd=1)
nodelabels(cex=0.6,bg="white")

plot of chunk unnamed-chunk-99

Our tip data, on the other hand, can be subsampled to only the species in our tree.

x<-xx[tree$tip.label]
x

## A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 
## a b b c c c b c c c c c c c b b b b c c c c c c c b 
## Levels: a b c

For our next step, we’re going to go through and add zero length tips to each of the internal nodes from nn. The only problem is that our node indices change with every new tip that we add! The trick here is to use phytools::matchNodes to update the node we’re adding the tip to after every new taxon is added.

nntree<-tree
for(i in 1:length(nn)){
  M<-matchNodes(tree,nntree,method="distances")
  nntree<-bind.tip(nntree,names(nn)[i],edge.length=0,
    where=M[which(M[,1]==as.numeric(names(nn)[i])),2])
}

Let’s check our result.

par(mfrow=c(1,2))
plotTree(tree,fsize=0.8,lwd=1)
nodelabels(cex=0.6)
plotTree(nntree,fsize=0.8,lwd=1)

plot of chunk unnamed-chunk-102

Really, that looks exactly right.

Next, we can fit our desired Mk model, and we should do so to the full tree & internal states – since, genuinely, known internal states provide us with information about the evolutionary process of our trait. I’m going to fit a custom model that matches our known generating process for the trait.

MODEL<-matrix(c(0,1,0,1,0,1,0,1,0),3,3)
mk_fit<-fitMk(nntree,c(x,nn),model=MODEL)
mk_fit

## Object of class "fitMk".
## 
## Fitted (or set) value of Q:
##           a         b         c
## a -0.513978  0.513978  0.000000
## b  0.513978 -1.027956  0.513978
## c  0.000000  0.513978 -0.513978
## 
## Fitted (or set) value of pi:
##        a        b        c 
## 0.333333 0.333333 0.333333 
## due to treating the root prior as (a) flat.
## 
## Log-likelihood: -15.088024 
## 
## Optimization method used was "nlminb"
## 
## R thinks it has found the ML solution.

Now to get our ancestral states, we just need to pass this fitted model object to phytools::ancr.

anc_mk<-ancr(mk_fit)
anc_mk

## Marginal ancestral state estimates:
##           a        b        c
## 37 0.150557 0.629894 0.219549
## 38 0.019778 0.705542 0.274680
## 39 0.000000 1.000000 0.000000
## 40 0.000199 0.953008 0.046794
## 41 0.002164 0.981934 0.015902
## 42 0.000000 0.000000 1.000000
## ...
## 
## Log-likelihood = -15.088024

We can plot our results.

plot(anc_mk,
  args.plotTree=list(direction="upwards"),
  args.tiplabels=list(cex=0.4))

plot of chunk unnamed-chunk-105

For fun, let’s compare this result to the true values and to those that we would’ve obtained without the known internal states.

par(mfrow=c(1,3))
plotTree(tree,offset=0.5,lwd=1,ftype="i",
  mar=c(0.1,1.1,2.1,0.1))
pp<-get("last_plot.phylo",envir=.PlotPhyloEnv)
cols<-setNames(palette.colors(3,"Polychrome 36"),
  letters[1:3])
points(pp$xx,pp$yy,pch=16,col=cols[xx],cex=1.5)
mtext("a) true states",line=0,adj=0,cex=0.7)
plot(anc_mk,
  args.plotTree=list(mar=c(0.1,1.1,2.1,0.1),fsize=0.8),
  args.tiplabels=list(cex=0.8),
  args.nodelabels=list(cex=0.8),
  legend=FALSE)
mtext("b) estimates with some known nodes",line=0,adj=0,
  cex=0.7)
plot(ancr(fitMk(tree,x,model=MODEL)),
  args.plotTree=list(mar=c(0.1,1.1,2.1,0.1),fsize=0.8),
  args.tiplabels=list(cex=0.8),
  args.nodelabels=list(cex=0.8),
  legend=FALSE)
mtext("c) estimates based only on tip data",line=0,adj=0,
  cex=0.7)

plot of chunk unnamed-chunk-106

In general, we should find that our internal nodes are more accurate, and more confident – particularly when close to known nodes of the tree – than when only tip data is used.

That’s pretty much it!

Thursday, July 17, 2025

A Pagel '94 type correlation model with one binary trait and one continuous character for phytools

Just yesterday I posted code illustrating how to fit a Pagel (1994) type correlational trait evolution model, but for one binary and a second continuous trait using the discretized diffusion approximation of Boucher & Démery (2016).

The basic idea is that the “forward” (i.e., \(0 \rightarrow 1\)) and backward (i.e., \(1 \rightarrow 0\)) transition rates for our discrete character are determined as some function of our continuous trait. I arbitrarily focused on a sigmoid curve described by four parameters: the lower and upper asymptotes; the inflection point; and the “steepness” of the curve.

Just like the Pagel (1994) model, this needn’t correspond to a correlation and will only do so (in a meaningful way) if certain character combinations tend to disproportionately accumulate under the evolutionary process: e.g., large things in state a and small things in state b, or similar. This model also might be a good explanation for data that arise with a high back-and-forth transition rate for the discrete trait when the continuous character value is low and the converse.

I have now implemented this method in a new phytools function, fitcontMk, which can be obtained by updating phytools from GitHub. For the full code of this function and its S3 methods (which should be considered very \(\beta\)) you can see GitHub.

library(phytools)
packageVersion("phytools")

## [1] '2.5.0'

To illustrate use of this new method, we (of course) must have data that arose under our hypothesized process. In my earlier post I demonstrated this simulation, so I won’t reiterate that explanation here.

## number of taxa in my tree
N<-500
tree<-pbtree(n=N,scale=10)
## number of steps in simulation
nn<-400
## this creates a tree with many singleton
## nodes
h<-max(nodeHeights(tree))
tt<-map.to.singleton(
  make.era.map(tree,
    limits=seq(0,h-h/nn,length.out=nn))
)
## simulate Brownian evolution
y<-fastBM(tt,internal=TRUE,sig2=0.25)
## re-center to zero
y<-y-mean(y)
## compute the trait mean on each edge
edge_y<-rowMeans(
  matrix(y[tt$edge],nrow(tt$edge),
    ncol(tt$edge)))
## here's our sigmoid functions
q01<-2+(0.01-2)/(1+exp(-5*(edge_y-0.1)))
q10<-0.01+(2-0.01)/(1+exp(-5*(edge_y+0.1)))
## visualize them
par(mfrow=c(2,1),mar=c(5.1,4.1,1.1,1.1),
  lend=1)
ii<-order(edge_y)
forward<-q01[ii]
plot(edge_y[ii],forward,type="l",
  xlab="continuous trait",
  ylab="transition rate (a to b)",
  las=1,bty="n",cex.axis=0.6,
  ylim=c(0,max(c(q01,q10))))
box(col="grey")
grid()
backward<-q10[ii]
plot(edge_y[ii],backward,type="l",
  xlab="continuous trait",
  ylab="transition rate (b to a)",
  las=1,bty="n",cex.axis=0.6,
  ylim=c(0,max(c(q01,q10))))
box(col="grey")
grid()

plot of chunk unnamed-chunk-8

Now, let’s simulate data for our discrete trait given the scenario illustrated above:

x<-vector(length=Ntip(tt)+Nnode(tt))
x[Ntip(tt)+1]<-sample(1:2,1)
for(i in 1:nrow(tt$edge)){
  Q<-matrix(
    c(-q01[i],q01[i],q10[i],-q10[i]),2,2,
    byrow=TRUE)
  P<-expm::expm(Q*tt$edge.length[i])
  x[tt$edge[i,2]]<-sample(1:2,1,
    prob=P[x[tt$edge[i,1]],])
}
COLS<-hcl.colors(n=100)
cols<-COLS[round((edge_y-min(edge_y))/
    diff(range(edge_y))*99)+1]
par(mfrow=c(1,2),mar=c(1.1,1.1,1.1,1.1))
plot(tt,edge.color=cols,edge.width=1,
  show.tip.label=FALSE,y.lim=c(-0.1*N,N))
add.color.bar(0.75*max(nodeHeights(tree)),
  cols=COLS,title="continuous trait",
  lims=round(range(y),2),digits=2,
  prompt=FALSE,outline=FALSE,
  x=0.125*max(nodeHeights(tree)),y=-0.1*N,
  subtitle="")
COLS<-hcl.colors(n=2)
cols=COLS[x[tt$edge[,2]]]
plot(tt,edge.color=cols,edge.width=1,
  show.tip.label=FALSE,direction="leftwards",
  y.lim=c(-0.1*N,N))
add.simmap.legend(leg=letters[1:2],
  colors=setNames(COLS,letters[1:2]),prompt=FALSE,
  vertical=FALSE,x=0.5*max(nodeHeights(tree)),
  y=-0.1*N)

plot of chunk unnamed-chunk-9

This simulation condition should’ve produced a fairly strong associating between low values of the continuous character and level b of the discrete trait, and high values of the continuous character with level a of our discrete trait.

Now let’s peel out the tip values from our simulation.

## continuous character
y<-y[1:Ntip(tree)]
## discrete trait
x<-setNames(letters[1:2][x[1:Ntip(tt)]],
  tt$tip.label)
rm(list=setdiff(ls(),c("tree","x","y")))

Awesome.

Finally, we can fit our model using fitcontMk. For users testing this, I recommend you turn on the argument trace=2 so that you can get updates of the optimization as it runs. (Otherwise just be prepared to stare at a blank screen for several hours!)

fitted_model<-fitcontMk(tree,x,y,parallel=TRUE,
  ncores=10,levs=100,maxit=5000)
fitted_model

## Object of class "fitcontMk" based on
##     a discretization with k = 100 levels.
## 
## Fitted sigmoidal continuous-trait dependent
##     M2 model parameters:
## 
##  states: [ a, b ]
##   sigsq:  0.2694 
##     q01: [ 1.1872, 0.0192 ]
##     q10: [ 0, 2.2724 ]
##       B: [ 20.0787, 4.7859 ]
##       M: [ 0.5481, -0.068 ]
## 
## log(L): -629.3899 (df = 9)
## 
## R thinks it has found the ML solution.

Remember, our generating values for simulation were \(\sigma^2 = 0.25\), \(q_{0,1} = [2,0.01]\), \(q_{1,0} = [0.01,2]\), \(B_{0,1} = B_{1,0} = 5.0\), \(M_{0,1} = 0.1\), and \(M_{1,0} = -0.1\). Did we come close?

Let’s plot and compare.

par(lend=1)
obj<-plot(fitted_model)
xx<-seq(min(y),max(y),length.out=100)
q01<-2+(0.01-2)/(1+exp(-5*(xx-0.1)))
q10<-0.01+(2-0.01)/(1+exp(-5*(xx+0.1)))
par(mfg=obj$mfg[[1]])
par(usr=obj$usr[[1]])
lines(xx,q01,col=make.transparent("blue",0.5),lwd=5)
legend("topleft",legend=c("estimated","generating"),
  lwd=c(1,5),col=c("black",make.transparent("blue",0.5)))
par(mfg=obj$mfg[[2]])
par(usr=obj$usr[[2]])
lines(xx,q10,col=make.transparent("blue",0.5),lwd=5)
legend("topleft",legend=c("estimated","generating"),
  lwd=c(1,5),col=c("black",make.transparent("blue",0.5)))

plot of chunk unnamed-chunk-1

Could be worse!

Sunday, July 13, 2025

How to create a traitgram with the horizontal axis reversed.... (It's pretty easy.)

A phytools user recently asked about flipping the horizontal (“time”) axis on a traitgram: projection of the tree into phenotype space using phytools::phenogram.

This is quite easy to do, so I thought I’d through up a quick demo, as follows:

library(phytools)

## load data from Garland et al. (1992)
data(mammal.tree)
data(mammal.data)

## extract character of interest
ln.bodyMass<-log(setNames(mammal.data$bodyMass,
  rownames(mammal.data)))

## get total tree height: we'll use this later
max(nodeHeights(mammal.tree))

## [1] 70

## plot traitgram, no axes
par(mar=c(5.1,4.1,1.1,1.1)) ## update margins
phenogram(mammal.tree,ln.bodyMass,ftype="i",
  spread.cost=c(1,0),fsize=0.7,color=palette()[4],
  axes=FALSE,quiet=TRUE)

## add horizontal axis running backwards
axis(1,at=seq(0,70,by=10),labels=seq(70,0,by=-10))
mtext("mybp",1,line=3,at=35)

## add vertical axis running forwards
axis(2,las=1)
title(ylab="log(body mass, kg)")

plot of chunk unnamed-chunk-13

That’s all there is to it.