A *phytools* user asks:

*“Question – can make.simmap constrain the root state of a phylogeny to a particular state? As in, if we
want to assume the ancestor of primates was a frugivore, can this be done using the current setup given both
frugivorous and folivorous species at the tips?”*

Answer – yes!

This is done using the argument `pi`

which sets prior probabilities for the global root node of the tree.
(Fixing values for other nodes in the tree is also possible, if a tiny bit more complicated.)

Here's an example using piscivory vs. non-piscivory in sunfishes. In this case, the phylogeny is originally from Near et al. (2005).

First, let's load *phytools* plus our tree & data:

```
library(phytools)
data(sunfish.tree)
sunfish.tree<-as.phylo(sunfish.tree)
sunfish.tree
```

```
##
## Phylogenetic tree with 28 tips and 27 internal nodes.
##
## Tip labels:
## Acantharchus_pomotis, Lepomis_gibbosus, Lepomis_microlophus, Lepomis_punctatus, Lepomis_miniatus, Lepomis_auritus, ...
##
## Rooted; includes branch lengths.
```

```
data(sunfish.data)
head(sunfish.data)
```

```
## feeding.mode gape.width buccal.length
## Acantharchus_pomotis pisc 0.114 -0.009
## Lepomis_gibbosus non -0.133 -0.009
## Lepomis_microlophus non -0.151 0.012
## Lepomis_punctatus non -0.103 -0.019
## Lepomis_miniatus non -0.134 0.001
## Lepomis_auritus non -0.222 -0.039
```

Let's get our feeding mode trait out:

```
fmode<-setNames(sunfish.data$feeding.mode,rownames(sunfish.data))
```

Now let's do our stochastic mapping using the default, *flat* prior at the root:

```
flat.prior<-make.simmap(sunfish.tree,fmode,model="ARD",nsim=100)
```

```
## make.simmap is sampling character histories conditioned on
## the transition matrix
##
## Q =
## non pisc
## non -6.087789 6.087789
## pisc 3.048905 -3.048905
## (estimated using likelihood);
## and (mean) root node prior probabilities
## pi =
## non pisc
## 0.5 0.5
```

```
## Done.
```

```
flat.prior
```

```
## 100 phylogenetic trees with mapped discrete characters
```

We can plot the posterior probabilities at all the nodes of the tree using the `summary`

method for the
object class:

```
flat.summary<-summary(flat.prior)
flat.summary
```

```
## 100 trees with a mapped discrete character with states:
## non, pisc
##
## trees have 7.32 changes between states on average
##
## changes are of the following types:
## non,pisc pisc,non
## x->y 4.45 2.87
##
## mean total time spent in each state is:
## non pisc total
## raw 0.7443479 0.9507313 1.695079
## prop 0.4391228 0.5608772 1.000000
```

```
cols<-setNames(palette()[c(4,2)],levels(fmode))
plot(flat.summary,ftype="i",fsize=0.9,colors=cols)
legend("topleft",c("non-piscivorous","piscivorous"),
pch=21,pt.cex=2,pt.bg=cols)
```

Now, let's say we want to fix the root state to *non-piscivory* – for instance, because there's other
evidence external to that from our comparative data indicating that ancestral sunfish were
non-piscivorous rather than piscivorous. We can fix that using the argument `pi`

as follows:

```
constrained<-make.simmap(sunfish.tree,fmode,model="ARD",pi=c(1,0),nsim=100)
```

```
## make.simmap is sampling character histories conditioned on
## the transition matrix
##
## Q =
## non pisc
## non -6.011052 6.011052
## pisc 0.000000 0.000000
## (estimated using likelihood);
## and (mean) root node prior probabilities
## pi =
## [1] 1 0
```

```
## Done.
```

```
constrained
```

```
## 100 phylogenetic trees with mapped discrete characters
```

One thing that the reader might notice is that this new constraint *changes* our inferred substitution
model. (In this case, it is a large change – often the effect is much more subtle.)

```
constrained.summary<-summary(constrained)
plot(constrained.summary,ftype="i",fsize=0.9,colors=cols)
legend("topleft",c("non-piscivorous","piscivorous"),
pch=21,pt.cex=2,pt.bg=cols)
```

Neat.

We can also visualize our result using `densityMap`

:

```
densityMap(constrained,lwd=6,outline=TRUE,fsize=0.9,res=200)
```

```
## sorry - this might take a while; please be patient
```

This is very different from when we left the root node unconstrained!

```
densityMap(flat.prior,lwd=6,outline=TRUE,fsize=0.9,res=200)
```

```
## sorry - this might take a while; please be patient
```

That's it.

Hi Liam, this is not so related to the post but I'm currently working on some phylogenetic analysis using continuous data (length in mm). I have a tree made based on available positioning of species, and I want to use the my length data to adjust branch length. However I am having issues as I have 5 length measurements (some negative some positive) for each of my 90 taxa. How can I use those measurements to return only a single value for each of my specimens, I've tried doing MLE but it only allows me to use a single column not all 5. Any help would be appreciated

ReplyDelete