I recently received an email with the following inquiry:

*“I wondered if you had any insight or advice on how the make.simmap function in phytools works when the input data
have multiple rows per species? It certainly does run but I am unclear on how example it's dealing with this. Specifically, I'm working
with a dataset of a 3-state categorical trait but for which I have between 1 and 42 (mean 3) datapoints per species …. I initially fed
in the trait data directly to make.simmap, and it did run fine and give reasonable looking estimates for internal nodes.
However, it also allocated a single state (without exception) to the tips, so I'm not sure if this is choosing the modal value, the
first value, or something else here? *

As an alternative, I converted the raw data to a matrix of proportions of each state per species (in a similar manner as incorporating uncertainty in tip state as you described here) based on the observed proportions for each species. This gives very similar estimates for ancestral states, perhaps within the bounds of the inherent stochasticity of the method, though different enough in places that I was suspicious that it's doing something different from my first approach. It also (of course) gives tip states that are as variable as the data it derived from, as you'd expect. However I wonder whether this approach, which essentially equates uncertainty in tip state with intraspecific variability, is a valid way of treating this data.

*In summary, do you have anything you can add concerning the appropriateness of either of these approaches (or any other suggestions) for
dealing with N-state categorical traits which are variable within species?”*

First - in the former case `make.simmap`

is most definitely *not* taking intraspecific polymorphism into account. Rather,
as the user suspected, it is most-likely just using the first observation for each species.

The second alternative is more interesting. Although this may seem reasonable it is also not correct. Saying that a particular tip taxa
has a prior probability of being in state *A* that is P(*A*)=0.50 and P(*B*)=0.50 implies that the taxon could be
*either* state *A* or *B* with equal probability - not that (say) 50% of the population is in state *A* & the other
50% in state *B*.

Though this may be somewhat unsatisfying, the simplest way to deal with intraspecific polymorphism is to re-code our character as an
additional state. That is, instead of having character states *A* and *B*, we instead have *A*, *A+B*, and *B*.

Unfortunately, `phytools::make.simmap`

presently doesn't automatically know how to handle data in this form. The function
`fitpolyMk`

does, however, and it is relatively straightforward to integrate it into `make.simmap`

. I'll next
demonstrate how.

Note that I just pushed a very
small update to `fitpolyMk`

to make this easier - thus to follow this demo exactly it is necessary to update *phytools*
from GitHub.

Let's get some data & a tree for our demo. Here are mine:

```
library(phytools)
packageVersion("phytools")
```

```
## [1] '0.7.2'
```

```
poly.data<-read.csv(file="http://www.phytools.org/TS2019/ex/4c/polymorphic-data.csv",
row.names=1)
y<-setNames(poly.data[,1],rownames(poly.data))
y
```

```
## t70 t71 t47 t169 t170 t67 t68 t62 t59 t60 t173 t174 t19 t20 t21
## C+D C+D C+D C+D C+D C+D B+C C+D C C B B B B+C B
## t15 t50 t109 t123 t124 t104 t105 t52 t53 t167 t168 t141 t74 t66 t48
## B B+C B+C B+C B+C B+C B+C B+C B+C C C C C C+D C
## t181 t182 t2 t39 t40 t4 t27 t191 t192 t28 t145 t146 t77 t8 t193
## C+D C+D C C C D C B+C C D C+D C+D C+D D D
## t194 t96 t45 t86 t87 t13 t6 t125 t183 t184 t16 t17 t135 t136 t147
## D D D D D C+D C+D C+D C C C C C C C
## t148 t83 t88 t89 t78 t79 t24 t187 t188 t36 t29 t99 t189 t190 t115
## C C B+C C C C C C C C D C+D C C C
## t119 t120 t121 t122 t46 t49 t165 t166 t112 t5 t106 t126 t142 t143 t98
## C C C C C C C+D C+D C+D B+C B+C B+C B+C B+C C
## t179 t180 t144 t113 t114 t18 t30 t155 t156 t11 t128 t129 t127 t64 t65
## B+C B+C B+C B+C B+C B+C C C C B+C C C+D C C+D C+D
## t153 t154 t185 t186 t80 t81 t33 t43 t151 t152 t82 t97 t132 t133 t61
## C C C C B B C D C+D C+D C+D D D D D
## t3 t102 t103 t35 t69 t149 t150 t7 t92 t93 t110 t111 t197 t198 t161
## A+B B+C B+C B+C A+B B B C+D C+D C+D C+D C+D C C C
## t157 t158 t12 t9 t41 t42 t25 t37 t38 t26 t10 t90 t94 t95 t54
## C C B+C C C C C C C C B+C C+D C+D C+D C
## t55 t91 t171 t172 t162 t163 t51 t32 t177 t178 t63 t84 t85 t44 t56
## C D C C C+D C+D C+D C C C B+C C C C C
## t199 t200 t107 t108 t130 t131 t31 t137 t138 t22 t139 t140 t118 t72 t73
## C C C C C C C C+D C C C+D C+D C+D C+D C
## t76 t159 t160 t14 t75 t134 t195 t196 t164 t34 t100 t101 t116 t117 t23
## C+D C+D C+D D C+D C+D C+D C+D C+D D D D C+D C C+D
## t175 t176 t57 t58 t1
## C+D C+D B+C B+C C+D
## Levels: A+B B B+C C C+D D
```

```
poly.tree<-read.tree(file="http://www.phytools.org/TS2019/ex/4c/polymorphic-tree.phy")
```

Just for fun - and to show ourselves that `fitpolyMk`

can handle it - let's flip the order of some of our polymorphic
conditions. In other words, we will make state *A + B* a mixture of taxa coded as `A+B`

and `B+A`

and
so on.

```
y<-setNames(as.character(y),names(y))
ii<-which(y=="A+B")
y[sample(ii,round(0.5*length(ii)))]<-"B+A"
ii<-which(y=="C+D")
y[sample(ii,round(0.5*length(ii)))]<-"D+C"
y<-as.factor(y)
y
```

```
## t70 t71 t47 t169 t170 t67 t68 t62 t59 t60 t173 t174 t19 t20 t21
## D+C C+D C+D C+D D+C C+D B+C C+D C C B B B B+C B
## t15 t50 t109 t123 t124 t104 t105 t52 t53 t167 t168 t141 t74 t66 t48
## B B+C B+C B+C B+C B+C B+C B+C B+C C C C C C+D C
## t181 t182 t2 t39 t40 t4 t27 t191 t192 t28 t145 t146 t77 t8 t193
## C+D D+C C C C D C B+C C D D+C C+D D+C D D
## t194 t96 t45 t86 t87 t13 t6 t125 t183 t184 t16 t17 t135 t136 t147
## D D D D D C+D C+D C+D C C C C C C C
## t148 t83 t88 t89 t78 t79 t24 t187 t188 t36 t29 t99 t189 t190 t115
## C C B+C C C C C C C C D C+D C C C
## t119 t120 t121 t122 t46 t49 t165 t166 t112 t5 t106 t126 t142 t143 t98
## C C C C C C D+C C+D C+D B+C B+C B+C B+C B+C C
## t179 t180 t144 t113 t114 t18 t30 t155 t156 t11 t128 t129 t127 t64 t65
## B+C B+C B+C B+C B+C B+C C C C B+C C D+C C C+D C+D
## t153 t154 t185 t186 t80 t81 t33 t43 t151 t152 t82 t97 t132 t133 t61
## C C C C B B C D C+D D+C D+C D D D D
## t3 t102 t103 t35 t69 t149 t150 t7 t92 t93 t110 t111 t197 t198 t161
## A+B B+C B+C B+C B+A B B D+C D+C D+C D+C D+C C C C
## t157 t158 t12 t9 t41 t42 t25 t37 t38 t26 t10 t90 t94 t95 t54
## C C B+C C C C C C C C B+C D+C C+D C+D C
## t55 t91 t171 t172 t162 t163 t51 t32 t177 t178 t63 t84 t85 t44 t56
## C D C C D+C C+D D+C C C C B+C C C C C
## t199 t200 t107 t108 t130 t131 t31 t137 t138 t22 t139 t140 t118 t72 t73
## C C C C C C C D+C C C D+C C+D C+D D+C C
## t76 t159 t160 t14 t75 t134 t195 t196 t164 t34 t100 t101 t116 t117 t23
## D+C C+D D+C D C+D C+D C+D C+D D+C D D D D+C C D+C
## t175 t176 t57 t58 t1
## D+C D+C B+C B+C D+C
## Levels: A+B B B+A B+C C C+D D D+C
```

Plot our data:

```
plotTree(poly.tree,ftype="off",lwd=1,type="fan")
X<-strsplit(setNames(as.character(y),names(y)),"+",fixed=TRUE)
pies<-matrix(0,Ntip(poly.tree),4,dimnames=list(poly.tree$tip.label,
c("A","B","C","D")))
for(i in 1:Ntip(poly.tree))
pies[poly.tree$tip.label[i],X[[poly.tree$tip.label[i]]]]<-
rep(1/length(X[[poly.tree$tip.label[i]]]),
length(X[[poly.tree$tip.label[i]]]))
tiplabels(pie=pies,piecol=c("black","yellow","red","blue"),
cex=0.35)
legend(x="topleft",legend=c("A","B","C","D"),pt.cex=2,pch=21,
pt.bg=c("black","yellow","red","blue"))
```

Now we'll go ahead & fit our polymorphic evolution model. Here, I'm going to use an symmetric-rates ordered model - just because I happen to know that this model has good fit to the data:

```
fit<-fitpolyMk(poly.tree,y,model="SYM",ordered=TRUE)
```

```
##
## This is the design matrix of the fitted model. Does it make sense?
##
## A A+B B B+C C C+D D
## A 0 1 0 0 0 0 0
## A+B 1 0 2 0 0 0 0
## B 0 2 0 3 0 0 0
## B+C 0 0 3 0 4 0 0
## C 0 0 0 4 0 5 0
## C+D 0 0 0 0 5 0 6
## D 0 0 0 0 0 6 0
```

```
plot(fit,lwd=2)
```

Now, how can we apply this to stochastic character mapping? Well, this is easier than you'd think simply because the function
`fitpolyMk`

(now) returns both the design matrix of the model, and a data matrix of the input data to the user
as components of the fitted model object: `$index.matrix`

and `$data`

, respectively.
Let's examine these & then use these as follows:

```
fit$index.matrix
```

```
## A A+B B B+C C C+D D
## A 0 1 0 0 0 0 0
## A+B 1 0 2 0 0 0 0
## B 0 2 0 3 0 0 0
## B+C 0 0 3 0 4 0 0
## C 0 0 0 4 0 5 0
## C+D 0 0 0 0 5 0 6
## D 0 0 0 0 0 6 0
```

```
fit$data
```

```
## A A+B B B+C C C+D D
## t70 0 0 0 0 0 1 0
## t71 0 0 0 0 0 1 0
## t47 0 0 0 0 0 1 0
## t169 0 0 0 0 0 1 0
## t170 0 0 0 0 0 1 0
## t67 0 0 0 0 0 1 0
## t68 0 0 0 1 0 0 0
## t62 0 0 0 0 0 1 0
## t59 0 0 0 0 1 0 0
## t60 0 0 0 0 1 0 0
## t173 0 0 1 0 0 0 0
## t174 0 0 1 0 0 0 0
## t19 0 0 1 0 0 0 0
## t20 0 0 0 1 0 0 0
## t21 0 0 1 0 0 0 0
## t15 0 0 1 0 0 0 0
## t50 0 0 0 1 0 0 0
## t109 0 0 0 1 0 0 0
## t123 0 0 0 1 0 0 0
## t124 0 0 0 1 0 0 0
## t104 0 0 0 1 0 0 0
## t105 0 0 0 1 0 0 0
## t52 0 0 0 1 0 0 0
## t53 0 0 0 1 0 0 0
## t167 0 0 0 0 1 0 0
## t168 0 0 0 0 1 0 0
## t141 0 0 0 0 1 0 0
## t74 0 0 0 0 1 0 0
## t66 0 0 0 0 0 1 0
## t48 0 0 0 0 1 0 0
## t181 0 0 0 0 0 1 0
## t182 0 0 0 0 0 1 0
## t2 0 0 0 0 1 0 0
## t39 0 0 0 0 1 0 0
## t40 0 0 0 0 1 0 0
## t4 0 0 0 0 0 0 1
## t27 0 0 0 0 1 0 0
## t191 0 0 0 1 0 0 0
## t192 0 0 0 0 1 0 0
## t28 0 0 0 0 0 0 1
## t145 0 0 0 0 0 1 0
## t146 0 0 0 0 0 1 0
## t77 0 0 0 0 0 1 0
## t8 0 0 0 0 0 0 1
## t193 0 0 0 0 0 0 1
## t194 0 0 0 0 0 0 1
## t96 0 0 0 0 0 0 1
## t45 0 0 0 0 0 0 1
## t86 0 0 0 0 0 0 1
## t87 0 0 0 0 0 0 1
## t13 0 0 0 0 0 1 0
## t6 0 0 0 0 0 1 0
## t125 0 0 0 0 0 1 0
## t183 0 0 0 0 1 0 0
## t184 0 0 0 0 1 0 0
## t16 0 0 0 0 1 0 0
## t17 0 0 0 0 1 0 0
## t135 0 0 0 0 1 0 0
## t136 0 0 0 0 1 0 0
## t147 0 0 0 0 1 0 0
## t148 0 0 0 0 1 0 0
## t83 0 0 0 0 1 0 0
## t88 0 0 0 1 0 0 0
## t89 0 0 0 0 1 0 0
## t78 0 0 0 0 1 0 0
## t79 0 0 0 0 1 0 0
## t24 0 0 0 0 1 0 0
## t187 0 0 0 0 1 0 0
## t188 0 0 0 0 1 0 0
## t36 0 0 0 0 1 0 0
## t29 0 0 0 0 0 0 1
## t99 0 0 0 0 0 1 0
## t189 0 0 0 0 1 0 0
## t190 0 0 0 0 1 0 0
## t115 0 0 0 0 1 0 0
## t119 0 0 0 0 1 0 0
## t120 0 0 0 0 1 0 0
## t121 0 0 0 0 1 0 0
## t122 0 0 0 0 1 0 0
## t46 0 0 0 0 1 0 0
## t49 0 0 0 0 1 0 0
## t165 0 0 0 0 0 1 0
## t166 0 0 0 0 0 1 0
## t112 0 0 0 0 0 1 0
## t5 0 0 0 1 0 0 0
## t106 0 0 0 1 0 0 0
## t126 0 0 0 1 0 0 0
## t142 0 0 0 1 0 0 0
## t143 0 0 0 1 0 0 0
## t98 0 0 0 0 1 0 0
## t179 0 0 0 1 0 0 0
## t180 0 0 0 1 0 0 0
## t144 0 0 0 1 0 0 0
## t113 0 0 0 1 0 0 0
## t114 0 0 0 1 0 0 0
## t18 0 0 0 1 0 0 0
## t30 0 0 0 0 1 0 0
## t155 0 0 0 0 1 0 0
## t156 0 0 0 0 1 0 0
## t11 0 0 0 1 0 0 0
## t128 0 0 0 0 1 0 0
## t129 0 0 0 0 0 1 0
## t127 0 0 0 0 1 0 0
## t64 0 0 0 0 0 1 0
## t65 0 0 0 0 0 1 0
## t153 0 0 0 0 1 0 0
## t154 0 0 0 0 1 0 0
## t185 0 0 0 0 1 0 0
## t186 0 0 0 0 1 0 0
## t80 0 0 1 0 0 0 0
## t81 0 0 1 0 0 0 0
## t33 0 0 0 0 1 0 0
## t43 0 0 0 0 0 0 1
## t151 0 0 0 0 0 1 0
## t152 0 0 0 0 0 1 0
## t82 0 0 0 0 0 1 0
## t97 0 0 0 0 0 0 1
## t132 0 0 0 0 0 0 1
## t133 0 0 0 0 0 0 1
## t61 0 0 0 0 0 0 1
## t3 0 1 0 0 0 0 0
## t102 0 0 0 1 0 0 0
## t103 0 0 0 1 0 0 0
## t35 0 0 0 1 0 0 0
## t69 0 1 0 0 0 0 0
## t149 0 0 1 0 0 0 0
## t150 0 0 1 0 0 0 0
## t7 0 0 0 0 0 1 0
## t92 0 0 0 0 0 1 0
## t93 0 0 0 0 0 1 0
## t110 0 0 0 0 0 1 0
## t111 0 0 0 0 0 1 0
## t197 0 0 0 0 1 0 0
## t198 0 0 0 0 1 0 0
## t161 0 0 0 0 1 0 0
## t157 0 0 0 0 1 0 0
## t158 0 0 0 0 1 0 0
## t12 0 0 0 1 0 0 0
## t9 0 0 0 0 1 0 0
## t41 0 0 0 0 1 0 0
## t42 0 0 0 0 1 0 0
## t25 0 0 0 0 1 0 0
## t37 0 0 0 0 1 0 0
## t38 0 0 0 0 1 0 0
## t26 0 0 0 0 1 0 0
## t10 0 0 0 1 0 0 0
## t90 0 0 0 0 0 1 0
## t94 0 0 0 0 0 1 0
## t95 0 0 0 0 0 1 0
## t54 0 0 0 0 1 0 0
## t55 0 0 0 0 1 0 0
## t91 0 0 0 0 0 0 1
## t171 0 0 0 0 1 0 0
## t172 0 0 0 0 1 0 0
## t162 0 0 0 0 0 1 0
## t163 0 0 0 0 0 1 0
## t51 0 0 0 0 0 1 0
## t32 0 0 0 0 1 0 0
## t177 0 0 0 0 1 0 0
## t178 0 0 0 0 1 0 0
## t63 0 0 0 1 0 0 0
## t84 0 0 0 0 1 0 0
## t85 0 0 0 0 1 0 0
## t44 0 0 0 0 1 0 0
## t56 0 0 0 0 1 0 0
## t199 0 0 0 0 1 0 0
## t200 0 0 0 0 1 0 0
## t107 0 0 0 0 1 0 0
## t108 0 0 0 0 1 0 0
## t130 0 0 0 0 1 0 0
## t131 0 0 0 0 1 0 0
## t31 0 0 0 0 1 0 0
## t137 0 0 0 0 0 1 0
## t138 0 0 0 0 1 0 0
## t22 0 0 0 0 1 0 0
## t139 0 0 0 0 0 1 0
## t140 0 0 0 0 0 1 0
## t118 0 0 0 0 0 1 0
## t72 0 0 0 0 0 1 0
## t73 0 0 0 0 1 0 0
## t76 0 0 0 0 0 1 0
## t159 0 0 0 0 0 1 0
## t160 0 0 0 0 0 1 0
## t14 0 0 0 0 0 0 1
## t75 0 0 0 0 0 1 0
## t134 0 0 0 0 0 1 0
## t195 0 0 0 0 0 1 0
## t196 0 0 0 0 0 1 0
## t164 0 0 0 0 0 1 0
## t34 0 0 0 0 0 0 1
## t100 0 0 0 0 0 0 1
## t101 0 0 0 0 0 0 1
## t116 0 0 0 0 0 1 0
## t117 0 0 0 0 1 0 0
## t23 0 0 0 0 0 1 0
## t175 0 0 0 0 0 1 0
## t176 0 0 0 0 0 1 0
## t57 0 0 0 1 0 0 0
## t58 0 0 0 1 0 0 0
## t1 0 0 0 0 0 1 0
```

```
simmap.trees<-make.simmap(poly.tree,fit$data,model=fit$index.matrix,nsim=100)
```

```
## make.simmap is sampling character histories conditioned on the transition matrix
##
## Q =
## A A+B B B+C C C+D D
## A 0 0.000000 0.000000 0.000000 0.000000 0.000000 0.00000
## A+B 0 -1.271756 1.271756 0.000000 0.000000 0.000000 0.00000
## B 0 1.271756 -2.333135 1.061379 0.000000 0.000000 0.00000
## B+C 0 0.000000 1.061379 -1.768122 0.706743 0.000000 0.00000
## C 0 0.000000 0.000000 0.706743 -1.913028 1.206285 0.00000
## C+D 0 0.000000 0.000000 0.000000 1.206285 -2.252835 1.04655
## D 0 0.000000 0.000000 0.000000 0.000000 1.046550 -1.04655
## (estimated using likelihood);
## and (mean) root node prior probabilities
## pi =
## A A+B B B+C C C+D D
## 0.1428571 0.1428571 0.1428571 0.1428571 0.1428571 0.1428571 0.1428571
```

```
## Done.
```

```
simmap.trees
```

```
## 100 phylogenetic trees with mapped discrete characters
```

We can plot a summary of this analysis as follows:

```
cols<-setNames(colorRampPalette(c("black","yellow","red","blue"))(7),
colnames(fit$data))
plot(summary(simmap.trees),colors=cols,type="fan",ftype="off")
legend(x="topleft",legend=colnames(fit$data),pt.cex=2.4,pch=21,
pt.bg=cols)
```

However, it might make more sense to do things like combine states before computing posterior probabilities at node. The following
is just an example that combines `A+B`

, `B`

, and `B+C`

and then plots them compared to a new
state `not-B`

:

```
b.trees<-mergeMappedStates(simmap.trees,c("A+B","B","B+C"),"B")
b.trees<-mergeMappedStates(b.trees,c("A","C","C+D","D"),"not-B")
cols<-setNames(c("black","yellow"),c("not-B","B"))
plot(summary(b.trees),colors=cols,type="fan",ftype="off")
legend(x="topleft",legend=names(cols),pt.cex=2.4,pch=21,
pt.bg=cols)
```

Or, similarly using `densityMap`

:

```
object<-densityMap(b.trees,states=names(cols),plot=FALSE)
```

```
## sorry - this might take a while; please be patient
```

```
object<-setMap(object,c("black","yellow"))
plot(object,lwd=c(2,6),ftype="off")
```

That's kind of the idea. We can similarly analyze our posterior sample of trees in all of the usual ways, e.g.:

```
object<-density(simmap.trees)
object
```

```
##
## Distribution of changes from stochastic mapping:
##
## A+B->B A+B->B+C
## Min. :0 Min. :0
## Median :0 Median :0
## Mean :0.34 Mean :0
## Max. :2 Max. :0
##
## A+B->C A+B->C+D
## Min. :0 Min. :0
## Median :0 Median :0
## Mean :0 Mean :0
## Max. :0 Max. :0
##
## A+B->D B->A+B
## Min. :0 Min. :2
## Median :0 Median :2
## Mean :0 Mean :2.29
## Max. :0 Max. :4
##
## B->B+C B->C
## Min. :0 Min. :0
## Median :1.5 Median :0
## Mean :1.85 Mean :0
## Max. :6 Max. :0
##
## B->C+D B->D
## Min. :0 Min. :0
## Median :0 Median :0
## Mean :0 Mean :0
## Max. :0 Max. :0
##
## B+C->A+B B+C->B
## Min. :0 Min. :4
## Median :0 Median :7
## Mean :0 Mean :6.67
## Max. :0 Max. :10
##
## B+C->C B+C->C+D
## Min. :1 Min. :0
## Median :3 Median :0
## Mean :3.54 Mean :0
## Max. :9 Max. :0
##
## B+C->D C->A+B
## Min. :0 Min. :0
## Median :0 Median :0
## Mean :0 Mean :0
## Max. :0 Max. :0
##
## C->B C->B+C
## Min. :0 Min. :13
## Median :0 Median :15
## Mean :0 Mean :15.49
## Max. :0 Max. :21
##
## C->C+D C->D
## Min. :18 Min. :0
## Median :24 Median :0
## Mean :23.9 Mean :0
## Max. :30 Max. :0
##
## C+D->A+B C+D->B
## Min. :0 Min. :0
## Median :0 Median :0
## Mean :0 Mean :0
## Max. :0 Max. :0
##
## C+D->B+C C+D->C
## Min. :0 Min. :8
## Median :0 Median :14
## Mean :0 Mean :13.68
## Max. :0 Max. :26
##
## C+D->D D->A+B
## Min. :7 Min. :0
## Median :11 Median :0
## Mean :10.64 Mean :0
## Max. :15 Max. :0
##
## D->B D->B+C
## Min. :0 Min. :0
## Median :0 Median :0
## Mean :0 Mean :0
## Max. :0 Max. :0
##
## D->C D->C+D
## Min. :0 Min. :0
## Median :0 Median :3
## Mean :0 Mean :3.51
## Max. :0 Max. :9
##
## 95% HPD interval(A+B->B): [0, 2]
## 95% HPD interval(A+B->B+C): [0, 0]
## 95% HPD interval(A+B->C): [0, 0]
## 95% HPD interval(A+B->C+D): [0, 0]
## 95% HPD interval(A+B->D): [0, 0]
## 95% HPD interval(B->A+B): [2, 4]
## 95% HPD interval(B->B+C): [0, 5]
## 95% HPD interval(B->C): [0, 0]
## 95% HPD interval(B->C+D): [0, 0]
## 95% HPD interval(B->D): [0, 0]
## 95% HPD interval(B+C->A+B): [0, 0]
## 95% HPD interval(B+C->B): [5, 9]
## 95% HPD interval(B+C->C): [1, 7]
## 95% HPD interval(B+C->C+D): [0, 0]
## 95% HPD interval(B+C->D): [0, 0]
## 95% HPD interval(C->A+B): [0, 0]
## 95% HPD interval(C->B): [0, 0]
## 95% HPD interval(C->B+C): [13, 18]
## 95% HPD interval(C->C+D): [20, 29]
## 95% HPD interval(C->D): [0, 0]
## 95% HPD interval(C+D->A+B): [0, 0]
## 95% HPD interval(C+D->B): [0, 0]
## 95% HPD interval(C+D->B+C): [0, 0]
## 95% HPD interval(C+D->C): [8, 19]
## 95% HPD interval(C+D->D): [8, 14]
## 95% HPD interval(D->A+B): [0, 0]
## 95% HPD interval(D->B): [0, 0]
## 95% HPD interval(D->B+C): [0, 0]
## 95% HPD interval(D->C): [0, 0]
## 95% HPD interval(D->C+D): [0, 7]
```

```
plot(object)
```

Or from our `B`

or `not-B`

analysis:

```
object<-density(b.trees)
object
```

```
##
## Distribution of changes from stochastic mapping:
## B->not-B not-B->B
## Min. :1 Min. :13
## Median :3 Median :15
## Mean :3.54 Mean :15.49
## Max. :9 Max. :21
##
## 95% HPD interval(B->not-B): [1, 7]
## 95% HPD interval(not-B->B): [13, 18]
```

```
par(family="mono")
plot(object)
```

You get the idea.

The very heterogeneous sampling (*“between 1 and 42 (mean 3) datapoints per species”*) raises another interesting point. What if
we only have one sample for a species? How can we genuinely say that it is monomorphic? The truth is - we can't; however, if we have
a model for our uncertainty we can easily incorporate it in the same way that we specify
uncertainty in the values of monomorphic tip
states (as previously discussed).

That's all I have to say about this - for now.

Hi Dr. Revell,

ReplyDeleteLove all your blogs - very clear and informative! I appreciate it a lot.

However, when I was learning this tutorial, I came across some problem with applying the fitted model to the data.

After assigning "fit", you used "fit$data" to call out the data matrix of the input data. But when I run "fit$data" (I was using your provided inputs) in my RStudio, it returned "NULL" rather than a data frame which you show in the tutorial.

I was wondering if this issue is caused by my phytools version (which is 0.6.99). If so, how can I update my package to 0.7.2 as my phytools shown to be up-to-date.

Thank you so much for your great help!!

Gratefully,

Laura

Dear Liam J. Revell

ReplyDeleteI follow your blog (http://blog.phytools.org) and learned a lot about Comparative Biology in the phytools R package.

You provide a very detailed script. However, when I applied the fit using fit$data I received the message "Null" at the R prompt instead of a data frame that you show in the tutorial. From that point I could not perform the analysis.

I was wondering if this issue is caused by my phytools version (which is 0.6.99). If so, how can I update my package to 0.7.2 as my phytools shown to be up-to-date.

Thank you so much for your great help!!

Gratefully,

Washington Vieira

Dear Liam J. Revell

ReplyDeleteI follow your blog (http://blog.phytools.org) and learned a lot about Comparative Biology in the phytools R package.

You provide a very detailed script. However, when I applied the fit using fit$data I received the message "Null" at the R prompt instead of a data frame that you show in the tutorial. From that point I could not perform the analysis.

I was wondering if this issue is caused by my phytools version (which is 0.6.99). If so, how can I update my package to 0.7.2 as my phytools shown to be up-to-date.

Thank you so much for your great help!!

Gratefully,

Washington Vieira

After installing the CRAN package 'devtools' this can be done in a fresh R session as follows:

Deletelibrary(devtools)

install_github("liamrevell/phytools")

All the best, Liam

Dear Liam,

ReplyDeleteThank you for this very clear and useful post.

I have a question regarding the incorporation of uncertainty for some tips with the fitpolyMk function. Although the conversion of the character states to a matrix with the prior probabilities of each tip works well with the fitMk function, I cannot run it with the fitpolyMk (I got an error message: 'Error in strsplit(x, "+", fixed = TRUE) : non-character argument'). I was therefore wondering whether it is possible to combine the intraspecific polymorphism with character state uncertainty.

Best regards,

Alex

I also had this error and I removed it by transforming my data with the lines that are in the Liams code:

Deletey2<-setNames(data[,1],rownames(data))

y2<-as.factor(y2)

Hi,

ReplyDeletethe code is very useful, there are not many tools for polymorphic data. I have, however, a problem with my data: at the end I get an error '1:nrow(L)':argument of length 0.

I have tried different models (ER, ARD, SYM) and it is still the same. I think this my be related to rates values, since in each model my rates are NaNs.

Thank you so much for your great help.

Update

DeleteProblem solved after chagning "ordered=TRUE" to "ordered=FALSE".