I recently received an email with the following inquiry:
“I wondered if you had any insight or advice on how the make.simmap
function in phytools works when the input data
have multiple rows per species? It certainly does run but I am unclear on how example it's dealing with this. Specifically, I'm working
with a dataset of a 3-state categorical trait but for which I have between 1 and 42 (mean 3) datapoints per species …. I initially fed
in the trait data directly to make.simmap
, and it did run fine and give reasonable looking estimates for internal nodes.
However, it also allocated a single state (without exception) to the tips, so I'm not sure if this is choosing the modal value, the
first value, or something else here?
As an alternative, I converted the raw data to a matrix of proportions of each state per species (in a similar manner as incorporating uncertainty in tip state as you described here) based on the observed proportions for each species. This gives very similar estimates for ancestral states, perhaps within the bounds of the inherent stochasticity of the method, though different enough in places that I was suspicious that it's doing something different from my first approach. It also (of course) gives tip states that are as variable as the data it derived from, as you'd expect. However I wonder whether this approach, which essentially equates uncertainty in tip state with intraspecific variability, is a valid way of treating this data.
In summary, do you have anything you can add concerning the appropriateness of either of these approaches (or any other suggestions) for dealing with N-state categorical traits which are variable within species?”
First - in the former case make.simmap
is most definitely not taking intraspecific polymorphism into account. Rather,
as the user suspected, it is most-likely just using the first observation for each species.
The second alternative is more interesting. Although this may seem reasonable it is also not correct. Saying that a particular tip taxa has a prior probability of being in state A that is P(A)=0.50 and P(B)=0.50 implies that the taxon could be either state A or B with equal probability - not that (say) 50% of the population is in state A & the other 50% in state B.
Though this may be somewhat unsatisfying, the simplest way to deal with intraspecific polymorphism is to re-code our character as an additional state. That is, instead of having character states A and B, we instead have A, A+B, and B.
Unfortunately, phytools::make.simmap
presently doesn't automatically know how to handle data in this form. The function
fitpolyMk
does, however, and it is relatively straightforward to integrate it into make.simmap
. I'll next
demonstrate how.
Note that I just pushed a very
small update to fitpolyMk
to make this easier - thus to follow this demo exactly it is necessary to update phytools
from GitHub.
Let's get some data & a tree for our demo. Here are mine:
library(phytools)
packageVersion("phytools")
## [1] '0.7.2'
poly.data<-read.csv(file="http://www.phytools.org/TS2019/ex/4c/polymorphic-data.csv",
row.names=1)
y<-setNames(poly.data[,1],rownames(poly.data))
y
## t70 t71 t47 t169 t170 t67 t68 t62 t59 t60 t173 t174 t19 t20 t21
## C+D C+D C+D C+D C+D C+D B+C C+D C C B B B B+C B
## t15 t50 t109 t123 t124 t104 t105 t52 t53 t167 t168 t141 t74 t66 t48
## B B+C B+C B+C B+C B+C B+C B+C B+C C C C C C+D C
## t181 t182 t2 t39 t40 t4 t27 t191 t192 t28 t145 t146 t77 t8 t193
## C+D C+D C C C D C B+C C D C+D C+D C+D D D
## t194 t96 t45 t86 t87 t13 t6 t125 t183 t184 t16 t17 t135 t136 t147
## D D D D D C+D C+D C+D C C C C C C C
## t148 t83 t88 t89 t78 t79 t24 t187 t188 t36 t29 t99 t189 t190 t115
## C C B+C C C C C C C C D C+D C C C
## t119 t120 t121 t122 t46 t49 t165 t166 t112 t5 t106 t126 t142 t143 t98
## C C C C C C C+D C+D C+D B+C B+C B+C B+C B+C C
## t179 t180 t144 t113 t114 t18 t30 t155 t156 t11 t128 t129 t127 t64 t65
## B+C B+C B+C B+C B+C B+C C C C B+C C C+D C C+D C+D
## t153 t154 t185 t186 t80 t81 t33 t43 t151 t152 t82 t97 t132 t133 t61
## C C C C B B C D C+D C+D C+D D D D D
## t3 t102 t103 t35 t69 t149 t150 t7 t92 t93 t110 t111 t197 t198 t161
## A+B B+C B+C B+C A+B B B C+D C+D C+D C+D C+D C C C
## t157 t158 t12 t9 t41 t42 t25 t37 t38 t26 t10 t90 t94 t95 t54
## C C B+C C C C C C C C B+C C+D C+D C+D C
## t55 t91 t171 t172 t162 t163 t51 t32 t177 t178 t63 t84 t85 t44 t56
## C D C C C+D C+D C+D C C C B+C C C C C
## t199 t200 t107 t108 t130 t131 t31 t137 t138 t22 t139 t140 t118 t72 t73
## C C C C C C C C+D C C C+D C+D C+D C+D C
## t76 t159 t160 t14 t75 t134 t195 t196 t164 t34 t100 t101 t116 t117 t23
## C+D C+D C+D D C+D C+D C+D C+D C+D D D D C+D C C+D
## t175 t176 t57 t58 t1
## C+D C+D B+C B+C C+D
## Levels: A+B B B+C C C+D D
poly.tree<-read.tree(file="http://www.phytools.org/TS2019/ex/4c/polymorphic-tree.phy")
Just for fun - and to show ourselves that fitpolyMk
can handle it - let's flip the order of some of our polymorphic
conditions. In other words, we will make state A + B a mixture of taxa coded as A+B
and B+A
and
so on.
y<-setNames(as.character(y),names(y))
ii<-which(y=="A+B")
y[sample(ii,round(0.5*length(ii)))]<-"B+A"
ii<-which(y=="C+D")
y[sample(ii,round(0.5*length(ii)))]<-"D+C"
y<-as.factor(y)
y
## t70 t71 t47 t169 t170 t67 t68 t62 t59 t60 t173 t174 t19 t20 t21
## D+C C+D C+D C+D D+C C+D B+C C+D C C B B B B+C B
## t15 t50 t109 t123 t124 t104 t105 t52 t53 t167 t168 t141 t74 t66 t48
## B B+C B+C B+C B+C B+C B+C B+C B+C C C C C C+D C
## t181 t182 t2 t39 t40 t4 t27 t191 t192 t28 t145 t146 t77 t8 t193
## C+D D+C C C C D C B+C C D D+C C+D D+C D D
## t194 t96 t45 t86 t87 t13 t6 t125 t183 t184 t16 t17 t135 t136 t147
## D D D D D C+D C+D C+D C C C C C C C
## t148 t83 t88 t89 t78 t79 t24 t187 t188 t36 t29 t99 t189 t190 t115
## C C B+C C C C C C C C D C+D C C C
## t119 t120 t121 t122 t46 t49 t165 t166 t112 t5 t106 t126 t142 t143 t98
## C C C C C C D+C C+D C+D B+C B+C B+C B+C B+C C
## t179 t180 t144 t113 t114 t18 t30 t155 t156 t11 t128 t129 t127 t64 t65
## B+C B+C B+C B+C B+C B+C C C C B+C C D+C C C+D C+D
## t153 t154 t185 t186 t80 t81 t33 t43 t151 t152 t82 t97 t132 t133 t61
## C C C C B B C D C+D D+C D+C D D D D
## t3 t102 t103 t35 t69 t149 t150 t7 t92 t93 t110 t111 t197 t198 t161
## A+B B+C B+C B+C B+A B B D+C D+C D+C D+C D+C C C C
## t157 t158 t12 t9 t41 t42 t25 t37 t38 t26 t10 t90 t94 t95 t54
## C C B+C C C C C C C C B+C D+C C+D C+D C
## t55 t91 t171 t172 t162 t163 t51 t32 t177 t178 t63 t84 t85 t44 t56
## C D C C D+C C+D D+C C C C B+C C C C C
## t199 t200 t107 t108 t130 t131 t31 t137 t138 t22 t139 t140 t118 t72 t73
## C C C C C C C D+C C C D+C C+D C+D D+C C
## t76 t159 t160 t14 t75 t134 t195 t196 t164 t34 t100 t101 t116 t117 t23
## D+C C+D D+C D C+D C+D C+D C+D D+C D D D D+C C D+C
## t175 t176 t57 t58 t1
## D+C D+C B+C B+C D+C
## Levels: A+B B B+A B+C C C+D D D+C
Plot our data:
plotTree(poly.tree,ftype="off",lwd=1,type="fan")
X<-strsplit(setNames(as.character(y),names(y)),"+",fixed=TRUE)
pies<-matrix(0,Ntip(poly.tree),4,dimnames=list(poly.tree$tip.label,
c("A","B","C","D")))
for(i in 1:Ntip(poly.tree))
pies[poly.tree$tip.label[i],X[[poly.tree$tip.label[i]]]]<-
rep(1/length(X[[poly.tree$tip.label[i]]]),
length(X[[poly.tree$tip.label[i]]]))
tiplabels(pie=pies,piecol=c("black","yellow","red","blue"),
cex=0.35)
legend(x="topleft",legend=c("A","B","C","D"),pt.cex=2,pch=21,
pt.bg=c("black","yellow","red","blue"))
Now we'll go ahead & fit our polymorphic evolution model. Here, I'm going to use an symmetric-rates ordered model - just because I happen to know that this model has good fit to the data:
fit<-fitpolyMk(poly.tree,y,model="SYM",ordered=TRUE)
##
## This is the design matrix of the fitted model. Does it make sense?
##
## A A+B B B+C C C+D D
## A 0 1 0 0 0 0 0
## A+B 1 0 2 0 0 0 0
## B 0 2 0 3 0 0 0
## B+C 0 0 3 0 4 0 0
## C 0 0 0 4 0 5 0
## C+D 0 0 0 0 5 0 6
## D 0 0 0 0 0 6 0
plot(fit,lwd=2)
Now, how can we apply this to stochastic character mapping? Well, this is easier than you'd think simply because the function
fitpolyMk
(now) returns both the design matrix of the model, and a data matrix of the input data to the user
as components of the fitted model object: $index.matrix
and $data
, respectively.
Let's examine these & then use these as follows:
fit$index.matrix
## A A+B B B+C C C+D D
## A 0 1 0 0 0 0 0
## A+B 1 0 2 0 0 0 0
## B 0 2 0 3 0 0 0
## B+C 0 0 3 0 4 0 0
## C 0 0 0 4 0 5 0
## C+D 0 0 0 0 5 0 6
## D 0 0 0 0 0 6 0
fit$data
## A A+B B B+C C C+D D
## t70 0 0 0 0 0 1 0
## t71 0 0 0 0 0 1 0
## t47 0 0 0 0 0 1 0
## t169 0 0 0 0 0 1 0
## t170 0 0 0 0 0 1 0
## t67 0 0 0 0 0 1 0
## t68 0 0 0 1 0 0 0
## t62 0 0 0 0 0 1 0
## t59 0 0 0 0 1 0 0
## t60 0 0 0 0 1 0 0
## t173 0 0 1 0 0 0 0
## t174 0 0 1 0 0 0 0
## t19 0 0 1 0 0 0 0
## t20 0 0 0 1 0 0 0
## t21 0 0 1 0 0 0 0
## t15 0 0 1 0 0 0 0
## t50 0 0 0 1 0 0 0
## t109 0 0 0 1 0 0 0
## t123 0 0 0 1 0 0 0
## t124 0 0 0 1 0 0 0
## t104 0 0 0 1 0 0 0
## t105 0 0 0 1 0 0 0
## t52 0 0 0 1 0 0 0
## t53 0 0 0 1 0 0 0
## t167 0 0 0 0 1 0 0
## t168 0 0 0 0 1 0 0
## t141 0 0 0 0 1 0 0
## t74 0 0 0 0 1 0 0
## t66 0 0 0 0 0 1 0
## t48 0 0 0 0 1 0 0
## t181 0 0 0 0 0 1 0
## t182 0 0 0 0 0 1 0
## t2 0 0 0 0 1 0 0
## t39 0 0 0 0 1 0 0
## t40 0 0 0 0 1 0 0
## t4 0 0 0 0 0 0 1
## t27 0 0 0 0 1 0 0
## t191 0 0 0 1 0 0 0
## t192 0 0 0 0 1 0 0
## t28 0 0 0 0 0 0 1
## t145 0 0 0 0 0 1 0
## t146 0 0 0 0 0 1 0
## t77 0 0 0 0 0 1 0
## t8 0 0 0 0 0 0 1
## t193 0 0 0 0 0 0 1
## t194 0 0 0 0 0 0 1
## t96 0 0 0 0 0 0 1
## t45 0 0 0 0 0 0 1
## t86 0 0 0 0 0 0 1
## t87 0 0 0 0 0 0 1
## t13 0 0 0 0 0 1 0
## t6 0 0 0 0 0 1 0
## t125 0 0 0 0 0 1 0
## t183 0 0 0 0 1 0 0
## t184 0 0 0 0 1 0 0
## t16 0 0 0 0 1 0 0
## t17 0 0 0 0 1 0 0
## t135 0 0 0 0 1 0 0
## t136 0 0 0 0 1 0 0
## t147 0 0 0 0 1 0 0
## t148 0 0 0 0 1 0 0
## t83 0 0 0 0 1 0 0
## t88 0 0 0 1 0 0 0
## t89 0 0 0 0 1 0 0
## t78 0 0 0 0 1 0 0
## t79 0 0 0 0 1 0 0
## t24 0 0 0 0 1 0 0
## t187 0 0 0 0 1 0 0
## t188 0 0 0 0 1 0 0
## t36 0 0 0 0 1 0 0
## t29 0 0 0 0 0 0 1
## t99 0 0 0 0 0 1 0
## t189 0 0 0 0 1 0 0
## t190 0 0 0 0 1 0 0
## t115 0 0 0 0 1 0 0
## t119 0 0 0 0 1 0 0
## t120 0 0 0 0 1 0 0
## t121 0 0 0 0 1 0 0
## t122 0 0 0 0 1 0 0
## t46 0 0 0 0 1 0 0
## t49 0 0 0 0 1 0 0
## t165 0 0 0 0 0 1 0
## t166 0 0 0 0 0 1 0
## t112 0 0 0 0 0 1 0
## t5 0 0 0 1 0 0 0
## t106 0 0 0 1 0 0 0
## t126 0 0 0 1 0 0 0
## t142 0 0 0 1 0 0 0
## t143 0 0 0 1 0 0 0
## t98 0 0 0 0 1 0 0
## t179 0 0 0 1 0 0 0
## t180 0 0 0 1 0 0 0
## t144 0 0 0 1 0 0 0
## t113 0 0 0 1 0 0 0
## t114 0 0 0 1 0 0 0
## t18 0 0 0 1 0 0 0
## t30 0 0 0 0 1 0 0
## t155 0 0 0 0 1 0 0
## t156 0 0 0 0 1 0 0
## t11 0 0 0 1 0 0 0
## t128 0 0 0 0 1 0 0
## t129 0 0 0 0 0 1 0
## t127 0 0 0 0 1 0 0
## t64 0 0 0 0 0 1 0
## t65 0 0 0 0 0 1 0
## t153 0 0 0 0 1 0 0
## t154 0 0 0 0 1 0 0
## t185 0 0 0 0 1 0 0
## t186 0 0 0 0 1 0 0
## t80 0 0 1 0 0 0 0
## t81 0 0 1 0 0 0 0
## t33 0 0 0 0 1 0 0
## t43 0 0 0 0 0 0 1
## t151 0 0 0 0 0 1 0
## t152 0 0 0 0 0 1 0
## t82 0 0 0 0 0 1 0
## t97 0 0 0 0 0 0 1
## t132 0 0 0 0 0 0 1
## t133 0 0 0 0 0 0 1
## t61 0 0 0 0 0 0 1
## t3 0 1 0 0 0 0 0
## t102 0 0 0 1 0 0 0
## t103 0 0 0 1 0 0 0
## t35 0 0 0 1 0 0 0
## t69 0 1 0 0 0 0 0
## t149 0 0 1 0 0 0 0
## t150 0 0 1 0 0 0 0
## t7 0 0 0 0 0 1 0
## t92 0 0 0 0 0 1 0
## t93 0 0 0 0 0 1 0
## t110 0 0 0 0 0 1 0
## t111 0 0 0 0 0 1 0
## t197 0 0 0 0 1 0 0
## t198 0 0 0 0 1 0 0
## t161 0 0 0 0 1 0 0
## t157 0 0 0 0 1 0 0
## t158 0 0 0 0 1 0 0
## t12 0 0 0 1 0 0 0
## t9 0 0 0 0 1 0 0
## t41 0 0 0 0 1 0 0
## t42 0 0 0 0 1 0 0
## t25 0 0 0 0 1 0 0
## t37 0 0 0 0 1 0 0
## t38 0 0 0 0 1 0 0
## t26 0 0 0 0 1 0 0
## t10 0 0 0 1 0 0 0
## t90 0 0 0 0 0 1 0
## t94 0 0 0 0 0 1 0
## t95 0 0 0 0 0 1 0
## t54 0 0 0 0 1 0 0
## t55 0 0 0 0 1 0 0
## t91 0 0 0 0 0 0 1
## t171 0 0 0 0 1 0 0
## t172 0 0 0 0 1 0 0
## t162 0 0 0 0 0 1 0
## t163 0 0 0 0 0 1 0
## t51 0 0 0 0 0 1 0
## t32 0 0 0 0 1 0 0
## t177 0 0 0 0 1 0 0
## t178 0 0 0 0 1 0 0
## t63 0 0 0 1 0 0 0
## t84 0 0 0 0 1 0 0
## t85 0 0 0 0 1 0 0
## t44 0 0 0 0 1 0 0
## t56 0 0 0 0 1 0 0
## t199 0 0 0 0 1 0 0
## t200 0 0 0 0 1 0 0
## t107 0 0 0 0 1 0 0
## t108 0 0 0 0 1 0 0
## t130 0 0 0 0 1 0 0
## t131 0 0 0 0 1 0 0
## t31 0 0 0 0 1 0 0
## t137 0 0 0 0 0 1 0
## t138 0 0 0 0 1 0 0
## t22 0 0 0 0 1 0 0
## t139 0 0 0 0 0 1 0
## t140 0 0 0 0 0 1 0
## t118 0 0 0 0 0 1 0
## t72 0 0 0 0 0 1 0
## t73 0 0 0 0 1 0 0
## t76 0 0 0 0 0 1 0
## t159 0 0 0 0 0 1 0
## t160 0 0 0 0 0 1 0
## t14 0 0 0 0 0 0 1
## t75 0 0 0 0 0 1 0
## t134 0 0 0 0 0 1 0
## t195 0 0 0 0 0 1 0
## t196 0 0 0 0 0 1 0
## t164 0 0 0 0 0 1 0
## t34 0 0 0 0 0 0 1
## t100 0 0 0 0 0 0 1
## t101 0 0 0 0 0 0 1
## t116 0 0 0 0 0 1 0
## t117 0 0 0 0 1 0 0
## t23 0 0 0 0 0 1 0
## t175 0 0 0 0 0 1 0
## t176 0 0 0 0 0 1 0
## t57 0 0 0 1 0 0 0
## t58 0 0 0 1 0 0 0
## t1 0 0 0 0 0 1 0
simmap.trees<-make.simmap(poly.tree,fit$data,model=fit$index.matrix,nsim=100)
## make.simmap is sampling character histories conditioned on the transition matrix
##
## Q =
## A A+B B B+C C C+D D
## A 0 0.000000 0.000000 0.000000 0.000000 0.000000 0.00000
## A+B 0 -1.271756 1.271756 0.000000 0.000000 0.000000 0.00000
## B 0 1.271756 -2.333135 1.061379 0.000000 0.000000 0.00000
## B+C 0 0.000000 1.061379 -1.768122 0.706743 0.000000 0.00000
## C 0 0.000000 0.000000 0.706743 -1.913028 1.206285 0.00000
## C+D 0 0.000000 0.000000 0.000000 1.206285 -2.252835 1.04655
## D 0 0.000000 0.000000 0.000000 0.000000 1.046550 -1.04655
## (estimated using likelihood);
## and (mean) root node prior probabilities
## pi =
## A A+B B B+C C C+D D
## 0.1428571 0.1428571 0.1428571 0.1428571 0.1428571 0.1428571 0.1428571
## Done.
simmap.trees
## 100 phylogenetic trees with mapped discrete characters
We can plot a summary of this analysis as follows:
cols<-setNames(colorRampPalette(c("black","yellow","red","blue"))(7),
colnames(fit$data))
plot(summary(simmap.trees),colors=cols,type="fan",ftype="off")
legend(x="topleft",legend=colnames(fit$data),pt.cex=2.4,pch=21,
pt.bg=cols)
However, it might make more sense to do things like combine states before computing posterior probabilities at node. The following
is just an example that combines A+B
, B
, and B+C
and then plots them compared to a new
state not-B
:
b.trees<-mergeMappedStates(simmap.trees,c("A+B","B","B+C"),"B")
b.trees<-mergeMappedStates(b.trees,c("A","C","C+D","D"),"not-B")
cols<-setNames(c("black","yellow"),c("not-B","B"))
plot(summary(b.trees),colors=cols,type="fan",ftype="off")
legend(x="topleft",legend=names(cols),pt.cex=2.4,pch=21,
pt.bg=cols)
Or, similarly using densityMap
:
object<-densityMap(b.trees,states=names(cols),plot=FALSE)
## sorry - this might take a while; please be patient
object<-setMap(object,c("black","yellow"))
plot(object,lwd=c(2,6),ftype="off")
That's kind of the idea. We can similarly analyze our posterior sample of trees in all of the usual ways, e.g.:
object<-density(simmap.trees)
object
##
## Distribution of changes from stochastic mapping:
##
## A+B->B A+B->B+C
## Min. :0 Min. :0
## Median :0 Median :0
## Mean :0.34 Mean :0
## Max. :2 Max. :0
##
## A+B->C A+B->C+D
## Min. :0 Min. :0
## Median :0 Median :0
## Mean :0 Mean :0
## Max. :0 Max. :0
##
## A+B->D B->A+B
## Min. :0 Min. :2
## Median :0 Median :2
## Mean :0 Mean :2.29
## Max. :0 Max. :4
##
## B->B+C B->C
## Min. :0 Min. :0
## Median :1.5 Median :0
## Mean :1.85 Mean :0
## Max. :6 Max. :0
##
## B->C+D B->D
## Min. :0 Min. :0
## Median :0 Median :0
## Mean :0 Mean :0
## Max. :0 Max. :0
##
## B+C->A+B B+C->B
## Min. :0 Min. :4
## Median :0 Median :7
## Mean :0 Mean :6.67
## Max. :0 Max. :10
##
## B+C->C B+C->C+D
## Min. :1 Min. :0
## Median :3 Median :0
## Mean :3.54 Mean :0
## Max. :9 Max. :0
##
## B+C->D C->A+B
## Min. :0 Min. :0
## Median :0 Median :0
## Mean :0 Mean :0
## Max. :0 Max. :0
##
## C->B C->B+C
## Min. :0 Min. :13
## Median :0 Median :15
## Mean :0 Mean :15.49
## Max. :0 Max. :21
##
## C->C+D C->D
## Min. :18 Min. :0
## Median :24 Median :0
## Mean :23.9 Mean :0
## Max. :30 Max. :0
##
## C+D->A+B C+D->B
## Min. :0 Min. :0
## Median :0 Median :0
## Mean :0 Mean :0
## Max. :0 Max. :0
##
## C+D->B+C C+D->C
## Min. :0 Min. :8
## Median :0 Median :14
## Mean :0 Mean :13.68
## Max. :0 Max. :26
##
## C+D->D D->A+B
## Min. :7 Min. :0
## Median :11 Median :0
## Mean :10.64 Mean :0
## Max. :15 Max. :0
##
## D->B D->B+C
## Min. :0 Min. :0
## Median :0 Median :0
## Mean :0 Mean :0
## Max. :0 Max. :0
##
## D->C D->C+D
## Min. :0 Min. :0
## Median :0 Median :3
## Mean :0 Mean :3.51
## Max. :0 Max. :9
##
## 95% HPD interval(A+B->B): [0, 2]
## 95% HPD interval(A+B->B+C): [0, 0]
## 95% HPD interval(A+B->C): [0, 0]
## 95% HPD interval(A+B->C+D): [0, 0]
## 95% HPD interval(A+B->D): [0, 0]
## 95% HPD interval(B->A+B): [2, 4]
## 95% HPD interval(B->B+C): [0, 5]
## 95% HPD interval(B->C): [0, 0]
## 95% HPD interval(B->C+D): [0, 0]
## 95% HPD interval(B->D): [0, 0]
## 95% HPD interval(B+C->A+B): [0, 0]
## 95% HPD interval(B+C->B): [5, 9]
## 95% HPD interval(B+C->C): [1, 7]
## 95% HPD interval(B+C->C+D): [0, 0]
## 95% HPD interval(B+C->D): [0, 0]
## 95% HPD interval(C->A+B): [0, 0]
## 95% HPD interval(C->B): [0, 0]
## 95% HPD interval(C->B+C): [13, 18]
## 95% HPD interval(C->C+D): [20, 29]
## 95% HPD interval(C->D): [0, 0]
## 95% HPD interval(C+D->A+B): [0, 0]
## 95% HPD interval(C+D->B): [0, 0]
## 95% HPD interval(C+D->B+C): [0, 0]
## 95% HPD interval(C+D->C): [8, 19]
## 95% HPD interval(C+D->D): [8, 14]
## 95% HPD interval(D->A+B): [0, 0]
## 95% HPD interval(D->B): [0, 0]
## 95% HPD interval(D->B+C): [0, 0]
## 95% HPD interval(D->C): [0, 0]
## 95% HPD interval(D->C+D): [0, 7]
plot(object)
Or from our B
or not-B
analysis:
object<-density(b.trees)
object
##
## Distribution of changes from stochastic mapping:
## B->not-B not-B->B
## Min. :1 Min. :13
## Median :3 Median :15
## Mean :3.54 Mean :15.49
## Max. :9 Max. :21
##
## 95% HPD interval(B->not-B): [1, 7]
## 95% HPD interval(not-B->B): [13, 18]
par(family="mono")
plot(object)
You get the idea.
The very heterogeneous sampling (“between 1 and 42 (mean 3) datapoints per species”) raises another interesting point. What if we only have one sample for a species? How can we genuinely say that it is monomorphic? The truth is - we can't; however, if we have a model for our uncertainty we can easily incorporate it in the same way that we specify uncertainty in the values of monomorphic tip states (as previously discussed).
That's all I have to say about this - for now.
Hi Dr. Revell,
ReplyDeleteLove all your blogs - very clear and informative! I appreciate it a lot.
However, when I was learning this tutorial, I came across some problem with applying the fitted model to the data.
After assigning "fit", you used "fit$data" to call out the data matrix of the input data. But when I run "fit$data" (I was using your provided inputs) in my RStudio, it returned "NULL" rather than a data frame which you show in the tutorial.
I was wondering if this issue is caused by my phytools version (which is 0.6.99). If so, how can I update my package to 0.7.2 as my phytools shown to be up-to-date.
Thank you so much for your great help!!
Gratefully,
Laura
Dear Liam J. Revell
ReplyDeleteI follow your blog (http://blog.phytools.org) and learned a lot about Comparative Biology in the phytools R package.
You provide a very detailed script. However, when I applied the fit using fit$data I received the message "Null" at the R prompt instead of a data frame that you show in the tutorial. From that point I could not perform the analysis.
I was wondering if this issue is caused by my phytools version (which is 0.6.99). If so, how can I update my package to 0.7.2 as my phytools shown to be up-to-date.
Thank you so much for your great help!!
Gratefully,
Washington Vieira
Dear Liam J. Revell
ReplyDeleteI follow your blog (http://blog.phytools.org) and learned a lot about Comparative Biology in the phytools R package.
You provide a very detailed script. However, when I applied the fit using fit$data I received the message "Null" at the R prompt instead of a data frame that you show in the tutorial. From that point I could not perform the analysis.
I was wondering if this issue is caused by my phytools version (which is 0.6.99). If so, how can I update my package to 0.7.2 as my phytools shown to be up-to-date.
Thank you so much for your great help!!
Gratefully,
Washington Vieira
After installing the CRAN package 'devtools' this can be done in a fresh R session as follows:
Deletelibrary(devtools)
install_github("liamrevell/phytools")
All the best, Liam
Dear Liam,
ReplyDeleteThank you for this very clear and useful post.
I have a question regarding the incorporation of uncertainty for some tips with the fitpolyMk function. Although the conversion of the character states to a matrix with the prior probabilities of each tip works well with the fitMk function, I cannot run it with the fitpolyMk (I got an error message: 'Error in strsplit(x, "+", fixed = TRUE) : non-character argument'). I was therefore wondering whether it is possible to combine the intraspecific polymorphism with character state uncertainty.
Best regards,
Alex
I also had this error and I removed it by transforming my data with the lines that are in the Liams code:
Deletey2<-setNames(data[,1],rownames(data))
y2<-as.factor(y2)
Hi,
ReplyDeletethe code is very useful, there are not many tools for polymorphic data. I have, however, a problem with my data: at the end I get an error '1:nrow(L)':argument of length 0.
I have tried different models (ER, ARD, SYM) and it is still the same. I think this my be related to rates values, since in each model my rates are NaNs.
Thank you so much for your great help.
Update
DeleteProblem solved after chagning "ordered=TRUE" to "ordered=FALSE".
Dear Sir,
ReplyDeleteI'm a great admirer of your tutorial. However, I would like to know that, how can I do character mapping of outcome from BiSSE or MuSSE of real data in diveritree package. Is it possible to do in simmap or any other method kindly let me know.
Sincerely,
Debajyoti
Dear Sir,
DeleteI got some code edited version of make.simmap by Martha Liliana Serrano-Serrano https://github.com/phylolab/MarthaSerrano_scripts/blob/master/script_Serranoetal_2017.R
Though I have not used yet, trying to understand.
Thanks,
Debajyoti
Dear Liam,
ReplyDeletethanks for the very cool tool phytools & all its functions! I recently used the function densityMap to visualize character changes for a binary trait, which worked fine. However, I was wondering if there is a way to also plot a timescale underneath the tree so that one can easily see when character changes happen? Unfortunately the usual commands to add a timescale (like axisPhylo) do not work since the outcome is not of class "Phylo". I would greatly appreciate your feedback! Thanks a lot!
Best,
Belinda