Phylogenetic Tools for Comparative Biology: July 2019

I recently received an email with the following inquiry:

“I wondered if you had any insight or advice on how the make.simmap function in phytools works when the input data have multiple rows per species? It certainly does run but I am unclear on how example it's dealing with this. Specifically, I'm working with a dataset of a 3-state categorical trait but for which I have between 1 and 42 (mean 3) datapoints per species …. I initially fed in the trait data directly to make.simmap, and it did run fine and give reasonable looking estimates for internal nodes. However, it also allocated a single state (without exception) to the tips, so I'm not sure if this is choosing the modal value, the first value, or something else here?

As an alternative, I converted the raw data to a matrix of proportions of each state per species (in a similar manner as incorporating uncertainty in tip state as you described here) based on the observed proportions for each species. This gives very similar estimates for ancestral states, perhaps within the bounds of the inherent stochasticity of the method, though different enough in places that I was suspicious that it's doing something different from my first approach. It also (of course) gives tip states that are as variable as the data it derived from, as you'd expect. However I wonder whether this approach, which essentially equates uncertainty in tip state with intraspecific variability, is a valid way of treating this data.

In summary, do you have anything you can add concerning the appropriateness of either of these approaches (or any other suggestions) for dealing with N-state categorical traits which are variable within species?”

First - in the former case make.simmap is most definitely not taking intraspecific polymorphism into account. Rather, as the user suspected, it is most-likely just using the first observation for each species.

The second alternative is more interesting. Although this may seem reasonable it is also not correct. Saying that a particular tip taxa has a prior probability of being in state A that is P(A)=0.50 and P(B)=0.50 implies that the taxon could be either state A or B with equal probability - not that (say) 50% of the population is in state A & the other 50% in state B.

Though this may be somewhat unsatisfying, the simplest way to deal with intraspecific polymorphism is to re-code our character as an additional state. That is, instead of having character states A and B, we instead have A, A+B, and B.

Unfortunately, phytools::make.simmap presently doesn't automatically know how to handle data in this form. The function fitpolyMk does, however, and it is relatively straightforward to integrate it into make.simmap. I'll next demonstrate how.

Note that I just pushed a very small update to fitpolyMk to make this easier - thus to follow this demo exactly it is necessary to update phytools from GitHub.

Let's get some data & a tree for our demo. Here are mine:

library(phytools)
packageVersion("phytools")

## [1] '0.7.2'

poly.data<-read.csv(file="http://www.phytools.org/TS2019/ex/4c/polymorphic-data.csv",
    row.names=1)
y<-setNames(poly.data[,1],rownames(poly.data))
y

##  t70  t71  t47 t169 t170  t67  t68  t62  t59  t60 t173 t174  t19  t20  t21 
##  C+D  C+D  C+D  C+D  C+D  C+D  B+C  C+D    C    C    B    B    B  B+C    B 
##  t15  t50 t109 t123 t124 t104 t105  t52  t53 t167 t168 t141  t74  t66  t48 
##    B  B+C  B+C  B+C  B+C  B+C  B+C  B+C  B+C    C    C    C    C  C+D    C 
## t181 t182   t2  t39  t40   t4  t27 t191 t192  t28 t145 t146  t77   t8 t193 
##  C+D  C+D    C    C    C    D    C  B+C    C    D  C+D  C+D  C+D    D    D 
## t194  t96  t45  t86  t87  t13   t6 t125 t183 t184  t16  t17 t135 t136 t147 
##    D    D    D    D    D  C+D  C+D  C+D    C    C    C    C    C    C    C 
## t148  t83  t88  t89  t78  t79  t24 t187 t188  t36  t29  t99 t189 t190 t115 
##    C    C  B+C    C    C    C    C    C    C    C    D  C+D    C    C    C 
## t119 t120 t121 t122  t46  t49 t165 t166 t112   t5 t106 t126 t142 t143  t98 
##    C    C    C    C    C    C  C+D  C+D  C+D  B+C  B+C  B+C  B+C  B+C    C 
## t179 t180 t144 t113 t114  t18  t30 t155 t156  t11 t128 t129 t127  t64  t65 
##  B+C  B+C  B+C  B+C  B+C  B+C    C    C    C  B+C    C  C+D    C  C+D  C+D 
## t153 t154 t185 t186  t80  t81  t33  t43 t151 t152  t82  t97 t132 t133  t61 
##    C    C    C    C    B    B    C    D  C+D  C+D  C+D    D    D    D    D 
##   t3 t102 t103  t35  t69 t149 t150   t7  t92  t93 t110 t111 t197 t198 t161 
##  A+B  B+C  B+C  B+C  A+B    B    B  C+D  C+D  C+D  C+D  C+D    C    C    C 
## t157 t158  t12   t9  t41  t42  t25  t37  t38  t26  t10  t90  t94  t95  t54 
##    C    C  B+C    C    C    C    C    C    C    C  B+C  C+D  C+D  C+D    C 
##  t55  t91 t171 t172 t162 t163  t51  t32 t177 t178  t63  t84  t85  t44  t56 
##    C    D    C    C  C+D  C+D  C+D    C    C    C  B+C    C    C    C    C 
## t199 t200 t107 t108 t130 t131  t31 t137 t138  t22 t139 t140 t118  t72  t73 
##    C    C    C    C    C    C    C  C+D    C    C  C+D  C+D  C+D  C+D    C 
##  t76 t159 t160  t14  t75 t134 t195 t196 t164  t34 t100 t101 t116 t117  t23 
##  C+D  C+D  C+D    D  C+D  C+D  C+D  C+D  C+D    D    D    D  C+D    C  C+D 
## t175 t176  t57  t58   t1 
##  C+D  C+D  B+C  B+C  C+D 
## Levels: A+B B B+C C C+D D

poly.tree<-read.tree(file="http://www.phytools.org/TS2019/ex/4c/polymorphic-tree.phy")

Just for fun - and to show ourselves that fitpolyMk can handle it - let's flip the order of some of our polymorphic conditions. In other words, we will make state A + B a mixture of taxa coded as A+B and B+A and so on.

y<-setNames(as.character(y),names(y))
ii<-which(y=="A+B")
y[sample(ii,round(0.5*length(ii)))]<-"B+A"
ii<-which(y=="C+D")
y[sample(ii,round(0.5*length(ii)))]<-"D+C"
y<-as.factor(y)
y

##  t70  t71  t47 t169 t170  t67  t68  t62  t59  t60 t173 t174  t19  t20  t21 
##  D+C  C+D  C+D  C+D  D+C  C+D  B+C  C+D    C    C    B    B    B  B+C    B 
##  t15  t50 t109 t123 t124 t104 t105  t52  t53 t167 t168 t141  t74  t66  t48 
##    B  B+C  B+C  B+C  B+C  B+C  B+C  B+C  B+C    C    C    C    C  C+D    C 
## t181 t182   t2  t39  t40   t4  t27 t191 t192  t28 t145 t146  t77   t8 t193 
##  C+D  D+C    C    C    C    D    C  B+C    C    D  D+C  C+D  D+C    D    D 
## t194  t96  t45  t86  t87  t13   t6 t125 t183 t184  t16  t17 t135 t136 t147 
##    D    D    D    D    D  C+D  C+D  C+D    C    C    C    C    C    C    C 
## t148  t83  t88  t89  t78  t79  t24 t187 t188  t36  t29  t99 t189 t190 t115 
##    C    C  B+C    C    C    C    C    C    C    C    D  C+D    C    C    C 
## t119 t120 t121 t122  t46  t49 t165 t166 t112   t5 t106 t126 t142 t143  t98 
##    C    C    C    C    C    C  D+C  C+D  C+D  B+C  B+C  B+C  B+C  B+C    C 
## t179 t180 t144 t113 t114  t18  t30 t155 t156  t11 t128 t129 t127  t64  t65 
##  B+C  B+C  B+C  B+C  B+C  B+C    C    C    C  B+C    C  D+C    C  C+D  C+D 
## t153 t154 t185 t186  t80  t81  t33  t43 t151 t152  t82  t97 t132 t133  t61 
##    C    C    C    C    B    B    C    D  C+D  D+C  D+C    D    D    D    D 
##   t3 t102 t103  t35  t69 t149 t150   t7  t92  t93 t110 t111 t197 t198 t161 
##  A+B  B+C  B+C  B+C  B+A    B    B  D+C  D+C  D+C  D+C  D+C    C    C    C 
## t157 t158  t12   t9  t41  t42  t25  t37  t38  t26  t10  t90  t94  t95  t54 
##    C    C  B+C    C    C    C    C    C    C    C  B+C  D+C  C+D  C+D    C 
##  t55  t91 t171 t172 t162 t163  t51  t32 t177 t178  t63  t84  t85  t44  t56 
##    C    D    C    C  D+C  C+D  D+C    C    C    C  B+C    C    C    C    C 
## t199 t200 t107 t108 t130 t131  t31 t137 t138  t22 t139 t140 t118  t72  t73 
##    C    C    C    C    C    C    C  D+C    C    C  D+C  C+D  C+D  D+C    C 
##  t76 t159 t160  t14  t75 t134 t195 t196 t164  t34 t100 t101 t116 t117  t23 
##  D+C  C+D  D+C    D  C+D  C+D  C+D  C+D  D+C    D    D    D  D+C    C  D+C 
## t175 t176  t57  t58   t1 
##  D+C  D+C  B+C  B+C  D+C 
## Levels: A+B B B+A B+C C C+D D D+C

Plot our data:

plotTree(poly.tree,ftype="off",lwd=1,type="fan")
X<-strsplit(setNames(as.character(y),names(y)),"+",fixed=TRUE)
pies<-matrix(0,Ntip(poly.tree),4,dimnames=list(poly.tree$tip.label,
    c("A","B","C","D")))
for(i in 1:Ntip(poly.tree)) 
    pies[poly.tree$tip.label[i],X[[poly.tree$tip.label[i]]]]<-
        rep(1/length(X[[poly.tree$tip.label[i]]]),
        length(X[[poly.tree$tip.label[i]]]))
tiplabels(pie=pies,piecol=c("black","yellow","red","blue"),
    cex=0.35)
legend(x="topleft",legend=c("A","B","C","D"),pt.cex=2,pch=21,
    pt.bg=c("black","yellow","red","blue"))

plot of chunk unnamed-chunk-3

Now we'll go ahead & fit our polymorphic evolution model. Here, I'm going to use an symmetric-rates ordered model - just because I happen to know that this model has good fit to the data:

fit<-fitpolyMk(poly.tree,y,model="SYM",ordered=TRUE)

## 
## This is the design matrix of the fitted model. Does it make sense?
## 
##     A A+B B B+C C C+D D
## A   0   1 0   0 0   0 0
## A+B 1   0 2   0 0   0 0
## B   0   2 0   3 0   0 0
## B+C 0   0 3   0 4   0 0
## C   0   0 0   4 0   5 0
## C+D 0   0 0   0 5   0 6
## D   0   0 0   0 0   6 0

plot(fit,lwd=2)

plot of chunk unnamed-chunk-4

Now, how can we apply this to stochastic character mapping? Well, this is easier than you'd think simply because the function fitpolyMk (now) returns both the design matrix of the model, and a data matrix of the input data to the user as components of the fitted model object: $index.matrix and $data, respectively. Let's examine these & then use these as follows:

fit$index.matrix

##     A A+B B B+C C C+D D
## A   0   1 0   0 0   0 0
## A+B 1   0 2   0 0   0 0
## B   0   2 0   3 0   0 0
## B+C 0   0 3   0 4   0 0
## C   0   0 0   4 0   5 0
## C+D 0   0 0   0 5   0 6
## D   0   0 0   0 0   6 0

fit$data

##      A A+B B B+C C C+D D
## t70  0   0 0   0 0   1 0
## t71  0   0 0   0 0   1 0
## t47  0   0 0   0 0   1 0
## t169 0   0 0   0 0   1 0
## t170 0   0 0   0 0   1 0
## t67  0   0 0   0 0   1 0
## t68  0   0 0   1 0   0 0
## t62  0   0 0   0 0   1 0
## t59  0   0 0   0 1   0 0
## t60  0   0 0   0 1   0 0
## t173 0   0 1   0 0   0 0
## t174 0   0 1   0 0   0 0
## t19  0   0 1   0 0   0 0
## t20  0   0 0   1 0   0 0
## t21  0   0 1   0 0   0 0
## t15  0   0 1   0 0   0 0
## t50  0   0 0   1 0   0 0
## t109 0   0 0   1 0   0 0
## t123 0   0 0   1 0   0 0
## t124 0   0 0   1 0   0 0
## t104 0   0 0   1 0   0 0
## t105 0   0 0   1 0   0 0
## t52  0   0 0   1 0   0 0
## t53  0   0 0   1 0   0 0
## t167 0   0 0   0 1   0 0
## t168 0   0 0   0 1   0 0
## t141 0   0 0   0 1   0 0
## t74  0   0 0   0 1   0 0
## t66  0   0 0   0 0   1 0
## t48  0   0 0   0 1   0 0
## t181 0   0 0   0 0   1 0
## t182 0   0 0   0 0   1 0
## t2   0   0 0   0 1   0 0
## t39  0   0 0   0 1   0 0
## t40  0   0 0   0 1   0 0
## t4   0   0 0   0 0   0 1
## t27  0   0 0   0 1   0 0
## t191 0   0 0   1 0   0 0
## t192 0   0 0   0 1   0 0
## t28  0   0 0   0 0   0 1
## t145 0   0 0   0 0   1 0
## t146 0   0 0   0 0   1 0
## t77  0   0 0   0 0   1 0
## t8   0   0 0   0 0   0 1
## t193 0   0 0   0 0   0 1
## t194 0   0 0   0 0   0 1
## t96  0   0 0   0 0   0 1
## t45  0   0 0   0 0   0 1
## t86  0   0 0   0 0   0 1
## t87  0   0 0   0 0   0 1
## t13  0   0 0   0 0   1 0
## t6   0   0 0   0 0   1 0
## t125 0   0 0   0 0   1 0
## t183 0   0 0   0 1   0 0
## t184 0   0 0   0 1   0 0
## t16  0   0 0   0 1   0 0
## t17  0   0 0   0 1   0 0
## t135 0   0 0   0 1   0 0
## t136 0   0 0   0 1   0 0
## t147 0   0 0   0 1   0 0
## t148 0   0 0   0 1   0 0
## t83  0   0 0   0 1   0 0
## t88  0   0 0   1 0   0 0
## t89  0   0 0   0 1   0 0
## t78  0   0 0   0 1   0 0
## t79  0   0 0   0 1   0 0
## t24  0   0 0   0 1   0 0
## t187 0   0 0   0 1   0 0
## t188 0   0 0   0 1   0 0
## t36  0   0 0   0 1   0 0
## t29  0   0 0   0 0   0 1
## t99  0   0 0   0 0   1 0
## t189 0   0 0   0 1   0 0
## t190 0   0 0   0 1   0 0
## t115 0   0 0   0 1   0 0
## t119 0   0 0   0 1   0 0
## t120 0   0 0   0 1   0 0
## t121 0   0 0   0 1   0 0
## t122 0   0 0   0 1   0 0
## t46  0   0 0   0 1   0 0
## t49  0   0 0   0 1   0 0
## t165 0   0 0   0 0   1 0
## t166 0   0 0   0 0   1 0
## t112 0   0 0   0 0   1 0
## t5   0   0 0   1 0   0 0
## t106 0   0 0   1 0   0 0
## t126 0   0 0   1 0   0 0
## t142 0   0 0   1 0   0 0
## t143 0   0 0   1 0   0 0
## t98  0   0 0   0 1   0 0
## t179 0   0 0   1 0   0 0
## t180 0   0 0   1 0   0 0
## t144 0   0 0   1 0   0 0
## t113 0   0 0   1 0   0 0
## t114 0   0 0   1 0   0 0
## t18  0   0 0   1 0   0 0
## t30  0   0 0   0 1   0 0
## t155 0   0 0   0 1   0 0
## t156 0   0 0   0 1   0 0
## t11  0   0 0   1 0   0 0
## t128 0   0 0   0 1   0 0
## t129 0   0 0   0 0   1 0
## t127 0   0 0   0 1   0 0
## t64  0   0 0   0 0   1 0
## t65  0   0 0   0 0   1 0
## t153 0   0 0   0 1   0 0
## t154 0   0 0   0 1   0 0
## t185 0   0 0   0 1   0 0
## t186 0   0 0   0 1   0 0
## t80  0   0 1   0 0   0 0
## t81  0   0 1   0 0   0 0
## t33  0   0 0   0 1   0 0
## t43  0   0 0   0 0   0 1
## t151 0   0 0   0 0   1 0
## t152 0   0 0   0 0   1 0
## t82  0   0 0   0 0   1 0
## t97  0   0 0   0 0   0 1
## t132 0   0 0   0 0   0 1
## t133 0   0 0   0 0   0 1
## t61  0   0 0   0 0   0 1
## t3   0   1 0   0 0   0 0
## t102 0   0 0   1 0   0 0
## t103 0   0 0   1 0   0 0
## t35  0   0 0   1 0   0 0
## t69  0   1 0   0 0   0 0
## t149 0   0 1   0 0   0 0
## t150 0   0 1   0 0   0 0
## t7   0   0 0   0 0   1 0
## t92  0   0 0   0 0   1 0
## t93  0   0 0   0 0   1 0
## t110 0   0 0   0 0   1 0
## t111 0   0 0   0 0   1 0
## t197 0   0 0   0 1   0 0
## t198 0   0 0   0 1   0 0
## t161 0   0 0   0 1   0 0
## t157 0   0 0   0 1   0 0
## t158 0   0 0   0 1   0 0
## t12  0   0 0   1 0   0 0
## t9   0   0 0   0 1   0 0
## t41  0   0 0   0 1   0 0
## t42  0   0 0   0 1   0 0
## t25  0   0 0   0 1   0 0
## t37  0   0 0   0 1   0 0
## t38  0   0 0   0 1   0 0
## t26  0   0 0   0 1   0 0
## t10  0   0 0   1 0   0 0
## t90  0   0 0   0 0   1 0
## t94  0   0 0   0 0   1 0
## t95  0   0 0   0 0   1 0
## t54  0   0 0   0 1   0 0
## t55  0   0 0   0 1   0 0
## t91  0   0 0   0 0   0 1
## t171 0   0 0   0 1   0 0
## t172 0   0 0   0 1   0 0
## t162 0   0 0   0 0   1 0
## t163 0   0 0   0 0   1 0
## t51  0   0 0   0 0   1 0
## t32  0   0 0   0 1   0 0
## t177 0   0 0   0 1   0 0
## t178 0   0 0   0 1   0 0
## t63  0   0 0   1 0   0 0
## t84  0   0 0   0 1   0 0
## t85  0   0 0   0 1   0 0
## t44  0   0 0   0 1   0 0
## t56  0   0 0   0 1   0 0
## t199 0   0 0   0 1   0 0
## t200 0   0 0   0 1   0 0
## t107 0   0 0   0 1   0 0
## t108 0   0 0   0 1   0 0
## t130 0   0 0   0 1   0 0
## t131 0   0 0   0 1   0 0
## t31  0   0 0   0 1   0 0
## t137 0   0 0   0 0   1 0
## t138 0   0 0   0 1   0 0
## t22  0   0 0   0 1   0 0
## t139 0   0 0   0 0   1 0
## t140 0   0 0   0 0   1 0
## t118 0   0 0   0 0   1 0
## t72  0   0 0   0 0   1 0
## t73  0   0 0   0 1   0 0
## t76  0   0 0   0 0   1 0
## t159 0   0 0   0 0   1 0
## t160 0   0 0   0 0   1 0
## t14  0   0 0   0 0   0 1
## t75  0   0 0   0 0   1 0
## t134 0   0 0   0 0   1 0
## t195 0   0 0   0 0   1 0
## t196 0   0 0   0 0   1 0
## t164 0   0 0   0 0   1 0
## t34  0   0 0   0 0   0 1
## t100 0   0 0   0 0   0 1
## t101 0   0 0   0 0   0 1
## t116 0   0 0   0 0   1 0
## t117 0   0 0   0 1   0 0
## t23  0   0 0   0 0   1 0
## t175 0   0 0   0 0   1 0
## t176 0   0 0   0 0   1 0
## t57  0   0 0   1 0   0 0
## t58  0   0 0   1 0   0 0
## t1   0   0 0   0 0   1 0

simmap.trees<-make.simmap(poly.tree,fit$data,model=fit$index.matrix,nsim=100)

## make.simmap is sampling character histories conditioned on the transition matrix
## 
## Q =
##     A       A+B         B       B+C         C       C+D        D
## A   0  0.000000  0.000000  0.000000  0.000000  0.000000  0.00000
## A+B 0 -1.271756  1.271756  0.000000  0.000000  0.000000  0.00000
## B   0  1.271756 -2.333135  1.061379  0.000000  0.000000  0.00000
## B+C 0  0.000000  1.061379 -1.768122  0.706743  0.000000  0.00000
## C   0  0.000000  0.000000  0.706743 -1.913028  1.206285  0.00000
## C+D 0  0.000000  0.000000  0.000000  1.206285 -2.252835  1.04655
## D   0  0.000000  0.000000  0.000000  0.000000  1.046550 -1.04655
## (estimated using likelihood);
## and (mean) root node prior probabilities
## pi =
##         A       A+B         B       B+C         C       C+D         D 
## 0.1428571 0.1428571 0.1428571 0.1428571 0.1428571 0.1428571 0.1428571

## Done.

simmap.trees

## 100 phylogenetic trees with mapped discrete characters

We can plot a summary of this analysis as follows:

cols<-setNames(colorRampPalette(c("black","yellow","red","blue"))(7),
    colnames(fit$data))
plot(summary(simmap.trees),colors=cols,type="fan",ftype="off")
legend(x="topleft",legend=colnames(fit$data),pt.cex=2.4,pch=21,
    pt.bg=cols)

plot of chunk unnamed-chunk-6

However, it might make more sense to do things like combine states before computing posterior probabilities at node. The following is just an example that combines A+B, B, and B+C and then plots them compared to a new state not-B:

b.trees<-mergeMappedStates(simmap.trees,c("A+B","B","B+C"),"B")
b.trees<-mergeMappedStates(b.trees,c("A","C","C+D","D"),"not-B")
cols<-setNames(c("black","yellow"),c("not-B","B"))
plot(summary(b.trees),colors=cols,type="fan",ftype="off")
legend(x="topleft",legend=names(cols),pt.cex=2.4,pch=21,
    pt.bg=cols)

plot of chunk unnamed-chunk-7

Or, similarly using densityMap:

object<-densityMap(b.trees,states=names(cols),plot=FALSE)

## sorry - this might take a while; please be patient

object<-setMap(object,c("black","yellow"))
plot(object,lwd=c(2,6),ftype="off")

plot of chunk unnamed-chunk-8

That's kind of the idea. We can similarly analyze our posterior sample of trees in all of the usual ways, e.g.:

object<-density(simmap.trees)
object

## 
## Distribution of changes from stochastic mapping:
## 
##  A+B->B      A+B->B+C
##  Min.   :0   Min.   :0
##  Median :0   Median :0
##  Mean   :0.34    Mean   :0
##  Max.   :2   Max.   :0
## 
##  A+B->C      A+B->C+D
##  Min.   :0   Min.   :0
##  Median :0   Median :0
##  Mean   :0   Mean   :0
##  Max.   :0   Max.   :0
## 
##  A+B->D      B->A+B
##  Min.   :0   Min.   :2
##  Median :0   Median :2
##  Mean   :0   Mean   :2.29
##  Max.   :0   Max.   :4
## 
##  B->B+C      B->C
##  Min.   :0   Min.   :0
##  Median :1.5 Median :0
##  Mean   :1.85    Mean   :0
##  Max.   :6   Max.   :0
## 
##  B->C+D      B->D
##  Min.   :0   Min.   :0
##  Median :0   Median :0
##  Mean   :0   Mean   :0
##  Max.   :0   Max.   :0
## 
##  B+C->A+B        B+C->B
##  Min.   :0   Min.   :4
##  Median :0   Median :7
##  Mean   :0   Mean   :6.67
##  Max.   :0   Max.   :10
## 
##  B+C->C      B+C->C+D
##  Min.   :1   Min.   :0
##  Median :3   Median :0
##  Mean   :3.54    Mean   :0
##  Max.   :9   Max.   :0
## 
##  B+C->D      C->A+B
##  Min.   :0   Min.   :0
##  Median :0   Median :0
##  Mean   :0   Mean   :0
##  Max.   :0   Max.   :0
## 
##  C->B        C->B+C
##  Min.   :0   Min.   :13
##  Median :0   Median :15
##  Mean   :0   Mean   :15.49
##  Max.   :0   Max.   :21
## 
##  C->C+D      C->D
##  Min.   :18  Min.   :0
##  Median :24  Median :0
##  Mean   :23.9    Mean   :0
##  Max.   :30  Max.   :0
## 
##  C+D->A+B        C+D->B
##  Min.   :0   Min.   :0
##  Median :0   Median :0
##  Mean   :0   Mean   :0
##  Max.   :0   Max.   :0
## 
##  C+D->B+C        C+D->C
##  Min.   :0   Min.   :8
##  Median :0   Median :14
##  Mean   :0   Mean   :13.68
##  Max.   :0   Max.   :26
## 
##  C+D->D      D->A+B
##  Min.   :7   Min.   :0
##  Median :11  Median :0
##  Mean   :10.64   Mean   :0
##  Max.   :15  Max.   :0
## 
##  D->B        D->B+C
##  Min.   :0   Min.   :0
##  Median :0   Median :0
##  Mean   :0   Mean   :0
##  Max.   :0   Max.   :0
## 
##  D->C        D->C+D
##  Min.   :0   Min.   :0
##  Median :0   Median :3
##  Mean   :0   Mean   :3.51
##  Max.   :0   Max.   :9
## 
## 95% HPD interval(A+B->B): [0, 2]
## 95% HPD interval(A+B->B+C): [0, 0]
## 95% HPD interval(A+B->C): [0, 0]
## 95% HPD interval(A+B->C+D): [0, 0]
## 95% HPD interval(A+B->D): [0, 0]
## 95% HPD interval(B->A+B): [2, 4]
## 95% HPD interval(B->B+C): [0, 5]
## 95% HPD interval(B->C): [0, 0]
## 95% HPD interval(B->C+D): [0, 0]
## 95% HPD interval(B->D): [0, 0]
## 95% HPD interval(B+C->A+B): [0, 0]
## 95% HPD interval(B+C->B): [5, 9]
## 95% HPD interval(B+C->C): [1, 7]
## 95% HPD interval(B+C->C+D): [0, 0]
## 95% HPD interval(B+C->D): [0, 0]
## 95% HPD interval(C->A+B): [0, 0]
## 95% HPD interval(C->B): [0, 0]
## 95% HPD interval(C->B+C): [13, 18]
## 95% HPD interval(C->C+D): [20, 29]
## 95% HPD interval(C->D): [0, 0]
## 95% HPD interval(C+D->A+B): [0, 0]
## 95% HPD interval(C+D->B): [0, 0]
## 95% HPD interval(C+D->B+C): [0, 0]
## 95% HPD interval(C+D->C): [8, 19]
## 95% HPD interval(C+D->D): [8, 14]
## 95% HPD interval(D->A+B): [0, 0]
## 95% HPD interval(D->B): [0, 0]
## 95% HPD interval(D->B+C): [0, 0]
## 95% HPD interval(D->C): [0, 0]
## 95% HPD interval(D->C+D): [0, 7]

plot(object)

plot of chunk unnamed-chunk-9

Or from our B or not-B analysis:

object<-density(b.trees)
object

## 
## Distribution of changes from stochastic mapping:
##  B->not-B        not-B->B
##  Min.   :1   Min.   :13
##  Median :3   Median :15
##  Mean   :3.54    Mean   :15.49
##  Max.   :9   Max.   :21
## 
## 95% HPD interval(B->not-B): [1, 7]
## 95% HPD interval(not-B->B): [13, 18]

par(family="mono")
plot(object)

plot of chunk unnamed-chunk-10

You get the idea.

The very heterogeneous sampling (“between 1 and 42 (mean 3) datapoints per species”) raises another interesting point. What if we only have one sample for a species? How can we genuinely say that it is monomorphic? The truth is - we can't; however, if we have a model for our uncertainty we can easily incorporate it in the same way that we specify uncertainty in the values of monomorphic tip states (as previously discussed).

That's all I have to say about this - for now.

Phylogenetic Tools for Comparative Biology

Pages

Wednesday, July 31, 2019

Stochastic character mapping with polymorphic characters using `make.simmap` and `fitpolyMk` in phytools

Pages

Wednesday, July 31, 2019

Stochastic character mapping with polymorphic characters using make.simmap and fitpolyMk in phytools

Stochastic character mapping with polymorphic characters using `make.simmap` and `fitpolyMk` in phytools