tag:blogger.com,1999:blog-8499895524521663926.post8667602798198209013..comments2024-03-27T07:13:39.236-04:00Comments on Phylogenetic Tools for Comparative Biology: Faster version of getStates, but why is it faster...?Liam Revellhttp://www.blogger.com/profile/04314686830842384151noreply@blogger.comBlogger4125tag:blogger.com,1999:blog-8499895524521663926.post-5569461927471223012016-10-28T20:04:40.418-04:002016-10-28T20:04:40.418-04:00Dear Liam,
I was glad to lay hands on your tutoria...Dear Liam,<br />I was glad to lay hands on your tutorials but I was blocked midway.<br /><br />I wanted to pull out the tips from data state with the function "getStates" and I consistently have this bug: <br /><br />"Error in setNames(sapply(tree$maps, function(x) names(x)[length(x)]), : 'names' attribute [166] must be the same length as the vector [0]"<br /><br />Is there a previous command which you applied to construct your "anoletree" in the example?<br /><br /><br />ThanksMChttps://www.blogger.com/profile/12899637463378973199noreply@blogger.comtag:blogger.com,1999:blog-8499895524521663926.post-39230034357340570312013-05-08T11:32:54.567-04:002013-05-08T11:32:54.567-04:00Ok - I understand why split helps, it is because r...Ok - I understand why split helps, it is because run-time increases more than linearly with the number of trees, and by splitting our object into an unclassed list of lists we cut off a large portion of that more than linear increase.<br /><br />However, I don't entirely understand <i>why</i> assigning a class attribute to what is really just a simple list affects lapply & sapply so much.<br /><br />Thanks again for the trick. I have now used it in the phytools function describe.simmap, countSimmap, as well as getStates, to great effect.<br /><br />- LiamLiam Revellhttps://www.blogger.com/profile/04314686830842384151noreply@blogger.comtag:blogger.com,1999:blog-8499895524521663926.post-36954866615864183012013-05-07T20:41:27.750-04:002013-05-07T20:41:27.750-04:00Hi Klaus. This is terrific.
I'm blown that re...Hi Klaus. This is terrific.<br /><br />I'm blown that removing the class attribute would have this effect! lapply documentation says classed objects will be coerced using as.list....<br /><br />Also don't know why there is any performance improvement using split - since split objects still have class "multiPhylo".<br /><br />I will definitely use this, Klaus. Thanks!<br /><br />- LiamLiam Revellhttps://www.blogger.com/profile/04314686830842384151noreply@blogger.comtag:blogger.com,1999:blog-8499895524521663926.post-438813771592918582013-05-07T20:12:17.182-04:002013-05-07T20:12:17.182-04:00Hi Liam,
that is not really unexpected. R copies...Hi Liam, <br /><br />that is not really unexpected. R copies a lot during subsetting objects. lapply and therefore also sapply work much better on pure lists. <br />Here is a trick sometimes used in ape and phangorn with your original getStates function: <br /><br /><br />> trees2 = unclass(trees)<br />> system.time(X1 <- sapply(trees, getStates))<br /> user system elapsed <br /> 10.572 0.000 10.591 <br />> system.time(X2 <- sapply(trees2, getStates))<br /> user system elapsed <br /> 0.356 0.000 0.354 <br />> all.equal(X1, X2)<br />[1] TRUE <br /><br />You can compare the different memory usage, if you run this code: <br /><br />Rprof(tmp <- tempfile(), memory.profiling=TRUE)<br />X1 <- sapply(trees, getStates)<br />Rprof()<br />summaryRprof(tmp, memory="both")<br />unlink(tmp)<br /><br />Rprof(tmp <- tempfile(), memory.profiling=TRUE)<br />X2 <- sapply(trees2, getStates)<br />Rprof()<br />summaryRprof(tmp, memory="both")<br />unlink(tmp)<br /><br /><br />Cheers, <br />Klaus<br />Klaushttps://www.blogger.com/profile/11021989593338482289noreply@blogger.com