News: Hoopes Prize!

bioblitz shieldFarrell Lab undergraduate Adam Clark wins the Hoopes Prize!

More

News

Prof. Farrell co-authors new paper which answers longstanding scientific question about cause of tropics' stunning biodiversity. More

Current research

News pictureCurrent research extends the evolution of insect-plant interactions to other trophic levels through a broad collaboration in the beetle Tree of Life project.

A new research dimension in the lab concerns the acoustic signals produced for mating and territory defense. More

Searches


Exhaustive and branch-and-bound searches (<13 taxa)

These strategies are guaranteed to find the optimal tree but take too long for more than a few taxa.

Heuristic searches

Heuristic searches are not guaranteed to find the globally-optimal tree, but they can work for many more taxa than exhaustive or branch-and-bound searches. A good starting batch file is:

#nexus
begin PAUP;
log file=hsearch1.log;
set autoclose=yes;
hsearch start=stepwise addseq=random nreps=100 savereps=yes randomize=addseq rstatus=yes hold=1 swap=tbr multrees=yes;
savetrees file=hsearch1.all.tre;
filter best=yes permdel=yes;
savetrees file=hsearch1.best.tre;
log stop;
end;

This batch file will do a heuristic search, save all the trees found in each random addition sequence replicate in the hsearch1.all.tre file, then filter for the best trees overall and save them in an hsearch1.best.tre file. PAUP will also output a tree-island profile. A "tree-island" is a group of trees which cannot be reached through branchswapping from a different group of trees. Ideally, all the trees will be on one island which will have been hit every time, no matter what starting tree or taxon addition order. If this is the case, the treespace is fairly simple and we would probably move on to bootstrapping. If the island(s) with the best trees was hit nearly all the time, we'd probably be satisfied, as well. However, if the best trees are not recovered in many of the searches, we would need to search further.

We use a variety of strategies for these next searches. The first strategy is to simply increase the number of addition sequence replicates. To do this, change the nreps=100 in the batch file above to a higher number, perhaps nreps=1000. To avoid confusion, we may also want to replace "hsearch1" with "hsearch2" wherever it appears before running the search again.

Another strategy is to start from random trees, rather than random taxon addition. On the above batch file, change the randomize=addseq to randomize=trees, change the "hsearch#" as above, and search again, with nreps to taste.

These searches will run longer than the initial search. A way to speed up the search while covering more of tree space is to increase the number of addition replicates but reducing the thoroughness of the search for each replicate. We might do this if we believe there may be islands of good trees PAUP is missing. A sample batch file follows. You may want to change the nreps and timelimit.

#nexus
begin paup;
log file=hsearch.tlimit.log;
set maxtrees=10000 increase=auto;
hsearch rstatus=no limitperrep=yes nreps=5000 randomize=trees timelimit=5 savereps=yes;
savetrees file=hsearch.tlimit.all.tre;
filter best=yes permdel=yes;
savetrees file=hsearch.tlimit.best.tre;
end;

The parsimony ratchet (below) can be useful in searching treespace broadly, as well.

Parsimony ratchet

The parsimony ratchet is a way to search treespace by reweights characters for some iterations of a search. It is especially good for searches with large numbers of taxa. It is described by Kevin Nixon (Nixon, K. C. 1999. "The Parsimony Ratchet, a new method for rapid parsimony analysis." Cladistics 15: 407-414). Derek Sikes and Paul Lewis have written a program, called PaupRat, which generates batch files to implement this search strategy in PAUP. As adviced by the authors: 1) It's better to do several searches of a moderate number of nreps in each search (create a new ratchet file for each search) than one search with many nreps; 2) It's useful to insert the commands:

stopcmd "filter best=yes permdel=yes";
stopcmd "savetrees file=mydata.best.tre";

into the setup.nex file before the stopcmd "[quit]"; to automatically filter for the best trees.

Bootstrap searches

Bootstrap searches can take quite some time. One useful feature of PAUP is the ability to search on multiple computers or on several different occasions and combine the results. The key to this is saving the bootstrap trees from each search and then loading them all together, then computing the consensus tree using the tree weights. You must be sure to keep the bootstrap search settings (except for the number of bootstrap replicates) the same between searches for this to be valid. This can be done through the menus (make sure to hit the "save trees to file" checkbox in the bootstrap menu), or through the batch files (below).

For the search itself [the search can be stopped before completing all the bootstrap replicates; if doing multiple searches, the treefile name should be changed for each search. You may want to change the number of bootstrap replicates (currently set at 500) and the number of random taxon additions per bootstrap replicate (currently set at a low value of 10)]:

#nexus
begin paup;
set storetreewts=yes;
bootstrap nreps=500 treefile=bootstrap1.tre replace=no/ start=stepwise addseq=random nreps=10 savereps=no randomize=addseq hold=1 swap=tbr multrees=yes;
end;

Load all the bootstrap trees, making sure to store tree weights (an option which should have been made the default by the above batch file), to load all blocks, and NOT to eliminate duplicate trees. Executing the following batch file should set all these options and load the trees saved as bootstrap1.tre in the active folder.

#nexus
begin paup;
gettrees allblocks=yes duptrees=keep storetreewts=yes mode=7 file=bootstrap1.tre;
end;

Finally, compute a majority-rule consensus tree. The batch file for this is:

#nexus
begin paup;
log file=bootstrapconsensus.log;
contree /majrule=yes strict=no le50=yes usetreewts=yes showtree=yes treefile=finalbootstrap.tre grpfreq=yes;
log stop;
end;

Note that this will create a bootstrap tree which will include nodes with support less than 50% which are consistent with the majority rule tree. Most authors choose to omit bootstrap numbers less than 50% while continuing to show the node; a better approach, if space allows, would be to show all bootstrap values.