News: 2011 bioblitz!
Farrell Lab undergraduate Adam Clark is leading the second annual Harvard bioblitz. Join us on May 1! More
News
Prof. Farrell co-authors new paper which answers longstanding scientific question about cause of tropics' stunning biodiversity. MoreCurrent research
Current research extends the evolution of insect-plant interactions to other trophic levels through a broad collaboration in the beetle Tree of Life project.
A new research dimension in the lab concerns the acoustic signals produced for mating and territory defense. More
Saturation plots & weighting
Saturated data refers to data where the phylogenetic signal is overwhelmed by multiple changes at each site. Note that data which is saturated at deep levels may still be useful at more recent divergences. Saturated data can be relatively harmless, merely making tree searches take longer and potentially reducing the decisiveness of the dataset. However, saturated data can pose problems. The data could increase variance in branch lengths, potentially raising the problem of long branch attraction. This could also make clocklike data reject a clock. The saturated data could require a more complex likelihood model than the dataset would without the saturated data. Finally, the saturated data could adversely affect the liklihood parameter estimates applied to all sites, which may affect the tree chosen.
One common approach for detecting saturated data is to graph the transition/transversion ratio for pairs of sequences versus number of transversions between those pairs of sequences. This will eventually decline to an equilibrium value, governed by the base frequencies, at saturation. Graphs for character sets which show a sharp decline and long horizontal trend of data suggest saturation much more than graphs which show a gradual decline in this ratio over all transversions. Character sets with lower effective number of states (for example, third codon positions of mitochondrial DNA in insects, which often shows strong AT bias) are more likely to show saturation than character sets with more balanced base composition.