News
Prof. Farrell co-authors new paper which answers longstanding scientific question about cause of tropics' stunning biodiversity. MoreCurrent research
Current research extends the evolution of insect-plant interactions to other trophic levels through a broad collaboration in the beetle Tree of Life project.
A new research dimension in the lab concerns the acoustic signals produced for mating and territory defense. More
Calibrating the tree: 2 methods
Calibrating the tree I: Using a previously-published calibration
There are generally two ways to put ages on nodes of a clock tree: 1) Use a previously-published rate of percent divergence for a particular gene versus time, or 2) Use fossil information (see next entry). Reported calibration rates often are in terms of uncorrected percent sequence divergence per million years ("uncorrected" means uncorrected for multiple hits). To use this value for your tree, compute all uncorrected pairwise distances for your taxa in PAUP. With menus, use File -> Save Distances to File, then Options -> Distance options and select uncorrected "p". The batch file would be (replacing [filename] with what you want to call the distance file, without brackets):
#nexus
begin paup;
dset distance=p;
savedist format=onecolumn file=[filename] undefined=asterisk;
end;
Examine the ensuing list (in PAUP or Microsoft Excel) to find a pair of taxa which have a percent divergence similar to the ones in the taxa used in the original calibration. Using these taxa, rather than the oldest or youngest pair, makes it more likely that the proportion of uncorrected multiple hits will be the same for the calibration and the pair being used to calibrate the entire tree (the rate of uncorrected divergence versus time will go down with time as multiple hits accumulate). The calibration rate can be used to determine the age of the node separating the chosen pair of taxa. The age of the node is then used to calibrate the rest of the tree.
Calibrating the tree II: Using fossil data
The rate of molecular evolution may be affected by many things: mutation rate, generation time, and perhaps even body size. A calibration based on fossil data of the group being studied may be more accurate than a calibration from other organisms. Multiple calibration points will give the most precise estimate. Record infomation about all the fossils in the group, not just the oldest. Fossils tend to give just minimum ages for the group — think about ways to determine maximum ages, as well. For example, the group may not be older than the evolution of life on land, or may not be older than the oldest possible age (which is not the age of the oldest known fossil) of an obligate host. Biogeographic information may also be useful.
Fossil calibrations, and minimum ages in general, should be mapped to the stem of the clade (= the most recent common ancestor of the clade and its sister group). Maximum ages should be mapped to the most recent common ancestor (MRCA) of the clade constrained by that calibration (the "crown"). For example, if we have a calibration for birds using the oldest known bird fossil (a minimum age constraint), we know that the split between birds and their sister group had to happen eariler than that fossil, but not that the MRCA of the bird taxa included in our study lived before that fossil. In contrast, if we say that we know that Lepidoptera must have evolved after the first terrestrial plants (a maximum age constraint), we know that the MRCA of Lepidoptera cannot be older than that, but not that the split separating Lepidoptera from their sister group occurred after it. The total branch length from the present to the node of each constraint on the clock tree is recorded (the units do not matter, as long as the branch lengths are proportional to time). This information, plus the age and type of the calibration constraint, are entered into an Excel sheet which calculated the maximum and minimum possible rates given all the calibration points. This can easily be done manually, as well. Each branch length is then multiplied by these rates to get the maximum and minimum length of the branch in time. Note that the rate is actually applied to the whole tree at once — the tree is stretched or compressed by the maximum or minimum rates — the ratio of the ages of nodes does not change with the different rates. Other methods must be used to calculate error bars on the ages of nodes due to finite sequence length rather than uncertain calibration. These methods are explained in "Errors on node times."
Farrell Lab undergraduate Adam Clark wins the Hoopes Prize!