The Tamura 1992 (T92) distance extends the K80 distance by taking GC content into account. It is calculated as \(-h \ln \left(1 - \frac{p}{h} - q\right) - \frac{1}{2} \times (1 - h) \ln\left(1 - 2 q\right)\), where \(p\) is the probability of transition, \(q\) the probability of transversion, \(h = 2\theta (1 - \theta)\) and \(\theta\) is the GC content. See the Wikipedia for more details.
Usage
T92_distance(m, gc = c("average", "target", "query"))Arguments
- m
A matrix of counts or probabilities for bases of the target genome to be aligned to bases on the query genome. As a convenience it can also receive a list produced by the
readTrainFile()function, containing this matrix.- gc
Calculate the GC content from the target, the query or average both?
Value
Returns a numeric value show the evolutionary distance between two genomes. the larger the value, the more different the two genomes are.
References
Tamura, K. (1992). "Estimation of the number of nucleotide substitutions when there are strong transition-transversion and G+C-content biases." Molecular Biology and Evolution, 9(4), 678–687. DOI: 10.1093/oxfordjournals.molbev.a040752
See also
Other Alignment statistics:
F81_distance(),
GCequilibrium(),
GCpressure(),
GCproportion(),
HKY85_distance(),
JC69_distance(),
K80_distance(),
P_distance(),
TN93_distance(),
exampleSubstitutionMatrix,
gapProportion(),
logDet_distance()
Other Similarity indexes:
F81_distance(),
GOC(),
HKY85_distance(),
JC69_distance(),
K80_distance(),
P_distance(),
TN93_distance(),
correlation_index(),
karyotype_index(),
logDet_distance(),
slidingWindow(),
strand_randomisation_index(),
synteny_index(),
tau_index()