The Felsenstein 1981 (F81) distance corrects for multiple substitutions like
the JC69_distance() function, but also accounts for GC content. It is
calculated as \(-E \ln\left(\frac{E - p}{E}\right)\), where \(E\) equals
\(1\) minus the sum of the squares of the average nucleotide frequencies
(McGuire and coll., 1999).
Arguments
- m
A matrix of counts or probabilities for bases of the target genome to be aligned to bases on the query genome. As a convenience it can also receive a list produced by the
readTrainFile()function, containing this matrix.
Value
A numeric value representing the evolutionary distance between two genomes. The larger the value, the more divergent the genomes.
References
Felsenstein, J. (1981). Evolutionary trees from DNA sequences: a maximum likelihood approach. Journal of Molecular Evolution, 17, 368–376. doi:10.1007/BF01734359
McGuire, G., Prentice, M. J., & Wright, F. (1999). Improved error bounds for genetic distances from DNA sequences. Biometrics, 55(4), 1064–1070. doi:10.1111/j.0006-341x.1999.01064.x
See also
Other Alignment statistics:
GCequilibrium(),
GCpressure(),
GCproportion(),
HKY85_distance(),
JC69_distance(),
K80_distance(),
P_distance(),
T92_distance(),
TN93_distance(),
exampleSubstitutionMatrix,
gapProportion(),
logDet_distance()
Other Similarity indexes:
GOC(),
HKY85_distance(),
JC69_distance(),
K80_distance(),
P_distance(),
T92_distance(),
TN93_distance(),
correlation_index(),
karyotype_index(),
logDet_distance(),
slidingWindow(),
strand_randomisation_index(),
synteny_index(),
tau_index()