The Felsenstein 1981 (F81) distance corrects for multiple substitutions like
the JC69_distance()
function, but also accounts for GC content. It is
calculated as \(-E \ln\left(\frac{E - p}{E}\right)\), where \(E\) equals
\(1\) minus the sum of the squares of the average nucleotide frequencies
(McGuire and coll., 1999).
Arguments
- train_parameters
A list containing the probabilities of the alignment, produced by the
readTrainFile()
function.
Value
A numeric value representing the evolutionary distance between two genomes. The larger the value, the more divergent the genomes.
References
Felsenstein, J. (1981). Evolutionary trees from DNA sequences: a maximum likelihood approach. Journal of Molecular Evolution, 17, 368–376. doi:10.1007/BF01734359
McGuire, G., Prentice, M. J., & Wright, F. (1999). Improved error bounds for genetic distances from DNA sequences. Biometrics, 55(4), 1064–1070. doi:10.1111/j.0006-341x.1999.01064.x
See also
Other Similarity indexes:
GOC()
,
JC69_distance()
,
K80_distance()
,
P_distance()
,
T92_distance()
,
correlation_index()
,
karyotype_index()
,
slidingWindow()
,
strand_randomisation_index()
,
synteny_index()
Examples
parameters <- readTrainFile(system.file("extdata/example.train", package = "GenomicBreaks"))
F81_distance(parameters)
#> [1] 0.01615041