Skip to contents

The Felsenstein 1981 (F81) distance corrects for multiple substitutions like the JC69_distance() function, but also accounts for GC content. It is calculated as \(-E \ln\left(\frac{E - p}{E}\right)\), where \(E\) equals \(1\) minus the sum of the squares of the average nucleotide frequencies (McGuire and coll., 1999).

Usage

F81_distance(train_parameters)

Arguments

train_parameters

A list containing the probabilities of the alignment, produced by the readTrainFile() function.

Value

A numeric value representing the evolutionary distance between two genomes. The larger the value, the more divergent the genomes.

References

Felsenstein, J. (1981). Evolutionary trees from DNA sequences: a maximum likelihood approach. Journal of Molecular Evolution, 17, 368–376. doi:10.1007/BF01734359

McGuire, G., Prentice, M. J., & Wright, F. (1999). Improved error bounds for genetic distances from DNA sequences. Biometrics, 55(4), 1064–1070. doi:10.1111/j.0006-341x.1999.01064.x

Author

Charles Plessy

Examples

parameters <- readTrainFile(system.file("extdata/example.train", package = "GenomicBreaks"))
F81_distance(parameters)
#> [1] 0.01615041