Nishimaki and Sato's K2P + Gap distance extends Kimura's 2-parameter distance by treating gaps as insertion/deletion states instead of removing gap-containing sites.
Arguments
- m
A matrix of counts or probabilities for bases of the target genome to be aligned to bases on the query genome. As a convenience it can also receive a list produced by the
readTrainFile()function, containing this matrix.
Value
Returns a numeric value showing the evolutionary distance between two genomes. The larger the value, the more different the two genomes are.
Details
The distance is calculated as $$ K = \frac{3}{4} w \log(w) - \frac{w}{2} \log\left((S - P) \sqrt{S + P - Q}\right) $$ where \(S\) is the probability of identical nucleotide pairs, \(P\) is the probability of transition-type nucleotide pairs, \(Q\) is the probability of transversion-type nucleotide pairs, and \(w\) is the nucleotide occupancy probability.
When there are no gaps, \(w = 1\), and this expression reduces to the usual K80 distance.
References
Kimura, M. (1980). "A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences." Journal of Molecular Evolution, 16, 111–120. DOI: doi:10.1007/BF01731581
Nishimaki, T. and Sato, K. (2019). "An Extension of the Kimura Two-Parameter Model to the Natural Evolutionary Process." Journal of Molecular Evolution, 87, 60–67. DOI: doi:10.1007/s00239-018-9885-1
See also
Other Alignment statistics:
F81_distance(),
GCequilibrium(),
GCpressure(),
GCproportion(),
HKY85_distance(),
JC69_distance(),
K80_distance(),
P_distance(),
T92_distance(),
TN93_distance(),
exampleSubstitutionMatrix,
gapProportion(),
logDet_distance()
Other Similarity indexes:
F81_distance(),
GOC(),
HKY85_distance(),
JC69_distance(),
K80_distance(),
P_distance(),
T92_distance(),
TN93_distance(),
correlation_index(),
karyotype_index(),
logDet_distance(),
slidingWindow(),
strand_randomisation_index(),
synteny_index(),
tau_index()
Examples
K80_gap_distance(exampleSubstitutionMatrix)
#> [1] 0.313952
# When there are no gaps, it returns the same as the K80 distance
nogaps <- exampleSubstitutionMatrix
nogaps["-",] <- 0
nogaps[,"-"] <- 0
K80_gap_distance(nogaps)
#> [1] 0.2789688
K80_distance(nogaps)
#> [1] 0.2789688