Flags alignments that are colinear with the next one in sequence order. The flag is added to the first alignment. Colinearity is defined by the fact that the next alignment on the target genome is the same as the next alignment on the query genome, with strand information being taken into account.
Arguments
- gb
GBreaks
object of the pairwise alignment.- tol
Unaligned regions larger than this tolerance threshold will interrupt colinearity.
- minwidth
Remove the intervals whose width smaller than this value.
- details
Returns more metadata columns if
TRUE
.
Value
Returns a modified GBreaks
object with a new colinear
metadata
column indicating if an alignment is colinear with the next one. If the
details
option is set to TRUE
, it will also output the relative position
of the following and previous alignments on the target and query genomes
(tfoll
, tprev
, qfoll
, qprev
), a partial colinearity flag for each
genome (t_col
and q_col
), and the distance to the next alignment on
each genome (tdist
and qdist
).
Details
Internally, flagColinearAlignments()
uses the GenomicRanges::precede
and GenomicRanges::follow
functions functions to determine what is the
next. For a given range, these functions return the index position of the
range it precedes or follows, or NA
as the first range follows nothing and
the last range precedes nothing. See the examples for details.
Note
The flags are only valid as long as the Genomic Breaks object is not modified by removing alignments or sorting them in a different order.
Pay attention that if the mindwidth
option is passed, some intervals
are discarded from the returned object. This parameter might be removed in
the future if too confusing or useless.
See also
Other Flagging functions:
flagAll()
,
flagDoubleInversions()
,
flagInversions()
,
flagLongShort()
,
flagPairs()
,
flagTranslocations()
Other Colinearity functions:
GOC()
,
bridgeRegions()
,
chain_contigs()
,
coalesce_contigs()
,
dist2next()
,
filterColinearRegions()
,
flagPairs()
Examples
flagColinearAlignments(exampleColinear3)
#> GBreaks object with 3 ranges and 2 metadata columns:
#> seqnames ranges strand | query colinear
#> <Rle> <IRanges> <Rle> | <GRanges> <logical>
#> [1] chrA 100-200 + | chrB:100-200 TRUE
#> [2] chrA 201-300 + | chrB:201-300 TRUE
#> [3] chrA 301-400 + | chrB:301-400 FALSE
#> -------
#> seqinfo: 1 sequence from an unspecified genome
flagColinearAlignments(exampleColinear3, details = TRUE)
#> GBreaks object with 3 ranges and 10 metadata columns:
#> seqnames ranges strand | query tfoll qfoll tprev
#> <Rle> <IRanges> <Rle> | <GRanges> <integer> <integer> <integer>
#> [1] chrA 100-200 + | chrB:100-200 1 1 <NA>
#> [2] chrA 201-300 + | chrB:201-300 1 1 -1
#> [3] chrA 301-400 + | chrB:301-400 <NA> <NA> -1
#> qprev t_col q_col tdist qdist colinear
#> <integer> <logical> <logical> <numeric> <numeric> <logical>
#> [1] <NA> TRUE TRUE 1 1 TRUE
#> [2] -1 TRUE TRUE 1 1 TRUE
#> [3] -1 <NA> <NA> Inf Inf FALSE
#> -------
#> seqinfo: 1 sequence from an unspecified genome
# Target range [1] precedes target range [2]
precede(exampleColinear3)
#> [1] 2 3 NA
# Query range [1] precedes query range [2]
precede(exampleColinear3$query)
#> [1] 2 3 NA
# Ranges on the minus strand
gb2 <- exampleColinear3 |> reverse() |> sort(ignore.strand = TRUE)
flagColinearAlignments(gb2)
#> GBreaks object with 3 ranges and 2 metadata columns:
#> seqnames ranges strand | query colinear
#> <Rle> <IRanges> <Rle> | <GRanges> <logical>
#> [1] chrA 201-300 - | chrB:301-400 TRUE
#> [2] chrA 301-400 - | chrB:201-300 TRUE
#> [3] chrA 401-501 - | chrB:100-200 FALSE
#> -------
#> seqinfo: 1 sequence from an unspecified genome
# Target range [1] follows target range [2]
follow(gb2)
#> [1] 2 3 NA
# Or, ignoring strand, target range [1] precedes target range [2]
precede(gb2, ignore.strand = TRUE)
#> [1] 2 3 NA
# Query range [1] follows query range [2]
follow(gb2$query)
#> [1] 2 3 NA
# Colinearity check on strandless objects
gb3 <- exampleColinear3
gb4 <- gb2
strand(gb4) <- strand(gb3) <- "*"
flagColinearAlignments(gb3)
#> GBreaks object with 3 ranges and 2 metadata columns:
#> seqnames ranges strand | query colinear
#> <Rle> <IRanges> <Rle> | <GRanges> <logical>
#> [1] chrA 100-200 * | chrB:100-200 TRUE
#> [2] chrA 201-300 * | chrB:201-300 TRUE
#> [3] chrA 301-400 * | chrB:301-400 FALSE
#> -------
#> seqinfo: 1 sequence from an unspecified genome
flagColinearAlignments(gb4)
#> GBreaks object with 3 ranges and 2 metadata columns:
#> seqnames ranges strand | query colinear
#> <Rle> <IRanges> <Rle> | <GRanges> <logical>
#> [1] chrA 201-300 * | chrB:301-400 TRUE
#> [2] chrA 301-400 * | chrB:201-300 TRUE
#> [3] chrA 401-501 * | chrB:100-200 FALSE
#> -------
#> seqinfo: 1 sequence from an unspecified genome
# Ranges that should not coalesce because they are not
# ordered properly
flagColinearAlignments(exampleNotColinear)
#> GBreaks object with 2 ranges and 2 metadata columns:
#> seqnames ranges strand | query colinear
#> <Rle> <IRanges> <Rle> | <GRanges> <logical>
#> [1] chrA 100-150 + | chrB:201-251 FALSE
#> [2] chrA 251-300 + | chrB:50-100 FALSE
#> -------
#> seqinfo: 1 sequence from an unspecified genome