Skip to contents

Using a GBreaks object representing the alignment of a query genome on a target genome, finds the longest match of each sequence level (representing contigs, scaffolds, etc.) of the query on the target.

Usage

longestMatchesInTarget(gb, min.width = 10000, min.matches = 2)

Arguments

gb

A GBreaks object

min.width

Minimum width of a match (on the query genome) for being considered.

min.matches

Discard query sequences that have fewer longest matches than min.matches on the target. Default is 2, so that only results relevant to chaining genomes are kept.

Value

Returns a GRangesList object containing one GBreaks object per sequence on the query genome.

Details

Each sequence of the query is represented only once in the output, but sequences of the target genome can be represented multiple times if they are the longest match of multiple query genome sequences. When the target genome is more contiguous than the query genome, and if there are no major structural variations between them, this will reveal arrangements of colinear sequences in the query genome.

For a more compact version of the results, the output of this function can be piped to strandNames(query = TRUE).

See also

Author

Charles Plessy

Examples

exampleColinear3
#> GBreaks object with 3 ranges and 1 metadata column:
#>       seqnames    ranges strand |        query
#>          <Rle> <IRanges>  <Rle> |    <GRanges>
#>   [1]     chrA   100-200      + | chrB:100-200
#>   [2]     chrA   201-300      + | chrB:201-300
#>   [3]     chrA   301-400      + | chrB:301-400
#>   -------
#>   seqinfo: 1 sequence from an unspecified genome
exampleColinear3 |> longestMatchesInTarget(min.width = 0, min.matches = 1)
#> GRangesList object of length 1:
#> $chrB
#> GBreaks object with 1 range and 1 metadata column:
#>       seqnames    ranges strand |        query
#>          <Rle> <IRanges>  <Rle> |    <GRanges>
#>   [1]     chrB   100-200      + | chrA:100-200
#>   -------
#>   seqinfo: 1 sequence from an unspecified genome
#>