Takes a gene annotation and computes the coordinates of operons based on colinearity (all genes on the same strand with no interruption by opposite-strand genes, and distance (all genes within a maximal distance of each other). Nested genes on the opposite strand do not interrupt colinearity as they do not interrupt transcription.

calcOperons(genes, window = 100)

Arguments

genes

A GRanges object representing gene annotations.

window

The maximum distance separating two genes of the same operon. It has to be an even number.

Value

A GRanges object representing operons, with a metadata column n

reporting the number of genes in each operon. If the input object had a gene_id column, then the gene identifiers will be reported as a IRanges::CharacterList() in a gene_id column in the operons object.

Note

The window parameter must be an even number because the algorithm expands the genes ranges by hald the window size on each side. If an odd number is provided, no error message is produced, but the results will be the same as if the window were 1 base greater.

Examples

genes <- GRanges(c("chr1:100-199:+", "chr1:300-400:+",
                   "chr1:500-600:+", "chr1:700-800:-"))
genes$gene_id <- LETTERS[seq_along(genes)]
OikScrambling:::calcOperons(genes, window = 100)
#> GRanges object with 1 range and 2 metadata columns:
#>       seqnames    ranges strand |         n         gene_id
#>          <Rle> <IRanges>  <Rle> | <integer> <CharacterList>
#>   [1]     chr1   300-600      + |         2             B,C
#>   -------
#>   seqinfo: 1 sequence from an unspecified genome; no seqlengths