Takes a gene annotation and computes the coordinates of operons based on colinearity (all genes on the same strand with no interruption by opposite-strand genes, and distance (all genes within a maximal distance of each other). Nested genes on the opposite strand do not interrupt colinearity as they do not interrupt transcription.
calcOperons(genes, window = 100)
A GRanges
object representing gene annotations.
The maximum distance separating two genes of the same operon. It has to be an even number.
A GRanges
object representing operons, with a metadata column n
reporting the number of genes in each operon. If the input object had a
gene_id
column, then the gene identifiers will be reported as a
IRanges::CharacterList()
in a gene_id
column in the operons object.
The window
parameter must be an even number because the algorithm
expands the genes
ranges by hald the window size on each side. If an
odd number is provided, no error message is produced, but the results will be
the same as if the window were 1 base greater.
genes <- GRanges(c("chr1:100-199:+", "chr1:300-400:+",
"chr1:500-600:+", "chr1:700-800:-"))
genes$gene_id <- LETTERS[seq_along(genes)]
OikScrambling:::calcOperons(genes, window = 100)
#> GRanges object with 1 range and 2 metadata columns:
#> seqnames ranges strand | n gene_id
#> <Rle> <IRanges> <Rle> | <integer> <CharacterList>
#> [1] chr1 300-600 + | 2 B,C
#> -------
#> seqinfo: 1 sequence from an unspecified genome; no seqlengths