Takes a gene annotation and computes the coordinates of operons based on colinearity (all genes on the same strand with no interruption by opposite-strand genes, and distance (all genes within a maximal distance of each other). Nested genes on the opposite strand do not interrupt colinearity as they do not interrupt transcription.
calcOperons(genes, window = 100)A GRanges object representing gene annotations.
The maximum distance separating two genes of the same operon. It has to be an even number.
A GRanges object representing operons, with a metadata column n
reporting the number of genes in each operon. If the input object had a
gene_id column, then the gene identifiers will be reported as a
IRanges::CharacterList() in a gene_id column in the operons object.
The window parameter must be an even number because the algorithm
expands the genes ranges by hald the window size on each side. If an
odd number is provided, no error message is produced, but the results will be
the same as if the window were 1 base greater.
genes <- GRanges(c("chr1:100-199:+", "chr1:300-400:+",
"chr1:500-600:+", "chr1:700-800:-"))
genes$gene_id <- LETTERS[seq_along(genes)]
OikScrambling:::calcOperons(genes, window = 100)
#> GRanges object with 1 range and 2 metadata columns:
#> seqnames ranges strand | n gene_id
#> <Rle> <IRanges> <Rle> | <integer> <CharacterList>
#> [1] chr1 300-600 + | 2 B,C
#> -------
#> seqinfo: 1 sequence from an unspecified genome; no seqlengths