vignettes/Translocations.Rmd
Translocations.Rmd
knitr::opts_chunk$set(cache = TRUE, cache.lazy = FALSE)
knitr::opts_knit$set(verbose = TRUE)
After coalescing colinear alignments, removing translocations of repeat-containing sequences and re-coalescing, colinearity is still broken hundreds of time.
Here we explore the role of translocations in scrambling Oikopleura genomes.
library('OikScrambling') |> suppressPackageStartupMessages()
load("BreakPoints.Rdata")
See
vignette("LoadGenomicBreaks", package = "OikScrambling")
for how the different GBreaks objects are prepared.
Details can be found in
vignette("GenomicBreaks", package = "GenomicBreaks")
, in
vignette("StructuralVariants", package = "GenomicBreaks")
,
and ?GenomicBreaks::flagTranslocations
The block in blue sky color map on the same sequence as their neighbors, but not colinearly. The span of unaligned regions between them varies greatly.
showTranslocations(coa$Oki_Kum |> flagTranslocations()) -> x
plotApairOfChrs(x[1:3] |> swap() |> sort(i=T))
plotApairOfChrs(x[4:6])
plotApairOfChrs(x[7:9])
Notes:
Translocations in a genomic context ressembling
insertion/deletion are removed in the coa2
objects.
At the moment, the flagTranslocations
function only
searches the target genome.
trans_summary <- data.frame(
align = sapply(gbs, function(gb) sum(flagTranslocations(gb)$tra)),
mapped = sapply(coa, function(gb) sum(flagTranslocations(gb)$tra)),
tra_rm = sapply(coa2, function(gb) sum(flagTranslocations(gb)$tra))
)
trans_summary$pairname <- OikScrambling:::compDistance(rownames(trans_summary))
trans_summary$class <- OikScrambling:::compDistClass(rownames(trans_summary))
trans_summary$genus <- OikScrambling:::compGenus(rownames(trans_summary))
trans_summary$target <- sub("_.*", "", rownames(trans_summary))
ggplot(trans_summary |> tidyr::pivot_longer(c("align", "mapped", "tra_rm"))) +
aes(value, name) + geom_point(aes(col = class, pch = target)) +
ggtitle ("Number of translocations found after different post-processings",
subtitle = "align: raw, mapped: collapsed, tra_rm: translocated repeats removed.") +
facet_wrap(~genus)
## Warning: The shape palette can deal with a maximum of 6 discrete values because
## more than 6 becomes difficult to discriminate; you have 8. Consider
## specifying shapes manually if you must have them.
## Warning: Removed 24 rows containing missing values (`geom_point()`).
trans_summary
## align mapped tra_rm pairname class genus target
## Oki_Osa 428 513 276 Oki – North distant Oikopleura Oki
## Oki_Bar 389 461 252 Oki – North distant Oikopleura Oki
## Oki_Kum 1199 1288 252 In same pop same_or_sister Oikopleura Oki
## Oki_Aom 437 524 294 Oki – North distant Oikopleura Oki
## Oki_Nor 422 532 342 Oki – North distant Oikopleura Oki
## Osa_Oki 385 453 236 Oki – North distant Oikopleura Osa
## Osa_Bar 296 376 164 North – North intermediate Oikopleura Osa
## Osa_Kum 395 463 245 Oki – North distant Oikopleura Osa
## Osa_Aom 1010 1104 355 In same pop same_or_sister Oikopleura Osa
## Osa_Nor 410 576 383 North – North intermediate Oikopleura Osa
## Bar_Oki 355 422 229 Oki – North distant Oikopleura Bar
## Bar_Osa 296 400 201 North – North intermediate Oikopleura Bar
## Bar_Kum 368 432 225 Oki – North distant Oikopleura Bar
## Bar_Aom 297 390 191 North – North intermediate Oikopleura Bar
## Bar_Nor 1011 1156 694 In same pop same_or_sister Oikopleura Bar
## Ply_Ros 1581 1734 626 Int – Int same_or_sister Ciona Ply
## Ply_Rob 876 1011 393 Int – Rob close Ciona Ply
## Ply_Sav 133 227 225 Int/Rob – Sav distant Ciona Ply
## Ply_Oki 3 4 4 Int/Rob – Oki different_genus <NA> Ply
## Rob_Ros 952 1127 617 Int – Rob close Ciona Rob
## Rob_Ply 1106 1392 885 Int – Rob close Ciona Rob
## Rob_Sav 128 238 235 Int/Rob – Sav distant Ciona Rob
## Rob_Oki 9 10 10 Int/Rob – Oki different_genus <NA> Rob
## Dme_Dbu 231 316 280 Dme_Dbu distant Drosophila Dme
## Dme_Dsu 98 131 96 Dme_Dsu intermediate Drosophila Dme
## Dme_Dya 173 194 106 Dme_Dya close Drosophila Dme
## Dme_Dma 266 285 134 Dme_Dma same_or_sister Drosophila Dme
## Cni_Cbr 1820 2071 894 Cni_Cbr close Caenorhabditis Cni
## Cni_Cre 260 362 303 Cni_Cre intermediate Caenorhabditis Cni
## Cni_Cin 186 295 282 Cni_Cin distant Caenorhabditis Cni
## Cbr_Cni 2428 2749 1491 Cbr_Cni close Caenorhabditis Cbr
## Cbr_Cre 281 395 344 Cbr_Cre intermediate Caenorhabditis Cbr
## Cbr_Cel 246 390 340 Cbr_Cel distant Caenorhabditis Cbr
The number of detected translocations increase after coalescing, because the translocated region itself may have a double gap interrupting the alignment. Then, it logically decreases when the repeat-associated translocations are removed.
Translocations are “small” when the flanking colinear regions are longer.
Small translocations are a majority. Only in same-population comparison they seem to associate strongly with repeats.
flagSmallTranslocations <- function (a) {
a <- flagTranslocations(a)
a$w0 <- width(a)
a$w1 <- c(tail(a$w0, -1), NA)
a$w2 <- c(tail(a$w0, -2), NA, NA)
a$midSmaller <- (a$w0 + a$w2 > 2 * a$w1) & a$tra
a
}
smallTransObjs <- SimpleList(
align_s = sapply(gbs, flagSmallTranslocations),
mapped_s = sapply(coa, flagSmallTranslocations),
tra_rm_s = sapply(coa2, flagSmallTranslocations)
)
cbind( trans_summary
, smallTransObjs |> sapply(sapply, \(gb) sum(gb$midSmaller)))
## align mapped tra_rm pairname class genus target
## Oki_Osa 428 513 276 Oki – North distant Oikopleura Oki
## Oki_Bar 389 461 252 Oki – North distant Oikopleura Oki
## Oki_Kum 1199 1288 252 In same pop same_or_sister Oikopleura Oki
## Oki_Aom 437 524 294 Oki – North distant Oikopleura Oki
## Oki_Nor 422 532 342 Oki – North distant Oikopleura Oki
## Osa_Oki 385 453 236 Oki – North distant Oikopleura Osa
## Osa_Bar 296 376 164 North – North intermediate Oikopleura Osa
## Osa_Kum 395 463 245 Oki – North distant Oikopleura Osa
## Osa_Aom 1010 1104 355 In same pop same_or_sister Oikopleura Osa
## Osa_Nor 410 576 383 North – North intermediate Oikopleura Osa
## Bar_Oki 355 422 229 Oki – North distant Oikopleura Bar
## Bar_Osa 296 400 201 North – North intermediate Oikopleura Bar
## Bar_Kum 368 432 225 Oki – North distant Oikopleura Bar
## Bar_Aom 297 390 191 North – North intermediate Oikopleura Bar
## Bar_Nor 1011 1156 694 In same pop same_or_sister Oikopleura Bar
## Ply_Ros 1581 1734 626 Int – Int same_or_sister Ciona Ply
## Ply_Rob 876 1011 393 Int – Rob close Ciona Ply
## Ply_Sav 133 227 225 Int/Rob – Sav distant Ciona Ply
## Ply_Oki 3 4 4 Int/Rob – Oki different_genus <NA> Ply
## Rob_Ros 952 1127 617 Int – Rob close Ciona Rob
## Rob_Ply 1106 1392 885 Int – Rob close Ciona Rob
## Rob_Sav 128 238 235 Int/Rob – Sav distant Ciona Rob
## Rob_Oki 9 10 10 Int/Rob – Oki different_genus <NA> Rob
## Dme_Dbu 231 316 280 Dme_Dbu distant Drosophila Dme
## Dme_Dsu 98 131 96 Dme_Dsu intermediate Drosophila Dme
## Dme_Dya 173 194 106 Dme_Dya close Drosophila Dme
## Dme_Dma 266 285 134 Dme_Dma same_or_sister Drosophila Dme
## Cni_Cbr 1820 2071 894 Cni_Cbr close Caenorhabditis Cni
## Cni_Cre 260 362 303 Cni_Cre intermediate Caenorhabditis Cni
## Cni_Cin 186 295 282 Cni_Cin distant Caenorhabditis Cni
## Cbr_Cni 2428 2749 1491 Cbr_Cni close Caenorhabditis Cbr
## Cbr_Cre 281 395 344 Cbr_Cre intermediate Caenorhabditis Cbr
## Cbr_Cel 246 390 340 Cbr_Cel distant Caenorhabditis Cbr
## align_s mapped_s tra_rm_s
## Oki_Osa 267 417 235
## Oki_Bar 241 383 215
## Oki_Kum 1060 1233 227
## Oki_Aom 275 429 250
## Oki_Nor 252 427 280
## Osa_Oki 253 375 204
## Osa_Bar 211 329 134
## Osa_Kum 262 390 211
## Osa_Aom 862 1036 293
## Osa_Nor 292 496 324
## Bar_Oki 224 349 191
## Bar_Osa 223 350 160
## Bar_Kum 243 361 190
## Bar_Aom 224 353 154
## Bar_Nor 777 986 570
## Ply_Ros 1227 1561 511
## Ply_Rob 658 941 352
## Ply_Sav 51 191 190
## Ply_Oki 2 3 3
## Rob_Ros 705 987 511
## Rob_Ply 776 1143 662
## Rob_Sav 62 208 205
## Rob_Oki 4 6 6
## Dme_Dbu 129 291 259
## Dme_Dsu 72 130 95
## Dme_Dya 154 186 100
## Dme_Dma 215 249 114
## Cni_Cbr 1464 1892 821
## Cni_Cre 142 339 285
## Cni_Cin 109 272 258
## Cbr_Cni 1933 2473 1328
## Cbr_Cre 145 369 322
## Cbr_Cel 125 366 318
coa$Oki_Osa |> flagTranslocations(both = FALSE) |> showTranslocations()
## GBreaks object with 1393 ranges and 9 metadata columns:
## seqnames ranges strand | query score
## <Rle> <IRanges> <Rle> | <GRanges> <integer>
## 13 chr1 589935-590156 + | Chr1:1311169-1311381 222
## 14 chr1 590893-591123 - | Chr1:1312994-1313215 231
## 15 chr1 591276-591419 + | Chr1:1312192-1312317 144
## 35 chr1 736824-738412 + | Chr1:2033491-2036651 1589
## 36 chr1 738882-741671 + | Chr1:2155668-2159253 2790
## ... ... ... ... . ... ...
## 8757 XSR 12687471-12687866 - | Chr2:3863401-3863796 396
## 8758 XSR 12688329-12690311 + | XSR:6432840-6434525 1983
## 8785 YSR 235526-282265 - | Chr2:1577776-1589273 46740
## 8786 YSR 283713-284365 + | Chr2:1601786-1609586 653
## 8787 YSR 287448-292843 - | Chr2:1568654-1571126 5396
## Arm rep repOvlp transcripts flag
## <factor> <CharacterList> <integer> <Rle> <character>
## 13 short <NA> 0 <NA> Tra
## 14 short <NA> 0 <NA> <NA>
## 15 short <NA> 0 <NA> <NA>
## 35 short rnd 98 <NA> Tra
## 36 short <NA> 0 g106.t1 <NA>
## ... ... ... ... ... ...
## 8757 XSR <NA> 0 <NA> <NA>
## 8758 XSR tandem,LowComplexity 74 g17003.t1;g17003.t2 <NA>
## 8785 YSR unknown,ltr-1,rnd,... 8342 <NA> Tra
## 8786 YSR ltr-1,tandem 49 <NA> <NA>
## 8787 YSR unknown,rnd,ltr-1,... 1742 <NA> <NA>
## nonCoa tra
## <logical> <Rle>
## 13 TRUE TRUE
## 14 TRUE FALSE
## 15 TRUE FALSE
## 35 FALSE TRUE
## 36 FALSE FALSE
## ... ... ...
## 8757 TRUE FALSE
## 8758 TRUE FALSE
## 8785 FALSE TRUE
## 8786 FALSE FALSE
## 8787 FALSE FALSE
## -------
## seqinfo: 19 sequences from OKI2018.I69 genome
addNormalisedArmNames <- function(gb) {
# Better would be to flag query arms in the main objects, but I do not want
# to push a rebuild now...
genomeQ <- unique(genome(gb$query))
gb $ armNames <- paste(seqnames(gb), gb$Arm ) |> tolower()
gb$query$armNames <- paste(seqnames(gb$query), flagLongShort(gb$query, longShort[[genomeQ]])$Arm) |> tolower()
gb
}
interArmTranslocations <- function(gb) {
translocations <- gb |> addNormalisedArmNames() |> flagTranslocations(both = FALSE) |> showTranslocations()
translocations[translocations$armNames != translocations$query$armNames]
}
interChrTranslocations <- function(gb) {
translocations <- gb |> flagTranslocations(both = FALSE) |> showTranslocations()
translocations[as.character(tolower(seqnames(translocations))) != as.character(tolower(seqnames(translocations$query)))]
}
interChrTra <- sapply(coa[1:15], interChrTranslocations) |> SimpleList()
# Need to improve subsetByOverlaps_GBreaks to allow ignoring queries when needed
(interChrTra$Oki_Osa)|> as("GRanges") |> subsetByOverlaps(coa$Oki_Bar, type = "within", ignore.strand = TRUE) |> as("GBreaks")
## GBreaks object with 99 ranges and 9 metadata columns:
## seqnames ranges strand | query score
## <Rle> <IRanges> <Rle> | <GRanges> <integer>
## 392 chr1 2218139-2218793 - | Chr2:10616806-10617403 655
## 633 chr1 3020827-3021941 - | PAR:1287529-1288984 1115
## 634 chr1 3021988-3026296 + | PAR:15424385-15427922 4309
## 743 chr1 3438590-3440682 + | PAR:2539758-2541854 2093
## 862 chr1 3931646-3932405 - | Chr2:2383635-2384403 760
## ... ... ... ... . ... ...
## 8626 XSR 11725261-11725823 + | PAR:2530103-2530772 563
## 8757 XSR 12687471-12687866 - | Chr2:3863401-3863796 396
## 8785 YSR 235526-282265 - | Chr2:1577776-1589273 46740
## 8786 YSR 283713-284365 + | Chr2:1601786-1609586 653
## 8787 YSR 287448-292843 - | Chr2:1568654-1571126 5396
## Arm rep repOvlp transcripts flag
## <factor> <CharacterList> <integer> <Rle> <character>
## 392 short <NA> 0 g539.t1 <NA>
## 633 short <NA> 0 g772.t1 <NA>
## 634 short rnd,tandem 163 <NA> <NA>
## 743 short <NA> 0 <NA> <NA>
## 862 short <NA> 0 g1034.t1 <NA>
## ... ... ... ... ... ...
## 8626 XSR <NA> 0 g16782.t1 <NA>
## 8757 XSR <NA> 0 <NA> <NA>
## 8785 YSR unknown,ltr-1,rnd,... 8342 <NA> Tra
## 8786 YSR ltr-1,tandem 49 <NA> <NA>
## 8787 YSR unknown,rnd,ltr-1,... 1742 <NA> <NA>
## nonCoa tra
## <logical> <Rle>
## 392 TRUE FALSE
## 633 FALSE FALSE
## 634 FALSE FALSE
## 743 TRUE FALSE
## 862 TRUE FALSE
## ... ... ...
## 8626 TRUE FALSE
## 8757 TRUE FALSE
## 8785 FALSE TRUE
## 8786 FALSE FALSE
## 8787 FALSE FALSE
## -------
## seqinfo: 19 sequences from OKI2018.I69 genome
(interChrTra$Oki_Bar)|> as("GRanges") |> subsetByOverlaps(coa$Oki_Osa, type = "within") |> as("GBreaks")
## GBreaks object with 48 ranges and 9 metadata columns:
## seqnames ranges strand | query score
## <Rle> <IRanges> <Rle> | <GRanges> <integer>
## 618 chr1 3006415-3008370 + | Chr2:747804-749281 1956
## 625 chr1 3032334-3036522 - | PAR:6665593-6669046 4189
## 627 chr1 3037442-3038334 - | PAR:6664130-6665096 893
## 755 chr1 3551783-3551945 + | PAR:12990472-12990634 163
## 855 chr1 3904642-3905709 + | PAR:3806639-3807330 1068
## ... ... ... ... . ... ...
## 6976 PAR 14357639-14359817 - | Chr2:10976243-10978419 2179
## 7133 PAR 15755978-15756608 + | XSR:8253159-8253790 631
## 7166 PAR 16388704-16389029 - | XSR:12028506-12028831 326
## 7644 XSR 4161326-4167846 - | Chr2:10163843-10170307 6521
## 7975 XSR 7350475-7351370 - | Chr2:655237-656179 896
## Arm rep repOvlp transcripts flag
## <factor> <CharacterList> <integer> <Rle> <character>
## 618 short <NA> 0 g769.t1 Tra
## 625 short tandem,rnd 0 g777.t4;g777.t3;g777.. Tra
## 627 short <NA> 0 g777.t3;g777.t2;g777.. <NA>
## 755 short <NA> 0 <NA> <NA>
## 855 short unknown 246 g1027.t1 Tra
## ... ... ... ... ... ...
## 6976 long rnd 73 <NA> <NA>
## 7133 long <NA> 0 <NA> <NA>
## 7166 long <NA> 0 g13106.t1 <NA>
## 7644 XSR tandem 0 g14457.t1 <NA>
## 7975 XSR <NA> 0 <NA> <NA>
## nonCoa tra
## <logical> <Rle>
## 618 FALSE TRUE
## 625 FALSE TRUE
## 627 TRUE FALSE
## 755 TRUE FALSE
## 855 FALSE TRUE
## ... ... ...
## 6976 TRUE FALSE
## 7133 TRUE FALSE
## 7166 TRUE FALSE
## 7644 TRUE FALSE
## 7975 TRUE FALSE
## -------
## seqinfo: 19 sequences from OKI2018.I69 genome
(interChrTra$Oki_Bar)|> as("GRanges") |> subsetByOverlaps(coa$Oki_Nor, type = "within") |> as("GBreaks")
## GBreaks object with 66 ranges and 9 metadata columns:
## seqnames ranges strand | query score
## <Rle> <IRanges> <Rle> | <GRanges> <integer>
## 560 chr1 2786840-2787602 + | Chr2:11272657-11273221 763
## 618 chr1 3006415-3008370 + | Chr2:747804-749281 1956
## 620 chr1 3010918-3020672 + | Chr2:749349-752515 9755
## 820 chr1 3790582-3790767 - | Chr2:3874473-3874658 186
## 853 chr1 3889113-3890233 - | XSR:5219411-5220619 1121
## ... ... ... ... . ... ...
## 7924 XSR 6158215-6160212 - | Chr2:1638589-1640592 1998
## 8006 XSR 8015454-8024003 - | Chr2:10589955-10597296 8550
## 8026 XSR 8326231-8326659 + | PAR:12179993-12180400 429
## 8186 XSR 10606421-10607002 + | Chr1:7772519-7773124 582
## 8288 XSR 11491875-11493771 - | Chr2:3515449-3517293 1897
## Arm rep repOvlp transcripts flag
## <factor> <CharacterList> <integer> <Rle> <character>
## 560 short <NA> 0 g685.t1 <NA>
## 618 short <NA> 0 g769.t1 Tra
## 620 short ltr-1 510 <NA> <NA>
## 820 short <NA> 0 g988.t1 <NA>
## 853 short <NA> 0 g1021.t1 <NA>
## ... ... ... ... ... ...
## 7924 XSR <NA> 0 <NA> <NA>
## 8006 XSR rnd,ltr-1 452 g15646.t1;g15647.t1 <NA>
## 8026 XSR <NA> 0 <NA> <NA>
## 8186 XSR <NA> 0 g16465.t1 <NA>
## 8288 XSR <NA> 0 g16719.t1 <NA>
## nonCoa tra
## <logical> <Rle>
## 560 FALSE FALSE
## 618 FALSE TRUE
## 620 FALSE FALSE
## 820 TRUE FALSE
## 853 TRUE FALSE
## ... ... ...
## 7924 TRUE FALSE
## 8006 FALSE FALSE
## 8026 TRUE FALSE
## 8186 TRUE FALSE
## 8288 TRUE FALSE
## -------
## seqinfo: 19 sequences from OKI2018.I69 genome
(interChrTra$Osa_Bar)|> as("GRanges") |> subsetByOverlaps(coa$Osa_Oki, type = "within") |> as("GBreaks")
## GBreaks object with 42 ranges and 9 metadata columns:
## seqnames ranges strand | query score
## <Rle> <IRanges> <Rle> | <GRanges> <integer>
## 19 Chr1 102613-105089 - | Chr2:11272161-11273579 2477
## 76 Chr1 446061-446186 - | Chr2:1762439-1762563 126
## 378 Chr1 2360831-2370002 + | XSR:11990727-12000901 9172
## 523 Chr1 3093634-3093871 + | YSR:1095082-1095309 238
## 652 Chr1 3907400-3907935 + | Chr2:12914421-12914880 536
## ... ... ... ... . ... ...
## 3617 XSR 9134655-9134746 + | Chr2:9342645-9342736 92
## 3652 XSR 9668252-9668390 - | Chr2:1638459-1638569 139
## 3714 XSR 10394962-10395341 - | Chr2:455161-455540 380
## 3778 XSR 11180724-11180835 + | YSR:1969047-1969146 112
## 3902 YSR 805902-806197 - | Chr1:2376560-2376859 296
## Arm rep repOvlp transcripts flag
## <factor> <CharacterList> <integer> <Rle> <character>
## 19 short rnd 68 g26.t1;g26.t2 <NA>
## 76 short rnd 126 g107.t1 <NA>
## 378 short rnd,unknown,LowComplexity 3021 g528.t1 <NA>
## 523 short <NA> 0 <NA> <NA>
## 652 short rnd 0 <NA> <NA>
## ... ... ... ... ... ...
## 3617 XSR <NA> 0 <NA> <NA>
## 3652 XSR <NA> 0 g14406.t1 <NA>
## 3714 XSR rnd 0 <NA> <NA>
## 3778 XSR <NA> 0 g14817.t1 <NA>
## 3902 YSR rnd 293 <NA> <NA>
## nonCoa tra
## <logical> <Rle>
## 19 FALSE FALSE
## 76 TRUE FALSE
## 378 FALSE FALSE
## 523 TRUE FALSE
## 652 TRUE FALSE
## ... ... ...
## 3617 TRUE FALSE
## 3652 TRUE FALSE
## 3714 TRUE FALSE
## 3778 TRUE FALSE
## 3902 TRUE FALSE
## -------
## seqinfo: 483 sequences from OSKA2016v1.9 genome
(interChrTra$Oki_Osa)|> as("GRanges") |> subsetByOverlaps(cleanGaps(interChrTra$Oki_Bar), type = "within", ignore.strand = TRUE) |> as("GBreaks")
## GBreaks object with 77 ranges and 9 metadata columns:
## seqnames ranges strand | query score
## <Rle> <IRanges> <Rle> | <GRanges> <integer>
## 392 chr1 2218139-2218793 - | Chr2:10616806-10617403 655
## 572 chr1 2804891-2806071 + | YSR:1253745-1254981 1181
## 605 chr1 2918232-2919567 - | Chr2:10559416-10560341 1336
## 634 chr1 3021988-3026296 + | PAR:15424385-15427922 4309
## 743 chr1 3438590-3440682 + | PAR:2539758-2541854 2093
## ... ... ... ... . ... ...
## 8232 XSR 7236144-7236655 + | PAR:4358158-4358668 512
## 8236 XSR 7355890-7357438 - | Chr2:3684734-3686177 1549
## 8238 XSR 7439674-7441392 - | PAR:5237440-5239089 1719
## 8368 XSR 9229608-9229906 + | YSR:358047-358349 299
## 8546 XSR 11090022-11090574 - | Chr1:7962513-7963091 553
## Arm rep repOvlp transcripts flag nonCoa
## <factor> <CharacterList> <integer> <Rle> <character> <logical>
## 392 short <NA> 0 g539.t1 <NA> TRUE
## 572 short rnd 1 <NA> <NA> TRUE
## 605 short tandem 0 <NA> <NA> FALSE
## 634 short rnd,tandem 163 <NA> <NA> FALSE
## 743 short <NA> 0 <NA> <NA> TRUE
## ... ... ... ... ... ... ...
## 8232 XSR <NA> 0 <NA> <NA> TRUE
## 8236 XSR <NA> 0 <NA> <NA> TRUE
## 8238 XSR <NA> 0 g15473.t1 <NA> FALSE
## 8368 XSR rnd 0 <NA> <NA> TRUE
## 8546 XSR <NA> 0 <NA> <NA> TRUE
## tra
## <Rle>
## 392 FALSE
## 572 FALSE
## 605 FALSE
## 634 FALSE
## 743 FALSE
## ... ...
## 8232 FALSE
## 8236 FALSE
## 8238 FALSE
## 8368 FALSE
## 8546 FALSE
## -------
## seqinfo: 19 sequences from OKI2018.I69 genome
interChrTra$Oki_Osa |> as("GRanges") |> subsetByOverlaps(cleanGaps(interChrTra$Oki_Bar), type = "within", ignore.strand = TRUE) |> subsetByOverlaps(gbs$Oki_Bar, ignore.strand = T) |> as("GBreaks")
## GBreaks object with 34 ranges and 9 metadata columns:
## seqnames ranges strand | query score
## <Rle> <IRanges> <Rle> | <GRanges> <integer>
## 605 chr1 2918232-2919567 - | Chr2:10559416-10560341 1336
## 634 chr1 3021988-3026296 + | PAR:15424385-15427922 4309
## 862 chr1 3931646-3932405 - | Chr2:2383635-2384403 760
## 954 chr1 4306180-4309907 - | Chr2:11892246-11895803 3728
## 1425 chr1 7584062-7667431 - | Chr2:13302398-13374732 83370
## ... ... ... ... . ... ...
## 6965 PAR 11801780-11802877 - | XSR:1771745-1772855 1098
## 7429 PAR 15998120-16007542 - | Chr1:2494108-2501994 9423
## 7973 XSR 4742322-4743280 - | PAR:12886116-12887043 959
## 8109 XSR 5505287-5506181 + | Chr2:137635-138385 895
## 8238 XSR 7439674-7441392 - | PAR:5237440-5239089 1719
## Arm rep repOvlp transcripts
## <factor> <CharacterList> <integer> <Rle>
## 605 short tandem 0 <NA>
## 634 short rnd,tandem 163 <NA>
## 862 short <NA> 0 g1034.t1
## 954 short <NA> 0 g1141.t1;g1141.t2
## 1425 long rnd,tandem,unknown,... 3183 g2015.t1;g2017.t1;g2..
## ... ... ... ... ...
## 6965 long <NA> 0 g11839.t1
## 7429 long ltr-1 9265 g13014.t1
## 7973 XSR <NA> 0 <NA>
## 8109 XSR <NA> 0 g14850.t1
## 8238 XSR <NA> 0 g15473.t1
## flag nonCoa tra
## <character> <logical> <Rle>
## 605 <NA> FALSE FALSE
## 634 <NA> FALSE FALSE
## 862 <NA> TRUE FALSE
## 954 <NA> TRUE FALSE
## 1425 Tra FALSE TRUE
## ... ... ... ...
## 6965 <NA> TRUE FALSE
## 7429 <NA> FALSE FALSE
## 7973 <NA> TRUE FALSE
## 8109 <NA> TRUE FALSE
## 8238 <NA> FALSE FALSE
## -------
## seqinfo: 19 sequences from OKI2018.I69 genome
findSpecificTra <- function(pair1, pair2) {
Tra1 <- interChrTra[[pair1]]
Tra2 <- interChrTra[[pair2]]
# First, Tra1 must not overlap with Tra1
# Strand is ignored because it can vary locally due to inversions.
Tra1_ov_Tra2 <- findOverlaps(Tra1, Tra2, ignore.strand = TRUE)
Tra1_not_in_Tra2 <- Tra1[! seq_along(Tra1) %in% unique(queryHits(Tra1_ov_Tra2))]
# Selected ranges must not overlap with unmapped ranges inpair2
Tra1_ov_unmap_pair2 <- findOverlaps(Tra1, unmap[[pair2]], ignore.strand = TRUE)
Tra1[! seq_along(Tra1) %in% queryHits(Tra1_ov_unmap_pair2)]
# Tra1 is not in Tra2 if it is covered by a gap region of Tra2.
# Strand is ignored because it can vary locally due to inversions.
# Tra1_not_in_Tra2 <- findOverlaps(cleanGaps(Tra2), Tra1, type = "within", ignore.strand = TRUE)
# Tra1[unique(subjectHits(Tra1_not_in_Tra2))]
}
# findSpecificTra <- function(pair1, pair2) {
# Tra1 <- interChrTra[[pair1]]
# Tra2 <- interChrTra[[pair2]]
# Tra1_not_in_pair2 <- findOverlaps(coa[[pair2]], Tra1 + 1000, type = "within", ignore.strand = TRUE)
# Tra1[unique(subjectHits(Tra1_not_in_pair2))]
# }