Skip to main content
Fig. 1 | BMC Medical Genomics

Fig. 1

From: De novo assembly and characterization of breast cancer transcriptomes identifies large numbers of novel fusion-gene transcripts of potential functional significance

Fig. 1

Computational workflow for chimeric transcript discovery. The central blue blocks show the workflow, orange boxes represent the tools and programs integrated with the workflow, purple boxes represent RNA-Seq reads and green boxes represent datasets from the UCSC genome database. RNA-Seq reads (in fastq format) were trimmed and only paired-end reads were used for the assembly process. Assembled contigs (in fasta format) were then aligned to the reference genome and the resulting alignment files (in pslx format) were analyzed by R-SAP to detect potential fusion transcripts. Fusion transcripts were further characterized by comparing alignment coordinates with known reference transcripts (BED format) using R-SAP. Part of the filtering was done by R-SAP internally while additional filtering was done using in-house perl scripts. A re-conformation step includes alignment of RNA-Seq reads to chimeric transcript sequences and also to the reference genome using Bowtie1 and Bowtie 2, respectively. Alignment files (in bam format) resulting from RNA-Seq reads to fusion- transcript sequences were used to estimate the raw read-counts by expectation-maximization using RSEM

Back to article page