Publication:
IVT-seq reveals extreme bias in RNA-sequencing

dc.contributor.coauthorLahens, Nicholas F.
dc.contributor.coauthorZhang, Ray
dc.contributor.coauthorHayer, Katharina
dc.contributor.coauthorBlack, Michael B.
dc.contributor.coauthorDueck, Hannah
dc.contributor.coauthorPizarro, Angel
dc.contributor.coauthorKim, Junhyong
dc.contributor.coauthorIrizarry, Rafael
dc.contributor.coauthorThomas, Russell S.
dc.contributor.coauthorGrant, Gregory R.
dc.contributor.coauthorHogenesch, John B.
dc.contributor.departmentDepartment of Chemical and Biological Engineering
dc.contributor.departmentDepartment of Chemical and Biological Engineering
dc.contributor.kuauthorKavaklı, İbrahim Halil
dc.contributor.kuprofileFaculty Member
dc.contributor.schoolcollegeinstituteCollege of Engineering
dc.contributor.yokid40319
dc.date.accessioned2024-11-09T12:25:42Z
dc.date.issued2014
dc.description.abstractBackground: RNA-seq is a powerful technique for identifying and quantifying transcription and splicing events, both known and novel. However, given its recent development and the proliferation of library construction methods, understanding the bias it introduces is incomplete but critical to realizing its value. Results: We present a method, in vitro transcription sequencing (IVT-seq), for identifying and assessing the technical biases in RNA-seq library generation and sequencing at scale. We created a pool of over 1,000 in vitro transcribed RNAs from a full-length human cDNA library and sequenced them with polyA and total RNA-seq, the most common protocols. Because each cDNA is full length, and we show in vitro transcription is incredibly processive, each base in each transcript should be equivalently represented. However, with common RNA-seq applications and platforms, we find 50% of transcripts have more than two-fold and 10% have more than 10-fold differences in within-transcript sequence coverage. We also find greater than 6% of transcripts have regions of dramatically unpredictable sequencing coverage between samples, confounding accurate determination of their expression. We use a combination of experimental and computational approaches to show rRNA depletion is responsible for the most significant variability in coverage, and several sequence determinants also strongly influence representation. Conclusions: These results show the utility of IVT-seq for promoting better understanding of bias introduced by RNA-seq. We find rRNA depletion is responsible for substantial, unappreciated biases in coverage introduced during library preparation. These biases suggest exon-level expression analysis may be inadvisable, and we recommend caution when interpreting RNA-seq results.
dc.description.fulltextYES
dc.description.indexedbyWoS
dc.description.indexedbyScopus
dc.description.issue6
dc.description.openaccessYES
dc.description.publisherscopeInternational
dc.description.sponsoredbyTubitakEuN/A
dc.description.sponsorshipNational Institutes of Health (NIH)
dc.description.sponsorshipDARPA
dc.description.sponsorshipNational Center for Research Resources
dc.description.sponsorshipNational Center for Advancing Translational Sciences, NIH
dc.description.sponsorshipPenn Genome Frontiers Institute under an HRFF grant
dc.description.sponsorshipPennsylvania Department of Health
dc.description.sponsorshipInstitute for Translational Medicine and Therapeutics of the Perelman School of Medicine at the University of Pennsylvania
dc.description.sponsorshipDRC grant
dc.description.versionPublisher version
dc.description.volume15
dc.formatpdf
dc.identifier.doi10.1186/gb-2014-15-6-r86
dc.identifier.embargoNO
dc.identifier.filenameinventorynoIR00189
dc.identifier.issn1474-7596
dc.identifier.linkhttps://doi.org/10.1186/gb-2014-15-6-r86
dc.identifier.quartileQ1
dc.identifier.scopus2-s2.0-84911861819
dc.identifier.urihttps://hdl.handle.net/20.500.14288/1614
dc.identifier.wos341269300011
dc.keywordsGenetics
dc.keywordsHeredity
dc.keywordsComplementary DNA
dc.keywordsPolyadenylic acid
dc.keywordsRibosome RNA
dc.keywordsRNA
dc.languageEnglish
dc.publisherBioMed Central
dc.relation.grantno2-R01-NS054794-06
dc.relation.grantno5-R01-HL097800-04
dc.relation.grantno12-DARPA-1068
dc.relation.grantnoUL1TR000003
dc.relation.grantnoP30DK19525
dc.relation.urihttp://cdm21054.contentdm.oclc.org/cdm/ref/collection/IR/id/1218
dc.sourceGenome Biology
dc.subjectBiotechnology
dc.subjectApplied microbiology
dc.titleIVT-seq reveals extreme bias in RNA-sequencing
dc.typeJournal Article
dspace.entity.typePublication
local.contributor.authorid0000-0001-6624-3505
local.contributor.kuauthorKavaklı, İbrahim Halil
relation.isOrgUnitOfPublicationc747a256-6e0c-4969-b1bf-3b9f2f674289
relation.isOrgUnitOfPublication.latestForDiscoveryc747a256-6e0c-4969-b1bf-3b9f2f674289

Files

Original bundle

Now showing 1 - 1 of 1
Thumbnail Image
Name:
1218.pdf
Size:
610.76 KB
Format:
Adobe Portable Document Format