Event Title
The De Novo Assembly and Analysis of the Soapberry Bug Transcriptome
Location
Davis 117
Start Date
30-4-2015 2:45 PM
End Date
30-4-2015 3:55 PM
Project Type
Presentation
Description
We are witnessing the rapid development of technology across all fields in this technological age. Genetics is no exception, particularly as next-generation sequencing technologies get faster and more accurate. In response to these improvements on data collection, processing methods have been developed to keep up with these innovations. In order to identify genes that may play a role in the development of alternate wing lengths exhibited by soapberry bugs, we sequenced its transcriptome. The transcriptome is the collection of all expressed genes in an organism, which is a powerful reference tool to have when conducting research relating to the genetics of an organism, including for comparative purposes. We data from an Illumina 1.5 next-generation sequencer to assemble a de novo transcriptome for the soapberry bug. These raw data were 142.6 million single-end reads of 100 base pairs each from all expressed genes, which leaves the computational question of how to fit these overlapping sequences into individual transcripts (messenger RNAs). I will briefly explain the basic quality check tools for the raw data, followed by an explanation of how the assembler I used, named Trinity, works. Ultimately, I produced a set of about 7398 contigs assembled by Trinity. Using biopython, I conducted a BLAST search for each contig. This annotation compares contigs to sequences in GenBank, the national repository of sequence data. Once annotated these contigs can be used to make comparisons of the genome contents of other species or to select gene sequences for further studies in the soapberry bug.
Faculty Sponsor
Bruce Maxwell
Sponsoring Department
Colby College. Computer Science Dept.
CLAS Field of Study
Natural Sciences
Event Website
http://www.colby.edu/clas
ID
1305
The De Novo Assembly and Analysis of the Soapberry Bug Transcriptome
Davis 117
We are witnessing the rapid development of technology across all fields in this technological age. Genetics is no exception, particularly as next-generation sequencing technologies get faster and more accurate. In response to these improvements on data collection, processing methods have been developed to keep up with these innovations. In order to identify genes that may play a role in the development of alternate wing lengths exhibited by soapberry bugs, we sequenced its transcriptome. The transcriptome is the collection of all expressed genes in an organism, which is a powerful reference tool to have when conducting research relating to the genetics of an organism, including for comparative purposes. We data from an Illumina 1.5 next-generation sequencer to assemble a de novo transcriptome for the soapberry bug. These raw data were 142.6 million single-end reads of 100 base pairs each from all expressed genes, which leaves the computational question of how to fit these overlapping sequences into individual transcripts (messenger RNAs). I will briefly explain the basic quality check tools for the raw data, followed by an explanation of how the assembler I used, named Trinity, works. Ultimately, I produced a set of about 7398 contigs assembled by Trinity. Using biopython, I conducted a BLAST search for each contig. This annotation compares contigs to sequences in GenBank, the national repository of sequence data. Once annotated these contigs can be used to make comparisons of the genome contents of other species or to select gene sequences for further studies in the soapberry bug.
https://digitalcommons.colby.edu/clas/2015/program/237