David M. Francis, The Ohio State University; Heather L. Merk, The Ohio State University
A pdf of Dr. Francis’ presentation can be found at the bottom of the page (Bioinformatics_101).
In this video, Dr. David Francis, The Ohio State University, introduces the concept of setting up a data management system on one’s computer to ask specific questions without limitations of distributed resources. Dr. Francis presents ideas for different pipelines one could set up that include BLAST and BioPerl. In addition, Dr. Francis stresses the importance formulating questions before creating a data pipeline.
In this video, Dr. Francis introduces Perl and BioPerl, and he teaches users how to download and install BioPerl.
In this video, Dr. Francis introduces an example of using Perl, BioPerl, and BLAST tools on a Linux platform for a marker discovery project. The objective is to use prior knowledge of the chromosome position of a disease resistance locus in tomato and loosely-linked markers to try to find more markers that are more closely linked to the disease resistance so that these markers can be used for marker-assisted selection.
Finding markers more closely linked to the disease resistance locus involves downloading tomato genome sequence and querying this sequence with the flanking marker sequences to pull out a sequence scaffold that contains the flanking markers. The sequence that is pulled out is then queried against other sequences, either EST sequences downloaded from NCBI or Next Generation Transcriptome sequences developed by SolCAP, in order to identify polymorphisms in the region of interest.
- The Linux cheat sheet referred to in this video can be found as a pdf attachment at the bottom of this page.
This video is a continuation of the previously introduced example. First, Dr. Francis walks through the Linux commands required to format a searchable database for BLAST, run a standalone BLAST, and view a BLAST output file. Second, Dr. Francis describes how to use BioPerl and Perl to parse a BLAST output file. He also describes the structure of the output file and provides guidelines for interpreting the results in terms of their utility for marker development. Third, Dr. Francis describes how to use Perl and BioPerl to parse a BLAST output and to retrieve useful sequences from GenBank.
- The Perl scripts referred to in this video are available in a zip folder at the bottom of the page.
- BioPerl [Online]. Available at: http://www.bioperl.org/wiki/Main_Page (verified 10 Dec 2010).
- National Center for Biotechnology Information [Online]. U.S. National Library of Medicine, National Institues of Health. Available at: http://www.ncbi.nlm.nih.gov/ (verified 4 Jan 2011).
The following are relatively non-technical resources for learning Perl
- Medinets, D. 1996. Perl 5 by example. Que, Indianapolis, IN.
- Siever, E., and S. Spainhour. 2002. Perl in a nutshell. O’Reilly Media, Sebastopol, CA.
Development of this lesson was supported in part by the National Institute of Food and Agriculture (NIFA) Solanaceae Coordinated Agricultural Project, agreement 2009-85606-05673, administered by Michigan State University. Any opinions, findings, conclusions, or recommendations expressed in this publication are those of the author(s) and do not necessarily reflect the view of the United States Department of Agriculture.
Bioinformatics_101.pdf (2.11 MB)
Bioinformatics_txtfiles.zip (7.69 KB)
LinuxCheatSheet.pdf (50.58 KB)