Bioinformatics 101

Authors:

David M. Francis, The Ohio State University; Heather L. Merk, The Ohio State University

These webinars and files presented at the 2010 Tomato Disease Workshop introduce methods for managing and querying sequence data on one’s computer using BLAST, Perl, and BioPerl tools. Overviews of BLAST and BioPerl are provided. In addition, a marker discovery project in tomato that uses Perl, BioPerl, and BLAST tools on a Linux platform is presented as an example.

A pdf of Dr. Francis’ presentation can be found at the bottom of the page (Bioinformatics_101).

In this video, Dr. David Francis, The Ohio State University, introduces the concept of setting up a data management system on one’s computer to ask specific questions without limitations of distributed resources. Dr. Francis presents ideas for different pipelines one could set up that include BLAST and BioPerl. In addition, Dr. Francis stresses the importance formulating questions before creating a data pipeline.

If you experience problems veiwing this video connect directly to our YouTube channel or see the YouTube troubleshooting guide.

In this video, Dr. Francis introduces the Basic Local Alignement Search Tool (BLAST) and provides instructions for downloading and installing BLAST from NCBI.

If you experience problems veiwing this video connect directly to our YouTube channel or see the YouTube troubleshooting guide.

 

In this video, Dr. Francis introduces Perl and BioPerl, and he teaches users how to download and install BioPerl.

If you experience problems veiwing this video connect directly to our YouTube channel or see the YouTube troubleshooting guide.

 

In this video, Dr. Francis introduces an example of using Perl, BioPerl, and BLAST tools on a Linux platform for a marker discovery project. The objective is to use prior knowledge of the chromosome position of a disease resistance locus in tomato and loosely-linked markers to try to find more markers that are more closely linked to the disease resistance so that these markers can be used for marker-assisted selection.

Finding markers more closely linked to the disease resistance locus involves downloading tomato genome sequence and querying this sequence with the flanking marker sequences to pull out a sequence scaffold that contains the flanking markers. The sequence that is pulled out is then queried against other sequences, either EST sequences downloaded from NCBI or Next Generation Transcriptome sequences developed by SolCAP, in order to identify polymorphisms in the region of interest.

If you experience problems veiwing this video connect directly to our YouTube channel or see the YouTube troubleshooting guide.

  • The Linux cheat sheet referred to in this video can be found as a pdf attachment at the bottom of this page.

This video is a continuation of the previously introduced example. First, Dr. Francis walks through the Linux commands required to format a searchable database for BLAST, run a standalone BLAST, and view a BLAST output file. Second, Dr. Francis describes how to use BioPerl and Perl to parse a BLAST output file. He also describes the structure of the output file and provides guidelines for interpreting the results in terms of their utility for marker development. Third, Dr. Francis describes how to use Perl and BioPerl to parse a BLAST output and to retrieve useful sequences from GenBank.

If you experience problems veiwing this video connect directly to our YouTube channel or see the YouTube troubleshooting guide.

  • The Perl scripts referred to in this video are available in a zip folder at the bottom of the page.

Find all the presentations from the 2010 Tomato Disease Workshop

External Links

Additional Resources

The following are relatively non-technical resources for learning Perl

  • Medinets, D. 1996. Perl 5 by example. Que, Indianapolis, IN.
  • Siever, E., and S. Spainhour. 2002. Perl in a nutshell. O’Reilly Media, Sebastopol, CA.

Funding Statement

Development of this lesson was supported in part by the National Institute of Food and Agriculture (NIFA) Solanaceae Coordinated Agricultural Project, agreement 2009-85606-05673, administered by Michigan State University. Any opinions, findings, conclusions, or recommendations expressed in this publication are those of the author(s) and do not necessarily reflect the view of the United States Department of Agriculture.

Attachments:

Bioinformatics_101.pdf (2.11 MB)

Bioinformatics_txtfiles.zip (7.69 KB)

LinuxCheatSheet.pdf (50.58 KB)

PBGworks 1001