How to Align Sequences

Authors:

Candice Hansey, Michigan State University; Heather L. Merk, The Ohio State University

Dr. Candice Hansey, Michigan State University, provides an overview of NCBI, including how to obtain sequence and use BLAST. In addition, Dr. Hansey provides demonstrations using Exonerate, MUMmer, FASTX-Toolkit, and the Tuxedo suite (including Botwtie and TopHat).

This sequence alignment webinar presented by Dr. Candice Hansey in October 2011 provides an overview of current tools used for sequence alignment. The webinar includes demonstrations of low-throughout alignment using the Basic Local Alignment Search Tool (BLAST) and also provides sample code and graphical output for whole genome alignment and next generation sequence alignment. Furthermore, the webinar includes a wealth of resources.

This one hour webinar has been divided into six videos, which are listed in order on this webpage. The seventh video is the full webinar. The powerpoint slides (in pdf format) are provided at the bottom of the page.

Learning Objectives

At the end of this webinar, you should be able to do the following:

  • Obtain sequence from the National Center for Biotechnology Information (NCBI) website
  • Format sequence for BLAST alignment
  • Perform low-throughout BLAST searches
  • Describe the steps required to perform whole genome sequence alignments using MUMmer
  • Describe next generation sequence alignment resources, including FASTX and the tuxedo suite
  • Locate resources to learn more about sequence alignment

Videos

  • Part 1 – overview of the National Center for Biotechnology Information (NCBI) website
  • Part 2 – Demonstrations using BLAST
  • Part 3 – Using Exonerate for pairwise sequence alignment of ESTs and cDNA
  • Part 4 – Performing whole genome alignments using MUMmer and short sequence alignments using Vmatch
  • Part 5 – Next generation sequencing platforms, the fastq sequence file format, and sequence quality control using the FASTX-Toolkit
  • Part 6 – High-throughput sequence alignment using the Tuxedo suite, including Bowtie and TopHat
  • Part 7 – Additional alignment programs and resources, as well as the question and answer session
  • Full Video

Part 1

Overview of the National Center for Biotechnology Information (NCBI) website.

 

If you experience problems viewing this video connect to our YouTube channel or see the YouTube troubleshooting guide.

Part 2

Demonstrations using BLAST.

 

If you experience problems viewing this video connect to our YouTube channel or see the YouTube troubleshooting guide.

Part 3

Using Exonerate for pairwise sequence alignment of ESTs and cDNA.

If you experience problems viewing this video connect to our YouTube channel or see the YouTube troubleshooting guide.

Part 4

Performing whole genome alignments using MUMmer and short sequence alignments using Vmatch.

 

If you experience problems viewing this video connect to our YouTube channel or see the YouTube troubleshooting guide.

Part 5

Next generation sequencing platforms, the fastq sequence file format, and sequence quality control using the FASTX-Toolkit.

If you experience problems viewing this video connect to our YouTube channel or see the YouTube troubleshooting guide.

Part 6

High-throughput sequence alignment using the Tuxedo suite, including Bowtie and TopHat.

If you experience problems viewing this video connect to our YouTube channel or see the YouTube troubleshooting guide.

Part 7

Additional alignment programs and resources, as well as the question and answer session.

If you experience problems viewing this video connect to our YouTube channel or see the YouTube troubleshooting guide.

Full Video

Full 54 minute webinar recording.

If you experience problems viewing this video connect to our YouTube channel or see the YouTube troubleshooting guide.

About the Presenter

Photo of webinar presenter, Dr. Candice Hansey

Dr. Candice Hansey received her Bachelor of Science and Ph.D. degrees from the University of Wisconsin-Madison in the Genetics and Plant Breeding and Plant Genetics Programs respectively. Dr. Hansey is currently a postdoctoral researcher at Michigan State University where she is combining her background in plant breeding with bioinformatics to understand the genetic diversity in maize and potato and how that diversity can be utilized to improve commercial production.

Register for, or watch other plant breeding and genomics “How To” webinars

References Cited

  • Trapnell, C., and S. L. Salzberg. 2009. How to map billions of short reads onto genomes. Nature Biotechnology 27: 455-457. (Available online at: dx.doi.org/10.1038/nbt0509-455) (verified 30 Sept 2011).

External Links

  • Bowtie [Online]. Johns Hopkins Bloomberg School of Public Health. Available at: bowtie-bio.sourceforge.net/index.shtml (verified 30 Sept 2011).
  • FASTX-Toolkit [Online]. Hannon Laboratory, Cold Spring Harbor Laboratory. Available at: hannonlab.cshl.edu/fastx_toolkit/ (verified 30 Sept 2011).
  • Integrated genome browser [Online]. UNC Charlotte. Available at: bioviz.org/igb/ (verified 30 Sept 2011).
  • Slater, G.S.C. Exonerate [Online]. European Bioinformatics Institute, European Molecular Biology Laboratory. Available at: www.ebi.ac.uk/%7Eguy/exonerate/ (verified 30 Sept 2011).
  • National Center for Biotechnology Information [Online]. U.S. National Library of Medicine, National Institutes of Health. Available at: http://www.ncbi.nlm.nih.gov/ (verified 4 Oct 2011).
  • MUMmer [Online]. SourceForge. Available at: mummer.sourceforge.net/ (verified 30 Sept 2011).
  • The Vmatch large scale sequence analysis software [Online]. LScSA-Software GmbH. Available at: www.vmatch.de/ (verified 30 Sept 2011).
  • TopHat [Online]. Johns Hopkins Bloomberg School of Public Health. Available at: tophat.cbcb.umd.edu/ (verified 30 Sept 2011).
  • Virtual Box [Online]. Oracle. Available at: www.virtualbox.org/ (verified 30 Sept 2011).

Additional Resources

  • Korf, I., M. Yandell, and J. Bedell. 2003. BLAST. O’Reilly Media, Sebastopol, CA. 
  • Lee, J., S. Cozens, and P. Wainwright. 2004. Beginning Perl, second edition. Apress, Springer-Verlag, NY.
  • Newham, C. 2005. Learning the bash shell. O’Reilly Media, Sebastopol, CA.
  • Tisdall, J. 2001. Beginning Perl for bioinformatics. O’Reilly Media, Sebastopol, CA.

Funding Statement

Development of this page was supported in part by the National Institute of Food and Agriculture (NIFA) Solanaceae Coordinated Agricultural Project, agreement 2009-85606-05673, administered by Michigan State University. Any opinions, findings, conclusions, or recommendations expressed in this publication are those of the author(s) and do not necessarily reflect the view of the United States Department of Agriculture.

PBGworks 1207