Search Keck Sites:


General Information:


Data Return:


 Reaching Us:


Questions?


W.M. Keck Facility
 Yale University
 300 George Street
 Addresses
 
 Contact Us

Yale University School of Medicine

Keck Home Page > DNA Sequencing > Data Interpretation

Interpretation of the Four Color Plot
of the Fluorescent Electrophoretic Data

On average, using good template and primer, Taq/dye-terminator cycle sequencing will provide 500-600 bases of sequence with a 98-99 % accuracy (exceptional template-primer combinations will yield 650-750 bases with 98% accuracy). After 600-650 bases, the resolution between peaks decreases and the software has difficulty accurately determining the exact number of bases in runs of the same base. Consequently, the error rate usually increases dramatically and may be as much as 10% at 550-650 bases. Furthermore, because the software utilizes a uniform spacing to call bases, it is slightly biased towards inserting extra bases. Thus, one should be somewhat conservative in data interpretation, particularly when designing primers for primer walking: in general, for the best chance of synthesizing a primer with the correct sequence and to provide sufficient overlap between the two sequencing steps, one should design a primer (see guidelines on the back of the sample sheet) in the region between bases 450 and 550.

Unlike Amplitaq polymerase, Taq FS polymerase demonstrates greatly reduced discrimination between incorporation of the four fluorescent-dideoxynucleotide terminators leading to relatively uniform peak sizes. Despite the increased uniformity of terminator incorporation, Taq FS data do exhibit some recognizable patterns which are useful for sequence interpretation:

  • G's following A's are weak and may be very weak, leading to a dropout peak.
     

Editing the Sequence File

Sequence data is provided in a computer-readable format either via ftp-server. Usually the sequence is in GCG format, ready for use by the University of Wisconsin Genetics Computer Group (UWGCG) programs which are available on the VAX computer in the Yale Biomedical Computing Unit. Before analyzing the raw sequence data or aligning it with previously determined sequence, use a sequence editing program with the electropherogram as a guide to truncate the sequence by removing:

  • unreliable data at the beginning of the sequence (usually the first 10-20 bases) which is due to the analysis software starting base calling before a uniform stream of fluorescent peaks is present in the electrophoretic data.
  • any relevant vector sequences.
  • unreliable data at the 3' end of the sequence (beginning in the region of 550-650 bases for ds plasmid DNA and large PCR fragments) which is due to the decreasing resolution of large DNA fragments (broadening and overlap of fluorescent peaks).
  • for PCR products, data past the physical end of the PCR fragment.

If you need assistance in interpreting your sequence data, please call us at 737-2566 or email, dnasequencing@yale.edu.

 

    Top of Page
Medical Center Yale-New Haven Hospital Yale University

Copyright © 2002, Yale University, New Haven, Connecticut, USA. All rights reserved.
Comments or suggestions to site editor.

Last modified: 30-Aug-2005 (EH)