|
Keck
Home Page >
Protein Chemistry >
Protein Analysis Procedures
More Information on
Protein Analysis Procedures
Amino Acid Analysis
Amino acid analysis is carried out on a Beckman Model
7300 ion-exchange instrument following a 16 hr hydrolysis at
115 degrees C in 100 µl of 6 N HCl, 0.2% phenol that
also contains 2 nmol norleucine. The latter serves as an
internal standard to correct for losses that may occur
during sample transfers, drying etc. After hydrolysis, the
HCl is dried in a Speedvac and the resulting amino acids
dissolved in 100 µl Beckman sample buffer that contains
2 nmol homoserine with the latter acting as a second
internal standard to independently monitor transfer of the
sample onto the analyzer. The instrument is calibrated with
a 2 nmol mixture of amino acids and it is operated via the
manufacturer's programs and with the use of their buffers.
Data analysis is carried out on an external computer using
Perkin Elmer/Nelson data acquisition software.
During acid hydrolysis asparagine will be converted to
aspartic acid and glutamine to glutamic acid. During the
HPLC analysis that follows, cysteine co-elutes with proline;
and methionine sulfoxide, which is a common oxidation
product found in peptides/proteins, co-elutes with aspartic
acid. Hence, following normal acid hydrolysis, glutamine and
asparagine are not individually quantified and it is
possible that the methionine value will be low and
(generally to a lesser extent) that the aspartic acid and
proline values will be somewhat high. Improved quantitation
of cysteine and methionine can be obtained by requesting
prior oxidation with performic acid, which converts both
methionine and methionine sulfoxide to methionine sulfone
and cysteine and cystine to cysteic acid. Generally,
however, performic acid oxidation destroys tyrosine. Best
quantitation of tryptophan is generally obtained by
requesting hydrolysis with methanesulfonic acid (MSA)
instead of hydrochloric acid. The procedure used in this
instance is to carry out the hydrolysis with 20 µl MSA
for 16 hrs at 115C. After hydrolysis, the sample is
neutralized with approximately 200 µl 0.35 M NaOH and
100 µl (50% of the sample) is then analyzed on the
Beckman 7300. Please keep in mind that since we believe the
overall extent of hydrolysis with MSA is less than with HCl,
we do not recommend MSA hydrolysis for use in quantifying
the concentration of protein stock solutions.
Internal Protein Sequencing
General Information
Protein digests that cannot be identified by mass
spectrometric approaches (i.e., peptide
mass searching or
Database
searching of MS/MS fragmentation data) are subjected to
preparative reverse phase HPLC and individual, peak detected
fractions are then "screened" by MALDI-MS followed by
N-terminal (Edman degradation) sequencing of one or more
peptide peaks that appear to be nearly homogenous based on
absorbance profile and MALDI-MS spectrum. The Keck
Laboratory has made a major and continuing effort to
implement and improve more sensitive procedures for
isolating and sequencing tryptic peptides from SDS
PAGE-separated proteins that generally are submitted as
Coomassie Blue stained gel bands. Some of these studies are
described in Keck Laboratory publications
(17,
18, 20-22, 23, 26). Our overall success rate, as
measured by the fraction of proteins submitted for which
internal peptide sequences were obtained that either result
in identifying the protein (via the database searches that
are included with this service) or that are suitable for
generating cDNA probes or primers, is nearly 97%.
Nearly 75% of the proteins submitted for internal
sequencing are identified via database searching of the
first peptide sequence obtained by the Keck Laboratory. This percentage will
inevitably increase as the various genome projects are
completed. For this reason the Keck Laboratory strongly
encourages that aliquots of all enzymatic digests destined
for HPLC and peptide sequencing be subjected first to MS or
MS/MS protein identification to allow many known proteins to
be identified prior to embarking on the more time consuming
and expensive HPLC/peptide sequencing approach.
Quantitation of SDS PAGE Samples for Internal
Sequencing
The most critical determinant of success of internal
sequencing is that a sufficient amount of protein be
submitted. Currently, the minimum recommended amount is 5
pmol while the optimal amount is about 25 pmol. If the
sample will be submitted in the form of a Coomassie Blue
stained gel band, it should be in a single band as the
second most critical determinant of success is that the
protein be contained within the minimum possible volume of
polyacrylamide gel. Two approaches may be taken to quantify
the amount of protein prior to in gel enzymatic cleavage.
The first is simply that several concentrations of a mixture
of known proteins be run on the same gel as the sample and
then the amount of protein in the sample estimated by
comparison to these standards. Since proteins vary by at
least twofold in their relative Coomassie Blue staining
intensity it is important that more than one standard
protein be run and that an "average" Coomassie Blue staining
intensity for a given amount of standard protein be used to
estimate the amount of protein in the unknown sample. The
second approach that may be taken is to estimate the amount
of protein based on the average MALDI-MS response, relative
to internal standards, obtained on an aliquot of the
resulting in gel trypsin or lysyl endopeptidase digest. The
Keck Laboratory routinely estimates the amount of protein
digest remainng in samples submitted for MALDI-MS based
protein identification.
Preparation of SDS PAGE Samples for Internal
Sequencing
The procedure we recommend for obtaining internal amino
acid sequences from SDS PAGE-separated proteins is in
situ tryptic or lysyl endopeptidase digestion in the
gel matrix - followed by elution of the resulting peptides
and HPLC separation. Although we also carry out in
situ tryptic or lysyl endopeptidase digests on the
PVDF membrane (followed again by elution and HPLC
separation), we strongly recommend the in gel approach as it
avoids the large losses that are sometimes associated with
blotting onto PVDF membranes and it also is more compatible
with subsequent mass spectrometry. It is important that in
gel samples are stained and shipped according to these
instructions and that every effort
is made to maximize the ratio of protein to total gel
volume. In general, we recommend the gel band contain at
least 0.05 µg of the desired protein per cubic mm gel
volume. Although we recommend an absolute minimum of 5 pmol
protein, the quality of the resulting peptide sequence data
is improved by going to larger amounts, with the optimum
level of protein being iabout 25 pmol. If there are
technical problems that prevent you from reaching the
recommended protein/gel volume ratio, you should email a
brief description of the problem (i.e., how to
concentrate a ml of sample so it can be loaded into a single
lane on an SDS polyacrylamide gel) to the Protein
Chemistry Section who may be able to help devise an
alternative protocol. If your protein contains significantly
more than 10% (w/w) carbohydrate, we recommend the
carbohydrate be removed prior to SDS PAGE, otherwise it is
likely to hinder enzymatic cleavage. In addition to your
sample, you should also submit an approximately equal size
piece of gel from a region of the gel that does not contain
protein. The latter will be "digested" and subjected to
analytical HPLC at no charge and will serve as an important
control which will help to quickly identify artifact and
trypsin autolysis peaks in the final HPLC chromatogram. In
addition to this "negative" control, the Keck Laboaratory
will also digest (again at no additional charge) a similar
amount of transferrin in parallel with your sample to serve
as a positive control.
Estimated Turn-around Time and Cost of Internal Edman
Sequencing
Typically, approximately five weeks are required to carry
out an in gel tryptic digest, fractionate the resulting
peptides by preparative reverse phase HPLC, "screen" about
six of the peptide peaks that have the most symmetrical
absorbance profile by MALDI-MS and to then subject the first
peptide to Edman sequencing.
The following table provides an estimate of the minimum
charges for an "average" internal Edman sequencing project.
To generate these estimates we have relied on the data in
Table I, which summarizes results
obtained from more than 200 in gel digests/internal Edman
sequencing projects, to estimate variables such as the
average number of peptides sequenced/protein and the length
of each seqeunce. As noted in this table, our overall
success rate at completing these projects is above 96%. One
factor that might result in the project exceeding the
charges for conventional internal sequencing would be an
unusually complex HPLC profile caused by digesting an
extremely large protein (i.e., >100 kD). In this
instance, additional MALDI-MS analyses and/or HPLC
repurification of individual peaks might well be required to
identify and isolate peptides suitable for sequencing.
Estimated Cost for an "Average" Internal Protein
Sequencing Project
|
Description
|
Service Charge
|
|
'
|
Yale
|
Non-Yale/Non-Profit
|
|
'
|
'
|
'
|
|
In Gel Digest1
|
$250
|
$287
|
|
Preparative HPLC
|
$375
|
$430
|
|
MALDI-MS on 6 Peptides
|
$420
|
$486
|
|
Edman Sequencing of 2 Peptides - Assuming 25
(Total) Residues Identified
|
$1285
|
$1470
|
1The in gel digest charge would
not apply to those samples that have already been subjected
to protein identification as these samples would have been
digested already as part of this latter service.
In Gel Enzymatic Digestion in the Keck Facility
In gel enzymatic digestion is carried out as described
generally in
Williams and Stone
(1997) and
Williams et
al (1997). Basically, this procedure involves
diffusing in modified trypsin (from Promega) or lysyl
endopeptidase (Wako), digesting for 24 hrs at 37 degrees C
and then extracting the resulting peptides. (See digest
procedure for more information).
Preparative Reverse Phase HPLC
Fractionation of Enzymatic Digests
All enzymatic digests that are destined for internal
Edman sequencing are fractionated on a Hewlett Packard 1090
HPLC system equipped with an Isco Model 2150 Peak Separator
and a 1 mm x 25 cm Vydac C-18 (5 micron particle size, 300
pore size) reverse phase column equilibrated with 98% buffer
A (0.06% TFA) and 2% buffer B (0.052% TFA, 80% acetonitrile)
as described in
Williams and Stone
(1997) and
Williams et
al (1997). Peptides are eluted at
50 µl/min
with the following gradient program: 0-60 min (2-37% B),
60-90 min (37-75% B) and 90-105 min (75-98% B) and are
detected by their absorbance at 210 nm. Fractions are
collected in capless Eppendorf tubes that are positioned on
the tops of 13 x 100 mm test tubes and that are capped
within approximately 1 hour of their collection (to prevent
evaporation of the acetonitrile). Under these conditions,
several fractions have been successfully sequenced even
after being stored at 4 degrees C for as long as two years.
After loading selected peptides onto our Applied Biosystems
sequencer, the original sample tube is rinsed with neat
trifluoroacetic acid (to recover peptides that may have
adsorbed onto the Eppendorf tube) which is then overlaid
onto the sequencing filter.
N-Terminal Protein/Peptide
Sequencing
N-terminal protein/peptide sequencing is carried out on
two Applied Biosystems Procise 494 cLC instruments that are
equipped with on-line HPLC for the identification of the
resulting phenylthiohydantoin (Pth) amino acid derivatives.
Since greater than 80% of higher eukaryotic proteins have
been reported to have blocked amino-termini that preclude
direct amino acid sequencing, the Keck Facility does not
recommend this approach for intact eukeryotic proteins
unless sufficient protein is available to first try direct
N-terminal sequencing and then, if that fails, to submit an
absolute minimum of 5 pmol protein for internal sequencing
(preceded by mass spectrometric screening of the digest for
"known" proteins). It is important to note that if final
purification involves SDS-PAGE, two separate samples should
be prepared if both direct N-terminal sequencing of the
intact protein and "internal sequencing" of tryptic peptides
derived from that protein may be requested. That is,
SDS-PAGE samples destined for direct N-terminal sequencing
must be electroblotted onto PVDF-type membranes, while the
Keck Laboaratory recommends that SDS-PAGE purified samples
destined for enzymatic cleavage and internal sequencing be
submitted in the form of Coomassie Blue stained gel
bands.
Before applying the sample, 2.5 pmol of a 17 residue
internal sequencing standard peptide which has the
formula:
[norleucine-(succinyl-lysine)4]3-norleucine
- succinyl-lysine
is first spotted onto the sequencing filter. Since this
internal sequencing standard is composed of non-naturally
occurring amino acids, it does not interfere with sequencing
the unknown peptide/protein. On the contrary, this internal
standard provides the on-line monitoring of sequencer
function that is so critical to being able to keep these
instruments operating at the peak performance that is
necessary to be able to routinely sequence at the <pmol
level. In addition, when a blocked eukaryotic
protein/peptide is encountered, the presence of the sequence
of the internal standard assures that the instrument was
operating well and that the failure to obtain a sequence was
not the result of an instrument malfunction. The use of a
41-mer version of this standard is described in Elliott
et al (1993). In general, the instrument is
operated based on the manufacturer's recommendations and 1.0
pmol Pth-standards are routinely used for calibration. In
addition, the S4 solvent that transfers the Pth-derivative
to the HPLC contains 1.2 pmol Pth-norvaline which acts as an
internal calibrant to independently monitor transfer to the
HPLC.
All sequences obtained in the Keck Facility are searched via
the BLAST Network Service operated by the National Center
for Biotechnology Information. This server accesses the
Brookhaven, Swiss, PIR and GenBank databases and is updated
daily and may be accessed via the
Web.
The E Value that is in the last column on the right of the
Blast Search Results pages provides a useful criterion to
judge the significance of the search. The Expect (E) value
is the number of matches one can "expect" to see simply by
chance in a database of the current size. The "E" value
decreases exponentially with the Score that is assigned to a
match and that is reported in the next to last column on the
search page. The lower the "E" value the more significant is
the match. Another indication of a significant homology
would (in the case of proteolytic digests) be the presence
of a preceding cleavage site. Additional information on
interpreting BLAST Search results may be found in
Altshul
et al (1990). Although sequences obtained by
the Keck Facility will be accompanied by a Pth Tabulation
Table summarizing the approximate yields of Pth-amino acids
detected at each cycle, the Keck Facility does not use this
data for sequence calling. Hence, unless you specifically
request that the data contained in these tables be verified
for accuracy, these tables may contain one or more errors.
Protein/peptide sequences are determined by overlaying
successive Pth chromatograms on a light box and we strongly recommend
that users go through this exercise to better understand the data.
|