|
Keck
Home Page >
Protein Chemistry >
Phosphorylation Sites
More Information
on Identifying Sites
of Phosphorylation in
Proteins
Overview
In most instances our general approach to identifying
sites of protein phosphorylation begins with an in gel
trypsin digest of the Coomassie Blue stained,
[32P]phosphorylated protein. During
microbore HPLC, the entire gradient is collected (with peak
detection) in "capless" Eppendorf tubes which are then
subjected to Cerenkov counting to quickly locate those
fractions which contain phosphorylated peptides. Since
Cerenkov counting does not require the addition of
scintillation fluid, each entire fraction may be counted
without loss of any sample. Cerenkov radiation results from
energetic ß-particles (electrons) passing through a
transparent medium of high refractive index (e.g.,
water). The resulting bluish-white light ("Cerenkov"
light) may be detected using the [3H]
channel on most scintillation counters. Assuming that the
stoichiometry of phosphorylation is high, the radioactively
labeled peptide may be identified by matrix assisted laser
desorption ionization mass spectrometry (MALDI-MS). In
"linear" MALDI-MS, the phosphorylated peptide will have a
mass that is +80 Da compared to that of any predicted
tryptic peptide. In "reflectron" MALDI-MS the phosphorylated
peptide will be unique in that it often will show a
characteristic fragmentation product resulting from loss of
phosphate during post-source decay. Edman sequencing can
then be used to confirm the identification of the
phosphorylated peptide. To identify the site of
phosphorylation we generally sequence the peptide both
before and after coupling aliquots of it to a solid
(Sequelon) support so that the sample can be subjected to
both "normal" and radiochemical sequencing . While Edman
degradation provides a powerful means to confirm the
identification of the peptide(s) present in the
[32P]-phosphate labeled fraction, it is
important to keep in mind that the phosphorylated
phenythiohydantoin (PTH) derivative produced is too
hydrophilic to be extracted from the sequencing support with
the non-polar solvents that must be used to prevent loss of
the peptide. Hence, all of the
[32P]-phosphate will remain on the
instrument. It is thus not possible under these conditions
to collect the individual PTH derivatives and determine at
which cycle/residue the [32P]-phosphate
is released. However, if the peptide is first covalently
coupled to a solid support, sufficiently polar solvents may
be used to extract the PTH (or ATZ) derivatives at each
cycle so that the released cpm can be detected by
scintillation counting. Although we may in the future be
able to introduce a stream splitter so that only a single
(solid phase) Edman sequencing run is required (e.g.,
in this instance a fraction of each PTH derivative might be
subjected to reverse phase HPLC with the remainder being
diverted to a fraction collector for scintillation
counting), currently, we generally resort to carrying out
two separate Edman sequencing runs.
In those instance where the stoichiometry of
phosphorylation is low, the analysis can be carried out
either on phosphorylated protein that has been separated
from its non-phosphorylated counterpart via 2D gel
electrophoresis or the site of phosphorylation can be
inferred indirectly. Since the phosphorylated peptide
generally elutes from RP-HPLC slightly in front of its
non-phosphorylated counterpart, the identification of the
former usually can be inferred via analysis of the latter.
In this instance the identity of the phosphorylated peptide
and the site of modification can be confirmed via a number
of approaches including radiochemical Edman sequencing of an
aliquot of the [32P]-phosphate labeled
fraction which has been coupled to Sequelon membrane (which
thus identifies the Edman cycle at which cpm are released),
in vitro mutagenesis of the putative site of
phosphorylation, co-chromatography of the
[32P]-phosphorylated peptide with a
synthetic version of the putative phosphorylated peptide, or
by carrying out a digest with another enzyme such as
chymotrypsin and verifying that the
[32P]-labeled tryptic and chymotryptic
peptides (both of which were identified indirectly via
analysis of their non-phosphorylated counterparts) indeed
overlap and include a possible site of phosphorylation.
We recommend submitting 10-50 pmol protein contaiing
>100,000 cpm/expected site and, if reasonably possible,
that the stoichiometry of phosporylation be estimated before
sample submission. In the case of in vivo labelled
samples, the stoichiometry of phosphorylation may be
estimated by two dimensional gel electrophoresis which
(usually) resolves the phosphorylated from the
non-phosphorylated protein. If the resulting autoradiogram
indicates that the radio-labelled protein is not
associated with the majority (or often, any) of the
Coomassie Blue staining, it is best to try to substantially
increase the level of phosphorylation prior to proceeding.
If you do not have extensive experience carrying out two
dimensional gel electrophoresis for this purpose, we
recommend you send your sample to a commercial laboratory.
Upon request we gladly will provide the name of a laboratory
whose Coomassie Blue stained, 2D gel samples have been
demonstrated to be amenable to the in gel digestion
procedure in use in our unit.
Conventional HPLC/peptide isolation approach with
[32P]-labeled protein.
Ideally, the protein should be stoichiometrically
phosphorylated at each site that is to be identified and it
should contain between 1 x 105 to 1 x10
6 dpm at each site. Assuming this is the case,
the method of choice for final purification is 1D or 2D gel
electrophoresis (see above), in which case we recommend that
10-50 pmol protein be isolated in one or a few (if multiple
gels are required) gel bands or spots stained with Coomassie
Blue as described. The 10-50 pmol
amount is higher than the 5-25 pmol minimum recommended for
protein identification via preparative HPLC followed by
Edman sequencing of one or more of the resulting peptide
peaks. The reason more protein is recommended for
identification of sites of phosphorylation is to try to
ensure that a reasonable fraction of the expected tryptic
peptides is in fact isolated. That is, as the amount of
protein digested is increased, so too does the number of
tryptic peptides that are isolated and thus, the probability
that the phosphorylated peptide(s) will be isolated.
Generally, the sample will be digested with trypsin and
subjected to reverse phase HPLC with collection being via
peak detection. Under the conditions used for reverse phase
HPLC (0.05% TFA, pH 2.2), a phosphorylated peptide generally
elutes slightly earlier than the corresponding
non-phosphorylated peptide and may or may not be separated
from it. The resulting >100 HPLC fractions will be
contained within individually capped, 1.5 ml Eppendorf tubes
which may then be subjected to Cerenkov counting. To
approximately quantify the overall recovery of
[32P]-labeled peptides, we will subject
both the initial gel band and the digested/extracted gel
band to Cerenkov counting. Typically, overall recoveries
above 25% would be considered to be good. Reasons for
significantly lower recoveries might be failure of the
protein to digest in the gel or failure of the
phosphorylated peptide(s) to elute from the reverse phase
HPLC column.
Mass spectrometric identification of
[32P]-phosphorylated peptides
Once HPLC fractions containing the phosphorylated
peptide(s) are located by Cerenkov counting, a small aliquot
of each will be analyzed by MALDI-MS. Under the normal
conditions that are used (linear MALDI-MS) peptides present
in the fraction will be observed as the singly charged (M+H)
ion. Since the phosphate will add +80 amu, the
phosphorylated peptide sometimes may be identified
tentatively at this point by comparing the observed mass to
a listing of the masses of the expected tryptic peptides.
This tentative identification may be confirmed by repeating
the MALDI-MS in the reflectron mode. Under these conditions,
peptides that contain phosphoserine or phosphothreonine
often will undergo detectable loss of phosphate resulting in
a characteristic loss of about 97 amu
[MH-H3PO4 ]. Similarly,
peptides containing phosphotyrosine (as well as
phosphoserine and phosphothreonine) will show a
characteristic loss of 79 daltons
[MH-HPO3] (see
the article by R. Annan in the December, 1995 ABRF
Newsletter for a review). By comparing the linear to the
reflectron MALDI-MS spectra it is sometimes possible
to determine tentatively which peptide mass corresponds to
that for the phosphorylated peptide. This identification and
the location of the site of phosphorylation is then
determined by Edman sequencing (see below). An alternative
approach is to compare the MALDI-MS spectra before and after
treatment with phosphatase and then look for the
characteristic -80 amu loss due to removal of phosphate. In
principle, MALDI-MS (coupled with either linear/reflectron
analysis or with/without phosphatase treatment) might
succeed in tentatively identifying peptides that are
phosphorylated at a high level of stoichiometry in the
unfractionated tryptic digest.
Edman degradation of
[32P]-phosphorylated peptides
While Edman degradation provides a powerful means to
confirm the identification of the peptide(s) present in the
[32P]-phosphate labeled fraction, the
phosphorylated phenythiohydantion (PTH) derivative produced
is too hydrophilic to be extracted from the sequencing
support under the conditions that are normally used. Hence,
all of the [32P]-phosphate will be lost
as it will remain on the instrument. As described above,
this challenge is circumvented by sequencing the peptide
both before and after coupling aliquots of the
peptide to a solid (Sequelon) support.
Identification of
[32P]-phosphorylated peptides when the
stoichiometry of phosphorylation is low
Unfortunately, many (perhaps most)
[32P]-phosphorylated proteins submitted
to the Keck Laboratory that are labeled in vivo
fall into this category either due to the comparatively
large, endogenous phosphate pool size or to
dephosphorylation during protein purification. In those
instances where the [32P]-labeled protein
has not been subjected to 2D gel electrophoresis and
autoradiography/Coomassie Blue staining, the first
indication that the stoichiometry of phosphorylation is very
low often comes from comparing the Cerenkov counting versus
absorbance profile from the HPLC run. If there is no
correlation, the [32P]-labeled peptides
almost surely represent a chemically insignificant
fraction of the sample. In this regard there are at least
two possible reasons for the lack of correlation between the
Cerenkov counting and absorbance profiles. The first is that
phosphorylated peptides generally elute slightly earlier (by
perhaps a minute or so) than the corresponding
non-phosphorylated peptide. The second explanation is that
phosphorylation near a site of tryptic cleavage may
significantly decrease the extent of tryptic cleavage at
that site so that the phosphorylated peptide well may be
present as an overlapping/partially cleaved tryptic that
does not even have a counterpart in the
non-phosphorylated protein digest. The danger, of course in
this instance, is that (particularly in the case of large
proteins (>50 kD)) it is quite likely this incompletely
cleaved, [32P]-labeled peptide (that is
present in a chemically insignificant amount) will
co-elute/overlap with a completely unrelated
peptide(s) that is not phosphorylated. Hence, if only
conventional Edman degradation is carried out on the
[32P]-labeled fractions, it is quite
possible to incorrectly assign the phosphorylated
peptide. Assuming the stoichiometry of phosphorylation is
reasonably high, MALDI-MS of the same fraction prevents this
possibility. If the stoichiometry of phosphorylation is low
and the Cerenkov counts appear to elute just in front of an
absorbance peak, a tentative identification of the
phosphorylated peptide may be made by identifying the
slightly later eluting, non-phosphorylated peptide. In this
instance it is particularly important that the
identification be confirmed by solid phase Edman degradation
(in which case the [32P]-phosphate
should, of course, be released at a cycle corresponding to a
serine, threonine or tyrosine) and/or by carrying out a
different kind of digest (i.e., chymotrypsin) and again show
that the [32P]-phosphate elutes just
prior to a peptide that spans the same putative site of
phosphorylation.
Mass spectrometric identification of phosphorylated
peptides that have not been labeled with
[32P]
Phosphorylated peptides can, in some instances, be
identified without [32P]-labeling. In
this case, the sample would be digested with trypsin as
described
followed by MALDI-MS analysis on ~5% of the digest. All
observed peptide masses would then be searched versus the
predicted tryptic fragments from the protein of interest
using the GPMAW program (Lighthouse Data). During these
searches, predicted peptides are searched both with and
without phosphorylation. The probability of success with
this approach might be increased by carrying out a
comparative analysis with a non-phosphorylated control. This
latter sample might be generated by expressing the
recombinant protein in an organism (e.g., E. coli)
or under a condition where it is not phosphorylated or
by subjecting an aliquot of the tryptic digest to
phosphatase treatment. Either of these approaches would
afford the opportunity to try to directly locate the
peptide(s) ions of interest by looking for differences
between the sample and its non-phosphorylated control.
Candidate peptide ions might then be subjected to nanospray
MS/MS analysis on an additional aliquot of the sample on our
Q-Tof mass spectrometer. An alternative approach would be to
try the procedure described by Goshe
et al (2001)
to specifically isolate
phosphorylated tryptic peptides from the digest. The
resulting peptides could then be subjected to MALDI-MS,
MS/MS and/or Edman sequencing.
These approaches would be likely to succeed only
if the sample is phosphorylated to a relatively high
stoichiometry. In view of this requirement and since
[32P]-labeling provides an excellent
means to not only identify and track the peptide(s) of
interest but to also determine their relative recovery, we
strongly recommend using [32P]-labeling
as the method of choice.
Estimated service charges for identifying sites of
[32P]-phosphorylation
Although it is impossible to accurately estimate
beforehand the total cost of identifying one or more sites
of phosphorylation in a protein, it is possible to estimate
the minimum likely charge for a "typical" project where the
stoichiometry of phosphorylation is high and where the
sample has been submitted as a single, Coomassie Blue
stained gel band containing 10-50 pmol protein with 0.1-1.0
x 106 cpm/site of phosphorylation. Under these
conditions the following would be the minimum charge
for identifying the first site of phosphorylation assuming
it lies within a 15 residue tryptic peptide.
|
Description of Service
|
Service Charge
|
|
Yale
|
Non-Yale/non-profit
|
Non-Yale/for profit
|
|
In gel trypsin digestion
|
$212
|
$244
|
$287
|
|
Preparative reverse phase HPLC of a
radioactive sample
|
$368
|
$405
|
$497
|
|
Cerenkov counting of all
HPLC fractions
|
$110
|
$127
|
$149
|
|
MALDI mass spectrometry (linear +
reflectron) on one fraction
|
$78
|
$90
|
$105
|
|
Edman sequencing of one 15 residue peptide
(to confirm chemical identification of the
peptide)
|
$682
|
$788
|
$917
|
|
Radiochemical Edman sequencing (with
scintillation counting of each cycle to locate
radioactive cycle)
|
$682
|
$788
|
$917
|
|
Minimum Total Service Charge Assuming
One Site of Phosphorylation on a 15 Residue Tryptic
Peptide
|
$2,132
|
$2,442
|
$2,872
|
|