Search Keck Sites:


W.M. Keck Facility
 Yale University
 300 George Street
 Addresses
 
 Contact Us

Yale University School of Medicine

Keck Home Page > Protein Chemistry > Phosphorylation Sites

More Information on Identifying Sites
of Phosphorylation in Proteins

Overview

In most instances our general approach to identifying sites of protein phosphorylation begins with an in gel trypsin digest of the Coomassie Blue stained, [32P]phosphorylated protein. During microbore HPLC, the entire gradient is collected (with peak detection) in "capless" Eppendorf tubes which are then subjected to Cerenkov counting to quickly locate those fractions which contain phosphorylated peptides. Since Cerenkov counting does not require the addition of scintillation fluid, each entire fraction may be counted without loss of any sample. Cerenkov radiation results from energetic ß-particles (electrons) passing through a transparent medium of high refractive index (e.g., water). The resulting bluish-white light ("Cerenkov" light) may be detected using the [3H] channel on most scintillation counters. Assuming that the stoichiometry of phosphorylation is high, the radioactively labeled peptide may be identified by matrix assisted laser desorption ionization mass spectrometry (MALDI-MS). In "linear" MALDI-MS, the phosphorylated peptide will have a mass that is +80 Da compared to that of any predicted tryptic peptide. In "reflectron" MALDI-MS the phosphorylated peptide will be unique in that it often will show a characteristic fragmentation product resulting from loss of phosphate during post-source decay. Edman sequencing can then be used to confirm the identification of the phosphorylated peptide. To identify the site of phosphorylation we generally sequence the peptide both before and after coupling aliquots of it to a solid (Sequelon) support so that the sample can be subjected to both "normal" and radiochemical sequencing . While Edman degradation provides a powerful means to confirm the identification of the peptide(s) present in the [32P]-phosphate labeled fraction, it is important to keep in mind that the phosphorylated phenythiohydantoin (PTH) derivative produced is too hydrophilic to be extracted from the sequencing support with the non-polar solvents that must be used to prevent loss of the peptide. Hence, all of the [32P]-phosphate will remain on the instrument. It is thus not possible under these conditions to collect the individual PTH derivatives and determine at which cycle/residue the [32P]-phosphate is released. However, if the peptide is first covalently coupled to a solid support, sufficiently polar solvents may be used to extract the PTH (or ATZ) derivatives at each cycle so that the released cpm can be detected by scintillation counting. Although we may in the future be able to introduce a stream splitter so that only a single (solid phase) Edman sequencing run is required (e.g., in this instance a fraction of each PTH derivative might be subjected to reverse phase HPLC with the remainder being diverted to a fraction collector for scintillation counting), currently, we generally resort to carrying out two separate Edman sequencing runs.

In those instance where the stoichiometry of phosphorylation is low, the analysis can be carried out either on phosphorylated protein that has been separated from its non-phosphorylated counterpart via 2D gel electrophoresis or the site of phosphorylation can be inferred indirectly. Since the phosphorylated peptide generally elutes from RP-HPLC slightly in front of its non-phosphorylated counterpart, the identification of the former usually can be inferred via analysis of the latter. In this instance the identity of the phosphorylated peptide and the site of modification can be confirmed via a number of approaches including radiochemical Edman sequencing of an aliquot of the [32P]-phosphate labeled fraction which has been coupled to Sequelon membrane (which thus identifies the Edman cycle at which cpm are released), in vitro mutagenesis of the putative site of phosphorylation, co-chromatography of the [32P]-phosphorylated peptide with a synthetic version of the putative phosphorylated peptide, or by carrying out a digest with another enzyme such as chymotrypsin and verifying that the [32P]-labeled tryptic and chymotryptic peptides (both of which were identified indirectly via analysis of their non-phosphorylated counterparts) indeed overlap and include a possible site of phosphorylation.

We recommend submitting 10-50 pmol protein contaiing >100,000 cpm/expected site and, if reasonably possible, that the stoichiometry of phosporylation be estimated before sample submission. In the case of in vivo labelled samples, the stoichiometry of phosphorylation may be estimated by two dimensional gel electrophoresis which (usually) resolves the phosphorylated from the non-phosphorylated protein. If the resulting autoradiogram indicates that the radio-labelled protein is not associated with the majority (or often, any) of the Coomassie Blue staining, it is best to try to substantially increase the level of phosphorylation prior to proceeding. If you do not have extensive experience carrying out two dimensional gel electrophoresis for this purpose, we recommend you send your sample to a commercial laboratory. Upon request we gladly will provide the name of a laboratory whose Coomassie Blue stained, 2D gel samples have been demonstrated to be amenable to the in gel digestion procedure in use in our unit.

 

Conventional HPLC/peptide isolation approach with [32P]-labeled protein.

Ideally, the protein should be stoichiometrically phosphorylated at each site that is to be identified and it should contain between 1 x 105 to 1 x10 6 dpm at each site. Assuming this is the case, the method of choice for final purification is 1D or 2D gel electrophoresis (see above), in which case we recommend that 10-50 pmol protein be isolated in one or a few (if multiple gels are required) gel bands or spots stained with Coomassie Blue as described. The 10-50 pmol amount is higher than the 5-25 pmol minimum recommended for protein identification via preparative HPLC followed by Edman sequencing of one or more of the resulting peptide peaks. The reason more protein is recommended for identification of sites of phosphorylation is to try to ensure that a reasonable fraction of the expected tryptic peptides is in fact isolated. That is, as the amount of protein digested is increased, so too does the number of tryptic peptides that are isolated and thus, the probability that the phosphorylated peptide(s) will be isolated. Generally, the sample will be digested with trypsin and subjected to reverse phase HPLC with collection being via peak detection. Under the conditions used for reverse phase HPLC (0.05% TFA, pH 2.2), a phosphorylated peptide generally elutes slightly earlier than the corresponding non-phosphorylated peptide and may or may not be separated from it. The resulting >100 HPLC fractions will be contained within individually capped, 1.5 ml Eppendorf tubes which may then be subjected to Cerenkov counting. To approximately quantify the overall recovery of [32P]-labeled peptides, we will subject both the initial gel band and the digested/extracted gel band to Cerenkov counting. Typically, overall recoveries above 25% would be considered to be good. Reasons for significantly lower recoveries might be failure of the protein to digest in the gel or failure of the phosphorylated peptide(s) to elute from the reverse phase HPLC column.

Mass spectrometric identification of [32P]-phosphorylated peptides

Once HPLC fractions containing the phosphorylated peptide(s) are located by Cerenkov counting, a small aliquot of each will be analyzed by MALDI-MS. Under the normal conditions that are used (linear MALDI-MS) peptides present in the fraction will be observed as the singly charged (M+H) ion. Since the phosphate will add +80 amu, the phosphorylated peptide sometimes may be identified tentatively at this point by comparing the observed mass to a listing of the masses of the expected tryptic peptides. This tentative identification may be confirmed by repeating the MALDI-MS in the reflectron mode. Under these conditions, peptides that contain phosphoserine or phosphothreonine often will undergo detectable loss of phosphate resulting in a characteristic loss of about 97 amu [MH-H3PO4 ]. Similarly, peptides containing phosphotyrosine (as well as phosphoserine and phosphothreonine) will show a characteristic loss of 79 daltons [MH-HPO3] (see the article by R. Annan in the December, 1995 ABRF Newsletter for a review). By comparing the linear to the reflectron MALDI-MS spectra it is sometimes possible to determine tentatively which peptide mass corresponds to that for the phosphorylated peptide. This identification and the location of the site of phosphorylation is then determined by Edman sequencing (see below). An alternative approach is to compare the MALDI-MS spectra before and after treatment with phosphatase and then look for the characteristic -80 amu loss due to removal of phosphate. In principle, MALDI-MS (coupled with either linear/reflectron analysis or with/without phosphatase treatment) might succeed in tentatively identifying peptides that are phosphorylated at a high level of stoichiometry in the unfractionated tryptic digest.

Edman degradation of [32P]-phosphorylated peptides

While Edman degradation provides a powerful means to confirm the identification of the peptide(s) present in the [32P]-phosphate labeled fraction, the phosphorylated phenythiohydantion (PTH) derivative produced is too hydrophilic to be extracted from the sequencing support under the conditions that are normally used. Hence, all of the [32P]-phosphate will be lost as it will remain on the instrument. As described above, this challenge is circumvented by sequencing the peptide both before and after coupling aliquots of the peptide to a solid (Sequelon) support.

Identification of [32P]-phosphorylated peptides when the stoichiometry of phosphorylation is low

Unfortunately, many (perhaps most) [32P]-phosphorylated proteins submitted to the Keck Laboratory that are labeled in vivo fall into this category either due to the comparatively large, endogenous phosphate pool size or to dephosphorylation during protein purification. In those instances where the [32P]-labeled protein has not been subjected to 2D gel electrophoresis and autoradiography/Coomassie Blue staining, the first indication that the stoichiometry of phosphorylation is very low often comes from comparing the Cerenkov counting versus absorbance profile from the HPLC run. If there is no correlation, the [32P]-labeled peptides almost surely represent a chemically insignificant fraction of the sample. In this regard there are at least two possible reasons for the lack of correlation between the Cerenkov counting and absorbance profiles. The first is that phosphorylated peptides generally elute slightly earlier (by perhaps a minute or so) than the corresponding non-phosphorylated peptide. The second explanation is that phosphorylation near a site of tryptic cleavage may significantly decrease the extent of tryptic cleavage at that site so that the phosphorylated peptide well may be present as an overlapping/partially cleaved tryptic that does not even have a counterpart in the non-phosphorylated protein digest. The danger, of course in this instance, is that (particularly in the case of large proteins (>50 kD)) it is quite likely this incompletely cleaved, [32P]-labeled peptide (that is present in a chemically insignificant amount) will co-elute/overlap with a completely unrelated peptide(s) that is not phosphorylated. Hence, if only conventional Edman degradation is carried out on the [32P]-labeled fractions, it is quite possible to incorrectly assign the phosphorylated peptide. Assuming the stoichiometry of phosphorylation is reasonably high, MALDI-MS of the same fraction prevents this possibility. If the stoichiometry of phosphorylation is low and the Cerenkov counts appear to elute just in front of an absorbance peak, a tentative identification of the phosphorylated peptide may be made by identifying the slightly later eluting, non-phosphorylated peptide. In this instance it is particularly important that the identification be confirmed by solid phase Edman degradation (in which case the [32P]-phosphate should, of course, be released at a cycle corresponding to a serine, threonine or tyrosine) and/or by carrying out a different kind of digest (i.e., chymotrypsin) and again show that the [32P]-phosphate elutes just prior to a peptide that spans the same putative site of phosphorylation.

Mass spectrometric identification of phosphorylated peptides that have not been labeled with [32P]

Phosphorylated peptides can, in some instances, be identified without [32P]-labeling. In this case, the sample would be digested with trypsin as described followed by MALDI-MS analysis on ~5% of the digest. All observed peptide masses would then be searched versus the predicted tryptic fragments from the protein of interest using the GPMAW program (Lighthouse Data). During these searches, predicted peptides are searched both with and without phosphorylation. The probability of success with this approach might be increased by carrying out a comparative analysis with a non-phosphorylated control. This latter sample might be generated by expressing the recombinant protein in an organism (e.g., E. coli) or under a condition where it is not phosphorylated or by subjecting an aliquot of the tryptic digest to phosphatase treatment. Either of these approaches would afford the opportunity to try to directly locate the peptide(s) ions of interest by looking for differences between the sample and its non-phosphorylated control. Candidate peptide ions might then be subjected to nanospray MS/MS analysis on an additional aliquot of the sample on our Q-Tof mass spectrometer. An alternative approach would be to try the procedure described by Goshe et al (2001) to specifically isolate phosphorylated tryptic peptides from the digest. The resulting peptides could then be subjected to MALDI-MS, MS/MS and/or Edman sequencing.

These approaches would be likely to succeed only if the sample is phosphorylated to a relatively high stoichiometry. In view of this requirement and since [32P]-labeling provides an excellent means to not only identify and track the peptide(s) of interest but to also determine their relative recovery, we strongly recommend using [32P]-labeling as the method of choice.

Estimated service charges for identifying sites of [32P]-phosphorylation

Although it is impossible to accurately estimate beforehand the total cost of identifying one or more sites of phosphorylation in a protein, it is possible to estimate the minimum likely charge for a "typical" project where the stoichiometry of phosphorylation is high and where the sample has been submitted as a single, Coomassie Blue stained gel band containing 10-50 pmol protein with 0.1-1.0 x 106 cpm/site of phosphorylation. Under these conditions the following would be the minimum charge for identifying the first site of phosphorylation assuming it lies within a 15 residue tryptic peptide.

Description of Service

Service Charge

Yale

Non-Yale/non-profit

Non-Yale/for profit

In gel trypsin digestion

$212

$244

$287

Preparative reverse phase HPLC of a radioactive sample

$368

$405

$497

Cerenkov counting of all HPLC fractions

$110

$127

$149

MALDI mass spectrometry (linear + reflectron) on one fraction

$78

$90

$105

Edman sequencing of one 15 residue peptide (to confirm chemical identification of the peptide)

$682

$788

$917

Radiochemical Edman sequencing (with scintillation counting of each cycle to locate radioactive cycle)

$682

$788

$917

Minimum Total Service Charge Assuming One Site of Phosphorylation on a 15 Residue Tryptic Peptide

$2,132

$2,442

$2,872

 

    Top of Page
Medical Center Yale-New Haven Hospital Yale University

Copyright © 2003, Yale University, New Haven, Connecticut, USA. All rights reserved.
Comments or suggestions to site editor.

Last modified: 23-Oct-2006 (GB)