The whey acidic protein family: a new signature motif and three-dimensional structure by comparative modeling1

https://doi.org/10.1016/S1093-3263(99)00023-6Get rights and content

Abstract

Whey acidic proteins (WAP) from the mouse, rat, rabbit, camel, and pig comprise two “four-disulfide core” domains. From a detailed analysis of all sequences containing this domain, we propose a new PROSITE motif ([KRHGVLN]-X-{PF}-X-[CF]-[PQSVLI]-X(9,19)-C-{P}-X-[DN]-X-{N}-[CE]-X(5)-C-C) to accurately identify new four-disulfide core proteins. A consensus model for the WAP proteins is proposed, based on the human mucous proteinase inhibitor crystal structure. This article presents a detailed atomic model for the two-domain porcine WAP sequence by comparative modeling. Surface electrostatic potential calculations indicate that the second domain of the pig WAP model is similar to the functional human mucous proteinase inhibitor domains, whereas the first domain may be nonfunctional.

Introduction

The whey acidic protein (WAP) is the major whey protein in the milk of the mouse (mWAP),1 rat (ratWAP),2 rabbit (rabWAP),3 and camel (cWAP),4 and was recently identified as a significant component of porcine milk (pWAP).5 The WAP proteins contain two four-disulfide core (4-DSC) domains, each comprising approximately 50 amino acids and that include eight cysteine residues in a conserved arrangement.1 The two WAP domains show limited sequence identity both within and between species.5 The 4-DSC domains are not exclusive to the WAP proteins, with numerous other proteins encoding one or two of these domains. These proteins are typically small, secretory proteins which exhibit a variety of growth- and differentiation-regulatory functions and have been shown to affect extracellular matrix remodeling and carcinoma.6 A large biological diversity exists between the proteins that contain one or two 4-DSC domains, with many being identified as proteinase inhibitors. These proteins are grouped into families based on their functionality and tissue-specific origins and include the antileukoproteinase (ALKI) family,7 epididymal8 and ovulatory9 specific proteins and elastase inhibitor (elafin) proteins.10 Therefore, based on the limited sequence similarity with known proteinase inhibitors it has been postulated that WAP may be a proteinase inhibitor.11 The antibiotic properties of equine neutrophil antibiotic peptide (eNAP-2 fragment12) and the growth-inhibitory nature of rat prostate stromal protein (ps2013) suggest that the 4-DSC domain is a preferential conformation for the stable folding and action of a class of protein inhibitors of varied function. Currently, no biological activity has been ascribed to the WAP proteins.

In this study, extensive sequence analysis on the 4-DSC family of proteins has been performed with a view to identifying the specific sequence motifs that uniquely constitute the 4-DSC domains and the effect of any missing conserved cysteine residues on the tertiary fold adopted by each domain. On the basis of this analysis, we report a new 4-DSC signature for PROSITE14 analysis that detects all known domains and eliminates false positives detected by the current PROSITE motif.

The 4-DSC domains are characterized by the conservation of sequence motifs and the interaction of specific charged residues involved in stabilizing the elafin fold, based on the experimental structural information available for the two-domain human mucous proteinase inhibitor hSLPI (SWISS-PROT entry ALK1_HUMAN; also known as MPI15) and for the one-domain R-elafin (hElaf; SWISS-PROT: ELAF_HUMAN).16, 17 Despite the limited sequence identity, the conserved factors are strongly indicative that the extended planar spiral of the elafin structure, pinned together by four disulfide bridges, is the preferred conformation of the 4-DSC domain.15, 16, 17

In this study a consensus three-dimensional structural model for the two-domain WAP proteins has been developed, with the modeling of a detailed atomic structure for the mature porcine WAP protein (pWAP) based on the crystal structure of the two-domain hSLPI.15 The viability of the WAP sequences to adopt this structural model has been examined, even in the absence of one or two of the conserved cysteine residues in some WAP proteins. Domain II of the WAP proteins appears more conserved than domain I,5 with the surface electrostatic potential of the pWAP domain II being similar to that of the corresponding hSLPI domain, suggesting an as yet undetermined inhibitory activity. Domain I of pWAP is more substituted and less rigid and may carry posttranslational modifications rendering it nonfunctional.

Section snippets

Sequence retrieval and analysis

The 4-DSC domain sequences were retrieved from protein sequence databases at the Australian National Genomic Information Service (using WebANGIS18). PROSITE14 analysis and pattern scans were carried out at the ExPASY web site (http://www.expasy.ch) against the SWISS-PROT (Release 37 and updates up to 01-April-1999 of the database) and TrEMBL databases. Putative serine- and threonine-linked O-glycosylation sites were located using the NetOGlyc19 server.

Sequence alignment

Alignment of all 4-DSC domains was

WAP signature

An alignment of eighty-four 4-DSC domain sequences, derived from WAP and numerous proteins with either confirmed or putative proteinase inhibitory activity, shows the characteristic conservation of the cysteine residues that constitute each domain (Figure 1/Color Plate 1 ).25

Within each domain, the eight conserved cysteine residues (numbered sequentially from C1 to C8), are arranged in the following pattern:

C1-(Xn)-C2-(Xn)-C3-(X5)-C4-(X5)-C5-C6-(X3, X5)-C7-(X3, X4)-C8

where X is any residue, Xn

Conclusions

A new four-disulfide core domain signature motif has been proposed to correctly identify new members of this “WAP-type”13 family. We have also suggested, on the basis of consensus sequence patterns, that this domain be defined from the KXGXCP motif (including the first conserved cysteine residue) to the CXXP motif (involving the last or eighth conserved cysteine residue), to avoid ambiguous domain extents defined in the literature. The two-domain WAP proteins can adopt the structure of hSLPI,

Acknowledgements

We thank Prof. Wolfram Bode (Max-Planck Institut fur Biochimie, Martinsreid, Germany) for kindly providing us the coordinates of hSLPI. This work was supported by the award of a SPIRT grant (C09804978) from the Australian Research Council to AGIC, with industry partners SGI Australia Pty. Ltd. and MSI Australia Pty. Ltd.

References (34)

  • S.M. Campbell et al.

    Comparison of the whey acidic protein genes of the rat and mouse

    Nucleic Acids Res

    (1984)
  • E. Devinoy et al.

    Recent data on the structure of rabbit milk protein genes and on the mechanism of the hormonal control of their expression

    Reprod. Nutrition Dev.

    (1988)
  • O.U. Beg et al.

    A camel milk whey protein rich in half-cysteinePrimary structure, assessment of variations, internal repeat patterns, and relationships with neurophysin and other active polypeptides

    Eur. J. Biochem.

    (1986)
  • K.J. Simpson et al.

    Molecular Characterisation and hormone-dependent expression of the porcine whey acidic protein gene

    J. Mol. Endocrinol.

    (1998)
  • R. Heinzel et al.

    Molecular cloning and expression of cDNA for human antileukoproteinase from cervix uterus

    Eur. J. Biochem.

    (1986)
  • C. Kirchhoff et al.

    A major human epididymis-specific cDNA encodes a protein with sequence homology to extracellular proteinase inhibitors

    Biol. Reprod.

    (1991)
  • M.A. Garczynski et al.

    Molecular characterisation of a ribonucleic acid transcript that is highly up-regulated at the time of ovulation in the brook trout (Salvelinus fontinalis) ovary

    Biol. Reprod.

    (1997)
  • Cited by (0)

    1

    Color Plates for this article are on pages 134–136.

    1

    Present address: Department of Biochemistry and Molecular Biology, University of Melbourne, Parkville, Victoria 3052, Australia

    View full text