PROTEINS: Structure, Function, and Genetics 36:77–86 (1999) Molecular Dynamics Simulations of Human ␣-Lactalbumin: Changes to the Structural and Dynamical Properties of the Protein at Low pH Lorna J. Smith,1* Christopher M. Dobson,1 and Wilfred F. van Gunsteren1,2 Centre for Molecular Sciences, New Chemistry Laboratory, University of Oxford, Oxford, England 2Department of Physical Chemistry, Swiss Federal Institute of Technology Zürich, Zürich, Switzerland 1Oxford ABSTRACT Two 700-ps molecular dynamics simulations of human ␣-lactalbumin have been compared. Both were initiated from an X-ray structure determined at pH 6.5. One simulation was designed to represent native conditions and the other the protein in solution at pH 2.0 without a bound calcium ion. The low pH conditions were modelled by protonating the aspartate, glutamate, and histidine side chains and the protein C-terminus. Significant changes were observed for the C-terminal region of the sequence in the simulation at low pH. Most notably an ␣-helix, helix D, and the C-terminal 310 helix were substantially disrupted relative to the simulation at high pH. These perturbations to the native fold are similar to those observed in an X-ray structure of ␣-lactalbumin at pH 4.2. In addition, larger fluctuations about side chain torsion angles were observed in the low pH simulation than in that corresponding to the higher pH. These structural and dynamical changes might be representative of the early stages of the transition to the moltenglobule state of the protein known to be formed under low pH conditions in solution. Proteins 1999;36:77–86. r 1999 Wiley-Liss, Inc. Key words: computer simulation; protein dynamics; protein denaturation; side chain disorder; molten globule; protein folding INTRODUCTION Partial or complete denaturation of proteins can be brought about in vitro by a range of solution conditions.1 These include extremes of temperature or pH, high salt concentrations, the removal of prosthetic groups or bound metal ions and the presence of organic solvents or chemical denaturant such as urea or guanidine hydrochloride. There is much interest in characterizing the non-native states formed under such conditions and comparing their properties with those of globular folded proteins. These studies can give insight into issues such as protein stability and folding and the relationship between the sequences of proteins and their three-dimensional structures.2–6 They also have significance with regard to understanding diseases associated with protein misfolding and the aggregation of non-native protein species. Such diseases include r 1999 WILEY-LISS, INC. cystic fibrosis, Alzheimer’s, and the spongiform encephalophies.7 In this work we concentrate on the partial unfolding of human ␣-lactalbumin under conditions of low pH and in the absence of a bound calcium ion. Experimental studies have shown that the behavior of this protein as a function of pH is particularly interesting. The native fold of human ␣-lactalbumin, defined in X-ray structures determined at pH 6.5,8,9 contains a helical ␣ domain (four ␣-helices, A–D, and a short C-terminal 310 helix) and a ␤ domain with a triple-stranded antiparallel ␤-sheet, a long loop and a short 310 helix (Fig. 1a). One calcium ion is bound in a site involving the carbonyl groups of Lys 79 and Asp 84, the carboxylate groups of Asp 82, Asp 87, and Asp 88, and two water molecules (Fig. 1b). Comparisons of this structure with an X-ray structure determined at pH 4.29 (with the calcium ion still bound) demonstrate that on lowering the pH there are changes in the C-terminal region of the sequence (residues 96–123). The most significant of these are a local unfolding of helix D (involving residues 104– 111) and a movement of the side chain of His 107 from a buried position in the native structure to a conformation where it is fully exposed to solvent. At even lower pH (pH 2.0) and in the absence of calcium human ␣-lactalbumin in solution forms a molten-globule state (the A-state)10 consisting of an ensemble of interconverting conformers. The ␣-lactalbumin A-state is compact11 and much of the native secondary structure persists, particularly the A, B, and C helices in the ␣ domain.12 The ␣ domain also retains an overall native fold.13,14 However, the packing of amino acid side chains is disordered across the conformational ensemble. This is particularly clearly evident in a loss of the near-uv CD signal15 and a significant reduction in the chemical shift dispersion of the NMR resonances for the molten-globule A-state compared to the native protein.16,17 Abbreviations: CD, circular dichroism; MD, molecular dynamics; NMR, nuclear magnetic resonance; RMSD, root-mean-square difference. Grant sponsors: U.K. Biotechnology and Biological Sciences Research Council; Engineering and Physical Sciences Research Council; Medical Research Council; Wellcome Trust; Schweizerischer National Fonds; and the Underwood Fund. *Correspondence to: L.J. Smith, New Chemistry Laboratory, University of Oxford, South Parks Road, Oxford OX1 3QT UK. E-mail: firstname.lastname@example.org Received 5 November 1998; Accepted 12 March 1999 78 L.J. SMITH ET AL. Fig. 1. The X-ray structure from Acharya et al.8 of human ␣-lactalbumin. In a) the full structure is shown. The positions of the calcium ion and aspartate ligands are indicated by black spheres and the regions of secondary structure are labelled (Helix A 5–15, Helix B 24–33, ␤-sheet 41–56, 310(1) 77–80, Helix C 86–98, Helix D 106–110, 310(2) 116–118). In b) the details of the calcium-binding site are shown, a circle representing the calcium ion and crosses indicating bound water molecules. The program MOLSCRIPT42 was used to generate part a. We have been investigating strategies through which theoretical studies can be used in conjunction with experimental data to extend our understanding of the conformational properties of human ␣-lactalbumin at low pH. In particular we have previously reported a model for the A-state of the protein generated using a molecular dynamics (MD) protocol based on insight from experimental data.18 The molten globule must be described in terms of an ensemble of slowly interconverting conformers and the protocol adopted provides models for individual structures within this ensemble. These differ in the packing of the amino acid side chains but have essentially the same native-like secondary structure and fold in accord with experimental observations. In this paper we report a complementary study in which we use MD techniques to probe possible initial stages in the denaturation of human ␣-lactalbumin under conditions of low pH. In particular we have run and analyzed two MD simulations of ␣-lactalbumin, one of the native protein at pH 8.0 (referred to as the high pH simulation) and the other starting from the native structure but modelling conditions of pH 2.0 and the absence of calcium (low pH simulation). Both simulations have been run for 700 ps in the presence of explicit solvent water molecules. Experimental interconversion between the native state and the A-state is slow (the average rate of this interconversion for guinea pig ␣-lactalbumin at pH 7.4 and 65°C has been estimated to be less than 10 s⫺1).19 Therefore within the low pH MD simulation we do not expect to equilibrate to the molten globule. Instead the work is intended to provide insight into the structural and dynamical properties of the native state and how these initially change when the protein is introduced into an environment at low pH. METHODS Human ␣-lactalbumin simulations and analysis were carried out using the GROMOS96 package of programs.20 The simulations were performed in the presence of explicit solvent (simple point charge water molecules21), using the GROMOS96 force field20 for solvent simulations, parameter set 43A1. Two simulations were run, modeling conditions of high (pH 8.0) and low pH (pH 2.0) respectively by changing the protonation of aspartate, glutamate, and histidine side chain groups in the protein and the Cterminus. Full details are given in Table I. In each case the TABLE I. Differences Between the Systems Used for the Simulations of Human ␣-Lactalbumin at High and Low pH Aspartate side chainsa Glutamate side chainsb Histidine side chainsc C-terminus Overall protein charge Calcium ion Crystallographic waters Box dimensions (initial) Total numbers of: Protein atoms Water molecules High pH (pH 8) Low pH (pH 2) Not protonated Not protonated Not protonated Not protonated ⫺8 Present Used 7.1832 nm Protonated Protonated Protonated Protonated ⫹16 Absent Not used 7.1832 nm 1,243 ⫹ Ca2⫹ 5,574 1,267 5,582 aAspartate residues are 14, 16, 37, 45, 74, 78, 82, 83, 84, 87, 88, 97, 102. residues are 7, 25, 43, 46, 49, 113, 116, 121. cHistidine residues are 32, 107. bGlutamate starting protein coordinates were taken from the crystal structure of human ␣-lactalbumin determined at pH 6.5 by Acharya et al.8 There are discrepancies in the literature regarding the pKa values of the two histidine side chains; the pKa values are 5.8 and 7.2 but the assignment of these to His 32 or His 107 is not clear.9,22 The simulation of the native protein was therefore run under conditions where both the histidine side chains were deprotonated, rather than at the pH of the X-ray structure. The calcium ion was included in the high pH simulation but was removed for the simulation at low pH. Truncated octahedron periodic boundary conditions were used in both simulations, the initial dimensions of the periodic box being chosen so that all non-hydrogen solute atoms lie further than 1.0 nm from the square box walls. Details of the box lengths used and sizes of the simulation systems are given in Table I. Each simulation was run for 700 ps. In order to allow relaxation of the water around the protein, a temperature of 100 K was used for the first 2 ps and 200 K for the next 2 ps; the positions of all the protein atoms were restrained to the X-ray ones during this initial 4 ps. The simulations were then continued at 293 K without any positional restraints and with a constant pressure of 1atm throughout. The temperature and pressure were maintained by weak coupling to an external bath23 (temperature coupling MD SIMULATIONS OF ␣-LACTALBUMIN 79 Fig. 2. Variations in the total potential energy of the system (a) and the protein covalent, electrostatic, and van der Waals energy (b, c and d respectively) as a function of time for the ␣-lactalbumin simulations at high (filled circles) and low pH (open circles). All energies are given in kJmol⫺1 and are averages over 10 ps periods. The average temperature of the protein in simulations over the time period 300–700 ps was 289.3 K (high pH) and 293.1 K (low pH). The overall differences between the protein energies observed in the simulations at low and high pH reflect the different total charges of the protein molecule in the two simulations (⫺8 at high pH and ⫹16 at low pH) and the absence of the calcium ion at low pH. relaxation time 0.1 ps; pressure coupling relaxation time 0.5 ps; isothermal compressibility T 4.575 10⫺4(kJ mol⫺1 nm⫺3) ⫺1). Throughout the simulations bond lengths were constrained to equilibrium values using the SHAKE procedure with a geometric accuracy of 10⫺4.24 Nonbonded interactions were treated using a twin range method.25 Within a short-range cutoff of 0.8 nm all interactions were determined at every step. Longer range (electrostatic and van der Waals) interactions within a cutoff range of 1.4 nm were updated at the same time as the pair list was generated (every 10 fs). A reaction field was applied (⑀ 54.0)26 beyond a cutoff of 1.4 nm. A time step of 2 fs was used. The analysis was performed using trajectory coordinates and energies that were written to disk every 0.1 ps. The solvent-accessible surface areas of individual structures taken from the simulation trajectories were calculated using the program NACCESS.27 Regions of secondary structure were identified using the program DSSP.28 The hydrophobic contacts present in structures taken from the simulations were analyzed using the program NAOMI.29 RESULTS AND DISCUSSION Characteristics of the Native Protein Figures 2 and 3 show the changes in the energy of the system, the protein radius of gyration, solvent-accessible surface area and positional root mean square differences (RMSD) with respect to the X-ray structure through the simulations. Looking first at the simulation of the protein at high pH, after initial rapid changes in most of these parameters the values plateau within 200–300 ps reflecting the equilibration of the system. The C␣ and all atom positional RMSD values with respect to the X-ray structure stabilize at values of approximately 0.18 and 0.26 nm respectively. Overall the values reflect changes to the crystal structure as it adapts to the simulation solution 80 L.J. SMITH ET AL. Fig. 3. Variations in the protein radius of gyration (a, in nm), the total protein solvent-accessible surface area (b, in nm2) and the C␣ atom (c, in nm) and all-atom (d, in nm) positional root-mean-square differences from the X-ray structure of native ␣-lactalbumin. The values are for instantaneous structures taken at 10 ps intervals through the simulation; high pH (filled circles), low pH (open circles). The protons only present at low pH and the calcium ion have been excluded in the RMSD calculations. environment. The secondary and tertiary structure of the protein identified in the X-ray structure at pH 6.5 is retained throughout the high pH simulation. The populations of the hydrogen bonds defining the secondary structure (over the simulation time of 300–700 ps) are summarized in Table II. Lack of regularity in secondary structure elements has been recognized previously in the X-ray structure at the C-terminus of helix A (Leu 11-Ile 15 form a 310 rather than an ␣-helix) and in the D helix region (Ala 106-Leu 110).8,9 These irregularities are reflected in the simulation as populations of a variety of medium range main chain hydrogen bonds by these residues. For example, for residues at the C-terminus of helix A there is a 31% and a 13% population of 13NH-11CO and 16NH-12CO hydrogen bonds respectively in addition to the i,i⫹3 hydrogen bonds which define the 310 helix in the X-ray structure. For helix D there are significant populations of 108NH104CO (92%) and 111NH-106CO (51%) hydrogen bonds in the simulation that are not observed in the X-ray structure (Figure 4a and Table II). During the simulation the calcium ion was not restrained to its X-ray structure position. All the features of the binding site persist, however, throughout the simulation (Fig. 5a), though the calcium ion does move slightly away from the protein (distances from the ion to the ligand binding sites on the protein increase on average by 0.02 nm compared to those present in the X-ray structure; Table III). The calcium ion continues to be essentially buried in the protein (solvent-accessible surface area of 0.04 nm2 at the end of the simulation, compared to 0.002 nm2 in the X-ray structure). This is of interest as recent experimental results suggest that the interconversion of MD SIMULATIONS OF ␣-LACTALBUMIN TABLE II. Populations of Main-chain Hydrogen Bonds (Indicated by Residue Numbers) in Regions of Secondary Structure in the Human ␣-Lactalbumin Simulations† A) Hydrogen bonds present in the X-ray structure NH-CO Helix A 8-4 9-5 10-6 11-7 12-8 13-10 15-12 16-13 Helix B 27-23 28-24 29-25 30-26 31-27 32-28 33-29 34-30 ␤-sheet 42-49 44-47 49-42 51-40 50-55 54-51 55-50 High pH Low pH 83 93 63 95 91 2 7 44 99 97 57 92 90 5 63 62 99 82 99 78 98 90 92 72 99 96 88 86 98 75 35 0 72 53 98 59 98 54 98 56 86 97 54 98 77 99 NH-CO 57-48 310 helix 1 79-76 80-77 81-78 Helix C 89-85 90-86 91-87 92-88 93-89 94-90 95-91 96-92 97-93 98-94 99-95 Helix D 107-104 109-105 110-106 111-107 310 helix 2 118-115 119-116 High pH Low pH 56 86 38 97 68 61 13 14 98 95 93 99 98 86 92 68 94 94 60 49 91 98 99 96 57 98 98 91 74 87 13 95 43 40 40 0 1 7 72 65 0 0 B) Additional hydrogen bonds with significant populations (ⱖ10%) in one or both of the simulations NH-CO Helix A 13-11 16-12 Helix B 26-22 32-29 ␤-sheet 47-44 310 helix 1 81-77 High pH Low pH NH-CO 31 13 23 13 96 7 60 14 36 52 0 45 82-78 82-79 84-79 Helix C 98-95 Helix D 108-104 108-105 110-105 111-106 High pH Low pH 0 0 0 49 31 21 0 12 92 0 18 51 23 46 0 65 †The populations listed are percentages over the simulation time 300–700 ps. holo bovine ␣-lactalbumin into the native apo form does not occur directly but via a molten-globule state (V. Forge et al., unpublished results). Presumably partial unfolding of the protein is required to liberate the calcium. Previous studies have shown that isotropic B factors calculated using protein simulation trajectories may be unreliable even if simulation lengths of approximately 500 ps (after equilibration) are used.30 This reflects the fact that atomic fluctuations may not converge fully within this time scale. In addition, experimental crystallographically refined B factors reflect atomic positional disorder and are sensitive to the details of the refinement protocol. A detailed comparison of the B factors calculated from the 81 ␣-lactalbumin simulation with experimental values is therefore not warranted. Indeed the calculated B factors for ␣-lactalbumin (mean for C␣ 0.12 nm2 using 300–700 ps simulation time) are considerably lower on average that those in the X-ray structure of Acharya et al.8 (mean 0.22 nm2). This suggests that a longer simulation time would be required to sample fully the internal motions of the protein. However, various trends can be identified regarding the mobility of human ␣-lactalbumin from the magnitude of the fluctuations in the simulation over the 700 ps time scale covered. Looking first at the main chain, from 300–700 ps the largest fluctuations in the simulation are for Asp 45, Glu 46, Thr 112, Glu 113, Lys 114 (C␣ RMS position fluctuations 0.13–0.14 nm; calculated B factors 0.44–0.52 nm2) and Leu 123 (C␣ RMS position fluctuation 0.22 nm2; B factor 1.27 nm2). All these residues are in exposed positions where atomic motions will be more readily allowed because of the absence of close packing. Asp 45 and Glu 46 form a turn between the first two strands of the ␤-sheet, Thr 112-Lys 114 are located in a surface loop linking helix D and the C-terminal 310 helix, and Leu 123 is the C-terminal residue of the polypeptide chain. It is of interest that these residues correlate very well with those that have the highest B factors experimentally although the experimental values of the latter are larger, in general, than those calculated from the trajectory; in the X-ray structure the highest C␣ atom B factors (0.55–0.86nm2) are observed for Asp 45, Glu 46, Glu 113, Lys 114, Glu 121, Lys 122, and Leu 123. In the X-ray structure determined at pH 6.5 by Acharya et al.8 which was used as the starting structure for the simulations, no residues are defined as having multiple conformations. However, there are seven residues with multiple side chain conformations in a human ␣-lactalbumin structure determined by Harata and Muraki9 at pH 6.5. (Harata and Muraki have determined structures of human ␣-lactalbumin at both pH 6.5 and pH 4.2. The structure at pH 6.5 is closely similar to that of Acharya et al. with the exception of the C-terminus; the C␣ atom positional RMSD between the two structures for residues 1–120 is 0.02 nm.) All but one of the residues with multiple side chain conformations occupy exposed positions on the protein surface. Therefore it might be expected that these residues have mobile side chains in solution. Indeed NMR studies have suggested that in general more than 60% of residues on the surfaces of proteins populate multiple 1 conformations in solution.31 Met 30, however, has two different side chain conformations in the ␣-lactalbumin structure of Harata and Muraki (differing in the 2 and 3 torsion angles; Fig. 6a) but is completely buried (zero solvent accessibility). This residue is situated in one of the hydrophobic clusters in the core of the protein. This side chain also has elevated B factors in the X-ray structure of Acharya et al. (particularly for the S␦ atom 0.31nm2 compared with a mean of 0.14nm2 for other buried residues) and the 2 torsion angle observed in this structure (21°) differs from either of those in the structure of Harata and Muraki (2 angles 176° and ⫺42°; Fig. 6a). 82 L.J. SMITH ET AL. Fig. 4. The main chain conformation of residues Trp 104 to Leu 119 in human ␣-lactalbumin. Examples of instantaneous structures are shown taken from the simulations of the protein (after 600 ps) at high pH (a) and low pH (b). Main-chain hydrogen bonds are indicated by dashed lines, the residues involved being labelled with the residue number. To provide the same orientation for panels a) and b) the two structures shown were superimposed using the backbone atoms of residues 104–119. Fig. 5. Residues involved in the calcium-binding site of human ␣-lactalbumin. Examples of instantaneous structures are shown taken from the simulations of the protein (after 600 ps) at high pH (a) and low pH (b). In each case the main chain is drawn together with the aspartate side chains that form the calcium-binding ligands. Hydrogen bonds are shown by dashed lines, and in the high pH structure a circle and crosses indicate the position of the calcium ion and bound water molecules respectively. To provide the same orientation for panels a) and b) the two structures shown were superimposed using the backbone atoms of residues 76–88. TABLE III. Distances (in nm) Between the Calcium Ion and the Protein Atom and Water Ligands in the X-Ray Structure8 and the High pH Human ␣-Lactalbumin Simulation (Average for 300–700 ps) Distance X-Ray Simulation Ca2⫹-79O Ca2⫹-82OD1 Ca2⫹-84O Ca2⫹-87OD1 Ca2⫹-88OD1 Ca2⫹-OW1 Ca2⫹-OW2 0.234 0.235 0.224 0.239 0.242 0.230 0.250 0.245 ⫾ 0.008 0.266 ⫾ 0.010 0.246 ⫾ 0.009 0.262 ⫾ 0.008 0.260 ⫾ 0.008 0.247 ⫾ 0.008 0.246 ⫾ 0.008 Analysis of the conformation of Met 30 in the simulation shows that there are considerable fluctuations about the side chain torsion angles for this residue. In particular, two 2 and seven 3 transitions of 120° or greater are observed during the high pH simulation (Fig. 6b). This contrasts with the behavior in the simulation of the majority of other buried side chains in ␣-lactalbumin. For 20 of the 24 residues in the protein which have side chain accessibilities of less than 1%, no dihedral angle transitions are Fig. 6. The conformation of Met 30. a) The X-ray structures of Acharya et al.8 and Harata and Muraki9 at pH 6.5. b) Instantaneous structures taken at 100 ps intervals through the ␣-lactalbumin simulation at high pH. The structures are superimposed using the main chain atoms of Met 30 (positioned at the top of the figure), the same orientation being used in a) and b). MD SIMULATIONS OF ␣-LACTALBUMIN observed during the simulation. The buried residues in addition to Met 30 for which dihedral transitions of 120° or greater are seen are Leu 12 (2 transitions), Leu 52 (9 transitions) and Ile 55 (14 transitions). In the ␣-lactalbumin structure Met 30 is adjacent to the aromatic rings of Phe 53 and Tyr 104. Interestingly, Leu 52 and Ile 55 also cluster together beside these two aromatic rings. Longer simulations times would be required to characterize the side chain motions in detail. However the current results show that within this hydrophobic core there is some flexibility in the native protein. This is clearly in agreement with the experimentally observed multiple conformations for Met 30. Multiple side chain conformations are not observed for Leu 52 or Ile 55 in either of the X-ray structures determined at pH 6.5 but two different conformations are defined for the side chain of Ile 55 in the pH 4.2 X-ray structure of human ␣-lactalbumin.9 The C␦1 atom B factor of Leu 52 is also elevated in the pH 6.5 X-ray structure of Harata and Muraki (0.28 nm2 compared with a mean for the buried residues of 0.15 nm2). The conformational heterogeneity of side chains in a hydrophobic core of human ␣-lactalbumin identified here is of interest with respect to the substantially poorer quality of the NMR spectra of human ␣-lactalbumin compared with those of the homologous hen lysozyme.22 Although many factors contribute to this difference in spectral quality, it in a large part reflects the broader line widths of resonances in the ␣-lactalbumin spectra. Line broadening can result from conformational exchange due to mobility within the protein structure.32 Analysis of 15N relaxation data for the ␣-lactalbumin main chain provides evidence for conformational exchange for a significant number of residues in the ␣ domain of the protein (C. Redfield and C.M. Dobson, unpublished data). The effects are most extreme for Gly 19 and Phe 31; the ratios of T1 to T2 relaxation times for the 15N nuclei of these residues are more than a factor of two greater than those for residues in the ␤ domain of the protein (data recorded at a 1H frequency of 750 MHz). In addition, the NMR resonances of Leu 105 and Lys 108 cannot be identified, this being thought to be due to extreme broadening of these resonances (C. Redfield and C.M. Dobson, unpublished data). Two of the residues with the largest conformational exchange effects (Phe 31 and Leu 105) are adjacent in the polypeptide chain to Met 30 or Tyr 104 within the region of the protein core where there are significant side chain fluctuations within the simulation. The time scale of motions required for conformational averaging to be observed experimentally is much longer than the 700 ps probed within the MD simulation. However, the conformational flexibility which allows fluctuations within the interior of the structure in the ␣-lactalbumin simulation is likely to permit slower motions to occur which are revealed in the experimental NMR relaxation behavior. Changes to the Structure and Dynamics at Low pH As with the high pH simulation, analysis of the variations in the energies and structural parameters shown in Figures 2 and 3 for the low pH simulation suggests that 83 the system has substantially equilibrated within 200–300 ps. After equilibration there are essentially no differences between the radius of gyration or the solvent-accessible surface area of the high and low pH structures in the simulations. The C␣ and all atom-positional RMSD’s with respect to the X-ray structure for the low pH simulation stabilize at values slightly higher than those for the high pH simulation (0.22nm for C␣ atoms, 0.30nm for all atoms). These values are similar to the positional RMSD’s between the structures in the low and high pH simulations after 700 ps (0.20nm for C␣ atoms, 0.29 nm for all atoms). In the simulation of the protein at low pH there are some changes to the secondary structure of the protein (Table II). There are minor differences in the hydrogen bonds populated at the termini of the A and B helices but the most significant changes are to the D helix and the two 310 helices in the protein. Concentrating first on the D helix (Ala 106-Leu 110), there are no main chain hydrogen bonds involving these residues that are populated for more than 70% of the simulation time (from 300–700 ps). For this region at low pH the hydrogen bonds with the greatest population are between residues with an i,i⫹3 or i,i⫹5 separation rather than the ␣-helical i,i⫹4 separation. In particular the hydrogen bonds 107NH-104CO, 108NH105CO and 111NH-106CO are observed (40, 46, and 65% populations over 300–700 ps). In the region corresponding to the native C-terminal 310 helix (116–118) no main chain hydrogen bonds with a population greater than 5% (over 300–700 ps) are observed in the low pH simulation (Fig. 4b). The conditions of low pH have disrupted the conformation of the C-terminal region of the polypeptide chain. The simulation has been able to reproduce the local unfolding of helix D, the major change observed experimentally between the X-ray structures of ␣-lactalbumin determined at pH 6.5 and pH 4.2 (see above).9 In the X-ray structure, however, the loss of helix D is associated with a substantial movement of the side chain of His 107. No significant changes to this side chain are seen in the simulation. A movement to an exposed position such as that observed in the X-ray structure at pH 4.2 would, however, be likely to occur over a much longer time scale than the 700 ps monitored here. There are some changes observed in the side chain contacts for the C-terminal region in the low pH simulation. These are related to the loss of the main chain helical hydrogen bonds and the protonation of glutamate, aspartate, and histidine side chains. For example, hydrogen bonds between the side chain oxygen atoms of Glu 116 and the main chain amide groups of Leu 115, Glu 116 and Gln 117 (30%, 56%, and 86% populations at high pH respectively) are completely absent in the low pH simulation. These atoms are brought into close proximity at high pH by the main chain 310 helical hydrogen bonds. Although the secondary structure in the C-terminal region of the protein sequence is retained in the simulation at high pH it is interesting that even in this simulation helix D has more variation in its main chain hydrogen bonding than the other ␣-helices. At high pH four non ␣-helical main chain hydrogen bonds (107NH-104CO, 84 L.J. SMITH ET AL. 108NH-105CO, 110NH-105CO, 111NH-106CO) have significant populations in this region of the sequence and three of these have increased populations in the low pH simulation (Table II). This could reflect sampling a low population of conformers characteristic of the low pH ensemble even at the higher pH. This hypothesis is supported by previous observations of elevated main chain B factors for the C-terminal residues in the X-ray structures of ␣-lactalbumin at pH 6.5 compared to those at pH 4.2.9 It is also consistent with the exchange broadening, particular for Leu 105 and Lys 108, identified by the 15N relaxation studies discussed above. This proposed sampling of a low population of conformers with a disordered C-terminal sequence at high pH is of significance with respect to understanding data from experimental studies of the exchange rates of main chain amide hydrogens in native ␣-lactalbumin (at pH 6.3).12 No protection from exchange is observed for residues in either the D helix or the C-terminal 310 helix in the native protein. Interestingly, despite the fast amide proton exchange rates in the C-terminal sequence, this region of the structure is most resistant to unfolding on the addition of urea to the molten-globule A-state of the protein.33 The results therefore suggest that the resistance to denaturation reflects stabilization from hydrophobic contacts rather than from secondary structure. Evidence in support of this comes from an observed close similarity the hydrophobic contacts present in the C-terminal sequence in the X-ray structure and at the end of the low pH simulation (average structure from the final 100 ps). Considering contacts between the side chains of aromatic residues in this region (Tyr 103, Trp 104, His 107, and Trp 118) and other hydrophobic side chains within the protein, contacts between 17 residue pairs are observed in the X-ray structure and 15 of these pairs retain hydrophobic contacts in the low pH conformer. The other 310 helix in human ␣-lactalbumin is adjacent to the calcium-binding site. In the low pH simulation in the absence of calcium this helix is elongated compared with that present in the X-ray structure but has an irregular pattern of hydrogen bonds (Fig. 5b and Table II). In this region several main chain i,i⫹3, i,i⫹4 and i,i⫹5 hydrogen bonds are populated. Two of these hydrogen bonds involve the carbonyl oxygen atom of Lys 79 (82NH-79CO and 84NH-79CO) which is one of the ligands in the calciumbinding site. In addition the amide of Asp 83, which because of the irregular hydrogen-bonding pattern is not involved in a main chain hydrogen bond, makes a hydrogen bond with the side chain of Asp 82. This side chain is also a ligand in the native calcium-binding site. The features of the calcium-binding site are therefore substantially disrupted in the low pH simulation. The RMS fluctuations about dihedral angles and the total number of dihedral angle transitions have also been compared in the two simulations (Table IV). As with the structural parameters, the fluctuations are initially larger in each simulation as the system equilibrates, but the extent of the motions stabilizes within the first 300 ps of the simulations. The fluctuations about the main chain and torsion angles are very similar in the two simulations. For example, the RMS fluctuations about from 300–700 ps are 15.6° and 15.7° in the high and low pH simulations respectively. Differences are seen however in the motions of the side chains. In particular the fluctuations about the 1, 2, and 3 torsion angles are greater in the low pH simulation. The RMS fluctuations from 300– 700 ps are 21.3° (1), 52.8° (2), and 67.3° (3) at high pH compared to 30.0° (;1), 59.0° (2), and 122.9° (3) at low pH. In addition, there are significantly more dihedral angle transitions for the side chains at low pH (563 transitions of 120° or greater at low pH compared to 380 at high pH). These differences may reflect in part the smaller number of hydrogen bonds involving side chains present during the low pH simulation. Counting only hydrogen bonds with a population greater than 10% in the simulations between 300–700 ps, there are 81 hydrogen bonds in the high pH simulation compared with 56 at low pH. The observed increase in fluctuations at low pH is of considerable interest as it indicates an increased flexibility and disorder in the protein interior under these conditions. If the simulation were to be run for much longer times, one would be able to sample more fully this disorder in the side chain packing within the protein core in equilibrium at low pH. Such results would correlate with the experimentally observed formation of the molten-globule state under conditions of low pH in which the side chain packing is disordered across the conformational ensemble.16,17,19 They are also in agreement with MD simulations that have shown that the packing of amino acid side chains in the native human ␣-lactalbumin structure can be substantially disrupted whilst retaining a low-energy native-like fold.18 CONCLUSIONS Despite increasing interest in non-native protein conformations2–7 our understanding of their structural and dynamical properties at an atomic level is still limited. This is primarily because of the considerable conformational disorder associated with such states and the resulting difficulties in their study and description. In general non-native states must be described explicitly in terms of ensembles of interconverting conformers.6,34,35 This poses problems for both experimental and theoretical studies. Experimental data represent an average over the conformational ensemble and are therefore difficult to interpret in structural terms.6,36 For theoretical MD studies, currently accessible simulation times are not nearly long enough when using conventional procedures to sample the full conformational ensemble, let alone characterize the process of interconversion between native and non-native species. A range of MD approaches has however been developed that attempt to overcome these problems. These provide an important starting point for interpreting experimental data. The MD approaches include using simplified force fields or elevated temperatures in the simulations, running series of simulations from different starting structures, and simulating peptide fragments rather than the full length polypeptide chain.37,38 MD SIMULATIONS OF ␣-LACTALBUMIN 85 TABLE IV. Root-Mean-Square Fluctuations About Torsion Angles and the Total Number of Torsion Angle Transitions in the Human ␣-Lactalbumin Simulations† A. RMS fluctuations (in degrees)a 100–200 200–300 300–400 Time period (ps) 400–500 500–600 600–700 300–700 High pH Low pH 14.5 15.3 13.8 15.3 14.3 14.0 13.2 14.8 13.8 13.7 13.2 14.6 15.6 15.7 High pH Low pH 13.6 15.0 13.4 15.3 14.0 13.7 12.6 14.7 13.3 13.5 12.6 14.4 15.7 15.9 High pH Low pH 20.1 19.9 17.0 21.8 16.7 18.3 13.7 20.5 15.7 21.3 15.6 20.3 21.3 30.0 High pH Low pH 38.1 44.3 29.4 35.3 31.4 32.8 32.2 39.0 25.9 38.3 27.4 35.0 52.8 59.0 High pH Low pH 57.1 83.7 47.1 57.9 44.1 61.7 40.9 48.9 38.5 66.6 46.2 56.9 67.3 122.9 Time period (ps) 400–500 500–600 1 2 3 B. Number of torsion angle transitionsb All main chain High pH Low pH All side chain High pH Low pH ⱖ120° side chainc High pH Low pH 100–200 200–300 300–400 600–700 300–700 616 808 670 839 589 749 639 727 627 730 587 631 2416 2806 843 1330 848 1095 677 1063 663 970 684 965 680 1013 2710 4011 149 193 129 160 100 161 70 137 109 153 98 103 380 563 †The side chains of proline and cysteine residues are excluded from this analysis. fluctuations for 4 are not listed due to the small number of residues with this torsion angle. bThe total number of torsion angle transitions of 60° or greater (All) and of 120° or greater (ⱖ120°) are listed. cThere are no main chain torsion angle transitions of 120° or greater in either of the simulations. aThe Here we have used a different strategy which does not attempt to characterize the denaturation of ␣-lactalbumin directly. Instead we have concentrated on events that could be related to the very initial steps of the denaturation at acidic pH values. Similar approaches have been used, for example, in studies of the pH-dependent conformations of a peptide fragment39 and of the denaturation of barnase under acidic conditions and in 8 M urea.40,41 The results for ␣-lactalbumin are promising in that a number of features correlate at least in general terms with experimental data. Most notably, structural changes are seen in the C-terminal region of the protein at low pH resulting in a disruption of the D helix and the C-terminal 310 helix. In addition there are increases in the side chain fluctuations at low pH which might be related to the experimental equilibration of the protein, over longer time scales, to the molten-globule state. The work has also increased our understanding of the properties of native ␣-lactalbumin. In particular significant side chain fluctuations involving Met 30, Leu 52, and Ile 55 in a buried hydrophobic core were observed in the simulation at high pH. This flexibility in the protein core correlates with multiple side chain conformations and elevated B factors observed in X-ray structures of the protein. It may also be related to the exchange broadening evident in data from 15N relaxation studies of the native protein. This work therefore provides further evidence that, as computing power increases and longer simulation times become accessible, MD techniques will provide an increasingly important method for characterizing not only the dynamics of native protein conformations but also of much longer-scale conformational transitions associated with protein denaturation. ACKNOWLEDGMENTS This is a contribution from the Oxford Centre for Molecular Sciences which is supported by the U.K. Biotechnology and Biological Sciences Research Council, the Engineering and Physical Sciences Research Council, and the Medical Research Council. The research of C.M.D. is supported in part by the Wellcome Trust and by an International Research Scholars award from the Howard Hughes Medical Research Institute. L.J.S. is a Royal Society Research Fellow. W.F.v.G. gratefully acknowledges financial support given by the Underwood Fund for his stay as a visiting 86 L.J. SMITH ET AL. Professor in Oxford and financial support from the Schweizerischer National Fonds (project 21-50929.97). We thank Christina Redfield for valuable discussions. 22. REFERENCES 1. Tanford C. Protein denaturation. Adv Protein Chem 1968;23:121– 282. 2. Dill KA, Shortle D. Denatured states of proteins. Ann Rev Biochem 1991;60:795–825. 3. Dobson CM. Unfolded proteins, compact states and molten globules. Curr Opin Struct Biol 1992;2:6–12. 4. Ptitsyn OB. Molten globule and protein folding. Adv Protein Chem 1995;47:83–229. 5. Shortle D. The denatured state (the other half of the folding equation) and its role in protein stability. FASEB J 1996;10:27–34. 6. Smith LJ, Fiebig KM, Schwalbe H, Dobson CM. The concept of a random coil. Residual structure in peptides and denatured proteins. Folding Design 1996;1:R95–R106. 7. Thomas PJ, Qu BH, Pedersen PL. Defective protein folding as a basis of human disease. Trends Biochem Sci 1995;20:456–459. 8. Acharya KR, Ren J, Stuart DI, Phillips DC, Fenna RE. Crystal structure of human ␣-lactalbumin at 1.7Å resolution. J Mol Biol 1991;221:571–581. 9. Harata K, Muraki M. X-ray structural evidence for a local helix-loop transition in ␣-lactalbumin. J Biol Chem 1992;267:1419– 1421. 10. Kuwajima K. The molten globule state of ␣-lactalbumin. FASEB J 1996;10:102–109. 11. Gast K, Zirwer D, Welfle H, Bychkova VE, Ptitsyn OB. Quasielastic light scattering from human ␣-lactalbumin: comparison of molecular dimensions in native and molten globule states. Int J Biol Macromol 1986;8:231–236. 12. Schulman BA, Redfield C, Peng ZY, Dobson CM, Kim PS. Different subdomains are most protected from hydrogen exchange in the molten globule and native states of human ␣-lactalbumin. J Mol Biol 1995;253:651–657. 13. Peng ZY, Wu LC, Kim PS. Local structural preferences in the ␣-lactalbumin molten globule. Biochemistry 1995;34:3248–3252. 14. Wu LC, Peng ZY, Kim PS. Bipartite structure of the ␣-lactalbumin molten globule. Nat Struct Biol 1995;2:281–286. 15. Dolgikh DA, Gilmanshin RI, Brazhnikov EV, Bychkova VE, Semisotnov GV, Venyaminov SY, Ptitsyn OB. ␣-lactalbumin compact state with fluctuating tertiary structure. FEBS Lett 1981;136:311–315. 16. Dolgikh DA, Abaturov LV, Bolotina IA, Brazhnikov EV, Bychkova VE, Gilmanshin RI, Lebedev YO, Semisotnov GV, Tiktopulo EI, Ptitsyn OB. Compact state of a protein molecule with pronounced small-scale mobility: bovine ␣-lactalbumin. Eur Biophys J 1985;13: 109–121. 17. Dobson CM, Hanley C, Radford SE, Baum J, Evans PA. Characterization of unfolded and partially folded states of proteins by NMR spectroscopy. In: Nall BT, Dill KA, editor. Conformations and forces in protein folding. Washington, DC: AAAS Publications; 1991. p 175–181. 18. Smith LJ, Dobson CM, van Gunsteren WF. Side-chain conformational disorder in a molten globule: molecular dynamics simulations of the A-state of human ␣-lactalbumin. J Mol Biol 1999; 286:1567–1580. 19. Baum J, Dobson CM, Evans PA, Hanley C. Characterization of a partly folded protein by NMR methods: studies on the molten globule state of guinea pig ␣-lactalbumin. Biochemistry 1989;28:7– 13. 20. van Gunsteren WF, Billeter SR, Eising AA, Hünenberger PH, Krüger P, Mark AE, Scott WRP, Tironi IG. Biomolecular simulation: the GROMOS96 manual and user guide. Zürich:Vdf Hochschulverlag AG an der ETH Zürich; 1996. 21. Berendsen HJC, Postma JPM, van Gunsteren WF, Hermans J. Interaction models for water in relation to protein hydration. In: 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. Pullman B, editor. Intermolecular forces. Dordrecht, The Netherlands: Reidel; 1981. p 331–342. Alexandrescu AT, Broadhurst RW, Wormald C, Chyan CL, Baum J, Dobson CM. 1H-NMR assignments and local environments of aromatic residues in bovine, human and guinea pig variants of ␣-lactalbumin. Eur J Biochem 1992;210:699–709. Berendsen HJC, Postma JPM, van Gunsteren WF, Dinola A, Haak JR. Molecular dynamics with coupling to an external bath. J Chem Phys 1984;81:3684–3690. Ryckaert JP, Ciccotti G, Berendsen HJC. Numerical intergration of the Cartesian equations of motion of a system with constraints: molecular dynamics of n-alkanes. J Comput Phys 1977;23:327– 341. van Gunsteren WF, Berendsen HJC. Computer simulation of molecular dynamics — methodology, applications, and perspectives in chemistry. Angew Chem Int Ed Eng 1990;29:992–1023. Smith PE, van Gunsteren WF. Consistent dielectric properties of the simple point charge and extended point charge water models at 277 and 300K. J Chem Phys 1994;100:3169–3174. Hubbard SJ, Thornton JM. NACCESS computer program. London: Department of Biochemistry and Molecular Biology, University College London; 1993. Kabsch W, Sander C. Dictionary of protein secondary structure: pattern recognition of hydrogen bonded and geometrical features. Biopolymers 1983;22:2577–2637. Brocklehurst SM, Perham RN. Prediction of the three dimensional structure of the biotinylated domain from yeast pyruvate carboxylase and of the lipoylated H-protein from the pea leaf glycine-cleavage system: a new automated method for the prediction of protein tertiary structure. Protein Sci 1993;2:626–639. Hünenberger PH, Mark AE, van Gunsteren WF. Fluctuation and cross-correlation analysis of protein motions observed in nanosecond molecular dynamics simulations. J Mol Biol 1995;252:492– 503. West NJ, Smith LJ. Side-chains in native and random coil protein conformations. Analysis of NMR coupling constants and 1 torsion angle preferences. J Mol Biol 1998;280:867–877. Smith LJ, Dobson CM. NMR and protein dynamics. Int J Quant Chem 1996;59:315–332. Schulman B, Kim PS, Dobson CM, Redfield C. A residue-specific NMR view of the non-cooperative unfolding of a molten globule. Nature Struct Biol 1997;4:630–634. Gillespie JR, Shortle D. Characterization of long-range structure in the denatured state of staphylococcal nuclease. 2. Distance restraints from paramagnetic relaxation and calculation of an ensemble of structures. J Mol Biol 1997;268:170–184. Dobson CM, Sali A, Karplus M. Protein folding: a perspective from theory and experiment. Angew Chem Int Ed Eng 1998;37:868– 893. Shortle DR. Structural analysis of non-native states of proteins by NMR methods. Curr Opin Struct Biol 1996;6:24–30. van Gunsteren WF, Hünenberger PH, Kovacs H, Mark AE, Schiffer CA. Investigation of protein unfolding and stability by computer simulation. Philos Trans R Soc. Lond B Biol Sci 1995;348: 49–59. Brooks CL. Simulations of protein folding and unfolding. Curr Opin Struct Biol 1998;8:222–226. Kirshenbaum K, Daggett V. pH dependent conformations of the amyloid ␤ (1–28) peptide fragment explored using molecular dynamics. Biochemistry 1995;34:7629–7639. Caflisch A, Karplus M. Acid and thermal-denaturation of barnase investigated by molecular dynamics simulations. J Mol Biol 1995;252:672–708. Tirado-Rives J, Orozco M, Jorgensen WL. Molecular dynamics simulations of the unfolding of barnase in water and 8 M aqueous urea. Biochemistry 1997;36:7313–7329. Kraulis PJ. Molscript: a program to produce both detailed and schematic plots of protein structures. J Appl Crystallogr 1991;24: 946–950.