PROTEINS: Structure, Function, and Genetics 34:453–463 (1999) Molecular Dynamics and Accuracy of NMR Structures: Effects of Error Bounds and Data Removal François-Regis Chalaoux, Seán I. O’Donoghue, and Michael Nilges* Structural Biology Programme, European Molecular Biology Laboratory, Heidelberg, Federal Republic of Germany ABSTRACT The effect of internal dynamics on the accuracy of nuclear magnetic resonance (NMR) structures was studied in detail using model distance restraint sets (DRS) generated from a 6.6 nanosecond molecular dynamics trajectory of bovine pancreatic trypsin inhibitor. The model data included the effects of internal dynamics in a very realistic way. Structure calculations using different error estimates were performed with iterative removal of systematically violated restraints. The accuracy of each calculated structure was measured as the atomic root mean square (RMS) difference to the optimized average structure derived from the trajectory by structure factors refinement. Many of the distance restraints were derived from NOEs that were significantly affected by internal dynamics. Depending on the error bounds used, these distance restraints seriously distorted the structure, leading to deviations from the coordinate average of the dynamics trajectory even in rigid regions. Increasing error bounds uniformly for all distance restraints relieved the strain on the structures. However, the accuracy did not improve. Significant improvement of accuracy was obtained by identifying inconsistent restraints with violation analysis, and excluding them from the calculation. The highest accuracy was obtained by setting bounds rather tightly, and removing about a third of the restraints. The limiting accuracy for all backbone atoms was between 0.6 and 0.7 Å. Also, the precision of the structures increased with removal of inconsistent restraints, indicating that a high precision is not simply the consequence of tight error bounds but of the consistency of the DRS. The precision consistently overestimated the accuracy. Proteins 1999;34:453–463. r 1999 Wiley-Liss, Inc. Key words: NOE; precision; protein structure; simulated annealing complicated because different types of dynamics may either increase the NOE (fluctuations in the inter-proton distance) or decrease it (fluctuations in the angle between the inter-proton vector and the internal coordinate system), leading to either under- or over-estimation of the average distance between the protons. These effects may partially cancel each other.3 In the standard structure calculation protocol for macromolecular NMR structures, one estimates the errors qualitatively, and fits a rigid structure to appropriately loosely set error bounds. Different suggestions have been made for the choice of these error bounds; e.g., restraints are classified into three classes (weak, medium, strong) with upper limits of 2.7, 3.6, and 5.0 Å,4,5 all upper limits are set to 6 Å,6 or the lower and upper limits are set as functions of the volume or the derived distance.7–10 Currently, there is no consensus as to the best criterion for setting these bounds. How the selection of error bounds affect the quality and accuracy of the structures has been the focus of several studies.5,6,11,12 While the practice of setting loose error bounds has been generally successful (the majority of NMR structures have been determined this way), it is clear that information content of the data is lost if the error bounds are set too loosely. This is unsatisfactory in particular for structure validation. On the other hand, error bounds that are too narrow will lead to distortions in the structure. How structure accuracy is affected by incorrect distance estimates and the choice of error bounds needs to be studied in a carefully designed model system that allows a meaningful assessment of the accuracy. It is difficult to define accuracy for a structure undergoing significant internal dynamics, since we do not know which rigid reference structure to refer to. In one study,5 model data was generated from an ensemble of experimental NMR structures of protein G,13 and the reference structure was INTRODUCTION Abbreviations: BPTI, bovine pancreatic trypsin inhibitor; DRS, distance restraint set; MD, molecular dynamics; NMR, nuclear magnetic resonance; NOE, nuclear Overhouser effect; PDB, protein data bank; RMS, root mean square. Internal dynamics has long been acknowledged as a fundamental problem in the derivation of the threedimensional structures of macromolecules by nuclear magnetic resonance (NMR).1,2 All parameters measured by NMR are time—and ensemble—averages. In contrast to X-ray crystallography, where the structure factors are linear superpositions over different conformations, these averages may be strongly non–linear, and depend on the time-scale of the motion. For the NOE, the analysis is Grant Sponsor: Deutsche Forschungsgemeinschaft; Grant number: Ni499/1-1; Grant sponsor: The Supercomputing Resource for Molecular Biology at the European Molecular Biology Laboratory, funded by a European Union Human Capital and Mobility Access to Large Scale Facilities; Grant number: ERBCHGECT940062. François-Regis Chalaoux’s present address is Synthelabo Biomoleculaire, Strasbourg 67080 CEDEX, France. *Correspondence to: Michael Nilges, Structural Biology Programme, European Molecular Biology Laboratory, Meyerhofstr. 1, D-69117 Heidelberg, Federal Republic of Germany. Received 10 August 1998; Accepted 11 November 1998 r 1999 WILEY-LISS, INC. 454 F.-R. CHALAOUX ET AL. chosen as the minimized coordinate average over the ensemble. The study used a realistic amount of experimental data, since experimentally observed NOEs were used as a basis for generating the model data, and it included dynamic effects to some degree, since model distances were calculated from the ensemble as 7r⫺68⫺1/6 averages, where r is the distance between two protons. The conclusion of the study was that the limiting accuracy for an NMR structure is around 0.4 Å, which is also considered the limiting accuracy for X-ray crystal structures.14 A similar number was derived from the comparison of several NMR and crystal structures.15 However, Zhao and Jardetzky6 have criticized the model study, pointing out that it contains circular reasoning; in particular, in the model study, the same energy parameters were used as for the calculation of the original ensemble of experimental structures. They argue that the reference structure has to be determined with an independent technique. Therefore, they used the protonated X-ray crystal structure of bovine pancreatic trypsin inhibitor (BPTI) as reference structure and for the derivation of a model data-set. Random noise was added to the inter-proton distances in the X-ray structure to mimic the effects of experimental errors and internal dynamics. The accuracy in their study was around 1 Å. The relevance of their study is limited, since the ‘‘noise’’ originating from internal dynamics is not random but contains correlations. It is also difficult to evaluate the results, since the noise was added in a more or less arbitrary manner. The best technique to generate model NOE data is to use molecular dynamics (MD) calculations, since spectral densities and thus crossrelaxation rates and NOEs can be calculated with few approximations, and no a priori assumptions need to be made about the nature of the averaging.3,16,17–20 MD-generated model data was used to test time-averaged restraint techniques.21–23 However, these studies again suffered from circular reasoning, since distances were extracted from the trajectories as simple distance averages (i.e., angular averaging was neglected and thus perfect knowledge of the type of averaging was assumed), and since the force fields used for generating the data generation trajectory and the refinement trajectory were identical. The principal problem with internal dynamics is that it can lead to mutually inconsistent NOEs. Recently, methods have been introduced to automatically identify and remove incorrect distance restraints in model building,24,25 by an iterative statistical analysis of the violations. Similar ideas were introduced to identify relatively large errors in distance restraints due to incorrect assignments26 and noise peaks.10,26,27 The effect of removing or re-setting slightly inconsistent NOEs, such as those originating from internal dynamics, has not been tested yet. In a similar spirit, error bounds are often increased manually only for certain restraints to avoid violations (e.g., ref. 15). We have used an MD trajectory of BPTI to generate a model system that avoids circular reasoning and introduces errors in inter-proton distances due to internal dynamics in a realistic and meaningful way. Starting from the X-ray crystal structure, the MD trajectory was calcu- lated for several nanoseconds in order to allow significant internal dynamics to occur. NOEs were extracted from the trajectory by calculating spectral densities from vector autocorrelation functions.20 Roughly half of the correlation functions had not converged in 6.6 ns simulation time. For the present study, in order to obtain a complete set of spectral densities, we assumed slow dynamics for the non-converged correlation functions, and estimated spectral densities from average order parameters and 7r⫺68 distance averages over the trajectory. NOEs to protons not assigned experimentally28 were removed. In this way, we arrived at a realistic number of distance restraints, when compared with an experimental study of the same protein.29 ‘‘Noise’’ in the distance restraints in our study arises from under—or over— estimation of the distances calculated from the spectral densities, when compared to the arithmetic mean of the distances in the trajectory, or the distances in the reference structure. Hence, compared with the previous studies by Clore et al.5 and Zhao and Jardetzky,6 our system has a more realistic model of the noise. A single reference structure was calculated from the MD trajectory using the probability map method.30 This method produces an average structure with good covalent geometry and packing. Since the method uses structure factor refinement, the structure is equivalent to an X-ray structure refined against the average structure factor of the trajectory. Using this model system, we addressed several points. Firstly, we systematically investigated the effect of different error bounds on structure quality and accuracy. In our model system, low-energy structures could only be obtained by using very wide bounds. However, these structures had poor accuracy. Secondly, we studied the effect of removing systematically violated restraints from the data. We found that those distances most affected by internal dynamics were identified preferentially in the violation analysis, and that the removal of these restraints improved the accuracy. The most accurate structures were obtained with rather tight error bounds and roughly one third of the data removed. In all our calculations, precision overestimated accuracy. Finally, we tried to realistically estimate the accuracy that can be achieved for a protein with significant internal dynamics. METHODS Molecular Dynamics Trajectory A molecular dynamics simulation of 6.6 ns was performed using the program X-PLOR31 employing the CHARMM extended atom energy function PARAM1932 (see ref. 20 for details). The initial set of atomic coordinates was obtained from the crystal form II structure of BPTI.33 Polar hydrogens were added34 and the resulting structure minimized with 100 steps of conjugate gradient minimization, followed by 50 ps of Langevin dynamics with an integration step of 2 fs to release initial stress and to remove bad contacts. Only polar hydrogens were treated explicitly resulting in a total system size of 568 atoms. An implicit solvent model was employed, using a distance dependent dielectric constant ⑀ ⫽ R, scaling of charges on 455 MD AND ACCURACY OF NMR STRUCTURES Lys, Arg, Glu, and Asp residues by a factor of 0.3, and solving the Langevin equation for the solvent accessible side-chains with a friction coefficient of 20 ps⫺1 and including random forces. The cutoff for the non-bonded list generation was set to 9.5 Å, and a switching function32 was applied to non-bonded interactions between 5.0 and 9.0 Å.35 Initial velocities were assigned from a Maxwell distribution at 300 K. The Newton/Langevin equations of motions were integrated with a time step of 2 fs for an overall time of 6.6 ns. Bond lengths were kept rigid during the simulation by use of the SHAKE-method.36 Complete coordinate sets were written every 0.1 ps. For each coordinate set, nonpolar hydrogens were then added with X-PLOR.34 Generation of the Model Data-Set Details of the NMR analysis of the trajectory are presented elsewhere.20 We first calculated the rotational correlation function averaged over three spatial dimensions. We then selected all proton-proton pairs for which the 7r⫺68⫺1/6 distance averaged over the full length of the trajectory was less than 4.5 Å. Vector autocorrelation functions were then calculated for each selected proton pair. Since aliphatic protons were added only after the calculation of the trajectory, no information about the rotation of methyl groups could be extracted. We therefore used the average position of the methyl protons as a reference position for each methyl group and treated the methyl group as a single hydrogen for the purpose of NOE calculations. For each correlation function, a convergence length was estimated by comparing two correlation functions calculated from trajectories of 10% different lengths. The convergence length was defined as the time at which the two correlation functions differed by more than 2.5%. Correlation functions with a convergence length of less than 10 ps were excluded from the analysis. The rotational correlation function was factored out, and the remaining internal correlation function was analyzed for plateaus within the convergence lengths to define the order parameter and effective correlation time.17,37 The order parameters were used to extend the correlation functions to infinity, and spectral densities were then calculated by numerical integration. Cross relaxation rates could then be extracted as described.38 Effective distances reff were calculated from these rates assuming a simple r⫺6 dependence. For all proton pairs excluded from the analysis by the convergence criterion, cross-relaxation rates were estimated by using the 7r⫺68 averaged over the trajectory, multiplied with the overall average over the order parameters. This seemed justified since dynamic processes not converged in the 6.6 ns trajectory of a small protein could reasonably be assumed to be on the same timescale as the overall tumbling. By this procedure, the number of distances reff in the data set could be nearly doubled. All NOEs to protons where no chemical shift assignment was available28 were removed. Reference Structure The average structure of the trajectory was generated by superposition of all frames onto the automatically deter- mined rigid part of the protein,39,40 using X-PLOR. This structure was then optimized by generating a probability density from the fitted frames, and refining into this density using X-ray refinement.30 Error Bounds Error bounds were derived by qualitative classification, 2 , or from the known difference proportional to reff or reff between reff and the reference structure (see Results). To obtain the DRS RSX (see Table I), the distances reff were binned into 0.5 Å bins. In each bin, we calculated the mean and standard deviation of rreference. The points corresponding to the mean ⫾ X times the standard deviation were fitted with straight lines. The error bounds were then defined by these lines. Iterative Structure Calculation Structures were calculated with X-PLOR with a standard simulated annealing protocol starting from random torsion angles.41 In some of the calculations we omitted the high temperature stage, since the two cooling stages alone converged well enough. The NMR refinement parameter files (TOPALLHDG and PARALLHDG, version 4) were used. This version is consistent with the ideal values in the X-ray structure refinement parameter file PARHCSDX,42 and uses atom radii rather similar to those used in the distance geometry program DISGEO.11 We note that the parameters, in particular the non-bonded parameters, are quite different from the PARMH19 parameters used to generate the MD trajectory. The ARIA modules, interfaced to X-PLOR (versions 3.1 and 3.851) were used to analyze structures and restraint violations, essentially as described.10,41 ARIA performs two essential tasks: it assigns ambiguous NOEs by iteratively removing assignment possibilities, and it identifies consistently violated restraints by a statistical analysis of restraint violations, similar to self-correcting distance geometry.24 Only the second task was used for the present application. In iteration zero, all restraints were used. In each following iteration, the eight structures with the lowest energy from the previous iteration were selected. We performed violation analysis as described24 and calculated the fraction Rvio of structures in which a particular restraint is violated by more than a threshold vtol: Rvio ⫽ 1 Sconv Sconv s 兺 ⌰(D ⫺ U ⫺ v tol) ⫹ ⌰(L ⫹ vtol ⫺ D) (1) where ⌰(x) is the Heaviside step function and Sconv is the number of lowest energy structures (i.e., eight). The parameter vtol was set to 0.1 Å in iterations one and two, and to 0.0 Å in iterations three and four. If Rvio exceeded a threshold for a particular restraint, this restraint was removed from the list. In all calculations, the threshold for Rvio was set to 0.75. Since there were no true noise peaks in the data and convergence in each iteration to the correct fold was essentially 100%, we employed a higher value for 456 F.-R. CHALAOUX ET AL. this threshold than usual (0.5; see ref. 41). The complete list of restraints was analyzed in this way in each iteration, so that restraints removed in one iteration could reenter the calculation in a following iteration. All restraints, parameter files and protocols used in this study are available on request. RESULTS MD Trajectory and Reference Structure The trajectory showed significant dynamics especially in the loops. The locations of the most mobile regions correlate well with the experimental NMR structure ensemble,29 normal mode calculations,43 and differences between the X-ray crystal structures in different crystal forms.44 However, the fluctuations are significantly larger than those found in the experimental NMR ensemble (ref. 29; see ref. 20 for a more detailed discussion). The rigid part of the molecule, determined with an automated fitting procedure,39,40 comprised residues 1–8, 11–12, 16–38, and 41–58. The backbone RMS fluctuation around the average structure for this region was 0.68 Å, for all backbone atoms 0.78 Å. The reference structure was determined by refining the average structure against the averaged structure factors calculated from the trajectory.30 Validation of the reference structure with PROCHECK showed 37 of the 46 non-proline and non-glycine residues in the most favored regions and 9 residues in the additional allowed regions. Fig. 1. Scatter plot of the distances reff determined from spectral densities against the trajectory averages 7rtraj8 (cf. Figure 11 in ref. 20). The error bounds for the DRSs RWMS and RL25 are indicated. The Model Data The basis of the model data-set were the cross-relaxation rates, ij, calculated from the trajectory via protonproton vector autocorrelation functions and model-free analysis.37 Whenever the correlation functions had converged (see Methods and ref. 20), effective distances reff were determined from the cross-relaxation rates using a standard r⫺6 dependency (reff,2 in ref. 20). For all nonconverged correlation functions, slow dynamics was assumed, and reff was estimated as (S2 7r⫺68)(⫺1/6), where S2 is the average over all order parameters of the converged correlation functions, and r is the distance in the MD trajectory. After removal of all distances involving protons for which no chemical shift was reported,28 and with the maximum observable distance set to 4.2 Å, a total of 1,543 distances were obtained. Of these, 828 distances were derived from converged correlation functions, the remaining from (S2 7r⫺68)(⫺1/6) averages. In total, 718 were intraresidue, 233 sequential, 188 medium range, and 404 long range. We have deliberately not removed any intra-residue restraints since they are an integral part of the data. The number of restraints compares to 642 upper limit restraints obtained experimentally.29 If one takes into account that the model data set contains restraints for all inter-proton distances, including trivial ones that are fixed by the covalent geometry, the number of restraints in the model data set is realistic. None of the calculations in this paper included distance restraints for hydrogen bonds or torsion angle restraints. The three disulphide bonds were Fig. 2. Ratios 7rtraj8/reff, depending on residue number (cf. Figure 13 in ref. 20). The bigger black dots indicate the average for each residue; the black lines in the bottom of the figure indicate the secondary structure elements. introduced as distance restraints (2.02 Å) between the sulphur atoms. In an optimal experiment, one would be able to directly measure 7rtraj8, the arithmetic average of the distance in the structure over time. From the cross relaxation rates, without applying any corrections, we obtain an effective distance reff. Figures 1 and 2 compare the distances reff determined from spectral densities against the distances 7rtraj8. For most residues the average overall ratios 7rtraj8/reff is close to one. However, for most residues there are values significantly larger than one, which indicates serious underestimation of the distance due to internal dynamics. MD AND ACCURACY OF NMR STRUCTURES 457 TABLE I. Distance Restraint Sets (DRS) Used in This Study† DRS Type of bounds ex Nrefer ex Ncalc re Ncalc AL6 WMS L25 L12 Q12 Q06 Q03 M06 M03 S68 S38 S20 upper ⫽ 6Å weak medium strong ⌬⫹/⫺ ⫽ 0.25reff ⌬⫹/⫺ ⫽ 0.125reff 2 ⌬⫹/⫺ ⫽ 0.125reff 2 ⌬⫹/⫺ ⫽ 0.0625reff 2 ⌬⫹/⫺ ⫽ 0.03125reff 2 ⌬⫹/⫺ ⫽ 0.0625reff 2 ⌬⫹/⫺ ⫽ 0.03125reff ⌬⫹/⫺ ⫽ ⌬⫹/⫺ ⫽ /2 ⌬⫹/⫺ ⫽ /4 140 278 192 433 116 270 599 270 599 394 741 987 92 213 164 448 63 265 594 95 336 248 566 807 — — — — — — — 246 301 — — — †Nex refer indicates the number of restraints for which the distance in the ex reference structure lay outside lower and upper bound, Ncalc the number of restraints that were excluded during the calculation, and re Ncalc the number of restraints that were reset during the calculation. The total number of restraints was 1,543 in all calculations. Because of the large number of reff distances calculated as (S2 7r⫺68)(⫺1/6), the deviations are more severe than in the previous analysis.20 Expectedly, the most severe deviations are found for the most mobile residues (around residues Tyr10 and Arg40). Several distance restraint sets (DRS) were derived from this data-set and the reference structure. In one set (RAL6 ), all superior and lower bounds were set to 6 Å and 0 Å, respectively, as suggested by Zhao and Jardetzky.6 For the second set (RWMS ) we used the classification of 2.7 Å, 3.6 Å, and 5.0 Å for strong, medium, and weak NOEs (e.g., ref. 15), where a strong NOE was defined by reff ⬍ 2.5 Å, a medium as reff ⬍ 3.3 Å, and a weak NOE as reff ⬍ 4.2 Å; all lower bounds were set to 1.8 Å. Several DRSs were generated by setting the estimated error to a polynomial function of reff: for RLX, the error is ⌬⫹/⫺ ⫽ Lreff, for RQX, 2 ⌬⫹/⫺ ⫽ Qreff . The DRSs RMX are identical to RQX, but in the calculations violated distance bounds are not removed but loosened specifically. In our model system, the error of each determined distance reff is known from a comparison with the distance rreference in the reference structure. We generated three DRSs with error bounds directly derived from this difference: for the DRSs (RSX ), the error bounds were set such that approximately X% of the distances in the reference structure, rreference, lay between lower and upper bound, where X was set to 68, 38, and 20. This corresponds to 1, 0.5, and 0.25 from the mean. This results in somewhat tighter lower bounds since under-estimation of distances is more important than over-estimation (see Fig. 1 and 2). The number of restraints for which the distances in the reference structure were between lower and upper bound are listed in Table I. Structure Quality, Precision, and Accuracy We used a standard simulated annealing protocol employing Cartesian MD, starting from random torsion angle structures (for exact parameters, see ref. 10). In each Fig. 3. Energy-sorted RMSave plots for all backbone atoms. In contrast to the original suggestion,45 the plot shows the maximum RMS difference from the average structure (see ref. 46), not the pairwise RMS difference. Diamond: iteration 0; asterisks: iteration 1; square: iteration 2; triangle: iteration 3; dot: iteration 4. (a) DRS RAL6. (b) DRS RS38. (c) DRS RS20. iteration, 20–25 structures were calculated. The convergence of the protocol was very good (Fig. 3). For each DRS, the initial calculation (‘‘zeroth’’ iteration) was performed with all restraints. At the beginning of iterations one to four, each restraint was checked for systematic violations in the structures of the previous iteration, and a restraint was removed if it was violated in more than six of the eight structures with lowest total energy, essentially as suggested26 and described before10 (see also Methods). In total, four refinement iterations were performed. The number of active restraints decreased with iteration for all DRSs, and the tighter the bounds, the more data were excluded (Table I). Some of the calculated ensembles are shown together with snapshots of the trajectory in Figure 4. The buried side-chain of Phe22 has two distinct conformations. For all DRSs apart from RAL6, RMS deviations from ideal covalent geometry and experimental restraints are high in the first iteration (see Fig. 5), indicating that the 458 F.-R. CHALAOUX ET AL. TABLE II. Accuracy, Precision, and Structure Quality in Iterations 1 and 4† DRS RMSref RMSave WhatIf PROSA ⫺ 1.28 1.28 1.44 1.39 1.22 1.31 1.28 1.31 1.28 1.16 1.03 1.06 1.50 1.40 1.50 1.46 1.39 1.42 1.37 1.42 1.39 1.34 1.23 1.29 0.74 0.56 0.44 0.43 0.68 0.56 0.49 0.56 0.56 0.58 0.52 0.76 ⫺2.52 ⫺1.71 ⫺1.62 ⫺1.52 ⫺1.97 ⫺1.63 ⫺1.40 ⫺1.63 ⫺1.40 ⫺1.60 ⫺1.54 ⫺1.44 ⫺0.49 ⫺0.83 ⫺0.89 ⫺0.91 ⫺0.86 ⫺0.88 ⫺0.87 ⫺0.88 ⫺0.87 ⫺0.93 ⫺0.74 ⫺0.78 48.6 56.5 53.3 52.2 54.3 56.8 58.4 56.8 57.3 61.4 67.9 68.8 0.97 0.80 0.89 0.72 0.86 0.77 0.65 1.00 0.77 0.83 0.63 0.70 1.22 0.89 0.93 0.73 0.96 0.78 0.68 1.00 0.77 0.85 0.66 0.77 0.74 0.38 0.28 0.15 0.41 0.22 0.16 0.10 0.08 0.21 0.18 0.33 ⫺2.71 ⫺1.69 ⫺1.80 ⫺2.13 ⫺2.04 ⫺1.84 ⫺1.75 ⫺1.84 ⫺1.75 ⫺1.75 ⫺1.94 ⫺1.82 ⫺0.45 ⫺1.3 ⫺1.1 ⫺1.1 ⫺1.0 ⫺1.0 ⫺1.3 ⫺1.0 ⫺1.3 ⫺1.2 ⫺1.1 ⫺1.0 50.8 66.0 60.9 62.5 66.3 66.6 70.4 59.5 72.3 61.1 74.7 72.6 RMSave,ref Iteration 0 AL6 WMS L25 L12 Q12 Q06 Q03 M06 M03 S68 S38 S20 Iteration 4 AL6 WMS L25 L12 Q12 Q06 Q03 M06 M03 S68 S38 S20 †Accuracy Fig. 4. C␣ traces of the MD trajectory and some of the calculated ensembles. The reference structure is shown in fat lines. The side-chain of the mobile buried residue Phe22 is shown. (a) Frames from the trajectory every 500 ps (b) DRS RAL6, iteration 0. (c) DRS RS38, iteration 0. (d) DRS RS38, iteration 4. restraints cannot be satisfied simultaneously in a single rigid structure. These RMS deviations decrease rapidly with iteration as the systematically violated restraints are removed, and they plateau around the third iteration. In contrast, RAL6 has only slightly elevated RMS values for covalent energy terms and distance restraints in the zeroth iteration, showing that there is little strain in the structures. Accordingly, fewer restraints are excluded (Table I). Other quality indices such as the WhatIf quality index47 and average PROSA energy48 vary less with iteration (cf. Table II). Since the fold of the protein varies only little between iterations and different bound sets, this is not surprising. The PROSA energy improves, while the WhatIf quality index deteriorates slightly. With distance data derived from experiments, we observed a good correlation between these quality indices and refinement iteration.10 The present data are derived from a simulation. Precision has been defined as the RMS difference from the average structure6 (RMSave ). For accuracy, two definitions are possible: the average of the RMS differences of each single structure from the reference structure (RMSref ), (RMSave,ref , RMSref ), precision (RMSave ), and structure quality for different DRSs in iterations 0 and 4. For each DRS, the structure quality was assessed by the WhatIf quality index,47 the average PROSA energy per residue,48 and the number of residues in core regions of the Ramachandran plot ( ⫺ ).49 and the RMS difference of the average structure from the reference structure (RMSave,ref ). Both measures are reported in Table II, only the first measure in Figure 6. RMSave,ref is usually somewhat smaller than RMSref, i.e., the average structure is more accurate than the individual structures on average, by maximally 0.25 Å. Precision and accuracy increase with exclusion of violated restraints, evidenced by a decrease of RMSref and RMSave with iteration number (see Figure 6, and Table II). To a small extent this is observed also for DRS RAL6, although there is little strain in the structures even in iteration zero. In most calculations, RMSref and RMSave show little change after iteration three (Fig. 6), similar to the conformational energy terms. The lowest value of RMSref is reached for DRS RS38 in iteration 3 (0.63 Å). Table 2 compares the accuracy and precision of all zeroth and fourth iterations. Only iteration zero (i.e., no data removal or bounds modifications) can be directly compared to previous studies.5,6,12 For this iteration, there is no obvious correlation between tightness of bounds, accuracy, and precision. For example, DRSs RS20 has much tighter bounds than DRS RS68, but their precision is very similar. The accuracy, however, is higher for DRS RS68. Clearly, RAL6 yields the worst results in terms of accuracy. 459 MD AND ACCURACY OF NMR STRUCTURES 1.6 0 0 1.4 0 0 00 0 RMS ref (Å) 0 0 0 AL6 1.2 0 1 M06 L25 Q12 S68 0.8 M03 L12 WMS Q06 S20 Q03 0.6 S38 0.2 0.4 0.6 RMS ave (Å) 0.8 1 Fig. 6. Precision against accuracy for all DRSs and for all iterations. The iterations for each DRS are connected by lines. Iteration zero is marked by 0; iteration 4 is marked by the name of the DRS. Each group of calculations is marked by a different colour: magenta, qualitative error bounds (AL6, WMS); green, linear error estimate (L12, L25); blue, quadratic error estimate (Q03,Q06,Q12); red, linear bounds from known error (S20,S38,S68); and black, quadratic error estimate with automatic bound resetting (M03,M06). structures has been seen in other model calculations5,50 and in a comparison of NMR structures of different generations.15 While it is obvious from Figure 6 that in our model study the more precise structures are also more accurate, there is no clear relationship. Exclusion of Distances Fig. 5. Structure quality for DRSs RAL6 (diamond), RWMS (asterisk), RS38 (square), RS20 (triangle). (a) Non-bonded repel energy. (b) RMS deviations from ideal angles. (c) RMS deviations from included distance restraints. (d) Number of included restraints. After four iterations of data removal, there is a clearer trend to more accurate structures with tightness of bounds. Only for the tightest bounds tested (DRS RS20 ), there is a increase both in RMSave and RMSref in iterations three and four. For this DRS, more than half of the restraints are removed during the calculation; this loss in experimental information is not compensated by the increased information content in tighter bounds. Surprisingly, the results for our scheme of bounds resetting (calculations M03 and M06) are somewhat worse than for the simple data removal scheme. RMSref is systematically higher than RMSave in all iterations and for all DRSs (Fig. 6). That precision overestimates accuracy has been noted previously.5,6,12,15,50 A linear relation between accuracy and precision in NMR The overall number of excluded restraints is comparable to, but always smaller than, the number of violations in the reference structure (Table I). Since internal dynamics leads to inconsistent distance restraints, the violation analysis would optimally identify those restraints most affected by internal dynamics. In Figure 7, we compare the average distances in one of the calculated ensembles, 7rensemble8, to the corresponding distances in the reference structure, rreference. Even for the largest values of rreference, the correlation is surprisingly good. There is a slight tendency that 7rensemble8 is smaller than rreference. This is a consequence of the fact that not all distance restraints with severe underestimation of the upper limit were removed. To get a more detailed picture, we compared the fraction of excluded restraints per residue to the fraction of restraints which are violated in the reference structure, for DRS RS38 (Fig. 8). There is a good correlation between the percentage of excluded restraints and the percentage of violated restraints in the reference structure (correlation coefficient 0.64; Fig. 8a). However, there is little correlation between the percentage of excluded restraints and the 460 F.-R. CHALAOUX ET AL. Fig. 7. Scatterplot of the distances in the reference structure rreference against distances averages 7rensemble8 over the ensemble for DRS RS38, fourth iteration. The excluded restraints are marked with crosses, the included restraints with open circles. fluctuation around the average in the trajectory (correlation coefficient 0.23; cf. Fig. 8a,b), or the RMS difference between the structure and the reference structure (correlation coefficient 0.22; cf. Fig. 8a,c). The correlation between the error in the structure and the RMS fluctuation in the trajectory is better (correlation coefficient 0.49). As already apparent in Figure 6, the RMS fluctuation of the structure around its average is very much underestimated, and the correlation of its residue dependence with that in the original trajectory is small (correlation coefficient 0.24). In contrast, the overall RMS fluctuation of the ensemble for DRS RAL6 iteration 0, agrees rather well with the fluctuation in the trajectory (correlation coefficient 0.48). However, this comes at the expense of a much increased RMS difference between the ensemble and the reference structure (Fig. 8c), and the correlation between the fluctuation in the trajectory and the error is small (correlation coefficient 0.25). Figure 9 shows the number of excluded restraints, compared with the difference between reff and rreference, for DRS RS38. The probability to correctly identify distances reff affected by internal dynamics grows with the error in reff (the two lines in the figure coincide for large differences rreference ⫺ reff ). However, there are excluded restraints close to a difference of zero, and some restraints are not excluded even at differences rreference ⫺ reff of several Å. Fig. 8. (a) Fraction of excluded restraints per residue (RS38, iteration 4), compared to the fraction of distances in the reference structure lying outside the bounds in the DRS RS38. (b) C␣-fluctuations in the trajectory (solid lines); for calculation AL6, iteration 0 (dot-dashed line); for calculation S38, iteration 0 (dotted line); and for calculation S38, iteration 4 (dashed line). (c) C␣-RMS differences from the reference structure, for calculation AL6, iteration 0 (dashed line), and for calculation S38, iteration 4 (dot-dashed line). Re-Classification Versus Exclusion of Data Fig. 9. Total number of restraints (total hight), and number of excluded restraints (dark gray bar), against rreference ⫺ reff, for DRS RS38. Bin size is 0.2 Å. Restraint exclusion is obviously not the optimal solution, since experimental data is lost. One simple alternative scheme is to increase the distance bounds for the violated restraints, rather than remove them altogether. This is common practice in many laboratories (e.g., see ref. 15). When applied manually, care is used and additional data (e.g., peak shapes) are used to identify restraints for which restraints are loosened. We have tested one auto- MD AND ACCURACY OF NMR STRUCTURES matic scheme, in which the bounds for each violated restraint are increased in steps by 10% until the restraint is satisfied, up to a maximum of five times. If the restraint still cannot be satisfied, it is excluded. Expectedly, fewer restraints were excluded from the calculation (see calculations M03 and M06 in Table I). The precision of the structures is very high. However, the surprising result of this calculation was that in general the accuracy of the structures is not improved. For data set RQ06, there is even a significant decrease. DISCUSSION The Model System From an MD trajectory one can calculate realistic spectral densities and it is therefore the method of choice for generating model systems to test the influence of internal dynamics on the accuracy of NMR structures. The analysis through correlation functions rather than simple distance averages is more realistic, since for many NOEs the effects of distance and angular fluctuations cancel each other, and the assumption of 7r⫺68⫺1/6 or 7r⫺38⫺1/3 averages would be incorrect. No knowledge of the exact nature of the average was assumed in the refinement, similar to the real, experimental case. The ‘‘noise’’ in the data due to internal dynamics is not randomly and symmetrically distributed around a known rigid structure; the derived data seemed to be more problematic than that considered in the more recent model studies.5,6 This is evidenced by the fact that the RMS fluctuation of the trajectory around its average is higher than what is typically observed in high resolution NMR structures,13,29 that we could not obtain low-energy structures with DRS RWMS, in contrast to Clore et al.,5 and that the RMS difference from the ideal structure for DRSs RAL6 is larger than observed by Zhao and Jardetzky.6 The reference structure, calculated from the MD trajectory by probability map refinement, is the equivalent of an independently refined X-ray crystal structure. Circular reasoning is avoided, since the force field used to calculate the MD trajectory (PARAM19) was very different from the parameters used in the calculation of structural ensembles from the restraints (PARALLHDG). PARAM1932 is an extended atom force field, treating only polar hydrogen atoms explicitly, and the version of PARALLHDG51 used in the calculation has covalent parameters from the CSDX X-ray refinement parameters,42 and vdW radii from the distance geometry program DISGEO.11 The comparison is in a way more relevant than the comparison between experimental X-ray crystal and solution NMR structures, since crystal packing effects do not play any role. Our goal was to examine the effects of dynamic averaging alone. For this purpose, the MD trajectory offers the ultimate comparison with respect to structure accuracy and the dynamics. Hence, we did not include the effects of spin diffusion, which can be dealt with during a structure calculation in a straightforward way (for reviews, see refs. 52, 53, 54). 461 Precision and Accuracy of the Structures To achieve the highest accuracy, we had to define rather narrow error bounds, and subsequently remove systematically violated distance restraints (about one third of the total number of restraints for DRSs RQ03 and RS38 ). It therefore appears that a smaller number of accurate distance restraints led to more accurate structures than a larger number of loose bounds. While the differences between the results obtained in the fourth iteration for different DRSs may be small (a few tenths of an Å), they are statistically significant (between calculations S38 and WMS, for example, the difference is more than two standard deviations; data not shown), and there is a clear trend towards higher accuracy with tighter bounds. Differences between structures obtained with all data (iteration 0) and with data removal are much more important, and we stress that the only DRS that produced low energy structures in iteration 0 was RAL6. Simply removing the distance restraints that cause systematic violations is only a first and maybe not very satisfactory solution. Other procedures can be considered. We tested one simple scheme that widens bounds specifically rather than removing the restraints, similar to procedures used with experimental data.15 To date, however, this method does not perform as well as data exclusion (see calculation M03 and M06 in Figure 6), and lead to an increase in precision, but a decrease in accuracy. Due to the restraint removal, the structures obviously do not satisfy all restraints any more. Structures refined with wide error bounds, on the other hand, may satisfy all restraints, but may not satisfy all data. In the DRS RAL6, for example, a strong NOE is converted into an upper limit of 6.0 Å. While a structure with a corresponding interproton distance of around 6 Å certainly satisfies the restraint, it would violate the data (assuming that a strong NOE corresponds to a distance of around 2.5 Å) by 3.5 Å. Consequently, if one evaluates RMS differences not to the bounds but directly to the complete set of ‘‘measured’’ distances, reff, structures in iteration 0 with DRS RAL6 show significantly larger values (around 1 Å) than structures of any other DRS in any iteration (e.g., below 0.6 Å in iteration 0 and around 0.8 Å in iteration 4 for DRS RS38 ). This is the case even though one uses the complete data-set for the evaluation, and many of the data points contributing to the RMS had been excluded from the structure calculation. A definite advantage of DRS RAL6 seemed that RMSDave showed a correlation with the RMS fluctuation in the original trajectory. However, the RMS fluctuation for the tightest bounds (RS20 ) in iteration 0 is of similar size, and the correlation is even better for RS38 (Fig. 8). Hence, inconsistencies in the data can produce a similar RMSDave as wide error bounds. The exact value of RMSDave depends on the bounds, without any clear trend. None of the calculations reproduced the disorder of the side-chain of Phe22 (see Fig. 4). 462 F.-R. CHALAOUX ET AL. Exclusion of Violated Restraints and Internal Dynamics The criterion for data exclusion in Eq. (1) was developed to identify large violations24 and may be rather crude for the present purpose since it uses only structural consistency to identify NOEs affected by internal dynamics. Since this is the present implementation in iterative schemes like ARIA10 and NOAH,26 we felt it important to study the effect on dynamically averaged distances. While the correlation of excluded distance restraints with those violated in the reference structure is satisfactory, the correlation with the mobility of the peptide chain is small (see Fig. 8). Still, a logical solution for inconsistencies in the restraints due to internal dynamics is to use ensemble averaging for the identified NOEs, and assume a rigid model for all others. This would solve the problem of underdetermination in some ensemble averaging methods. In this context it should be noted that in the study by Bonvin and Brünger,55 an important fraction of the data was not averaged (hydrogen bond and coupling constant restraints). Our own attempts to use ensemble averaging without such a class of static restraints failed. An analysis as described in this paper could be used to identify a class of restraints that are not subject to dynamic averaging. We expect that in this iterative way, ensemble averaging methods could be used from the start in an NMR structure calculation, as an integral part of iterative methods like ARIA. In this paper, we have deliberately restricted ourselves to the ‘‘standard’’ structure determination approach. We feel that the determination of an accurate ‘‘average’’ structure from NMR data is an important goal in itself, even when some of the data are better represented by an ensemble. One important use of an accurate average structure is solving X-ray crystal structures with molecular replacement. An average structure is also a necessary ‘‘zero-order’’ approximation for further NMR refinement. In general, the data may not always contain enough information to obtain the dynamic behavior of the protein directly.55,56 CONCLUSIONS In this paper, we showed that precision is not only a consequence of tight bounds and number of restraints but also of the consistency of restraints. The accuracy is significantly higher with narrow bounds and restraint exclusion than with bounds wide enough to obtain lowenergy structures without data removal. The most accurate structures were obtained by removing about a third of the distance restraints. The restraint exclusion scheme worked qualitatively correctly and identified those restraints that are incompatible with the rigid reference structure. It is worth noting that in our model study, all noise was due to internal dynamics, and no additional sources of noise such as artifacts or incorrect assignments were present. If a data removal strategy is employed with experimental data, the minimum requirement is to docu- ment the excluded restraints. This applies obviously also to modifications of individual restraints to obtain violationfree structures. Although this has apparently been used for many structure determinations by NMR (e.g., ref. 15), details of the procedures involved are usually not given, and the consequences or the validity of the approach have not been assessed systematically. We suggest to submit the data with the structures in a form close to the raw data, so that modifications (reclassification/exclusion) are visible. To date, data-sets submitted to the PDB57 often do not allow the ready identification of modified restraints, in particular when only the final upper bounds are reported. In addition to the deviation from the original data (an R-value), the number of NOEs that needed corrections could serve as a figure of merit. REFERENCES 1. Kim Y, Prestegard JH. A dynamic model for the structure of acyl carrier protein in solution. Biochemistry 1989;28:8792–8797. 2. van Gunsteren WF, Brunne RM, Gros P, van Schaik RC, Schiffer CA, Torda AE. Accounting for molecular mobility in structure determination based on nuclear magnetic resonance spectroscopic and X-ray diffraction data. Meth Enzymol 1994;261:619–654. 3. LeMaster DM, Kay LE, Brünger AT, Prestegard JH. Protein dynamics and distance determinations by NOE measurement. FEBS Lett 1988;236:71–76. 4. Wüthrich K. NMR of proteins and nucleic acids. New York: John Wiley & Sons; 1986. p 1–292. 5. Clore GM, Robien MA, Gronenborn AM. Exploring the limits of precision and accuracy of protein structures determined by nuclear magnetic resonance spectroscopy. J Mol Biol 1993;231:82–102. 6. Zhao D, Jardetzky O. An assessment of the precision and accuracy of protein structures determined by NMR: dependence on distance errors. J Mol Biol 1994;239:601–607. 7. Güntert P, Braun W, Wüthrich K. Efficient computation of threedimensional protein structures in solution from nuclear magnetic resonance data using the program DIANA and the supporting programs CALIBA, HABAS, and GLOMSA. J Mol Biol 1991;217: 517–530. 8. Hyberts SG, Goldberg MS, Havel TF, Wagner G. The solution structure of eglin c based on measurements of many NOEs and coupling constants and its comparison with X-ray structures. Protein Sci 1992;1:736–751. 9. Folmer RHA, Nilges M, Konings RNH, Hilbers CW. Solution structure of the single-stranded DNA binding protein of bacteriophage Pf3. EMBO J 1995;14:4132–4142. 10. Nilges M, Macias MJ, O’Donoghue SI, Oschkinat H. Automated NOESY interpretation with ambiguous distance restraints: the refined NMR solution structure of the pleckstrin homology domain from ␤-spectrin. J Mol Biol 1997;269:408–422. 11. Havel T, Wüthrich K. A distance geometry program for determining the structures of small proteins and other macromolecules from nuclear magnetic resonance measurements of intramolecular 1H1H proximities in solution. Bull Math Biol 1984;46:673–698. 12. Havel TF, Wüthrich K. An evaluation of the combined use of nuclear magnetic resonance and distance geometry for the determination of protein conformations in solution. J Mol Biol 1985;182: 281–294. 13. Gronenborn AM, Filpula DR, Essig NZ, et al. A novel, highly stable fold of the immunoglobulin binding domain of streptococcal protein. Science 1991;253:657–661. 14. Chothia C, Lesk AM. The relation between the divergence of sequence and structure in proteins. EMBO J 1986;5:823–826. 15. Gronenborn AM, Clore GM. Structures of protein complexes by multidimensional heteronuclear magnetic resonance spectroscopy. Crit Rev Biochem Mol Biol 1995;30:351–385. 16. Post CB. Internal motional averaging and three-dimensional structure determination by nuclear magnetic resonance. J Mol Biol 1992;224:1087–1101. 17. Brüschweiler R, Roux B, Blackledge M, Griesinger C, Karplus M, Ernst R. Influence of rapid intramolecular motion on NMR MD AND ACCURACY OF NMR STRUCTURES 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. cross-relaxation rates. A molecular dynamics study of antamanide in solution. J Am Chem Soc 1992;114:2289–2302. Abseher R, Lüdemann S, Schreiber H, Steinhauser O. NMR cross relaxation investigated by molecular dynamics simulation: a case study of ubiquitin in solution. J Mol Biol 1995;249:604–624. Fushman D, Ohlenschläger O, Rüterjans H. Determination of the backbone mobility of ribonuclease T1 and its 2’GMP complex using molecular dynamics simulations and NMR relaxation data. J Biomol Struct Dyn 1994;11:1377–1402. Schneider T, Brünger AT, Nilges M. Influence of internal dynamics on accuracy of protein NMR structures: derivation of realistic model distance data from a long molecular dynamics trajectory. J Mol Biol 1999;285:727–740. Pearlman DA, Kollman PA. Are time-averaged restraints necessary for nuclear magnetic resonance refinement? A model study for DNA. J Mol Biol 1991;220:457–479. Pearlman DA. How is an NMR structure best defined? An analysis of molecular dynamics distance based approaches. J Biomol NMR 1994;4:1–16. Pearlman DA. How well do time-averaged J-coupling restraints work? J Biomol NMR 1994;4:279–299. Haenggi G, Braun W. Pattern recognition and self-correcting distance geometry calculations applied to myohemerythrin. FEBS Lett 1994;344:147–153. Mumenthaler C, Braun W. Predicting the helix packing of globular proteins by self-correcting distance geometry. Protein Sci 1995;4:863–871. Mumenthaler C, Braun W. Automated assignment of simulated and experimental NOESY spectra of proteins by feedback filtering and self-correcting distance geometry. J Mol Biol 1995;254:465– 480. Macias MJ, Musacchio A, Ponstingl H, Nilges M, Saraste M, Oschkinat H. Structure of the pleckstrin homology domain from ␤-spectrin. Nature 1994;369:675–677. Wagner G, Braun W, Havel T, Schaumann T, Gō N, Wüthrich K. Protein structures in solution by nuclear magnetic resonance and distance geometry: the polypeptide fold of the basic pancreatic trypsin inhibitor determined using two different algorithms, DISGEO and DISMAN. J Mol Biol 1987;196:611–639. Berndt K, Güntert P, Orbons L, Wüthrich K. Determination of a high-quality nuclear magnetic resonance solution structure of the bovine pancreatic trypsin inhibitor and comparison with three crystal structures. J Mol Biol 1993;227:757–775. DeLano WL, Brünger AT. Helix packing in proteins: prediction and energetic analysis of dimeric, trimeric, and tetrameric GCN4 coiled coil structures. Proteins 1994;20:105–123. Brünger AT. X-PLOR. A system for X-ray crystallography and NMR. New Haven: Yale University Press; 1992. p 1–382. Brooks BR, Bruccoleri RE, Olafson BD, States DJ, Swaminathan S, Karplus M. CHARMM: A program for macromolecular energy, minimization, and dynamics calculations. J Comp Chem 1983;4: 187–217. Deisenhofer J, Steigemann W. Crystallographic refinement of the structure of bovine pancreatic trypsin inhibitor at 1.5 Å resolution. Acta Crystallogr 1975;B31:238–250. Brünger AT, Karplus M. Polar hydrogen positions in proteins: empirical energy placement and neutron diffraction comparison. Proteins 1988;4:148–156. Loncharich RJ, Brooks BR. The effects of truncating long-range forces on protein dynamics. Proteins 1989;6:32–45. Ryckaert J, Ciocotti G, Berendsen H. Numerical-integration of cartesian equations of motion of a system with constraints— molecular dynamics of N-alkanes. J Comput Phys 1977;23:327– 341. Lipari G, Szabo A. Model-free approach to the interpretation of 38. 39. 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. 51. 52. 53. 54. 55. 56. 57. 463 nuclear magnetic resonance relaxation in macromolecules. 1. Theory and range of validity. J Am Chem Soc 1982;104:4546– 4558. Solomon I. Relaxation processes in a system of two spins. Phys Rev 1955;99:559–565. Nilges M, Clore GM, Gronenborn AM. A simple method for delineating well-defined and variable regions in protein structures determined from interproton distance data. FEBS Lett 1987;219:11–16. Abseher R, Nilges M. Are there non-trivial dynamic crosscorrelations in proteins? J Mol Biol 1998;279:911–920. Nilges M, O’Donoghue SI. Ambiguous NOEs and automated NOESY assignment. Progr NMR Spectr 1998;32:107–139. Engh RA, Huber R. Accurate bond and angle parameters for x-ray structure refinement. Acta Crystallogr 1991;A47:392–400. Brüschweiler R. Normal modes and NMR order parameters in proteins. J Am Chem Soc 1992;114:5341–5344. Wlodawer A, Nachman J, Gilliland G, Gallager W, Woodward C. Structure of form III crystals of bovine pancreatic trypsin inhibitor. J Mol Biol 1987;198:469–480. Widmer H, Widmer A, Braun W. Extensive distance geometry calculations with different NOE calibrations: new criteria for structure selection applied to sandostatin and BPTI. J Biomol NMR 1993;3:307–324. Abseher R, Horstink L, Hilbers CW, Nilges M. Essential spaces defined by NMR structure ensembles and molecular dynamics simulation show significant overlap. Proteins 1998;31:370–382. Vriend G, Sander C. Quality control of protein models: directional atomic contact analysis. J Appl Crystallogr 1993;26:47–60. Sippl MJ. Recognition of errors in three-dimensional structures of proteins. Proteins 1993;17:355–362. Laskowski RA, MacArthur MW, Moss DS, Thornton JM. PROCHECK: a program to check the stereochemical quality of protein structures. J Appl Crystallogr 1993;26:283–291. Brünger AT, Clore GM, Gronenborn AM, Saffrich R, Nilges M. Assessing the quality of solution nuclear magnetic resonance structures by complete cross-validation. Science 1993;261:328– 331. Nilges M, Clore GM, Gronenborn AM. Determination of threedimensional structures of proteins by hybrid distance geometrydynamical simulated annealing calculations. Febs Lett 1988;229: 317–324. James TL. Relaxation matrix analysis of two-dimensional nuclear Overhauser effect spectra. Curr Opin Struct Biol 1991;1:1042– 1053. Case D. New directions in NMR spectral simulation and structure refinement. In: van Gunsteren WF, Weiner PK, Wilkinson AJ, editors. Computer simulation of biomolecular systems: theoretical and experimental applications. Vol 2. Escom Leiden; 1993. p 382–406. Bonvin AMJJ, Boelens R, Kaptein R. Determination of biomolecular structures by NMR: use of relaxation matrix calculations. In: van Gunsteren WF, Weiner PK, Wilkinson AJ, editors. Computer simulation of biomolecular systems: theoretical and experimental applications, Vol 2. Escom Leiden; 1993. p 407–440. Bonvin AMJJ, Brünger AT. Conformational variability of solution nuclear magnetic resonance structures. J Mol Biol 1995;250: 80–93. Bonvin AMJJ, Brünger AT. Do NOE distances contain enough distance information to access the relative populations of multiconformer structures? J Biomol NMR 1995;5:72–76. Bernstein FC, Koetzle TF, Williams GJB, et al. The protein data bank: a computer-based archival file for macromolecular structures. J Mol Biol 1977;112:535–542.