Forcefield Based Simulations



3       Preparing the Energy Expression and the Model

Many forcefields allow you a great deal of flexibility with respect to how atom types are assigned to atoms, which terms of the energy function are used, and how the simulation engine applies the forcefield. You can also perform computational experiments by using alternative functional terms and applying constraints and restraints to your model.

Who should read this chapter

Although largely automated model preparation and energy expression setup is possible for simple systems, you should read this chapter if you want to perform the best simulations possible. You need to read this chapter if you want or need to know about:

This chapter explains

Using forcefields

Selecting forcefields

Assigning forcefield atom types and charges

Parameter assignment

Using alternative forms of energy terms

Applying constraints and restraints

Modeling periodic systems

Handling nonbond interactions

Related information in this book

Forcefields presents the functional forms of energy expressions and describes the forcefields that are available in Molecular Simulations products.

The atom types defined for each forcefield are listed under Forcefield Terms and Atom Types.

The files that specify the forcefields are described in detail in separate documentation.

Table 5 . Finding information in Preparing the Energy Expression and the Model section

If you want to know about: Read:
Charge assignment.  
Assigning charges.  
Missing parameters.  
Automatic assignment of values for missing parameters.  
Custom parameters.  
Manual parameter assignment.  
Editing forcefields.  
Editing a forcefield.  
Hydrogen bond terms.  
Hydrogen bonds and hydrogen-bond terms.  
Criteria for defining hydrogen bonds.  
Hydrogen bonds and hydrogen-bond terms.  
Optimizing or simulating only part of a model.  
Applying constraints and restraints.  
Reducing computational costs.  
Applying constraints and restraints.  
Forcing a model towards a desired conformation.  
Applying constraints and restraints.  
Images of models in periodic systems.  
Modeling periodic systems.  
Relation between Cartesian and crystal axes.  
Figure 7. Relationship between Cartesian coordinate system (xyz) and periodic system (abc) in Discover and CHARMm.  
Solvent molecules.  
Minimum-image model; Explicit-image model.  
Combination rules for van der Waals terms.  
Combination rules for van der Waals terms.  
Dielectric constants.  
The dielectric constant and the Coulombic term.  
Neighbor lists.  
Neighbor lists and buffer widths.  
Charge groups, functional groups.  
Charge groups and group-based cutoffs.  
Cell multipoles, nonperiodic systems.  
Cell multipole method.  
Ewald sums, periodic systems.  
Ewald sums for periodic systems.  

Specific information

For specific information on procedures, please see the manual for the molecular modeling program and/or simulation engine that you are using (see Available documentation):


Using forcefields

Graphic interface mode

All MSI's simulation engines and forcefields can be used through at least one graphical molecular modeling interface (Cerius2, Insight II, QUANTA, see Table 1).

Standalone mode

The Discover and CHARMm programs can also be run in a command-based, standalone mode with input from a text interface and/or a script and other input files.

Mixed-mode use

For Discover and CHARMm, you can optionally perform some tasks through the appropriate graphical interface and others in standalone mode. For example, you might want to prepare model structure and command input files with the graphical interface, save both types of files, edit the command input file with a text editor so as to perform some complex simulation, start the run in standalone mode, and then analyze the results with the aid of one of the graphical interfaces. In addition, facilities exist for directly entering specific BTCL or CHARMm commands from the Insight or QUANTA interface and for reading a BTCL or CHARMm command input file into the Insight, Cerius2, or QUANTA interface.

Additional information

How to run Discover and CHARMm in standalone mode is documented separately (see Available documentation).

General procedure

Regardless of whether simulation engines are run through the graphical interface, in standalone mode, or in some combination of both modes, the general sequence of activities for doing forcefield-based calculations is as follows.

Important

For specific instructions, see the documentation for the appropriate simulation engine and/or molecular modeling program (see Available documentation).

1. Read in the forcefield--Based on the type of model and the scientific problem that you want to simulate, decide which forcefield to use (see Forcefields). If it is not the default forcefield for the molecular modeling program you are using, you need to specify the desired forcefield.

The forcefield parameter files (which contain parameters that specify force constants, equilibrium geometries, van der Waals radii, and other data needed for calculating energies) are read in as part of this process.

2. Prepare the model:

a. If necessary, read in monomer/residue definitions--In most cases, the molecular modeling program automatically does this for you, or (depending on the type of model) it is not necessary.

Information about residues or monomers, the basic chemical units that comprise many models, is stored in topology (or library, dictionary, or template) files. The atoms, atomic properties, bonds, bond angles, torsion angles, improper torsion angles, hydrogen bond donors, acceptors, and antecedents, nonbond exclusions, and charge increments are all specified on a per residue basis.

If this information is not included in a structure or model file you intend to read in (for example, a Brookhaven Protein Database file), then you may have to specify the appropriate topology file.

b. Read in or construct a model--Read in a model from an appropriate file or construct it using the builder functionality in the molecular modeling program. Make sure your final model is correct with respect to atom connectivity, hybridization, bond orders, valences, etc.

c. Assign forcefield atom types and charges to each atom in your model (see also Assigning forcefield atom types and charges). If you are using UFF, calculate fractional bond orders. These are largely automated processes.

d. If necessary, define charge groups--The need for charge groups depends on the type of model and the scientific problem that you want to simulate. Some calculations are impossible without this kind of information, others can be significantly speeded up by supplying it (see also Charge groups and group-based cutoffs).

e. Read in or generate Cartesian coordinates--In most cases, the molecular modeling program automatically does this for you when you save a model that you have built or read in from a non-MSI type of file.

3. Set up the energy expression--You may want to use alternative terms in the energy expression, use or avoid using certain default terms, specify how nonbond interactions are handled, apply constraints or restraints (biases) to your model, etc. (see also this chapter).

4. Set up the calculation--Unless you want your calculation to run under the default conditions, you need to specify items such as which minimization algorithm(s) and what termination criteria to use (see also Minimization, Molecular Dynamics, and Free Energy).

Most programs also allow you to control your calculation by specifying various nondefault conditions. You may want to use a more robust minimizer for a highly strained model, then switch to a more accurate one for the final stages of computation. Some forcefield engines allow sophisticated if-tests and conditional branching. You can determine what will happen if certain parameters are not found (see also Parameter assignment). You can also send jobs that are time consuming to some other (faster) computer or run them in the background.

5. Specify output--If desired, specify nondefault kinds or amounts of output.

6. Run the calculation.

7. Analyze the results--The molecular modeling programs provide analysis functionality that allows you to view your results in the form of graphs and tables, as well as by graphicallly displaying the final (and/or intermediate) conformations of the model.


Selecting forcefields

For details on how to select a forcefield in the molecular modeling program that you are using, please see the appropriate specific documentation (see Available documentation). A brief summary:


Assigning forcefield atom types and charges

Who should read this section

You should read this section if you want to understand what atom types are, how types and charges are assigned automatically, and how you can make your own atom type or charge assignments.

In addition, if you want to understand parameter assignment (Determination of which parameters are used with which atom types) and/or edit the forcefield parameters (Manual parameter assignment), you need to understand something about atom type assignment first.

Availability

All molecular modeling programs supplied by MSI perform automatic and/or semi-automatic atom-type and charge assignment (which needs to be re-done if you switch to a different forcefield). Please see the guidebook for the appropriate molecular modeling program for details on how to assign atom types and charges for the simulation engine you are using (see Available documentation).

What are atom types in forcefields?

The simulation engine needs the forcefield atom type of each atom in the model in order to determine which forcefield parameters to use. Forcefield parameters apply to particular combinations of atom types as specified by the forcefield.

Relation between forcefield atom types and chemical atoms

The forcefield atom types are related to the microchemical environment of the atoms in a way defined by the particular forcefield. For example, a methane model has only two atom types, one for the carbon and one for the hydrogens, even though each of the atoms may have a distinct atom name for labeling purposes. The hydrogen atoms are equivalent by symmetry; therefore, they would all have the same atom type in any forcefield.

As a more complicated example, consider propane, which has four distinct types of atoms: methyl carbon atoms, methyl hydrogen atoms, a methylene carbon atom, and the methylene hydrogens. In principle, a forcefield could consider these to be four distinct atom types, but in practice, the chemical difference between the carbon atoms or between the hydrogen atoms is very small, so in most forcefields the carbon atoms are all assigned the same atom type, and all the hydrogens are assigned a second atom type.

Assigning atom types to a model

Atom types and charges supplied by the structure file

Atom types are assigned by the simulation engine or the molecular modeling program. Atom types are automatically assigned by using a set of rules that link the type of an atom to its element type and its chemical microenvironment (for example, the number and nature of connected atoms). Different forcefields use different atom types and atom-typing rules, which are contained in a residue library or the forcefield file.

The atom type information can also be supplied by a molecular data file such as an .msi file (OFF), an .mdf file (Discover and OFF), or an RTF (or PSF) file (CHARMm). These structure files are typically created in the Cerius2, Insight, or QUANTA molecular modeling program.

To make sure that atom types are assigned:

Charge information also is saved in the structure file.

To assure that you use the most appropriate atom types in your studies, you should always check the assigned atom types against the appropriate table under Forcefield Terms and Atom Types. In most cases, Cerius2 and Insight automatically assign the atom types. However, these assignment engines of course require the models to be correctly built. One of the most critical types of information is the bond order, which should be set before the forcefield is assigned.

Atom typing in different modeling programs

In Cerius2 and Insight, atoms with unassigned atom types are labelled with question marks when you label the model according to the atom type (FFTYPE or potential type).

Important

A newly assigned atom type (including associated parameters such as mass and charge) replaces any previously assigned or calculated value.

Important

Currently, the automatic atom-typing engines cannot distinguish a metal atom from a metallic ion. Hence, by default, "metal" atom types are assigned by the automatic typing engines. So for metal ions, you need to assign the formal charges and atom types by hand.

Assigning charges

Charges (when available) are generally assigned at the same time as the forcefield atom type (see Assigning atom types to a model).

An atom-type charge is simply a fixed value associated with the atom types. Overall neutrality of a model is not necessarily achieved by assigning forcefield atom types. You may prefer to assign charges specifically. (The exact method depends on the molecular modeling program being used.)

Importance of correct charge assignment

Electrostatic interactions play a critical role in determining the structures of inorganic systems and in defining the packing of organic molecules.

Many forcefields already include charges

Forcefields that include Coulombic terms generally already include standard charges (or "bond increments") associated with the atom types. These forcefields have been parameterized with nonzero atom type charges or charge increments (Table 6) and therefore you usually just assign charges automatically when you do atom typing, instead of having to assign specific charges:

Table 6 . Forcefields parameterized with nonzero atom charges or bond increments

Forcefield Engine
CFF91-95, CFF, PCFF, COMPASS  
OFF, Discover  
CVFF  
OFF, Discover  
bks1.01  
OFF  
burchart1.01  
OFF  
burchart1.01-DREIDING2.21 (not all atom types)  
OFF  
burchart1.01-UNIVERSAL1.02 (Burchart atom types only)  
OFF  
glassff_1.01  
OFF  
MMFF93  
CHARMm1  
msxx_1.01  
OFF  
CHARMm  
CHARMm  
1 CHARMm as run through the Cerius2·MMFF module, not in QUANTA or standard CHARMm.

Important

If you want to assign charges different from those in the forcefield, you need to assign the charges after atoms typing (and automatic charge assignment) is done.

Finding charges, if needed

If you need to specifically assign charges, most relevant modules allow you to set atomic charges directly or specify an overall net charge for the whole structure using charge editing functions.

For small models, you can obtain values for charges by using an ab initio or semiempirical quantum chemistry module (for example, MOPAC).

For larger, distorted, models and when charge assignment is done by the charge equilibration method (in Cerius2), you usually need to perform a short minimization before assigning charges. This is because charge equilibration calculations on distorted models can lead to assignment of unrealistic charges.

Charge assignment in different modeling programs


Parameter assignment

Who should read this section

If you are a novice user or routinely run relatively simple calculations on relatively simple or standard models, you do not need to read this section. However, if, for example, an error message informs you of missing parameters or you want to customize your energy expression for an atypical model, then you do need to understand how the simulation engines determine what parameters are used for which atoms, bonds, angles, etc.

You should also understand something about atom type and charge assignment (Assigning forcefield atom types and charges) to make effective use of this section.

Determination of which parameters are used with which atom types

Before calculating the energy of a model, the simulation engine must construct the complete energy expression for the model by associating the correct forcefield parameters with the appropriate atoms and other coordinates. For example, methane has one type of bond (C-H) and one type of bond angle (H-C-H). The program must create a list of the four actual bonds and then associate the C-H bond parameters with each. Similarly, there are six H-C-H angles, but they are characterized by the same set of parameters.

It is important to understand how the parameters from the forcefield are associated with individual internal coordinates, because the energy, derivatives, structures, and almost all other properties calculated by the program depend on these forcefield parameters and the way in which they are associated with the internal coordinates. The following sections describe two facets of this process: atom type equivalences and wildcards in parameter definitions.

Atom type equivalences

Chemically distinct atoms often differ in some, but not all, of their forcefield parameters. For example, the bond parameters for the C-C bonds in ethene and in benzene are quite different, but the nonbond parameters for the carbon atoms are essentially the same.

In Discover, rather than duplicating the nonbond parameters in the forcefield parameter file, atom type equivalences are used to simplify the problem. In the example, the phenyl carbon atom type is equivalent to the pure sp2 carbons of ethene insofar as the nonbond parameters are concerned.

The Discover program recognizes five types of equivalences for each atom type: nonbond, bond, angle, torsion, and out-of-plane. Crossterms such as bond-bond terms have the same equivalences (insofar as atom types are concerned) as the diagonal term of the topology of all the atoms defining the internal coordinates. For the bond-bond term, this means that the atom type equivalences for angles would be used.

The actual format of the equivalence data in the forcefield parameter file is detailed in the File Formats documentation. For the equivalences used in any particular forcefield, you should examine the actual forcefield parameter file for current information.

CHARMm PRM files handle equivalences for nonbond parameters by using partial wildcards, for example, N* means that the associated nonbond parameters apply to any nitrogen type that is not specifically parameterized.

For forcefields in Cerius2·OFF, wildcards are usually used.

Wildcard atom types in the parameter file

For some internal coordinates, the parameters do not depend strongly on the specific atom types of one or more atoms. For example, the parameters of torsion terms may not be strongly affected by the end atoms. This means that the torsion parameters are essentially defined by the central bond rather than its substituents.

The forcefield engines allow wildcard atom types to conveniently handle this type of situation. This special atom type, indicated by an X in CHARMm .PRM files and in relevant Cerius2 forcefield files and by an asterisk (*) in Discover forcefield files, matches any atom type when the forcefield engine is searching for the parameters to associate with a particular internal coordinate. (In CHARMm, this applies only to bond, angle, torsion, and improper-torsion parameters.)

Automatic assignment of values for missing parameters

Availability

Table 7 . Automatic parameter assignment in MSI's molecular modeling programs

Modeling program Automatic parameters Comments
Cerius2  
yes  
not for all forcefields  
Insight II (CDiscover)  
yes  
but not for AMBER forcefield  
insight II (FDiscover)  
yes  
controllable in standalone mode only, not for AMBER  
QUANTA  
yes  
but only if the PSF Generator is used  

What happens if parameters are not found

Some classic and second-generation forcefields are not completely parameterized for all their atom types. (For rules-based forcefields (Rule-based forcefields broadly applicable to the periodic table), all parameters are generated according to rules rather than read from the forcefield file.) When parameters for classic and second-generation forcefields are not available, one of several things can happen, with varying consequences:

Temporary patches for missing parameters; precautions

A forcefield may include automatic parameters for use when better-quality explicit parameters are not defined for a particular bond, angle, torsion, or out-of-plane interaction. These parameters are intended as temporary patches, to allow you to begin calculations immediately. While MSI has made every effort to ensure that the automatic parameters used in CVFF, the CFF family of forcefields, and CHARMm produce reasonable geometries for a wide variety of models, we cannot guarantee that the automatic parameters are appropriate in every instance. You therefore should always carefully evaluate any results that you obtain using automatic parameters.

How missing parameters are supplied

Discover automatically assigns values for parameters missing from the CFF and CVFF forcefields by switching to an automatic forcefield. This switching is accomplished with an equivalence table that converts the original set of atom types to a smaller set of generic atom types.

Cerius2·OFF behaves similarly.

QUANTA's parameter chooser looks through the existing CHARMm parameters for similar cases and averages them all to come up with suggested values.

Discover's automatic forcefield

In the automatic forcefield in the Discover program, the atom types for bonds, angles, torsions, and out-of-plane deformations have different levels of specificity. For example, while bond-stretching parameters are determined by the atom types of both atoms; angle-bending and torsion parameters may be determined by the atom type of only the central atom(s). A wildcard (*), representing any type of atom, is used for the end atoms of torsions and angles.

In some cases, angle-bending parameters are specified by two atoms (rather than only the central atom). This can lead to ambiguity--for example, C-C-N (if not explicitly defined in the forcefield) can be associated with c_-c_-* or with n_-c_-*. The underscore in this example is used to denote the generic (or automatic) atom types. Here, a one-sided wildcard (*#, where # is an integer indicating the precedence), is used for one of the end atoms in an angle.

Cerius2·OFF handles precedence with an additional field (P0, P1, ... P9) rather than wildcards.

In interpreting the wildcard, the Discover program and the Cerius2·OFF module use the parameter for which the integer is lower. The parameters for a C-C-N angle would, for example (if not explicitly defined in the forcefield), be taken from those for atom types n_-c_-*6 rather than c_-c_-*7, because 6 is smaller than 7.

An example

As an example, the parameters for the angle oh-c"-c" in oxalic acid (Figure 3) are not present in CFF91.

Figure 3 . Oxalic acid structure and CFF91 atom types

When the automatic parameter assignment process is used in Discover, it looks at the auto-equivalence table in the cff91.frc file to find the generic atom types for this angle (indicated in bold type):


#auto_equivalence     cff91_auto

! Equivalences
! -----------------------------------------------------------------------------
!Ver Ref Type NonB Bond Bond Angle Angle Torsion Torsion OOP OOP
Inct End Atom Apex Atom End Atoms Center Atoms End Atom Center Atom
!--- --- ---- ---- ---- --- -------- ----------- --------- ------------ -------- -----------
2.0 2 c" c" c" c'_ c_ c'_ c_ c'_ c_ c'_
2.0 2 oh o o_ o_ o_ o_ o_ o_ o_ o_
...

Thus, for parameter assignment purposes only, atom type oh is reassigned to o_, c" is reassigned to c'_ for the apex atom, and c" is reassigned to c_ for the end atom. The parameters for the oh-c"-c" angle could be taken from either the o_c'_*7 or the c_c'_*9 lines in the quadratic_angle section of the cff91.frc file--o_c'_*7 is chosen because 7 is lower than 9:



#quadratic_angle cff91_auto

> E = K2 * (Theta - Theta0)^2

!Ver Ref I J K Theta0 K2
!---- --- ---- ---- ---- --------- ---------
2.0 2 c_ c'_ *9 120.0000 40.0000
2.0 2 n_ c'_ *8 120.0000 53.5000
2.0 2 o_ c'_ *7 110.0000 122.0000
2.0 2 o'_ c'_ *6 120.0000 68.0000
2.0 2 h_ c'_ *2 110.0000 55.0000
...

Manual parameter assignment

Notification of missing parameters

If parameter(s) for a potential type are not present in the forcefield file and are not generated when the energy expression is set up, an appropriate error message is written to the textport or text window of your molecular modeling program. (In QUANTA, this occurs only if you use QUANTA's Applications/Builders tools to construct your model; there is no warning for models that are read in from some other application or database.)

Missing and/or automatic parameters are also listed in an output file after the completion of a simultion run. You can find out if parameters are missing before starting your run:

If terms are missing, in Cerius2·Discover the energy expression is set up unless one or more of the Stop check boxes in the Discover Parameters control panel (accessed by selecting Forcefield/Parameters in the DISCOVER card) are checked.

If terms are missing, in Cerius2·OFF the energy expression is set up by only if Ignore undefined terms is checked in the Energy Expression control panel (accessed from the Energy Expression/Setup card menu item on the OPEN FORCE FIELD card).

Obtaining new parameters

If the forcefield you are using is similar in functional form and atom typing to another forcefield which does contain the desired parameters, you may be able to use those parameters in your forcefield, at least on a trial basis. You may also be able to obtain new parameters (or help in deriving them yourself) from the scientific literature (see References) or from the developers of the forcefield you are using.

Editing a forcefield

Expert users can edit MSI forcefields in different ways to customize them to their needs or to create new forcefields.

Editing through a graphical interface

MSI's principal molecular modeling programs include forcefield editors (see Available documentation):

You can change the functional form of terms in rules-based forcefields (see Rule-based forcefields broadly applicable to the periodic table) by adding an explicit term. Any explicit term is always used in preference to a generated term.

Important

The Cerius2·FFE module cannot be used to modify the CFF or CVFF forcefield files that are included with Cerius2--if you want to use customized versions of these forcefields, you need to modify them with Insight II 4.0.0 or as described in the Discover documentation.

Manual editing

Expert users can edit the files that define many forcefields with a text editor:

The potential template rule file used by Insight can also be edited. You may add new atom types by making additions to this file (refer to the Insight II documentation for a complete description).

Important

For forcefields accessed through Cerius2, we strongly recommend that you never edit these files by hand. Please use the Force Field Editor module. This is important because some forcefield values are linked to others and only the Force Field Editor reliably assures that related values are modified in a coordinated way.


Using alternative forms of energy terms

The energy expression is the heart of the forcefield. Potential energy is described in the energy expression as the sum of various terms that indicate the energy costs of bond stretching, angle bending, etc. Not all terms are present in all forcefields, and the functional forms of the terms vary among forcefields (see Forcefields).

This section and the following main section (Applying constraints and restraints) describe energy term preferences that you can set and restraint terms that can be optionally included in the energy expression.

Who should read this section

If you are a novice user, you should alter the default energy terms and parameters as little as possible. One exception to this recommendation is nonbond methods (see Handling nonbond interactions), where you should choose the method according to the model type rather than necessarily accept the default settings.

Availability

Table 8 . Modifications of energy expression in MSI's simulation engines

term modification engine (restrictions) details
any  
removal  
FFE1, CHARMm  
here  
all of a type  
scaling  
FDiscover, CDiscover  
here  
all of a type  
editing  
FFE, Insight 4.0.02  
here  
bond stretching  
Morse vs. harmonic  
FDiscover, CDiscover (CVFF)  
here  
bond stretching  
scaling  
OFF3 (C-H bonds)  
here  
torsion twisting  
scaled, averaged, or first found  
OFF  
here  
out-of-plane movement  
averaged or first found  
OFF  
here  
van der Waals interactions  
Lennard-Jones vs. quartic  
FDiscover (standalone)  
here  
hydrogen bond interactions  
if used, how set up  
OFF, FDiscover (standalone, AMBER), CDiscover (standalone, AMBER), CHARMm  
here  
crossterms  
removal  
FDiscover, CDiscover (CVFF)  
here  
1-3 or bond stretching-angle bending interactions  
Urey-Bradley term vs. crossterm  
OFF, CHARMm (standalone only)  
here  
1 The Force Field Editor module in Cerius2.

2 The Forcefield/Edit_FF parameter block in the Builder and other modules can be used to edit parameters in all terms except crossterms for AMBER, CVFF, and CFF. (Parameter block not present in Insight 97.0.)

3 The Open Force Field module in Cerius2.

Most methods for changing the functional form of the energy expression are available via the graphical UIFs:

Discover and CHARMm also offer additional functionality when run in standalone mode.

Removing terms from the energy expression

Why remove terms

You may, for example, want to save computation time during the early stages of minimization of a model that is far from its equilibrium conformation by not calculating any cross terms. Or you may have found that certain terms are insignificant with respect to the purposes of your study.

How to remove terms

You can effectively remove terms from the energy expression in several ways:

Scaling or editing any selected type of term

Uses

The contributions of various terms in the potential energy expression to the total energy can be scaled up or down and/or otherwise edited. This can be useful, for example, in the early stages of minimizing very "bad" structures, where large contributions by certain terms might interfere with convergence.

How it works

The Cerius2·Force Field Editor module allows you to directly change the parameters in any term (e.g., for all C_3-H_ bond terms, but not for all bond terms) in the forcefield. The Energy Terms control panels include entry boxes for all relevant parameters.

In the Discover program, scaling applies to an entire class of energy terms (e.g., all bond terms) in the energy expression. The force constants (or some other parameters) for the chosen class of terms are multiplied by some constant factor. For example, all bond interactions can be scaled by one factor and all van der Waals radii by another.

In the Insight 4.0.0 molecular modeling program, you can use the Forcefield/FF_Edit parameter block in the Builder and other modules to directly change the parameters in any term except crossterms (e.g., for all C-H bond terms, but not for all bond terms) in the forcefield. The *_Par parameter blocks accessed through the editor include entry boxes for all relevant parameters.

Alternative bond terms

With the CVFF forcefield in Discover, you can choose to use quadratic bond terms rather than Morse bond terms. The Morse term can allow bonds to stretch to unrealistic lengths (Figure 2), so you may get quicker convergence from a hightly distorted configuration if you replace the Morse term by a harmonic term. You do this by specifying the "no Morse" version of CVFF.

Scaled torsion terms

If all torsions about a common bond were simply summed, the torsion energy term could be too large. Cerius2·OFF therefore allows several methods for scaling torsion terms (Discover and CHARMm automatically handle torsions optimally, because of how their forcefields are parameterized):

Behavior in Cerius2·OFF

Inversion terms

The inversion, improper, or out-of-plane torsion term represents the energy involved in inverting a chiral center or otherwise changing this out-of-plane angle.

In Cerius2·OFF, you may use the first inversion term found or the average of all inversion energies, but the first approach is not recommended.

Nonbond functional form

In the Cerius2·Force Field Editor module, you can use the van der Waals Energy Terms control panels to choose among several nonbond functional forms. (These are listed in the online help, which is accessed by right-mouse clicking over the Function popup.) However, you would have to change the relevant parameters as well, if you wanted good results.

You can use the DSL (the FDiscover command language) set command to choose between the usual Lennard-Jones 6-12 or 6-9 potential (e.g., term 10 in Eq. 23) and a quartic form:

Eq. 26        

The quartic form is useful when you need to eliminate bad van der Waals contacts, but the second derivatives are not calculated.

Hydrogen bonds and hydrogen-bond terms

Who should read this section

Many forcefields, especially the newer ones, fully account for hydrogen bonds by other terms in the forcefield and so do not have or require specific terms for handling hydrogen bond interactions.

However, some older forcefields include specific terms for hydrogen bonding (e.g., older versions of AMBER). Others (e.g., CHARMm) allow you to use a hydrogen bond term if you want (but MSI does not recommend this). If you are using a forcefield with explicit hydrogen bond terms, you should read this section.

Lack of hydrogen bond terms is an asset

If specific hydrogen bonds are required, generation of a list of hydrogen bonds is a major step in evaluating the energy of a system. This process involves looking at all possible pairs of hydrogen bond donors and acceptors and selecting those that meet certain criteria (Figure 5):

Since hydrogen bond interactions depend on both angle and distance, both angle cutoffs and distance cutoffs must be specified for a switching function (see Nonbond cutoffs). A switching, or spline, function (Figure 15) is needed to conserve energy by smoothing transitions over the cutoffs.

Specifying the criteria

In Cerius2·OFF, the hydrogen bond criteria can be changed by using the Hydrogen Bond Preferences control panel (accessed by selecting the Energy Terms/Hydrogen Bond menu item from the OPEN FORCE FIELD card). This control panel also allows specification of switching function ("spline") parameters.

In Discover, default hydrogen bond criteria are contained in the forcefield file (amber.frc). CDiscover allows you to use the BTCL forcefield scale command to scale hydrogen bond terms (if they exist). In FDiscover, they can be changed by editing the command input file to change the variables HBDIST and HBANGL.

In CHARMm, you can change the hydrogen bond criteria with the CHARMm Update Parameters dialog box, which is accessed from the CHARMm/Update Parameters menu item. This dialog box also allows specification of switching function parameters for hydrogen bonds. Setting the Update Frequency to 0 (the default) effectively omits the hydrogen bond term from the potential energy expression. You can also omit explicit hydrogen bond terms by using the CHARMm/Energy Terms menu item.

Bond-angle cross terms vs. Urey-Bradley terms

An alternative or supplement to bond-angle interactions is the Urey-Bradley term, which accounts for 1-3 interactions between two atoms that are bonded to a common atom.

In Cerius2·OFF, use the Energy Terms Selection control panel to specify whether to use the Urey-Bradley term (assuming it is available in the current forcefield).

In CHARMm, the Urey-Bradley term, if present, can be omitted from the energy expression (standalone only) or can be specified in the parameter file (ANGLE statement).


Applying constraints and restraints

Why read this section

Constraints and restraints allow you to focus the calculation on a region or conformation of interest and also to set up computational experiments. Such experiments are one of the primary uses of molecular modeling, allowing you control over a model at the atomic level. Several examples are described under When to use constraints/restraints.

Restraints vs. constraints

The seminal difference between a constraint and a restraint is that a constraint is an absolute restriction imposed on the calculation, while a restraint is an energetic bias that tends to force the calculation toward a certain restriction (even though many people use these terms as if they were interchangeable).

Availability

Table 9 . Constraints and restraints in MSI's simulation engines

constraint/restraint type engine1 details
atom  
fixed (constraints)  
OFF2, FDiscover, CDiscover, CHARMm  
here  
template forcing  
harmonic (Eq. 28) restraint  
FDiscover  
here  
tethering and template forcing  
quadratic (Eq. 29) restraint  
CDiscover3  
here  
tethering  
harmonic (Eq. 28) restraint  
FDiscover  
here  
tethering  
mass-weighted harmonic (Eq. 30) restraint  
CHARMm  
here  
quartic droplet  
harmonic (Eq. 31) restraint  
CHARMm  
here  
distance  
harmonic (Eq. 32) restraint  
OFF  
here  
distance  
quadratic (Eq. 29), flat-bottomed (Eq. 34), or cosine (Eq. 36) restraint  
CDiscover3  
here  
distance  
harmonic (Eq. 32) or flat-bottomed (Eq. 33) restraint  
FDiscover  
here  
distance  
flat-bottomed (Eq. 35) restraint  
CHARMm  
here  
dynamics  
RATTLE algorithm (constraints)  
CDiscover  
here  
dynamics  
SHAKE algorithm (constraints)  
CHARMm  
here  
dynamics  
consensus dynamics (Eq. 28) (standalone only)  
FDiscover, CDiscover  
here  
angle  
harmonic (Eq. 37) restraint  
OFF  
here  
angle  
quadratic (Eq. 29), flat-bottomed (Eq. 34), or cosine (Eq. 36) restraint  
CDiscoverc  
here  
torsion  
harmonic (Eq. 38) restraint  
OFF  
here  
torsion  
quadratic (Eq. 29), flat-bottomed or J3 dihedral (Eq. 34), cosine (Eq. 36), cis (Eq. 39), trans (Eq. 40), or cis/trans (Eq. 41) restraint  
CDiscoverc  
here  
torsion  
flat-bottomed (Eq. 33) restraint (standalone only)  
FDiscover  
here  
torsion  
cosine (Eq. 42) or harmonic (Eq. 38) torque (one of these is standalone only)  
FDiscover  
here  
torsion  
harmonic (Eq. 38) restraint  
CHARMm  
here  
inversion  
harmonic (Eq. 43) restraint  
OFF  
here  
chiral  
flat-bottomed (Eq. 34) restraint  
CDiscover3  
here  
out-of-plane  
quadratic (Eq. 29), flat-bottomed or J3 dihedral (Eq. 34), or cosine (Eq. 36) restraint  
CDiscover3  
here  
out-of-plane  
harmonic (Eq. 43) restraint (standalone only)  
FDiscover  
here  
1 The standalone modes of running simulation engines may give access to additional constraints and restraints--please see the appropriate documentation.

2 The Open Force Field module in Cerius2.

3 Not available yet in the Cerius2·Discover module; restraints applied with CDiscover (in Insight and standalone modes) can also be scaled.

Most restraints and constraints are available via the graphical UIFs:

Discover and CHARMm offer additional restraint and constraint functionality when run in standalone mode.

When to use constraints/restraints

Constraints and restraints are often used to control and direct the minimization.

Fixed-atoms example

For example, you can fix some atoms in space, not allowing then to move. For example, part of the structure of a molecule may have been well solved experimentally, but the structures of other areas are less clear. Or you might want to keep parts of your model (e.g., solvent molecules) rigid to decrease computational costs.

Torsion-rotation example

You can add extra terms to the energy expression to restrain or bias the system in certain ways. For example, if you are investigating the adiabatic energy barrier to rotation about a bond, you would restrain the value of that torsion and minimize the structure. Repeating this procedure for a set of torsion values in the range 0°-360° yields a complete energy profile for rotation about the bond. A similar process is used to generate phi/psi maps and other multidimensional energy surfaces in studies of model conformation.

Docking example

If a substrate is being docked onto an enzyme and a specific hydrogen bond between the enzyme and the ligand is thought to be involved in binding, the donor and acceptor atoms can be pulled together to provide a docking coordinate. In this way, the results are not so dependent on the initial starting configuration, which may have been only a crude graphic alignment. In cases like this, the restraint is turned off at some point to make sure that the biased minimum is close to a true minimum.

Modeling incomplete models

Another example of the use of restraints is in modeling incomplete systems. Often, it is difficult or impossible to construct a realistic environment around parts of a model system. For example, only a partial structure of a large protein complex may be available, and some atoms must be restrained to stay near their initial crystal positions because they do not "feel" interactions with neighboring (missing) amino acids, membrane, or solvent. If the site of interest (for instance, a binding site for a competitive inhibitor) is well characterized but other parts of the enzyme are unknown or would require too much computation time if they were included, a limited study can still be carried out with the ends tethered to their crystal coordinates. Usually, these restraints are permanent parts of the model. The results of such calculations must be critically evaluated but can be valid if the ligand binding does not depend on interactions with missing pieces of the model or on conformational flexibility in the tethered regions.

Relaxing crystal structures

As a final example, tethering can be used to gently relax a crystal structure. Often, crystal coordinates, even if highly refined, have several strained interactions due to intrinsically disordered or poorly defined atomic positions, which, upon minimization, give rise to large initial forces. If these forces are not restrained, they can result in artifactual movement away from the original structure. The general approach is to progressively relax parts of the model in stages, starting with the least well determined atoms, until the entire system can minimize freely. The restraints are ultimately removed so that the final minimum represents an unperturbed conformation. It is usually not necessary to minimize to convergence at each stage--the object is to relax the most-strained parts of the system as quickly as possible without introducing artifacts.

Fixed atom constraints

Cost-saving

Fixed atoms are constrained to a given location in space; they cannot move at all. Fixed atoms reduce the expense of a calculation in two ways:

Important

The energy calculated by simulation engines is correct only to an arbitrary constant, depending on the model as well as the fixed atoms. Thus, only differences in energy between conformations of the same model having the same fixed atoms are meaningful.

Uses

Use atom constraints when you want to apply minimization or dynamics to part of a model, while keeping the remainder of the model fixed and rigid. For example, use atom constraints to quickly minimize a sorbate in a zeolite by fixing the atom positions of the zeolite frame and allowing only the sorbate atoms to move. Or fix all residues in a protein except for those in the active site.

Template forcing, tethering, quartic droplet restraints, and consensus conformations

Uses

Typical uses of these related types of restraints are to bias the conformation of one model towards that of another, to bias selected atoms towards their experimentally known positions, to restrain the core of a model while allowing its solvent-exposed constituents more freedom of movement, or to find an identical or close set of conformations that a group of related models can achieve.

Template forcing

To force the conformation of one model to be similar to that of a template model, a one-to-one correspondence between atoms in the template and in the moving structure is set up, and (for example) one of the following restraint terms is added to the energy expression:

Eq. 27        

or:

Eq. 28        

The term in Eq. 27 is proportional to the root-mean-square (rms) deviation of the analog atoms from the template atoms. (This form cannot be used with the Newton-Raphson minimizer in FDiscover.) The values obtained for the energy and the rms function depend on the value of the forcing parameter K. Typical values for this constant are in the neighborhood of 5 kcal Å-1. It is often instructive to look at the dependence of the energy and rms functions on the forcing parameter by making several determinations with different forcing parameters. If several runs (minimization or dynamics) are made, it may also be helpful to plot the energy as a function of the rms value. For tethered minimizations, a very large forcing constant (e.g., 2000.0 kcal Å-1) is often used to prevent significant movement of any of the tethered atoms.

Eq. 28 represents a conceptually more straightforward restraint, with each atom restrained by an isotropic spring to the position of its template atom. In either form, the summation is over a list of pairs of atoms to restrain: one from the moving model, and one from the template model. FDiscover uses this quadratic form by default.

The Ki in Eq. 28 are determined by the distance of atom i from the atom defining the origin as:

where rmin is the distance at which the tethering turns on, kmin is the initial force constant at that distance, rmax is the distance where the force constant reaches its maximum allowed value, and kmax is the maximum allowed force constant. If rmin and kmin are not given, the default values are zero. If rmax is zero, tethering uses a constant force constant of kmax.

In CDiscover, a simpler quadratic is used:

Eq. 29        

where V is any appropriate internal (bond length, angle, etc.--the same functional form is used for several types of restraints).

Advantages of each type of template-forcing restraint

The first form (Eq. 27) gives the best rms fit for the least energetic cost, but individual atoms may remain quite far from their template position. The second form (Eq. 28) restrains each atom individually, so each atom is forced toward its template partner. The resulting rms fit is not as good as that from Eq. 27, but no one atom is allowed to deviate as much as is possible with Eq. 27. The form in Eq. 28 also allows for a different force constant for each pair, which means that different atoms or classes of atoms can be treated differently.

Tethering

Tethering is the same as template forcing, except that the atoms are restrained to their original positions rather than to positions in a template structure. Both Eq. 27 and quadratic forms are applicable for tethering; however, Eq. 28 is used by FDiscover and Eq. 29 by CDiscover, because tethering is usually used to keep atoms from moving too far from their original positions.

CHARMm allows mass-weighted tethering by calculating an additional energy term for all atoms that are to be restrained. This term has the form:

Eq. 30        

Where Econs is the constraint energy, ki is the force constant, mi is the mass of atom i (if mass weighing is used) or 1, ri is the position of atom i, r0 is the reference position about which the atom is to be centered, and n is an exponent.

Quartic droplet restraint

The quartic droplet restraint term in CHARMm is designed to put the entire model into a "cage" by constructing a restraining sphere around a model. The potential is scaled so that atom positions furthest from the center of mass or the geometric center of the model have the greatest restraining force applied.

The quartic droplet restraint term is based on the center of mass (COM) or the center of geometry of the model. No net force or torque is introduced by the center of mass term. The potential function is:

Eq. 31        

FDiscover (standalone only) allows similar restraints within spherical shells.

Consensus dynamics

Consensus dynamics is used to find the consensus configuration of a set of analogs. In essence, all models in the set are treated as both moving molecules and templates.

Standalone FDiscover uses the harmonic template-forcing restraint (Eq. 28).

The database capability of CDiscover is used in settng up consensus dynamics calculations, using the restraint in Eq. 29.

General internal-coordinate restraints

In CHARMm, you can apply general internal coordinate restraints by applying restraints to all bonds, angles, and/or dihedral angles that have entries in an internal coordinates table. This facility is global, that is, not applicable to specific internal coordinates.

Distance and NOE restraints

Uses

Distance restraints are used to bias the distance between two atoms, bonded or not, toward a given value. Some uses are to cyclize linear models by bringing the ends closer together, dock different models, and fit distance data derived from NOE and other experiments.

Several functional forms for distance restraints: harmonic...

Several commonly used functional forms are supported.

One is a simple harmonic function, which in FDiscover has the form:

Eq. 32        

where K is a force constant, Rij is the current distance between the atoms, and Rtarget is the target distance. A large force constant tends to force the distance to be close to the target distance; a smaller force constant results in a correspondingly smaller bias.

In CDiscover, the quadratic form is the same, except that scaling is enabled (Eq. 29).

In Cerius2, the form is the same, except that K is multiplied by 0.5. Rtarget can be defined explicitly or automatically extracted from the model as the current distance between atoms.

...and flat-bottomed...

The second form is also harmonic, but it is separated into several piecewise continuous regions, resulting in a flat-bottomed potential (Figure 6). For FDiscover, the form is:

Eq. 33        

Figure 6 . Distance restraint function

E as a function of R, the distance between two atoms or dihedral angles, defined as in Eq. 33.  

For CDiscover, the flat-bottomed form is:

Eq. 34        

where V is any appropriate internal (bond length, angle, etc.--the same functional form is used for several types of restraints).

The restraining potential used in CHARMm is:

Eq. 35        

Where Rlim is the value of R where the force equals fmax.

...and cosine

CDiscover also allows a cosine form of restraint:

Eq. 36        

where V is any appropriate internal (bond length, angle, etc.) and n is the periodicity.

Advantages of the flat-bottomed functional form

It is not necessary for the flat-bottomed potential (Figure 6) to be symmetric. By appropriate definition of the points R1, R2, etc., any of the regions may be eliminated. For Eq. 33, the important regions are those from R1 to R2 and from R3 to R4, where a harmonic potential is applied, and the flat bottom from R2 to R3.

This form of the restraint allows a range of acceptable distances and is particularly useful for incorporating experimental distance information, such as those from NOE experiments, into a calculation. The flat bottom allows for experimental error in the determined distance. The two outer regions (Figure 6) have a constant gradient, which is useful for avoiding unreasonably large forces if the initial structure is far from the target value.

Distance and angle constraints in dynamics simulations

The RATTLE and SHAKE algorithms effectively remove very-high-frequency vibrations from consideration during dynamics simulations. Use of these algorithms can allow for a larger time step during simulation.

In CDiscover, the BTCL rattle command is used before the dynamics command to set up constraints in bonds, angles, or water molecules in a molecular dynamics simulation. It can be used to constrain bonds or any atom pairs to user-defined distances. It can be used to constrain angles spanned by two constrained bonds. In addition, it can be used to fix the geometry of water molecules so that the fixed-geometry water models SPC and TIP3P can be used in a simulation. This functionality is available in a limited way via the Calculate/Dynamics parameter block in the Discover_3 module of Insight (click More to display the Rattle toggle).

In CHARMm, SHAKE is used to constrain bond lengths and angles spanned by two constrained bonds during dynamics runs. (However, its use is recommended only for constraining all bonds in which one of the bonded atoms is a hydrogen.) The SHAKE algorithm cannot be used with the Newton-Raphson or ABNR minimizers (see Minimization).

Angle restraints

In Cerius2, an angle restraint can be applied to a group of any three atoms. The restraint is implemented such that:

Eq. 37        

Where: Ka is the angle force constant; is the angle between the selected atoms; and 0 is the desired restrained angle of the selected atoms. 0 can be defined explicitly or can be automatically extracted from the model as the current angle connecting selected atoms.

In CDiscover the default form of angle restraints is cosine (Eq. 36). Quadratic (Eq. 29) and flat-bottomed (Eq. 34) angle restraints can also be used.

Torsion restraints

Uses

Some uses of torsion restraints are to enforce chiral and prochiral centers, prevent cis-trans conversions, and fit NOE J-coupling constants from NMR experiments. Conversely, other uses are to force torsion rotation in order to perform phi/psi mapping, perform conformational searching, and induce conformational changes.

Functional forms

Several forms of torsion restraints are used in the literature and implemented in MSI's simulation engines.

Harmonic restraints, or periodic restraints (Eq. 42 with n = 1), are appropriate for forcing a torsion angle to a particular value. The periodic form with a periodicity greater than one is useful for restraining a torsion to one of several related angles. For instance, a threefold potential could keep a torsion either trans or at one of the two gauche conformations, depending on the starting conformation and the strength of the potential applied.

Implementation

In Cerius2·OFF, a torsion (dihedral) restraint can be defined among any group of four atoms. The restraint is implemented such that:

Eq. 38        

Where Kt is the torsion force constant; is the angle between the i-j-k and j-k-l planes; and 0 is the desired restrained angle of the selected atoms, which can be defined explicitly or automatically extracted from the model as the current angle connecting selected atoms.

In CDiscover, you can specifically restrain dihedrals to be cis:

Eq. 39        

or trans:

Eq. 40        

or either cis or trans:

Eq. 41        

You can also use the flat-bottomed function (Eq. 34) to apply J3 dihedral restraints to fit the results of NOE experiments. A plain cosine form (Eq. 36) and a quadratic form (Eq. 29) are also available. The torson involving any four atoms can be restrained.

In FDiscover, the functional forms include a simple harmonic form analogous to Eq. 32 and a piecewise continuous form like Eq. 33 with R interpreted as the angle, rather than the distance. Another form is the periodic function of Eq. 42:

Eq. 42        

where V gives the strength of the restraint, n is an integer periodicity, and 0 is the phase angle.

CHARMm uses a harmonic potential to restrict the motion of a dihedral angle to a value close to a reference position or to examine a series of different conformations when making potential energy maps.

Inversion, out-of-plane, and chiral restraints

Uses

Typical uses include prevention of changes in chirality or prochirality. (A molecule is chiral if no stable conformation of it can be superimposed on its mirror image--most chiral organic molecules can be described in terms of chiral centers, i.e., an atom that has four distinct substituents. Two chemically identical substituents on an otherwise chiral tetrahedral center are prochiral; in addition, sp2 hybridised planar systems with three different substituents are considered prochiral.)

Implementation

In Cerius2·OFF, an inversion (improper torsion or out-of-plane angle) restraint can be defined among any four atoms i, j, k, l, where i defines the inversion center. The restraint is implemented such that:

Eq. 43        

Where Ki is the force constant for the out-of-plane; is the angle between the i-j-l and i-k-l planes; and 0 is the desired restrained out-of-plane angle of the selected atoms, which can be defined explicitly or automatically extracted from the model as the current angle connecting selected atoms. There must be a real atomic center for the inversion.

The CDiscover program can impose a flat-bottomed chiral restraint (Eq. 34) to invert the chirality or force it to be R or S.

CDiscover can also impose a cosine (Eq. 36), quadratic (Eq. 29), or flat-bottomed (Eq. 34) out-of-plane restraint.

The FDiscover DSL language can be used to impose chirality and prochirality restraints having the same functional form as Eq. 43, where 0 is the out-of-plane angle corresponding to R or S.

Plane and other geometrical constraints and restraints

The BTCL language of CDiscover allows sophisticated geometric manipulation of molecular and other objects, including constraints and restraints, by means of the geometry, molGeom, restraint and other commands. A subset of this functionality is accessible in the Calculate/Geometric parameter block in the Insight·Discover_3 module.


Modeling periodic systems

Why read this section

Periodic boundary conditions refers to the simulation of models consisting of a periodic lattice of identical subunits. By applying periodic boundaries to simulations, the influence, for example, of bulk solvent or crystalline environments can be included, thereby improving the rigor and realism of a model.

Availability

Table 10 . Periodic boundary methods in MSI's simulation engines

periodicity engine details
minimum image  
OFF1, FDiscover, CHARMm  
here  
explicit image  
FDiscover, CDiscover, CHARMm  
here  
crystal simulations  
OFF, CDiscover, CHARMm  
here  
bonds across boundaries  
OFF, CDiscover, CHARMm  
here  
1 The Open Force Field module in Cerius2.

Most methods for controlling the treatment of periodic systems are available via the graphical UIFs:

Discover and CHARMm offer additional functionality when run in standalone mode.

Models are specified in Cartesian space

Some simulation engines accept only Cartesian coordinates, not crystal coordinates (others are able to convert between the two systems). This is important when using asymmetric space groups, since the symmetry operators assume that the input coordinates correspond to the standard asymmetric unit as defined in the International Tables for Crystallography (Reidl 1983).

For Discover, it is assumed that the x Cartesian axis corresponds to the a crystal axis and that the b axis lies in the x,y plane (see Figure 7)

Figure 7 . Relationship between Cartesian coordinate system (xyz) and periodic system (abc) in Discover and CHARMm

.

For Cerius2·OFF, by default the c lattice vector is parallel to the z Cartesian axis and the b lattice vector lies in the y,z plane (Figure 8).

Figure 8 . Relationship between Cartesian coordinate system (xyz) and periodic system (abc) in Cerius2·OFF

CHARMm can handle models that are defined in either crystal or Cartesian space. In converting from crystal to Cartesian axes, the a, b, c crystal axes are aligned with the x, y, z Cartesian axes (Figure 7).

Minimum-image model

Tip

For periodic systems in which nonbond interactions dominate, the Ewald sum method (Ewald sums for periodic systems) is preferred over the the minimum-image convention.

Simulation in bulk solvent

The left side of Figure 9 shows a solute molecule surrounded by enough solvent to occupy the volume (and shape) of a cube. A simulation carried out on this isolated cubic system is a poor approximation of what would happen in a true bulk solvent environment. For example, the solute can diffuse toward a surface or solvent molecules can evaporate. To remedy this, on the right of Figure 9 the cube is replicated in three dimensions to form a 3 X 3 X 3 lattice of identical cubes. This is a much better representation of bulk solvent for the interior cube, because molecules near the surfaces now interact with solvent in adjacent cubes. The imaged atoms are used to calculate energies and forces on the real atoms in the interior cube. The energies and forces on the imaged atoms themselves are not calculated because their motions are computed as symmetry operations on the real atoms, for example, by translations along the cubic axes.

Figure 9 . Solute surrounded by solvent

A solute surrounded by an isolated cube of solvent is replicated periodically in three dimensions in order to better represent a bulk or crystalline environment.  

Implications of minimum-image model for calculating nonbond interactions

Consider the implications of this model for a specific case. In Figure 10, molecule A1 is located near an edge of the square. (For simplicity, this discussion focuses on a two-dimensional lattice.) In addition, eight images of A1 (A2-A9) are present in the adjacent symmetrically related squares. Consider the interactions of molecules A with molecules B. The closest image of B to A1 is actually not B1, but rather B5. If molecules in the interior cell are allowed to interact only with the molecule or molecular image closest to it, this is called a minimum-image model. Each molecule interacts only with those molecules and images within a distance of half the cell size. The advantage of this approach is its simplicity. It is straightforward to compute energy between a given pair of molecules without explicitly keeping track of the images in neighboring cells. All periodic boundary algorithms imply a cutoff criterion, but the minimum-image convention implies a maximum distance for this cutoff of no more than half the cell dimensions.

Figure 10 . Minimum-image model

Minimum-image model showing that each real molecule interacts with at most only one image of each real molecule.  

For a description of the minimum-image convention, see also Allen and Tildesley (1987).

Explicit-image model

A more general approach--ghost molecules

Simulation engines (Discover and CHARMm) can also use a more general approach by generating explicit images of the interior molecules, also called ghost molecules, which interact with the interior molecules. These ghost molecules are replicated to as great a distance as necessary (but no farther than necessary) to satisfy the desired potential energy cutoff criteria.

The left side of Figure 11 shows molecule A1 interacting with several images of B (B1, B2, B3, B5) within the specified cutoff radius (shown as a shaded circle centered on A1). A1 interacts with several of its own images as well (A3, A5, A6, A8).

Figure 11 . Explicit-image model

Explicit-image model showing how a cutoff distance defines which molecules in adjacent unit cells are selected as ghost images. (Different cutoff distances are used in the left and right figures.) Left: explicit-image model--a larger cutoff including interactions with more images is possible than with the minimum-image convention; right: the shaded region identifies which molecules are selected as ghost images within the cutoff distance of any molecules in the unit cell.  

Cutoff distances and nonbond interactions

The right side of Figure 11 shows which molecules in the adjacent unit cells become explicit ghost molecules for a given cutoff distance. Not every molecule in an adjacent cell becomes a ghost. However, if a cutoff distance that is longer than the cell length is used, ghosts from unit cells beyond the nearest neighbor cells may be included. As molecules (effectively, see below) move in and out of the boundaries, the molecules that are ghosts can change. Therefore, the ghost list is regenerated periodically.

Nonbond interactions do not have to be calculated between ghost atoms. This helps to significantly reduce computation time.

When group-based cutoffs (Charge groups and group-based cutoffs) are used, the nonbond potential is cut off on the basis of charge groups (i.e., only if two groups are within the cutoff is the interaction calculated), and only those groups in molecular ghosts that are within the cutoff distance of a real group are included in the ghost atom list.

How images and "real" molecules move

Ghost molecules follow their symmetrically related counterparts. However, when it comes time to move the molecules (in a dynamics step or minimization iteration), only the real molecules (A1 and B1) are actually moved according to the accumulated forces each molecule has felt. The ghost molecule positions are simply regenerated by applying the defined symmetry relations to the new positions of the molecules.

Perfect symmetry is maintained between the primary structure and all its image objects. For many applications, this condition is satisfactory. However, it is not possible to study, for example, cooperative changes between image objects.

To maintain all molecules in the central cell, image centering is used. Molecules that happen to migrate to an edge of the primary structure and would appear in one of its image objects instead reappear in the primary structure from the opposite direction. Thus a constant number of atoms is maintained and no molecules are lost, no matter how far they may diffuse during the calculation.

Crystal simulations

Energies of crystals can be calculated and the lattice parameters a, b, c, , , and can be optimized with Cerius2, CDiscover, and CHARMm:

Crystal simulations are also available in several Cerius2·OFF Instruments modules. For example, you can use the Crystal Packer module to optimize crystals or calculate their energy and can include minimization of periodic structures in a Mechanical Properties run.

Because crystal patching is not available in CHARMm, bonds between crystal images are not handled well. Similarly, hydrogen bond interactions described by an explicit hydrogen bond function cannot be used. The only forces that can be calculated between primary and image atoms in crystals are nonbond forces.

Bonds across boundaries

Allowing bonds (with additional energy terms including angles, dihedrals, and improper dihedrals) between the primary atoms and image atoms enables you to study polymers such as DNA or industrial polymers.

Cerius2·OFF, CDiscover, and CHARMm can handle bonds across cell boundaries. (However, CHARMm is best used only for linear polymers, since it does not handle 3D lattices or networks well.)


Handling nonbond interactions

Electrostatic (Coulombic) and van der Waals interactions together are referred to as nonbond interactions.

Why read this section

Nonbond terms can involve extensive calculation. To avoid a heavy calculation burden, some approximation scheme is often employed. Choosing the best method for your particular model can save computational expense without sacrificing accuracy.

In addition, you have some direct control over the functional terms for nonbond interactions:

Availability

Table 11 . Nonbond methods in MSI's simulation engines

method type of system engine details
atom-based (single) cutoffs  
periodic, nonperiodic  
OFF1, FDiscover, CDiscover, CHARMm  
here  
group-based cutoffs  
periodic, nonperiodic  
FDiscover, FDiscover, CHARMm  
here  
double cuttoffs  
periodic, nonperiodic  
FDiscover  
here  
tail corrections  
disordered periodic  
CDiscover  
here  
cell-based cutoffs  
periodic  
CDiscover  
here  
cell multipole method  
nonperiodic, periodic2  
CDiscover  
here  
Ewald sums  
periodic  
OFF, CDiscover  
here  
1 The Open Force Field module in Cerius2.

2 Standalone only for periodic systems, cannot be used for constant-pressure or constant-stress dynamics.

Most methods for specifying how to treat nonbond interactions are available via the graphical UIFs:

Discover and CHARMm also offer additional functionality when run in standalone mode.

You may use different methods for van der Waals and electrostatic interactions

Typically, both van der Waals and Coulombic interactions are calculated by the same method and (if by the nonbond cutoff method) with the same nonbond list. However, different methods and parameters may be used for van der Waals and Coulombic terms in CDiscover and (except for some operations) in Cerius2·OFF and CHARMm. This allows you, for instance, to use a large cutoff for electrostatic interactions and a smaller cutoff for van der Waals interactions.

The van der Waals interaction potential is relatively short range and dies out as 1 \xda r6. By 8-10 Å, the energy and forces are quite small. Thus, using cutoffs to bring the van der Waals potential to zero at about 10 Å can be a reasonable approximation. The Coulombic interactions, on the other hand, die off as 1 \xda r, so even at considerable distances the energy of interaction is not negligible. But this depends on the model: except for a few formally charged groups, most molecules are composed of neutral fragments with dipoles and quadrupoles. Thus, in most models the major component of the electrostatic interaction between molecules or parts of molecules is a dipole-dipole interaction, which falls off as 1 \xda r3.

Note

For models having 2D periodicity (e.g., built using the Cerius2·Surface Builder) the Ewald method is available for the Coulombic terms but not for the van der Waals terms.

Automatic exclusions

Van der Waals and Coulombic interactions are ordinarily calculated between all atom pairs that are not specifically excluded. Most forcefields exclude nonbond terms for atoms connected by bonds (1-2 interactions) and valence angles (1-3). Some forcefields also exclude nonbond terms between end atoms in torsion (1-4) interactions. These interactions are illustrated Figure 12.

Figure 12 , Types of interactions usually excluded from nonbond calculations

1-4 interactions and AMBER

If 1-4 nonbond interactions in torsions are included in the nonbond list, they may be scaled. For example, with the AMBER forcefield (as implemented in both Cerius2·OFF and Discover) these nonbond interactions must be scaled by 0.50.

Scaling by 0.5 occurs by default with Cerius2 when AMBER is loaded.

In the Insight·Discover module, you need to toggle the p1_4 parameter in the Parameters/Scale_Terms parameter block on and enter 0.5 in the p14 parameter box,

The standalone version of the FDiscover program handles this scaling with the following DSL command:


>	scale 1-4 by 0.5

Important

This term is not set by default in Discover, even for the AMBER forcefield, so you must remember to set the p1_4 parameter either the first time that you run the Discover program from Insight when using AMBER or in your command input file for each standalone job that uses AMBER.

The equivalent BTCL command for the CDiscover program is forcefield scale with the vdw_1_4 keyword.

Combination rules for van der Waals terms

van der Waals radius combination rules

Any van der Waals interaction parameters that are actually defined for heterogenous atom pairs are called off-diagonal parameters. Off-diagonal parameters that are not available for such atom pairs are calculated by averaging those for each of the two atom types, using a geometric, arithmetic, or (in CDiscover and Cerius·OFF) 6th-power combination rule:

Eq. 44               

Eq. 45               

Eq. 46               

Availability

In Discover and Cerius2·OFF, a choice of combination rule is available and is specified in the forcefield file (see the File Formats documentation).

Quality of results

The arithmetic mean gives marginally better equilibrium distances for van der Waals interactions than the geometric combination rule (Halgren 1992). The 6th-power rule (not available with all forcefields) yields even better results (Waldman and Hagler 1993).

van der Waals combination rules and Ewald sums

With the Ewald method (Karasawa & Goddard 1989) (Ewald sums for periodic systems), the geometric mean leads to faster convergence than the arithmetic mean.

In addition, because the Ewald sum calculation proceeds much faster when only diagonal parameters are used, the Cerius2·OFF Van Der Waals Preferences control panel includes an option to ignore off-diagonals even when they are present (they are not present in any of the Discover forcefields).

The dielectric constant and the Coulombic term

Role of the dielectric constant in modeling

The electrostatic potential is computed from the partial atomic charges associated with the model (Assigning charges). Approximate solvent-screening effects can be included by specifying a nondefault value for the dielectric constant if it is explicitly included in the forcefield. (The "dielectric constant" used in modeling is not the dielectric constant that most experimental chemists would think of--it is instead an empirical, dimensionless scaling factor.)

The dielectric constant reflects the polarizability of the solvent molecules. A polarizable solvent such as water has a greater dielectric constant than less polar liquids. Electrostatic interactions in polarizable solvents with high dielectric constants are greatly attenuated. In closely packed molecules, however, there are fewer solvent molecules to screen the charge interactions.

A relatively large dielectric constant can be used for simulating the aqueous environment of small systems. However, many calculations on models use a smaller dielectric constant. For example, a dielectric constant between 2.0 and 10.0 has been used for simulations in the interior of a protein. A typical value for water would be around 4.

Additional information

For a helpful review, please see Harvey (1989). A tutorial on dielectric constants in forcefields can be found at MSI's website:

http://www.msi.com/support/insight/insight/dielectric.html

A distance-dependent dielectric "constant"

The dielectric constant can be kept constant, or the Coulombic term can be made a shielded function, where the dielectric "constant" is a function of distance (r ·). This is useful for electrostatic interactions in closely packed molecules, where the number of solvent molecules between two interacting charges is usually fewer than in bulk solvent. A distance-dependent dielectric constant is also useful for models in which explicit solvent molecules are not included.

The distance-dependent dielectric function (also called a shielded dielectric) is generally used with the Dreiding and AMBER forcefields and may be used with others.

A shielded Coulombic term is faster to calculate than a non-shielded term because no square root has to be evaluated.

Important

A distance-dependent dielectric constant cannot be used on a periodic model with the Ewald sum method (Ewald sums for periodic systems).

Availability      

The dielectric constant can be changed and/or made distance dependent in any of MSI's simulation engines.

In Cerius2·FFE, the Coulombic control panel allows you to choose the form of the Coulombic term--the term can be distance-dependent, not distance-dependent, or corrected by an erfc term (see Glass forcefield).

Special considerations for the AMBER forcefield

With the AMBER forcefield, in most applications a distance-dependent dielectric ( = f (r)) should be used.

In Cerius2·OFF, the Epsilon value is 1.0 unless you change it in the Coulomb Preferences control panel (which is accessed by selecting the Energy Terms/Coulomb card menu item).

In the Insight·Discover_3 module, in the Specify/Nonbonds parameter block you would click More, then set Dielectric to Dist_Dependent and enter 4.0 for the Dielectric Value.

The equivalent BTCL command for the CDiscover program is forcefield with the distance_dep keyword set to true and dielect set to 4.0.

In the Insight·Discover module, in the Parameters/Set parameter block you would enter 4.0 in the Dielectric parameter box and make sure the Dist_Dependent parameter is toggled on.

The standalone version of the FDiscover program handles this with the following DSL command:


>	set dielectric = 4.0*r

Nonbond cutoffs

Why read this section

An energy expression such as Eq. 8, which is representative of current forcefields, is computationally tractable only for systems with relatively small numbers of atoms. The number of internal coordinates grows linearly with the size of a model, so the computational work involved in the first nine terms in Eq. 8 also grows linearly.

However, inspection of the final summation, which represents the nonbond interactions, reveals a quadratic dependence on the number of atoms in the system: If the system of interest has 1000 atoms, the nonbond summation has about 500,000 terms. If it has 10,000 atoms, the summation has 50,000,000!

Therefore, it is common to neglect or approximate the nonbond interactions for widely separated pairs of atoms.

Choosing how to treat long-range nonbond interactions is an important factor in determining the accuracy and the calculation time of an energy evaluation.

Several cutoff methods are discussed below, and a review was published by Brooks et al. (1985b). More recently, two other methods--cell multipoles (Cell multipole method) and Ewald sums (Ewald sums for periodic systems)--have also become available. You should read all these sections to decide which method is best for your model and computational problem.

Note

The same nonbond method(s)1 and specifications should generally be used for all energy calculations within a given project.

1. However, the method and/or specifications used for van der Waals interactions may differ from those used for Coulombic interactions (see You may use different methods for van der Waals and electrostatic interactions).

Effect of nonbond cutoff distance on calculation of nonbond interactions

To appreciate the impact of cutoffs on computational efficiency, consider a receptor-ligand-solvent system with 5000 total atoms. An example would be a small protein (100-150 residues) surrounded by 1-2 layers of water.

Figure 13 shows how the number of nonbond interactions increases with the cutoff distance. This calculation would run at least 10 times faster with an 8.0 Å cutoff than with no cutoff (assuming that the nonbond term is rate limiting, which it usually is). The trade-off is, of course, that interactions beyond the cutoff distance are not accounted for.

The significance of nonbond interactions beyond the cutoff distance depends on the system being simulated. In modeling an isolated molecule or cluster, the use of cutoffs for the van der Waals interactions is quite reasonable. The potential is relatively short range and dies out as 1/r6. Consequently, by 8-10 Å, both the energy and forces are quite small.

In contrast, the situation is slightly different in modeling disordered or ordered crystalline systems. For a typical disordered system, which might consist of a cube of organic material with ~25 Å edges, contributions of the van der Waals interactions at distances greater than 8-10 Å to the energy and pressure can amount to ~50-200 kcal mol-1 and ~500-1000 bar (0.05-0.10 GPa), respectively, while contributions of electrostatic interactions are much smaller. Also, contributions of remote nonbond interactions of all types to the resultant force on atoms is small. Fortunately, it is possible in such systems to apply tail corrections, which permit the use of 8-10 Å cutoffs while simultaneously yielding accurate values of energy and pressure.

Finally, in periodic crystalline systems, both van der Waals interactions and electrostatic interactions can be significant up to 15 Å or more. For example, in a calculation of the energy as a function of cutoff distance in the [Ala-Pro-D-Phe]2 crystal, Kitson and Hagler showed that the nonbond energy accounted for changes from 63% to 97% of the asymptotic value as the cutoff distance was increased from 8 to 15 Å (Kitson and Hagler 1988).

Figure 13 . Number of nonbond interactions as a function of cutoff distance

The number of nonbond pairwise interactions (in millions) expected for a 5000-atom system as a function of cutoff distance. The time required to evaluate the total energy of this system is approximately proportional to the number of nonbond interactions.  

Figure 14 shows how the van der Waals component of the nonbond energy varies as a function of cutoff distance for an [Ala-Pro-D-Phe]2 crystal. The van der Waals energy changes by 40% as the cutoff distance is increased from 8 to 15 Å. The exact dependence of the energy on the cutoff distance depends on the system itself and should be calibrated for each new system.

Figure 14 . Van der Waals energy as a function of cutoff distance

The van der Waals energy for the hexapeptide crystal, [Ala-Pro-D-Phe]2 as a function of cutoff distance. Note that the van der Waals energy does not converge until approximately 20 Å. Simulation done with FDiscover.  

Atom-based cutoffs and nonbond cutoff terms

Direct cutoff not recommended

MSI applications make several methods available for calculation of long-range nonbond interactions. Cerius2·OFF offers (among others) the direct method, which is straightforward and can be applied to nonperiodic and periodic models. Nonbond interactions are simply calculated to a cutoff distance and interactions beyond this distance are ignored.

However, the direct method can lead to discontinuities in the energy and its derivatives. As an atom pair distance moves in and out of the cutoff range between calculation steps, the energy jumps, since the nonbond energy for that atom pair is included in one step and excluded from the next. (For small models you may, of course, calculate all nonbond interactions by setting a large enough cutoff distance and using the direct method.)

Minimizing discontinuities in the potential energy surface

To avoid the discontinuities caused by direct cutoffs, most simulation engines use some kind of switching function (Figure 15) to smoothly turn off nonbond interactions over a range of distances. (The variable names and definitions differ among MSI simulation engines, as illustrated in the figure).

Figure 15 . Application of a switching function

Application of a switching function; energy = E(r) · S(r). Variable names in MSI simulation engines that relate to cutoffs are also illustrated. Thick dark curve = the unmodified van der Waals potential; dashed curve = the switching function S(r); grey curve = the resulting, switched potential.  

Implementation in Cerius2·OFF

Cerius2·OFF allows you to use a cubic spline switching method, by which the energy is multiplied by a spline function. The interaction cutoff is defined by two parameters: the spline-on and the spline-off distances (Figure 15). Within the spline-on/spline-off range, the nonbond interaction energy is attenuated by the spline function. Beyond the spline-off distance, nonbond interactions are ignored. The current defaults in the Open Force Field module set a narrow on/off range that results in fast calculations. Using a broader spline range gives more accurate results, but slower calculation.

Eq. 47        

Implementation in Discover

In the Discover program the switching function is defined by two variables: the point where the function reaches zero and the range over which the function decreases from one to zero (Figure 15).

The Discover program uses a fifth-order polynomial for the switching function. It is formulated so that the first and second derivatives are zero at both the inner and outer ends of the switching region. Thus, the interaction energy and its first and second derivatives are continuous, although higher derivatives are not.

Implementation in CHARMm

CHARMm and Cerius2·MMFF offer two types of switching functions, which its documentation refers to as a switching function (not the same as the Discover switching function) and a shifted potential.

Two variables corresponding to the ends of the switching regions (ron and roff, Figure 15) are required to define the switching function. Depending on the value of rij, the following values are used to multiply individual electrostatic or van der Waals energy terms:

Eq. 48        

However, energy can be significant at the cutoff distance, which can result in artificially large forces at long ranges. This is especially true for relatively short (that is, less than 12 Å) cutoff distances and small ranges (that is, when roff - ron < 3 Å).

The shifted potential modifies the radial function so that energy and forces go to zero at some cutoff distance (Eq. 49). The individual electrostatic and van der Waals terms in the energy function are simply multiplied by this term:

Eq. 49        

One disadvantage to the CHARMm functions is a discontinuity in the second derivatives at the cutoff distance.

Neighbor lists and buffer widths

To maximize the efficiency of nonbond calculations, Cerius2·OFF, Discover, and CHARMm create a neighbor list that contains all pair interactions to be considered during calculation of the nonbond interactions. Atom pairs are not included in the list if they are too far apart or if they are excluded (Automatic exclusions).

Advantages of a neighbor list

Neighbor list generation was chosen over other approaches using cutoffs for computational efficiency:

The buffer region

Although a neighbor list requires time to set up, the net result is time saving for models containing more than about 50 atoms, because the list is not recalculated each time the energy expression is evaluated. Since the list is not updated at every step, it includes atoms in a buffer region (the distance between the two right-most lines in Figure 15) that might move close enough together to contribute to the energy calculation before the next update of the neighbor list.

Updating the neighbor list

To ensure that no atoms outside the buffer region can move close enough to interact during an energy minimization or molecular dynamics simulation, the nonbond list is automatically updated whenever any atom moves more than one-half the buffer width. Thus, the width of the buffer region, coupled with the velocity with which atoms move, determines the maximum amount of time before the neighbor list is updated.

Charge groups and group-based cutoffs

Dipoles must not be split by cutoff distances

To understand the implications of the generalization that the strongest electrostatic interactions in many molecules are due to dipoles rather than fully charged groups (see You may use different methods for van der Waals and electrostatic interactions), note that the interaction energy for two monopoles, each of one e.u. of charge, is about 33 kcal mol-1 at 10 Å, while that for two dipoles formed from unit monopoles is no more than about 0.3 kcal mol-1. It is clear that ignoring monopole-monopole interactions would give grossly misleading results, whereas ignoring dipole-dipole interactions would be only a modest approximation.

If nonbond cutoffs were applied to such a model on an atom-by-atom basis, this could generate spurious monopoles by artificially splitting dipoles (by having one of a dipole's atoms inside the cutoff and one outside). Instead of ignoring a relatively small dipole-dipole interaction, this would artificially introduce a large monopole-monopole interaction. To avoid these artifacts, the Discover and CHARMm simulation engines can apply cutoffs over charge groups. (In CHARMm in PSF mode, every residue is a charge group.)

Functional groups and charge groups

A charge group is a small group of atoms close to one another which has a net charge of zero or almost zero. Often, charge groups are identical to common chemical functional groups. Thus, a carbonyl group, methyl group, or carboxylic group would be an approximately neutral charge group.

Implementation in Discover--switching atoms

The Discover program designates one atom from each charge group as the switching atom and generates the neighbor list by considering the distance between the switching atoms of two charge groups. If the distance is less than the cutoff distance, then the pairwise interactions between all atoms in the two groups are included. If the distance is greater than the cutoff, they are all excluded. Similarly, when calculating the actual interaction energy, the Discover program switches off the interactions between all atom-atom pairs in the two charge groups based only on the distance between the two switching atoms. This procedure prevents artifactual splitting of dipoles.

Implementation in CHARMm

If group-based cutoffs are used in CHARMm, the neighbor list is stored in terms of group pairs.

Charge group size and the cutoff distance

The size of a charge group, as defined by the greatest distance from the switching atom to another atom in the same group, must be significantly smaller than the cutoff distances. Otherwise, an interaction between two atoms close to each other might be ignored because the switching atoms of the two groups are farther apart than the cutoffs. Typical groups are no more than 1-3 Å large, so cutoffs larger than 7-8 Å are reasonable. However, some models contain considerably larger groups.

The Discover program checks the size of the groups against the cutoff distances, then outputs an error message and terminates if the cutoffs are too short relative to the group size. If this happens, you must either increase the cutoffs or define smaller groups.

Charge group neutrality...

The Discover program also warns you about significantly non-neutral groups. Some can be expected if the model contains formally charged functional groups, such as protonated amines and carboxylates. However, other non-neutral groups usually indicate an error in group definitions.

... and defining or checking charge groups

In Cerius2·Discover, you can specify the tolerance with which neutrality is defined when you ask Discover to perform charge grouping.

In Insight, charge groups and switching atoms are defined, edited, and checked with the Forcefield/Groups parameter block, which is found in the Builder, Biopolymer, and other modules. Potentials and charges for the atoms must be fixed or accepted before defining charge groups.

In CHARMm, you can edit charge groups in the RTF files with any text editor. In PSF mode, every residue is a charge group.

Double cutoffs

The FDiscover program also incorporates an improvement over a single cutoff distance called double cutoffs, or, as it is sometimes called in dynamics, multiple timesteps. The nonbond interaction potential at a distance is a smooth function that does not vary rapidly.

With double cutoffs, two cutoff distances--an inner and outer one--are assigned. The two distances define an inner spherical region and an outer shell around a given atom.

Whenever the neighbor list is updated, interactions are calculated in exactly the same way as in the single cutoff scheme, but considering the outer cutoff as the cutoff. Then the resulting interaction energy is partitioned into contributions from atoms within the inner cutoff and atoms within a spherical shell from the inner to the outer cutoff. The contribution from the shell is treated as a constant in subsequent molecular dynamics steps, until the next neighbor list update occurs. The basic idea behind double cutoffs is to calculate the shell contribution only when the neighbor list needs to be updated, while calculating inner cutoff contributions at every step.

These double cutoffs make the calculation less expensive by allowing smaller values for the inner cutoff than could normally be used with a single cutoff. Accuracy is regained at minimal cost by using a large distance for the outer cutoff.

Discontinuities in the potential energy surface

It is important to realize that the effective potential energy surface is not quite continuous when double cutoffs are used. The magnitude of the discontinuities depends on the cutoff distances and the system that is being studied.

These discontinuities are only a minor problem for dynamics, where they are manifested as small fluctuations in the total energy.

Their effect during minimization depends on the minimizer that is used, because some minimization algorithms, such as conjugate gradient, are quite sensitive to discontinuities in the surface. Other algorithms, such as steepest descents, are relatively robust.

Tail corrections

Long-range van der Waals interactions in disordered periodic systems

For disordered periodic systems, contributions to the potential energy and pressure from van der Waals interactions outside the cutoff can be written as:

Eq. 50        

Eq. 51        

where Ni and i denote the number and number density of atoms of type i, U (r) denotes the van der Waals nonbond potential describing interactions between atoms of type and , and g (r) denotes the pair correlation function describing the probability of finding and at separation r relative to the probability of finding the pair at an infinite distance (McQuarrie, 1976, Chapter 13).

Except in rare cases, the function g (r) is short range, reaching its limiting value of unity at distances of ~10 Å. Moreover, g (r) - 1.0 is small even at shorter distances. In consequence, accurate estimates of the tail corrections for all normal nonbond cutoff values may be safely made by setting all g (r) = 1.0 in Eq. 50 and Eq. 51.

Computational costs

Note also that applying Eqs. 50 and 51 at each step in a simulation contributes negligibly to the overall simulation cost, since for constant-volume simulations the full correction may be precomputed, and in simulations where the volume fluctuates it is necessary only to recompute the volume at each step.

Cell-based cutoffs

CDiscover allows cell-based cutoffs for periodic systems. This is another image-based method, in which the neighbor list is based on a specified number of cell layers surrounding the central cell.

Cell multipole method

Mor rigorous, controllable, and efficient than cutoffs

The cell multipole method (CMM, available only in CDiscover) provides a treatment of the nonbond interactions for both nonperiodic and periodic systems that is more rigorous and efficient than cutoffs. This method (Greengard and Rokhlin 1987, Schmidt and Lee 1991, Ding et al. 1992) is a hierarchical approach that allows the accuracy of the nonbond calculation to be controlled. Short-range interactions are treated in the usual way, but long-range group-group interactions are treated in terms of multipoles. Computational time scales as N (the number of atoms).

The cell multipole method applies to the general energy expression of the following form:

Eq. 52        

where i is the potential at atom i, Rij is the distance between atom i and atom j, p is a number (p = 1 for Coulombic and 6 for London dispersion interactions, for example), and the 's are general charges. For Coulombic interactions, the 's are real charges.

Near- and far-field potentials

The general potential i may be divided into a near-field potential due to the surrounding atoms (those within a few angstroms) and a far-field potential due to the rest of the atoms that interact with the ith atom.

The number of interactions in the near field is limited, so it is relatively easy to calculate the near-field potential exactly. The number of interactions in the far field is of order N 2, making an exact calculation of this potential intractable for large models. The cell multipole method calculates the far-field potential accurately and efficiently in the following manner.

Derivation of cell multipole method

We begin by placing an arbitrarily shaped molecule in a rectangular box. The box is then cubed into a number of basic cells of length 4-6 Å and containing 2-4 atoms on average. The basic cell level is denoted level A in Figure 16. Starting from a corner of the box, every eight basic cells may be considered to constitute a larger, parent cell, termed level B. Every eight parent cells may constitute a grandparent cell, termed level C. This procedure is repeated until only a few large cells fill the box. For example, considering any atom in cell A0 of the three-level cell system (Figure 16) the other atoms in A0 and all atoms in An contribute to the near-field potential, and the atoms in Af, B, and C contribute to the far-field potential.

Figure 16 . Three-level hierarchical cell system

Definition of hierarchical cells and division of near field and far field for a basic cell A0. Larger cells are formed as cells are farther from cell A0 (this constitutes the hierarchy). Note that the near field is one layer thick.  
C  
C  
C  
C  
C  
C  
B  
B  
B  
B  
B  
B  
C  
B  
Af  
Af  
Af  
Af  
Af  
Af  
B  
B  
Af  
An  
An  
An  
Af  
Af  
C  
B  
Af  
An  
A0  
An  
Af  
Af  
B  
B  
C  
Af  
An  
An  
An  
Af  
Af  
B  
Af  
Af  
Af  
Af  
Af  
Af  
B  
B  
Af  
Af  
Af  
Af  
Af  
Af  
C  
B  
B  
B  
B  
B  
B  
C  
B  
B  
B  
B  
B  
B  
C  
C  
C  
C  
C  

Key steps used in cell multipole method

The cell multipole method involves the following two key steps:

1. Multipole expansion and calculation of general multipole moments.

The potential associated with each basic cell can be represented as a general potential originating at the center of the cell. This potential may be expanded into an infinite series of multipole moments. For example, the potential associated with cell Af in Figure 16 centered at rAf, is expressed as:

Eq. 53        

where R = rAf - r; r is any point outside cell Af; , = x, y, z; and Z, D, and Q are monopoles, dipoles, and quadrupoles, respectively.

The potentials associated with the higher-level cells can be expanded in an analogous manner, with moments derived from lower-level cell moments.

2. Generation of Taylor coefficients.

Using this expansion to represent the potentials associated with Af-, B-, and C-level cells, the far-field potential of cell A0 may be obtained by summing all the far-cell contributions. The resulting potential may now be expanded as a Taylor series about the center of cell A0:

Eq. 54        

where rA0 is the position vector of the center of cell A0 and r = r - rA0. The Taylor coefficients in Eq. 54 are due to all the far-cell contributions.

A key point of the cell multipole method is that, once the set of Taylor coefficients is calculated at rA0, the far-field potential of any atom in cell A0 is obtained easily through Eq. 54.

Since the Taylor coefficients must be generated for every basic cell, another key point of the cell multipole method is efficient generation of these coefficients. A hierarchical procedure is used, in which coefficients determined for higher-level cells are propagated to the coefficients for lower-level cells. Thus, coefficients for a child B cell are obtained by adding contributions directly translated from the C-level coefficients at the center of the parent C cell to the coefficients at the center of B, generated by considering only the B-cell contributions.

Improved computational performance and accuracy

The cell multipole method is an order-N method (Greengard and Rokhlin 1987, Schmidt and Lee 1991, Ding et al. 1992). The time savings with respect to an exact N 2 algorithm, as well as the improved accuracy relative to using cutoffs, can be dramatic. Table 12 shows results from several calculations on hemoglobin. When the conventional method with 9.5-Å cutoffs is used, the computational and setup times are greatly reduced, but at the cost of a disturbingly large error (over 1% of the correct energies). The last 6 lines of the table show results for second-, third-, and fourth-order multipole expansions at two levels of computational accuracy. The short-range treatment becomes progressively better towards the bottom of the table. However, the overall CPU time increases. It is practical to achieve essentially exact results (within a fraction of a kcal mol-1) in reasonable times.

For systems larger than hemoglobin, the improvement in performance can be even more dramatic. For a system ten times larger, the cell multipole method would take 3-10 minutes for the energy evaluation, depending on the accuracy desired. The exact N 2 calculation, in contrast, would take about three days!

Table 12 . Computational efficiency of cell multipole method

The results shown are for a calculation with CDiscover 3.1 on hemoglobin (32250 atoms) on an SGI Crimson (50 MHz MIPS R4000),

level of calculation time (s) error (kcal mol-1)1
setup2 energy evaluation3 van der Waals Coulomb
exact pairwise calculation  
468  
2809  
 
0.00  
0.00  
9.5-Å cutoff  
12.6  
30.2  
 
1485  
1359  
coarse4, 2nd-order multipole  
23.9  
15.1  
 
275  
-26.0  
coarse, 3rd-order multipole  
58.5  
15.1  
 
243  
-25.0  
coarse, 4th-order multipole  
199  
16.1  
 
243  
-3.50  
fine5, 2nd-order multipole  
96.7  
57.2  
 
8.95  
-10.6  
fine, 3rd-order multipole  
219  
56.6  
 
6.95  
-2.50  
fine, 4th-order multipole  
718  
57.7  
 
0.24  
-0.18  
1 Relative to the exact pairwise calculation of the energy.

2 Time required to set up atom lists and multipole expansions (overhead needed at the beginning of a calculation and periodically during dynamics or minimization).

3 Time required for recurring evaluation of energy and gradients during a calculation.

4 That is, a reasonably accurate, fast calculation with low overhead.

5 That is, calculation with highest accuracy and greatest overhead.

Nonbond interaction energies

Due to the nature of the cell multipole method, specific nonbond interaction energies cannot be calculated unless you use the ESFF forcefield. When this method is used with other forcefields, the per-atom energy is calculated by using the cell multipole method, and the nonbond interaction energy is calculated using the group-based method. You can specify cutoffs for the group-based method of nonbond analysis. A large cutoff in the group-based method may give reasonably accurate energies compared with the cell multipole method.

Ewald sums for periodic systems

Nonbond energies of periodic systems

Mainly for crystals

The Ewald technique (Tosi 1964, Ewald 1921) is a method for computation of nonbond energies of periodic systems. Crystalline solids are the most appropriate candidates for Ewald summation, partly because the error associated with using cutoffs (Nonbond cutoffs) is much greater in an infinite system. The technique can also be applied to amorphous solids and solutions.

Ewald method compared with cutoff-based methods--Coulombic energy...

Figure 17 shows the electrostatic energy for quartz as computed by various techniques. One would feel that all the techniques should converge to the same value at high cutoff distances. However, the direct atom-based cutoffs approach yields results that fluctuate wildly as the cutoff increases, even for rather large cutoffs. The problem is that the sum is only conditionally convergent. As the cutoff increases, charges of opposite sign are taken into account and the partial sum is modified significantly. Worse, reordering the terms of a conditionally convergent series can yield arbitrary results. The problem then is to find physically and chemically meaningful orderings of the series.

Figure 17 . Electrostatic energy vs. cutoff distance for quartz

The electrostatic energy of quartz was calculated with CDiscover 3.1 by several methods. Medium line with points: using atom-based cutoffs; thin dark line: using cell-based cutoffs; thick line: using group-based cutoffs; same thick line: by the Ewald method with dipole correction; and medium dashed line: by the Ewald method without dipole correction.  

The cell-based (Cell-based cutoffs) and group-based (Charge groups and group-based cutoffs) cutoff techniques are natural candidates. However, they yield somewhat different values (Figure 17), due to the different cutoff conventions employed. The group-based technique computes the result for a sphere, but the cell-based technique computes the result for a parallelepiped that preserves the shape of the unit cell.

A standard Ewald calculation that does not take the dipole moment of the unit cell into account yields yet another value. An Ewald calculation that includes the effect of the dipole moment agrees with the group-based calculation (Figure 17).

...and van der Waals energy

For van der Waals energy, the energy sum is absolutely convergent, and no chaotic behavior arises from the direct approach. Even so, as Figure 18 indicates, the convergence of the dispersive energy is slower than might be expected. Even with a cutoff distance of 30 Å, the error is a significant fraction of 0.1 kcal mol-1. (The Ewald calculation is less costly for comparable accuracy.) The repulsive energy, on the other hand, converges at a cutoff distance of only 15 Å and needs no special treatment. (Atom-based calculations for much larger systems, however, show that sometimes even the repulsive energy can exhibit a surprisingly high error at a cutoff of 12 Å.)

Figure 18 . van der Waals energy vs. cutoff distance for NaCl

The graph shows the (solid lines) dispersive and (dashed line) repulsive portions of the van der Waals energy as a function of the cutoff distance, as calculated by the (thin lines) atom-based and (thick line) Ewald methods. The Ewald calculation was performed with CDiscover 3.1 to an accuracy of 1 e-6, which requires a cutoff distance of 9.5 Å.  

Theory of Ewald technique

For full details on the Ewald summation method and parameter optimization procedure used in MSI's simulation engines, please refer to Karasawa and Goddard (1989).

The Ewald approach to improving convergence is to multiply a general lattice sum:

Eq. 55        

by a convergence function (r), which decreases rapidly with r. Of course, to preserve equality, one must then add a term equal to the product of 1 - (r) with the lattice sum:

Eq. 56        

Here, the first term converges quickly, because m(r) decreases rapidly. Ewald's insight was that the second term can be Fourier transformed to provide a rapidly converging sum over the reciprocal lattice. The sum over L in Eq. 56 runs over all lattice vectors, but the i = j terms must be omitted when L = 0.

The convergence functions

The convergence functions are, for the electrostatic energy:

Eq. 57        

and for the dispersive energy:

Eq. 58        

Optimizing computational effort

The electrostatic convergence function 1 was also used by Catlow and Norgett (1976) and Karasawa and Goddard (1989). The dispersive convergence function 6 was recommended and used by Karasawa and Goddard. The convergence parameter plays a similar role in both cases. As increases, the real-space sum converges more rapidly and the reciprocal space sum converges more slowly. (That is, a large implies a heavy computational load for reciprocal space, and a small implies a heavy computational load for real space.) Cutoffs must be adjusted accordingly, and processing time is affected by the cutoffs. A value of that balances processing in the real and reciprocal spaces proves to be optimal. The same value of can be used for both the dispersive and electrostatic energy, and thus they can be combined for greater efficiency.

Implementation in Discover

The Discover program automatically chooses so as to balance the computational loads for real and reciprocal space.

Implementation in Cerius2· OFF

Cerius2·OFF instead uses the inverse of as input, that is, the ratio of the time required for a real-space calculation to the time for a reciprocal-space calculation. The value chosen for the time ratio does not affect the accuracy of the calculation, only the time taken to perform it. Real-space calculations typically take longer than reciprocal ones, so the value of the time ratio is usually greater than 1.

Electrostatic energy

The Ewald expression for the electrostatic energy is (dropping a factor of 1/40):

Eq. 59        

where a = |ri - rj - RL|; ri = Hsi; h = 2(HT)-1n (reciprocal lattice vectors); = det(H) = cell volume; and b = h \xda 2.

In Eq. 59, the first term corresponds to the real-space sum, the contribution (L = 0, i = j) being omitted. The second term in Eq. 59 corresponds to the reciprocal-space sum. For electrostatic interactions, the double sum over i and j is reduced to a single sum. This provides a substantial performance improvement (N1.5 instead of N2.5, where N is the number of atoms per unit cell). A similar reduction of the reciprocal-space sum occurs for the dispersive energy only if the geometric combination rule () holds. The third term in Eq. 59 arises from the self-energy of the charge distributions produced by the introduction of the convergence function. The final term of Eq. 59 is zero if the unit cell is charge neutral, which is normally the case. Indeed, the electrostatic energy in an infinite system of non-neutral cells is formally infinite. However, some applications can involve the use of non-neutral cells. For example, Cation Locator adds cations to the system one by one, and neutrality is not achieved until the final cation is added. The effect of the final term in the equation is to give convergence to a finite value of the energy, which corresponds to a system where the excess charge is neutralized by a compensating uniform background charge density.

The Ewald sum as it appears in Eq. 59, with no h = 0 term, strictly represents an infinite crystal. A real macroscopic but finite crystal also includes surface contributions, which can be substantial (Deem et al. 1990) and which depend on the dipole moment of the unit cell and the shape of the crystal. However, in a real environment, physical effects such as surface reconstruction and dielectric effects in the surrounding medium serve, in general, to diminish the surface charge. The Cerius2·OFF and CDiscover programs therefore omit such terms, corresponding to the so-called "tin-foil" boundary conditions.

Accuracy of Ewald calculations

You choose the accuracy before beginning calculation

The Ewald method allows you to select, before running the calculation, a level of accuracy for the calculation. (Estimation of the cutoff and convergence constants is difficult, so a facility to automatically calculate these parameters to a certain accuracy (Karasawa & Goddard 1989) is provided instead.)

Depending on the system, an Ewald calculation with accuracy = 1 e-4 can be comparable in performance to an atom-based calculation with a large cutoff (19 Å) over the range that has been tested. In addition, the Ewald results are significantly more accurate.

Computation time vs. accuracy and model size

Ewald processing time grows as N1.5, where N is the number of atoms in the unit cell. Increases in accuracy do not require unreasonable increases in the Ewald lattice cutoff. For acetic acid, for example, increasing the accuracy by 2 orders of magnitude (from 1 e-2 to 1 e-4), with constant repulsive cutoff, increased processing time only about 1.5 fold (from 22.08 to 35.34 seconds) and increased the Ewald lattice cutoff less than 20% (from 11.7 to 13.4 Å).

Troubleshooting

Although the default Ewald accuracy is acceptable for most single-point energy calculations, tighter accuracy may be required for some minimization and dynamics runs, to assure acceptable gradient accuracy. A value as low as 0.00025 may be preferable if your minimization run fails to converge or your dynamics run misbehaves.

Nonbond interaction energies

Due to the nature of the Ewald sum method, specific nonbond interaction energies cannot be calculated. When this method is used, the per-atom energy is calculated by using the Ewald sum method, and the nonbond interaction energy is calculated using the group-based method. You can specify cutoffs for the group-based method of nonbond analysis. A large cutoff in the group-based method may give reasonably accurate energies compared with the Ewald sum method.

Ewald sum for models with 2D periodicity

Only for Coulombic terms

If the model has 2D periodicity, an Ewald sum may be applied to the Coulombic terms, using the method of F. Harris (communication). A non-Ewald method must be used for the van der Waals terms in this case, and a large cutoff may be required to obtain good accuracy.

Slow but accurate

The method is similar in principle to the 3D Ewald sum, but more complex in that one direction must be treated nonperiodically. To avoid loss of accuracy, no cutoff is applied in the nonperiodic direction. However, the method can still be optimized, since fewer terms are required in the periodic sums for those atom pairs having a large separation in the nonperiodic direction. Although the method is slower than a non-Ewald method (or a 3D Ewald sum), it provides very good accuracy.

Activated automatically

The method is available in Cerius2·OFF and is activated automatically if you select the Ewald method for Coulomb terms when the current model has 2D periodicity.




Last updated September 09, 1998 at 09:35AM Pacific Daylight Time.
Copyright © 1997, 1998 Molecular Simulations, Inc. All rights reserved.