Forcefield Based Simulations 
3
Preparing the Energy Expression and the Model
Many forcefields allow you a great deal of flexibility with respect to how atom types are assigned to atoms, which terms of the energy function are used, and how the simulation engine applies the forcefield. You can also perform computational experiments by using alternative functional terms and applying constraints and restraints to your model.
Who should read this chapter
Although largely automated model preparation and energy expression setup is possible for simple systems, you should read this chapter if you want to perform the best simulations possible. You need to read this chapter if you want or need to know about:
This chapter explains
 Using forcefields
 Selecting forcefields
 Assigning forcefield atom types and charges
 Parameter assignment
 Using alternative forms of energy terms
 Applying constraints and restraints
 Modeling periodic systems
 Handling nonbond interactions
Related information in this book
Forcefields presents the functional forms of energy expressions and describes the forcefields that are available in Molecular Simulations products.
The atom types defined for each forcefield are listed under Forcefield Terms and Atom Types.
The files that specify the forcefields are described in detail in separate documentation.
Specific information
For specific information on procedures, please see the manual for the molecular modeling program and/or simulation engine that you are using (see Available documentation):
Graphic interface mode
All MSI's simulation engines and forcefields can be used through at least one graphical molecular modeling interface (Cerius^{2}, Insight II, QUANTA, see Table 1).
Standalone mode
The Discover and CHARMm programs can also be run in a commandbased, standalone mode with input from a text interface and/or a script and other input files.
Mixedmode use
For Discover and CHARMm, you can optionally perform some tasks through the appropriate graphical interface and others in standalone mode. For example, you might want to prepare model structure and command input files with the graphical interface, save both types of files, edit the command input file with a text editor so as to perform some complex simulation, start the run in standalone mode, and then analyze the results with the aid of one of the graphical interfaces. In addition, facilities exist for directly entering specific BTCL or CHARMm commands from the Insight or QUANTA interface and for reading a BTCL or CHARMm command input file into the Insight, Cerius^{2}, or QUANTA interface.
Additional information
How to run Discover and CHARMm in standalone mode is documented separately (see Available documentation).
General procedure
Regardless of whether simulation engines are run through the graphical interface, in standalone mode, or in some combination of both modes, the general sequence of activities for doing forcefieldbased calculations is as follows.
 1. Read in the forcefieldBased on the type of model and the scientific problem that you want to simulate, decide which forcefield to use (see Forcefields). If it is not the default forcefield for the molecular modeling program you are using, you need to specify the desired forcefield.
 The forcefield parameter files (which contain parameters that specify force constants, equilibrium geometries, van der Waals radii, and other data needed for calculating energies) are read in as part of this process.
 2. Prepare the model:
 a. If necessary, read in monomer/residue definitionsIn most cases, the molecular modeling program automatically does this for you, or (depending on the type of model) it is not necessary.
 Information about residues or monomers, the basic chemical units that comprise many models, is stored in topology (or library, dictionary, or template) files. The atoms, atomic properties, bonds, bond angles, torsion angles, improper torsion angles, hydrogen bond donors, acceptors, and antecedents, nonbond exclusions, and charge increments are all specified on a per residue basis.
 If this information is not included in a structure or model file you intend to read in (for example, a Brookhaven Protein Database file), then you may have to specify the appropriate topology file.
 b. Read in or construct a modelRead in a model from an appropriate file or construct it using the builder functionality in the molecular modeling program. Make sure your final model is correct with respect to atom connectivity, hybridization, bond orders, valences, etc.
 c. Assign forcefield atom types and charges to each atom in your model (see also Assigning forcefield atom types and charges). If you are using UFF, calculate fractional bond orders. These are largely automated processes.
 d. If necessary, define charge groupsThe need for charge groups depends on the type of model and the scientific problem that you want to simulate. Some calculations are impossible without this kind of information, others can be significantly speeded up by supplying it (see also Charge groups and groupbased cutoffs).
 e. Read in or generate Cartesian coordinatesIn most cases, the molecular modeling program automatically does this for you when you save a model that you have built or read in from a nonMSI type of file.
 3. Set up the energy expressionYou may want to use alternative terms in the energy expression, use or avoid using certain default terms, specify how nonbond interactions are handled, apply constraints or restraints (biases) to your model, etc. (see also this chapter).
 4. Set up the calculationUnless you want your calculation to run under the default conditions, you need to specify items such as which minimization algorithm(s) and what termination criteria to use (see also Minimization, Molecular Dynamics, and Free Energy).
 Most programs also allow you to control your calculation by specifying various nondefault conditions. You may want to use a more robust minimizer for a highly strained model, then switch to a more accurate one for the final stages of computation. Some forcefield engines allow sophisticated iftests and conditional branching. You can determine what will happen if certain parameters are not found (see also Parameter assignment). You can also send jobs that are time consuming to some other (faster) computer or run them in the background.
 5. Specify outputIf desired, specify nondefault kinds or amounts of output.
 6. Run the calculation.
 7. Analyze the resultsThe molecular modeling programs provide analysis functionality that allows you to view your results in the form of graphs and tables, as well as by graphicallly displaying the final (and/or intermediate) conformations of the model.
Selecting forcefields
For details on how to select a forcefield in the molecular modeling program that you are using, please see the appropriate specific documentation (see Available documentation). A brief summary:
Assigning forcefield atom types and charges
Who should read this section
You should read this section if you want to understand what atom types are, how types and charges are assigned automatically, and how you can make your own atom type or charge assignments.
In addition, if you want to understand parameter assignment (Determination of which parameters are used with which atom types) and/or edit the forcefield parameters (Manual parameter assignment), you need to understand something about atom type assignment first.
Availability
All molecular modeling programs supplied by MSI perform automatic and/or semiautomatic atomtype and charge assignment (which needs to be redone if you switch to a different forcefield). Please see the guidebook for the appropriate molecular modeling program for details on how to assign atom types and charges for the simulation engine you are using (see Available documentation).
The simulation engine needs the forcefield atom type of each atom in the model in order to determine which forcefield parameters to use. Forcefield parameters apply to particular combinations of atom types as specified by the forcefield.
Relation between forcefield atom types and chemical atoms
The forcefield atom types are related to the microchemical environment of the atoms in a way defined by the particular forcefield. For example, a methane model has only two atom types, one for the carbon and one for the hydrogens, even though each of the atoms may have a distinct atom name for labeling purposes. The hydrogen atoms are equivalent by symmetry; therefore, they would all have the same atom type in any forcefield.
As a more complicated example, consider propane, which has four distinct types of atoms: methyl carbon atoms, methyl hydrogen atoms, a methylene carbon atom, and the methylene hydrogens. In principle, a forcefield could consider these to be four distinct atom types, but in practice, the chemical difference between the carbon atoms or between the hydrogen atoms is very small, so in most forcefields the carbon atoms are all assigned the same atom type, and all the hydrogens are assigned a second atom type.
Atom types and charges supplied by the structure file
Atom types are assigned by the simulation engine or the molecular modeling program. Atom types are automatically assigned by using a set of rules that link the type of an atom to its element type and its chemical microenvironment (for example, the number and nature of connected atoms). Different forcefields use different atom types and atomtyping rules, which are contained in a residue library or the forcefield file.
The atom type information can also be supplied by a molecular data file such as an .msi file (OFF), an .mdf file (Discover and OFF), or an RTF (or PSF) file (CHARMm). These structure files are typically created in the Cerius^{2}, Insight, or QUANTA molecular modeling program.
To make sure that atom types are assigned:
Charge information also is saved in the structure file.
To assure that you use the most appropriate atom types in your studies, you should always check the assigned atom types against the appropriate table under Forcefield Terms and Atom Types. In most cases, Cerius^{2} and Insight automatically assign the atom types. However, these assignment engines of course require the models to be correctly built. One of the most critical types of information is the bond order, which should be set before the forcefield is assigned.
Atom typing in different modeling programs
In Cerius^{2} and Insight, atoms with unassigned atom types are labelled with question marks when you label the model according to the atom type (FFTYPE or potential type).
Important
Charges (when available) are generally assigned at the same time as the forcefield atom type (see Assigning atom types to a model).
An atomtype charge is simply a fixed value associated with the atom types. Overall neutrality of a model is not necessarily achieved by assigning forcefield atom types. You may prefer to assign charges specifically. (The exact method depends on the molecular modeling program being used.)
Importance of correct charge assignment
Electrostatic interactions play a critical role in determining the structures of inorganic systems and in defining the packing of organic molecules.
Many forcefields already include charges
Forcefields that include Coulombic terms generally already include standard charges (or "bond increments") associated with the atom types. These forcefields have been parameterized with nonzero atom type charges or charge increments (Table 6) and therefore you usually just assign charges automatically when you do atom typing, instead of having to assign specific charges:
Finding charges, if needed
If you need to specifically assign charges, most relevant modules allow you to set atomic charges directly or specify an overall net charge for the whole structure using charge editing functions.
For small models, you can obtain values for charges by using an ab initio or semiempirical quantum chemistry module (for example, MOPAC).
For larger, distorted, models and when charge assignment is done by the charge equilibration method (in Cerius^{2}), you usually need to perform a short minimization before assigning charges. This is because charge equilibration calculations on distorted models can lead to assignment of unrealistic charges.
Charge assignment in different modeling programs
 In Cerius^{2}·Discover, charges are automatically assigned when atoms are typed.
 In Cerius^{2}·OFF, if you are using UFF or the Dreiding forcefield, charges should be assigned to the model by using the Charges module, accessed from the OFF SETUP deck of cards (see Cerius^{2} Forcefield Engines: OFF). The Charges module uses the charge equilibration approach developed by Rappé & Goddard (1991) to predict the charges from the model geometry and the atomic electronegativities. If the model geometry changes much during minimization, you should iterate the procedure of reassigning charges and reminimizing until the energy reaches a constant. The Charges control panel also allows you to edit or assign charges manually.
 In Insight II for all forcefields except CFF, assignment of charges (and atom types) to each atom is done with the Forcefield/Potentials parameter block, which appears automatically when appropriate or can be accessed from the Biopolymer, Builder, and other modules. Potential function atom types must be (and are) assigned before charges or partial charges are assigned. The Insight program assigns atom types and partial charges to each atom in the structure based on information in a residue library file or (if not found in a residue library file) on the bond increments found in the forcefield file. You can edit the charges on individual atoms with the Atom/Charge parameter block in the Biopolymer or Builder module.
 QUANTA automatically assigns charges when you construct or modify a model. Library (or "dictionary") files for commonly used chemical units (amino acids, nucleic acids, etc.) are supplied with CHARMm. You can also manually assign charges different from what were assigned automatically, by using the Molecular Editor (accessed from Applications/Builders/3D Builder on the main QUANTA menu bar).
Parameter assignment
Who should read this section
If you are a novice user or routinely run relatively simple calculations on relatively simple or standard models, you do not need to read this section. However, if, for example, an error message informs you of missing parameters or you want to customize your energy expression for an atypical model, then you do need to understand how the simulation engines determine what parameters are used for which atoms, bonds, angles, etc.
You should also understand something about atom type and charge assignment (Assigning forcefield atom types and charges) to make effective use of this section.
Before calculating the energy of a model, the simulation engine must construct the complete energy expression for the model by associating the correct forcefield parameters with the appropriate atoms and other coordinates. For example, methane has one type of bond (CH) and one type of bond angle (HCH). The program must create a list of the four actual bonds and then associate the CH bond parameters with each. Similarly, there are six HCH angles, but they are characterized by the same set of parameters.
It is important to understand how the parameters from the forcefield are associated with individual internal coordinates, because the energy, derivatives, structures, and almost all other properties calculated by the program depend on these forcefield parameters and the way in which they are associated with the internal coordinates. The following sections describe two facets of this process: atom type equivalences and wildcards in parameter definitions.
Chemically distinct atoms often differ in some, but not all, of their forcefield parameters. For example, the bond parameters for the CC bonds in ethene and in benzene are quite different, but the nonbond parameters for the carbon atoms are essentially the same.
In Discover, rather than duplicating the nonbond parameters in the forcefield parameter file, atom type equivalences are used to simplify the problem. In the example, the phenyl carbon atom type is equivalent to the pure sp^{2} carbons of ethene insofar as the nonbond parameters are concerned.
The Discover program recognizes five types of equivalences for each atom type: nonbond, bond, angle, torsion, and outofplane. Crossterms such as bondbond terms have the same equivalences (insofar as atom types are concerned) as the diagonal term of the topology of all the atoms defining the internal coordinates. For the bondbond term, this means that the atom type equivalences for angles would be used.
The actual format of the equivalence data in the forcefield parameter file is detailed in the File Formats documentation. For the equivalences used in any particular forcefield, you should examine the actual forcefield parameter file for current information.
CHARMm PRM files handle equivalences for nonbond parameters by using partial wildcards, for example, N* means that the associated nonbond parameters apply to any nitrogen type that is not specifically parameterized.
For forcefields in Cerius^{2}·OFF, wildcards are usually used.
For some internal coordinates, the parameters do not depend strongly on the specific atom types of one or more atoms. For example, the parameters of torsion terms may not be strongly affected by the end atoms. This means that the torsion parameters are essentially defined by the central bond rather than its substituents.
The forcefield engines allow wildcard atom types to conveniently handle this type of situation. This special atom type, indicated by an X in CHARMm .PRM files and in relevant Cerius^{2} forcefield files and by an asterisk (*) in Discover forcefield files, matches any atom type when the forcefield engine is searching for the parameters to associate with a particular internal coordinate. (In CHARMm, this applies only to bond, angle, torsion, and impropertorsion parameters.)
Availability
What happens if parameters are not found
Some classic and secondgeneration forcefields are not completely parameterized for all their atom types. (For rulesbased forcefields (Rulebased forcefields broadly applicable to the periodic table), all parameters are generated according to rules rather than read from the forcefield file.) When parameters for classic and secondgeneration forcefields are not available, one of several things can happen, with varying consequences:
Temporary patches for missing parameters; precautions
A forcefield may include automatic parameters for use when betterquality explicit parameters are not defined for a particular bond, angle, torsion, or outofplane interaction. These parameters are intended as temporary patches, to allow you to begin calculations immediately. While MSI has made every effort to ensure that the automatic parameters used in CVFF, the CFF family of forcefields, and CHARMm produce reasonable geometries for a wide variety of models, we cannot guarantee that the automatic parameters are appropriate in every instance. You therefore should always carefully evaluate any results that you obtain using automatic parameters.
How missing parameters are supplied
Discover automatically assigns values for parameters missing from the CFF and CVFF forcefields by switching to an automatic forcefield. This switching is accomplished with an equivalence table that converts the original set of atom types to a smaller set of generic atom types.
Cerius^{2}·OFF behaves similarly.
QUANTA's parameter chooser looks through the existing CHARMm parameters for similar cases and averages them all to come up with suggested values.
Discover's automatic forcefield
In the automatic forcefield in the Discover program, the atom types for bonds, angles, torsions, and outofplane deformations have different levels of specificity. For example, while bondstretching parameters are determined by the atom types of both atoms; anglebending and torsion parameters may be determined by the atom type of only the central atom(s). A wildcard (*), representing any type of atom, is used for the end atoms of torsions and angles.
In some cases, anglebending parameters are specified by two atoms (rather than only the central atom). This can lead to ambiguityfor example, CCN (if not explicitly defined in the forcefield) can be associated with c_c_* or with n_c_*. The underscore in this example is used to denote the generic (or automatic) atom types. Here, a onesided wildcard (*#, where # is an integer indicating the precedence), is used for one of the end atoms in an angle.
Cerius^{2}·OFF handles precedence with an additional field (P0, P1, ... P9) rather than wildcards.
In interpreting the wildcard, the Discover program and the Cerius^{2}·OFF module use the parameter for which the integer is lower. The parameters for a CCN angle would, for example (if not explicitly defined in the forcefield), be taken from those for atom types n_c_*6 rather than c_c_*7, because 6 is smaller than 7.
An example
As an example, the parameters for the angle ohc"c" in oxalic acid (Figure 3) are not present in CFF91.
Figure 3
. Oxalic acid structure and CFF91 atom types

When the automatic parameter assignment process is used in Discover, it looks at the autoequivalence table in the cff91.frc file to find the generic atom types for this angle (indicated in bold type):
#auto_equivalence cff91_auto
! Equivalences
! 
!Ver Ref Type NonB Bond Bond Angle Angle Torsion Torsion OOP OOP
Inct End Atom Apex Atom End Atoms Center Atoms End Atom Center Atom
!           
2.0 2 c" c" c" c'_ c_ c'_ c_ c'_ c_ c'_
2.0 2 oh o o_ o_ o_ o_ o_ o_ o_ o_
...
Thus, for parameter assignment purposes only, atom type oh is reassigned to o_, c" is reassigned to c'_ for the apex atom, and c" is reassigned to c_ for the end atom. The parameters for the ohc"c" angle could be taken from either the o_c'_*7 or the c_c'_*9 lines in the quadratic_angle section of the cff91.frc fileo_c'_*7 is chosen because 7 is lower than 9:
#quadratic_angle cff91_auto
> E = K2 * (Theta  Theta0)^2
!Ver Ref I J K Theta0 K2
!      
2.0 2 c_ c'_ *9 120.0000 40.0000
2.0 2 n_ c'_ *8 120.0000 53.5000
2.0 2 o_ c'_ *7 110.0000 122.0000
2.0 2 o'_ c'_ *6 120.0000 68.0000
2.0 2 h_ c'_ *2 110.0000 55.0000
...
Manual parameter assignment
Notification of missing parameters
If parameter(s) for a potential type are not present in the forcefield file and are not generated when the energy expression is set up, an appropriate error message is written to the textport or text window of your molecular modeling program. (In QUANTA, this occurs only if you use QUANTA's Applications/Builders tools to construct your model; there is no warning for models that are read in from some other application or database.)
Missing and/or automatic parameters are also listed in an output file after the completion of a simultion run. You can find out if parameters are missing before starting your run:
 If terms are missing, in Cerius^{2}·Discover the energy expression is set up unless one or more of the Stop check boxes in the Discover Parameters control panel (accessed by selecting Forcefield/Parameters in the DISCOVER card) are checked.
 If terms are missing, in Cerius^{2}·OFF the energy expression is set up by only if Ignore undefined terms is checked in the Energy Expression control panel (accessed from the Energy Expression/Setup card menu item on the OPEN FORCE FIELD card).
Obtaining new parameters
If the forcefield you are using is similar in functional form and atom typing to another forcefield which does contain the desired parameters, you may be able to use those parameters in your forcefield, at least on a trial basis. You may also be able to obtain new parameters (or help in deriving them yourself) from the scientific literature (see References) or from the developers of the forcefield you are using.
Expert users can edit MSI forcefields in different ways to customize them to their needs or to create new forcefields.
Editing through a graphical interface
MSI's principal molecular modeling programs include forcefield editors (see Available documentation):
 You can change the functional form of terms in rulesbased forcefields (see Rulebased forcefields broadly applicable to the periodic table) by adding an explicit term. Any explicit term is always used in preference to a generated term.
Important
Manual editing
Expert users can edit the files that define many forcefields with a text editor:
 The potential template rule file used by Insight can also be edited. You may add new atom types by making additions to this file (refer to the Insight II documentation for a complete description).
Important
The energy expression is the heart of the forcefield. Potential energy is described in the energy expression as the sum of various terms that indicate the energy costs of bond stretching, angle bending, etc. Not all terms are present in all forcefields, and the functional forms of the terms vary among forcefields (see Forcefields).
This section and the following main section (Applying constraints and restraints) describe energy term preferences that you can set and restraint terms that can be optionally included in the energy expression.
Who should read this section
If you are a novice user, you should alter the default energy terms and parameters as little as possible. One exception to this recommendation is nonbond methods (see Handling nonbond interactions), where you should choose the method according to the model type rather than necessarily accept the default settings.
Availability
Table 8
. Modifications of energy expression in MSI's simulation engines
term

modification

engine (restrictions)

details


any

removal

FFE^{1}, CHARMm

here

all of a type

scaling

FDiscover, CDiscover

here

all of a type

editing

FFE, Insight 4.0.0^{2}

here

bond stretching

Morse vs. harmonic

FDiscover, CDiscover (CVFF)

here

bond stretching

scaling

OFF^{3} (CH bonds)

here

torsion twisting

scaled, averaged, or first found

OFF

here

outofplane movement

averaged or first found

OFF

here

van der Waals interactions

LennardJones vs. quartic

FDiscover (standalone)

here

hydrogen bond interactions

if used, how set up

OFF, FDiscover (standalone, AMBER), CDiscover (standalone, AMBER), CHARMm

here

crossterms

removal

FDiscover, CDiscover (CVFF)

here

13 or bond stretchingangle bending interactions

UreyBradley term vs. crossterm

OFF, CHARMm (standalone only)

here

Most methods for changing the functional form of the energy expression are available via the graphical UIFs:
Discover and CHARMm also offer additional functionality when run in standalone mode.
Why remove terms
You may, for example, want to save computation time during the early stages of minimization of a model that is far from its equilibrium conformation by not calculating any cross terms. Or you may have found that certain terms are insignificant with respect to the purposes of your study.
How to remove terms
You can effectively remove terms from the energy expression in several ways:
Scaling or editing any selected type of term
Uses
The contributions of various terms in the potential energy expression to the total energy can be scaled up or down and/or otherwise edited. This can be useful, for example, in the early stages of minimizing very "bad" structures, where large contributions by certain terms might interfere with convergence.
How it works
The Cerius^{2}·Force Field Editor module allows you to directly change the parameters in any term (e.g., for all C_3H_ bond terms, but not for all bond terms) in the forcefield. The Energy Terms control panels include entry boxes for all relevant parameters.
In the Discover program, scaling applies to an entire class of energy terms (e.g., all bond terms) in the energy expression. The force constants (or some other parameters) for the chosen class of terms are multiplied by some constant factor. For example, all bond interactions can be scaled by one factor and all van der Waals radii by another.
In the Insight 4.0.0 molecular modeling program, you can use the Forcefield/FF_Edit parameter block in the Builder and other modules to directly change the parameters in any term except crossterms (e.g., for all CH bond terms, but not for all bond terms) in the forcefield. The *_Par parameter blocks accessed through the editor include entry boxes for all relevant parameters.
With the CVFF forcefield in Discover, you can choose to use quadratic bond terms rather than Morse bond terms. The Morse term can allow bonds to stretch to unrealistic lengths (Figure 2), so you may get quicker convergence from a hightly distorted configuration if you replace the Morse term by a harmonic term. You do this by specifying the "no Morse" version of CVFF.
If all torsions about a common bond were simply summed, the torsion energy term could be too large. Cerius^{2}·OFF therefore allows several methods for scaling torsion terms (Discover and CHARMm automatically handle torsions optimally, because of how their forcefields are parameterized):
Behavior in Cerius^{2}·OFF
Inversion terms
The inversion, improper, or outofplane torsion term represents the energy involved in inverting a chiral center or otherwise changing this outofplane angle.
In Cerius^{2}·OFF, you may use the first inversion term found or the average of all inversion energies, but the first approach is not recommended.
In the Cerius^{2}·Force Field Editor module, you can use the van der Waals Energy Terms control panels to choose among several nonbond functional forms. (These are listed in the online help, which is accessed by rightmouse clicking over the Function popup.) However, you would have to change the relevant parameters as well, if you wanted good results.
You can use the DSL (the FDiscover command language) set command to choose between the usual LennardJones 612 or 69 potential (e.g., term 10 in Eq. 23) and a quartic form:
The quartic form is useful when you need to eliminate bad van der Waals contacts, but the second derivatives are not calculated.
Who should read this section
Many forcefields, especially the newer ones, fully account for hydrogen bonds by other terms in the forcefield and so do not have or require specific terms for handling hydrogen bond interactions.
However, some older forcefields include specific terms for hydrogen bonding (e.g., older versions of AMBER). Others (e.g., CHARMm) allow you to use a hydrogen bond term if you want (but MSI does not recommend this). If you are using a forcefield with explicit hydrogen bond terms, you should read this section.
Lack of hydrogen bond terms is an asset
If specific hydrogen bonds are required, generation of a list of hydrogen bonds is a major step in evaluating the energy of a system. This process involves looking at all possible pairs of hydrogen bond donors and acceptors and selecting those that meet certain criteria (Figure 5):
Since hydrogen bond interactions depend on both angle and distance, both angle cutoffs and distance cutoffs must be specified for a switching function (see Nonbond cutoffs). A switching, or spline, function (Figure 15) is needed to conserve energy by smoothing transitions over the cutoffs.
Specifying the criteria
In Cerius^{2}·OFF, the hydrogen bond criteria can be changed by using the Hydrogen Bond Preferences control panel (accessed by selecting the Energy Terms/Hydrogen Bond menu item from the OPEN FORCE FIELD card). This control panel also allows specification of switching function ("spline") parameters.
In Discover, default hydrogen bond criteria are contained in the forcefield file (amber.frc). CDiscover allows you to use the BTCL forcefield scale command to scale hydrogen bond terms (if they exist). In FDiscover, they can be changed by editing the command input file to change the variables HBDIST and HBANGL.
In CHARMm, you can change the hydrogen bond criteria with the CHARMm Update Parameters dialog box, which is accessed from the CHARMm/Update Parameters menu item. This dialog box also allows specification of switching function parameters for hydrogen bonds. Setting the Update Frequency to 0 (the default) effectively omits the hydrogen bond term from the potential energy expression. You can also omit explicit hydrogen bond terms by using the CHARMm/Energy Terms menu item.
An alternative or supplement to bondangle interactions is the UreyBradley term, which accounts for 13 interactions between two atoms that are bonded to a common atom.
In Cerius^{2}·OFF, use the Energy Terms Selection control panel to specify whether to use the UreyBradley term (assuming it is available in the current forcefield).
In CHARMm, the UreyBradley term, if present, can be omitted from the energy expression (standalone only) or can be specified in the parameter file (ANGLE statement).
Why read this section
Constraints and restraints allow you to focus the calculation on a region or conformation of interest and also to set up computational experiments. Such experiments are one of the primary uses of molecular modeling, allowing you control over a model at the atomic level. Several examples are described under When to use constraints/restraints.
Restraints vs. constraints
The seminal difference between a constraint and a restraint is that a constraint is an absolute restriction imposed on the calculation, while a restraint is an energetic bias that tends to force the calculation toward a certain restriction (even though many people use these terms as if they were interchangeable).
Availability
Table 9
. Constraints and restraints in MSI's simulation engines
constraint/restraint

type

engine^{1}

details


atom

fixed (constraints)

OFF^{2}, FDiscover, CDiscover, CHARMm

here

template forcing

harmonic (Eq. 28) restraint

FDiscover

here

tethering and template forcing

quadratic (Eq. 29) restraint

CDiscover^{3}

here

tethering

harmonic (Eq. 28) restraint

FDiscover

here

tethering

massweighted harmonic (Eq. 30) restraint

CHARMm

here

quartic droplet

harmonic (Eq. 31) restraint

CHARMm

here

distance

harmonic (Eq. 32) restraint

OFF

here

distance

quadratic (Eq. 29), flatbottomed (Eq. 34), or cosine (Eq. 36) restraint

CDiscover^{3}

here

distance

harmonic (Eq. 32) or flatbottomed (Eq. 33) restraint

FDiscover

here

distance

flatbottomed (Eq. 35) restraint

CHARMm

here

dynamics

RATTLE algorithm (constraints)

CDiscover

here

dynamics

SHAKE algorithm (constraints)

CHARMm

here

dynamics

consensus dynamics (Eq. 28) (standalone only)

FDiscover, CDiscover

here

angle

harmonic (Eq. 37) restraint

OFF

here

angle

quadratic (Eq. 29), flatbottomed (Eq. 34), or cosine (Eq. 36) restraint

CDiscover^{c}

here

torsion

harmonic (Eq. 38) restraint

OFF

here

torsion

quadratic (Eq. 29), flatbottomed or J^{3} dihedral (Eq. 34), cosine (Eq. 36), cis (Eq. 39), trans (Eq. 40), or cis/trans (Eq. 41) restraint

CDiscover^{c}

here

torsion

flatbottomed (Eq. 33) restraint (standalone only)

FDiscover

here

torsion

cosine (Eq. 42) or harmonic (Eq. 38) torque (one of these is standalone only)

FDiscover

here

torsion

harmonic (Eq. 38) restraint

CHARMm

here

inversion

harmonic (Eq. 43) restraint

OFF

here

chiral

flatbottomed (Eq. 34) restraint

CDiscover^{3}

here

outofplane

quadratic (Eq. 29), flatbottomed or J^{3} dihedral (Eq. 34), or cosine (Eq. 36) restraint

CDiscover^{3}

here

outofplane

harmonic (Eq. 43) restraint (standalone only)

FDiscover

here

Most restraints and constraints are available via the graphical UIFs:
Discover and CHARMm offer additional restraint and constraint functionality when run in standalone mode.
Constraints and restraints are often used to control and direct the minimization.
Fixedatoms example
For example, you can fix some atoms in space, not allowing then to move. For example, part of the structure of a molecule may have been well solved experimentally, but the structures of other areas are less clear. Or you might want to keep parts of your model (e.g., solvent molecules) rigid to decrease computational costs.
Torsionrotation example
You can add extra terms to the energy expression to restrain or bias the system in certain ways. For example, if you are investigating the adiabatic energy barrier to rotation about a bond, you would restrain the value of that torsion and minimize the structure. Repeating this procedure for a set of torsion values in the range 0°360° yields a complete energy profile for rotation about the bond. A similar process is used to generate phi/psi maps and other multidimensional energy surfaces in studies of model conformation.
Docking example
If a substrate is being docked onto an enzyme and a specific hydrogen bond between the enzyme and the ligand is thought to be involved in binding, the donor and acceptor atoms can be pulled together to provide a docking coordinate. In this way, the results are not so dependent on the initial starting configuration, which may have been only a crude graphic alignment. In cases like this, the restraint is turned off at some point to make sure that the biased minimum is close to a true minimum.
Modeling incomplete models
Another example of the use of restraints is in modeling incomplete systems. Often, it is difficult or impossible to construct a realistic environment around parts of a model system. For example, only a partial structure of a large protein complex may be available, and some atoms must be restrained to stay near their initial crystal positions because they do not "feel" interactions with neighboring (missing) amino acids, membrane, or solvent. If the site of interest (for instance, a binding site for a competitive inhibitor) is well characterized but other parts of the enzyme are unknown or would require too much computation time if they were included, a limited study can still be carried out with the ends tethered to their crystal coordinates. Usually, these restraints are permanent parts of the model. The results of such calculations must be critically evaluated but can be valid if the ligand binding does not depend on interactions with missing pieces of the model or on conformational flexibility in the tethered regions.
Relaxing crystal structures
As a final example, tethering can be used to gently relax a crystal structure. Often, crystal coordinates, even if highly refined, have several strained interactions due to intrinsically disordered or poorly defined atomic positions, which, upon minimization, give rise to large initial forces. If these forces are not restrained, they can result in artifactual movement away from the original structure. The general approach is to progressively relax parts of the model in stages, starting with the least well determined atoms, until the entire system can minimize freely. The restraints are ultimately removed so that the final minimum represents an unperturbed conformation. It is usually not necessary to minimize to convergence at each stagethe object is to relax the moststrained parts of the system as quickly as possible without introducing artifacts.
Costsaving
Fixed atoms are constrained to a given location in space; they cannot move at all. Fixed atoms reduce the expense of a calculation in two ways:
Important
Uses
Use atom constraints when you want to apply minimization or dynamics to part of a model, while keeping the remainder of the model fixed and rigid. For example, use atom constraints to quickly minimize a sorbate in a zeolite by fixing the atom positions of the zeolite frame and allowing only the sorbate atoms to move. Or fix all residues in a protein except for those in the active site.
Uses
Typical uses of these related types of restraints are to bias the conformation of one model towards that of another, to bias selected atoms towards their experimentally known positions, to restrain the core of a model while allowing its solventexposed constituents more freedom of movement, or to find an identical or close set of conformations that a group of related models can achieve.
Template forcing
To force the conformation of one model to be similar to that of a template model, a onetoone correspondence between atoms in the template and in the moving structure is set up, and (for example) one of the following restraint terms is added to the energy expression:
or:
The term in Eq. 27 is proportional to the rootmeansquare (rms) deviation of the analog atoms from the template atoms. (This form cannot be used with the NewtonRaphson minimizer in FDiscover.) The values obtained for the energy and the rms function depend on the value of the forcing parameter K. Typical values for this constant are in the neighborhood of 5 kcal Å^{1}. It is often instructive to look at the dependence of the energy and rms functions on the forcing parameter by making several determinations with different forcing parameters. If several runs (minimization or dynamics) are made, it may also be helpful to plot the energy as a function of the rms value. For tethered minimizations, a very large forcing constant (e.g., 2000.0 kcal Å^{1}) is often used to prevent significant movement of any of the tethered atoms.
Eq. 28 represents a conceptually more straightforward restraint, with each atom restrained by an isotropic spring to the position of its template atom. In either form, the summation is over a list of pairs of atoms to restrain: one from the moving model, and one from the template model. FDiscover uses this quadratic form by default.
The K_{i} in Eq. 28 are determined by the distance of atom i from the atom defining the origin as:
where r_{min} is the distance at which the tethering turns on, k_{min} is the initial force constant at that distance, r_{max} is the distance where the force constant reaches its maximum allowed value, and k_{max }is the maximum allowed force constant. If r_{min} and k_{min} are not given, the default values are zero. If r_{max} is zero, tethering uses a constant force constant of k_{max}.
In CDiscover, a simpler quadratic is used:
where V is any appropriate internal (bond length, angle, etc.the same functional form is used for several types of restraints).
Advantages of each type of templateforcing restraint
The first form (Eq. 27) gives the best rms fit for the least energetic cost, but individual atoms may remain quite far from their template position. The second form (Eq. 28) restrains each atom individually, so each atom is forced toward its template partner. The resulting rms fit is not as good as that from Eq. 27, but no one atom is allowed to deviate as much as is possible with Eq. 27. The form in Eq. 28 also allows for a different force constant for each pair, which means that different atoms or classes of atoms can be treated differently.
Tethering
Tethering is the same as template forcing, except that the atoms are restrained to their original positions rather than to positions in a template structure. Both Eq. 27 and quadratic forms are applicable for tethering; however, Eq. 28 is used by FDiscover and Eq. 29 by CDiscover, because tethering is usually used to keep atoms from moving too far from their original positions.
CHARMm allows massweighted tethering by calculating an additional energy term for all atoms that are to be restrained. This term has the form:
Where E_{cons} is the constraint energy, k_{i} is the force constant, m_{i} is the mass of atom i (if mass weighing is used) or 1, r_{i} is the position of atom i, r_{0} is the reference position about which the atom is to be centered, and n is an exponent.
Quartic droplet restraint
The quartic droplet restraint term in CHARMm is designed to put the entire model into a "cage" by constructing a restraining sphere around a model. The potential is scaled so that atom positions furthest from the center of mass or the geometric center of the model have the greatest restraining force applied.
The quartic droplet restraint term is based on the center of mass (COM) or the center of geometry of the model. No net force or torque is introduced by the center of mass term. The potential function is:
FDiscover (standalone only) allows similar restraints within spherical shells.
Consensus dynamics
Consensus dynamics is used to find the consensus configuration of a set of analogs. In essence, all models in the set are treated as both moving molecules and templates.
Standalone FDiscover uses the harmonic templateforcing restraint (Eq. 28).
The database capability of CDiscover is used in settng up consensus dynamics calculations, using the restraint in Eq. 29.
In CHARMm, you can apply general internal coordinate restraints by applying restraints to all bonds, angles, and/or dihedral angles that have entries in an internal coordinates table. This facility is global, that is, not applicable to specific internal coordinates.
Uses
Distance restraints are used to bias the distance between two atoms, bonded or not, toward a given value. Some uses are to cyclize linear models by bringing the ends closer together, dock different models, and fit distance data derived from NOE and other experiments.
Several functional forms for distance restraints: harmonic...
Several commonly used functional forms are supported.
One is a simple harmonic function, which in FDiscover has the form:
where K is a force constant, R_{ij} is the current distance between the atoms, and R_{target }is the target distance. A large force constant tends to force the distance to be close to the target distance; a smaller force constant results in a correspondingly smaller bias.
In CDiscover, the quadratic form is the same, except that scaling is enabled (Eq. 29).
In Cerius^{2}, the form is the same, except that K is multiplied by 0.5. R_{target} can be defined explicitly or automatically extracted from the model as the current distance between atoms.
...and flatbottomed...
The second form is also harmonic, but it is separated into several piecewise continuous regions, resulting in a flatbottomed potential (Figure 6). For FDiscover, the form is:
For CDiscover, the flatbottomed form is:
where V is any appropriate internal (bond length, angle, etc.the same functional form is used for several types of restraints).
The restraining potential used in CHARMm is:
Where R_{lim} is the value of R where the force equals f_{max}.
...and cosine
CDiscover also allows a cosine form of restraint:
where V is any appropriate internal (bond length, angle, etc.) and n is the periodicity.
Advantages of the flatbottomed functional form
It is not necessary for the flatbottomed potential (Figure 6) to be symmetric. By appropriate definition of the points R_{1}, R_{2}, etc., any of the regions may be eliminated. For Eq. 33, the important regions are those from R_{1} to R_{2} and from R_{3} to R_{4}, where a harmonic potential is applied, and the flat bottom from R_{2} to R_{3}.
This form of the restraint allows a range of acceptable distances and is particularly useful for incorporating experimental distance information, such as those from NOE experiments, into a calculation. The flat bottom allows for experimental error in the determined distance. The two outer regions (Figure 6) have a constant gradient, which is useful for avoiding unreasonably large forces if the initial structure is far from the target value.
The RATTLE and SHAKE algorithms effectively remove veryhighfrequency vibrations from consideration during dynamics simulations. Use of these algorithms can allow for a larger time step during simulation.
In CDiscover, the BTCL rattle command is used before the dynamics command to set up constraints in bonds, angles, or water molecules in a molecular dynamics simulation. It can be used to constrain bonds or any atom pairs to userdefined distances. It can be used to constrain angles spanned by two constrained bonds. In addition, it can be used to fix the geometry of water molecules so that the fixedgeometry water models SPC and TIP3P can be used in a simulation. This functionality is available in a limited way via the Calculate/Dynamics parameter block in the Discover_3 module of Insight (click More to display the Rattle toggle).
In CHARMm, SHAKE is used to constrain bond lengths and angles spanned by two constrained bonds during dynamics runs. (However, its use is recommended only for constraining all bonds in which one of the bonded atoms is a hydrogen.) The SHAKE algorithm cannot be used with the NewtonRaphson or ABNR minimizers (see Minimization).
In Cerius^{2}, an angle restraint can be applied to a group of any three atoms. The restraint is implemented such that:
Where: K_{a} is the angle force constant; is the angle between the selected atoms; and _{0} is the desired restrained angle of the selected atoms. _{0} can be defined explicitly or can be automatically extracted from the model as the current angle connecting selected atoms.
In CDiscover the default form of angle restraints is cosine (Eq. 36). Quadratic (Eq. 29) and flatbottomed (Eq. 34) angle restraints can also be used.
Uses
Some uses of torsion restraints are to enforce chiral and prochiral centers, prevent cistrans conversions, and fit NOE Jcoupling constants from NMR experiments. Conversely, other uses are to force torsion rotation in order to perform phi/psi mapping, perform conformational searching, and induce conformational changes.
Functional forms
Several forms of torsion restraints are used in the literature and implemented in MSI's simulation engines.
Harmonic restraints, or periodic restraints (Eq. 42 with n = 1), are appropriate for forcing a torsion angle to a particular value. The periodic form with a periodicity greater than one is useful for restraining a torsion to one of several related angles. For instance, a threefold potential could keep a torsion either trans or at one of the two gauche conformations, depending on the starting conformation and the strength of the potential applied.
Implementation
In Cerius^{2}·OFF, a torsion (dihedral) restraint can be defined among any group of four atoms. The restraint is implemented such that:
Where K_{t} is the torsion force constant; is the angle between the ijk and jkl planes; and _{0} is the desired restrained angle of the selected atoms, which can be defined explicitly or automatically extracted from the model as the current angle connecting selected atoms.
In CDiscover, you can specifically restrain dihedrals to be cis:
or trans:
or either cis or trans:
You can also use the flatbottomed function (Eq. 34) to apply J^{3} dihedral restraints to fit the results of NOE experiments. A plain cosine form (Eq. 36) and a quadratic form (Eq. 29) are also available. The torson involving any four atoms can be restrained.
In FDiscover, the functional forms include a simple harmonic form analogous to Eq. 32 and a piecewise continuous form like Eq. 33 with R interpreted as the angle, rather than the distance. Another form is the periodic function of Eq. 42:
where V gives the strength of the restraint, n is an integer periodicity, and _{0} is the phase angle.
CHARMm uses a harmonic potential to restrict the motion of a dihedral angle to a value close to a reference position or to examine a series of different conformations when making potential energy maps.
Uses
Typical uses include prevention of changes in chirality or prochirality. (A molecule is chiral if no stable conformation of it can be superimposed on its mirror imagemost chiral organic molecules can be described in terms of chiral centers, i.e., an atom that has four distinct substituents. Two chemically identical substituents on an otherwise chiral tetrahedral center are prochiral; in addition, sp^{2} hybridised planar systems with three different substituents are considered prochiral.)
Implementation
In Cerius^{2}·OFF, an inversion (improper torsion or outofplane angle) restraint can be defined among any four atoms i, j, k, l, where i defines the inversion center. The restraint is implemented such that:
Where K_{i} is the force constant for the outofplane; is the angle between the ijl and ikl planes; and _{0} is the desired restrained outofplane angle of the selected atoms, which can be defined explicitly or automatically extracted from the model as the current angle connecting selected atoms. There must be a real atomic center for the inversion.
The CDiscover program can impose a flatbottomed chiral restraint (Eq. 34) to invert the chirality or force it to be R or S.
CDiscover can also impose a cosine (Eq. 36), quadratic (Eq. 29), or flatbottomed (Eq. 34) outofplane restraint.
The FDiscover DSL language can be used to impose chirality and prochirality restraints having the same functional form as Eq. 43, where _{0} is the outofplane angle corresponding to R or S.
The BTCL language of CDiscover allows sophisticated geometric manipulation of molecular and other objects, including constraints and restraints, by means of the geometry, molGeom, restraint and other commands. A subset of this functionality is accessible in the Calculate/Geometric parameter block in the Insight·Discover_3 module.
Why read this section
Periodic boundary conditions refers to the simulation of models consisting of a periodic lattice of identical subunits. By applying periodic boundaries to simulations, the influence, for example, of bulk solvent or crystalline environments can be included, thereby improving the rigor and realism of a model.
Availability
Most methods for controlling the treatment of periodic systems are available via the graphical UIFs:
Discover and CHARMm offer additional functionality when run in standalone mode.
Models are specified in Cartesian space
Some simulation engines accept only Cartesian coordinates, not crystal coordinates (others are able to convert between the two systems). This is important when using asymmetric space groups, since the symmetry operators assume that the input coordinates correspond to the standard asymmetric unit as defined in the International Tables for Crystallography (Reidl 1983).
For Discover, it is assumed that the x Cartesian axis corresponds to the a crystal axis and that the b axis lies in the x,y plane (see Figure 7)
Figure 7
. Relationship between Cartesian coordinate system (xyz) and
periodic system (abc) in Discover and CHARMm

.
For Cerius^{2}·OFF, by default the c lattice vector is parallel to the z Cartesian axis and the b lattice vector lies in the y,z plane (Figure 8).
Figure 8
. Relationship between Cartesian coordinate system (xyz) and
periodic system (abc) in Cerius^{2}·OFF

CHARMm can handle models that are defined in either crystal or Cartesian space. In converting from crystal to Cartesian axes, the a, b, c crystal axes are aligned with the x, y, z Cartesian axes (Figure 7).
Tip
Simulation in bulk solvent
The left side of Figure 9 shows a solute molecule surrounded by enough solvent to occupy the volume (and shape) of a cube. A simulation carried out on this isolated cubic system is a poor approximation of what would happen in a true bulk solvent environment. For example, the solute can diffuse toward a surface or solvent molecules can evaporate. To remedy this, on the right of Figure 9 the cube is replicated in three dimensions to form a 3 X 3 X 3 lattice of identical cubes. This is a much better representation of bulk solvent for the interior cube, because molecules near the surfaces now interact with solvent in adjacent cubes. The imaged atoms are used to calculate energies and forces on the real atoms in the interior cube. The energies and forces on the imaged atoms themselves are not calculated because their motions are computed as symmetry operations on the real atoms, for example, by translations along the cubic axes.
Implications of minimumimage model for calculating nonbond interactions
Consider the implications of this model for a specific case. In Figure 10, molecule A1 is located near an edge of the square. (For simplicity, this discussion focuses on a twodimensional lattice.) In addition, eight images of A1 (A2A9) are present in the adjacent symmetrically related squares. Consider the interactions of molecules A with molecules B. The closest image of B to A1 is actually not B1, but rather B5. If molecules in the interior cell are allowed to interact only with the molecule or molecular image closest to it, this is called a minimumimage model. Each molecule interacts only with those molecules and images within a distance of half the cell size. The advantage of this approach is its simplicity. It is straightforward to compute energy between a given pair of molecules without explicitly keeping track of the images in neighboring cells. All periodic boundary algorithms imply a cutoff criterion, but the minimumimage convention implies a maximum distance for this cutoff of no more than half the cell dimensions.
For a description of the minimumimage convention, see also Allen and Tildesley (1987).
A more general approachghost molecules
Simulation engines (Discover and CHARMm) can also use a more general approach by generating explicit images of the interior molecules, also called ghost molecules, which interact with the interior molecules. These ghost molecules are replicated to as great a distance as necessary (but no farther than necessary) to satisfy the desired potential energy cutoff criteria.
The left side of Figure 11 shows molecule A1 interacting with several images of B (B1, B2, B3, B5) within the specified cutoff radius (shown as a shaded circle centered on A1). A1 interacts with several of its own images as well (A3, A5, A6, A8).
Cutoff distances and nonbond interactions
The right side of Figure 11 shows which molecules in the adjacent unit cells become explicit ghost molecules for a given cutoff distance. Not every molecule in an adjacent cell becomes a ghost. However, if a cutoff distance that is longer than the cell length is used, ghosts from unit cells beyond the nearest neighbor cells may be included. As molecules (effectively, see below) move in and out of the boundaries, the molecules that are ghosts can change. Therefore, the ghost list is regenerated periodically.
Nonbond interactions do not have to be calculated between ghost atoms. This helps to significantly reduce computation time.
When groupbased cutoffs (Charge groups and groupbased cutoffs) are used, the nonbond potential is cut off on the basis of charge groups (i.e., only if two groups are within the cutoff is the interaction calculated), and only those groups in molecular ghosts that are within the cutoff distance of a real group are included in the ghost atom list.
How images and "real" molecules move
Ghost molecules follow their symmetrically related counterparts. However, when it comes time to move the molecules (in a dynamics step or minimization iteration), only the real molecules (A1 and B1) are actually moved according to the accumulated forces each molecule has felt. The ghost molecule positions are simply regenerated by applying the defined symmetry relations to the new positions of the molecules.
Perfect symmetry is maintained between the primary structure and all its image objects. For many applications, this condition is satisfactory. However, it is not possible to study, for example, cooperative changes between image objects.
To maintain all molecules in the central cell, image centering is used. Molecules that happen to migrate to an edge of the primary structure and would appear in one of its image objects instead reappear in the primary structure from the opposite direction. Thus a constant number of atoms is maintained and no molecules are lost, no matter how far they may diffuse during the calculation.
Energies of crystals can be calculated and the lattice parameters a, b, c, , , and can be optimized with Cerius^{2}, CDiscover, and CHARMm:
 Crystal simulations are also available in several Cerius^{2}·OFF Instruments modules. For example, you can use the Crystal Packer module to optimize crystals or calculate their energy and can include minimization of periodic structures in a Mechanical Properties run.
 Because crystal patching is not available in CHARMm, bonds between crystal images are not handled well. Similarly, hydrogen bond interactions described by an explicit hydrogen bond function cannot be used. The only forces that can be calculated between primary and image atoms in crystals are nonbond forces.
Bonds across boundaries
Allowing bonds (with additional energy terms including angles, dihedrals, and improper dihedrals) between the primary atoms and image atoms enables you to study polymers such as DNA or industrial polymers.
Cerius^{2}·OFF, CDiscover, and CHARMm can handle bonds across cell boundaries. (However, CHARMm is best used only for linear polymers, since it does not handle 3D lattices or networks well.)
Electrostatic (Coulombic) and van der Waals interactions together are referred to as nonbond interactions.
Why read this section
Nonbond terms can involve extensive calculation. To avoid a heavy calculation burden, some approximation scheme is often employed. Choosing the best method for your particular model can save computational expense without sacrificing accuracy.
In addition, you have some direct control over the functional terms for nonbond interactions:
Availability
Table 11
. Nonbond methods in MSI's simulation engines
method

type of system

engine

details


atombased (single) cutoffs

periodic, nonperiodic

OFF^{1}, FDiscover, CDiscover, CHARMm

here

groupbased cutoffs

periodic, nonperiodic

FDiscover, FDiscover, CHARMm

here

double cuttoffs

periodic, nonperiodic

FDiscover

here

tail corrections

disordered periodic

CDiscover

here

cellbased cutoffs

periodic

CDiscover

here

cell multipole method

nonperiodic, periodic^{2}

CDiscover

here

Ewald sums

periodic

OFF, CDiscover

here

Most methods for specifying how to treat nonbond interactions are available via the graphical UIFs:
Discover and CHARMm also offer additional functionality when run in standalone mode.
You may use different methods for van der Waals and electrostatic interactions
Typically, both van der Waals and Coulombic interactions are calculated by the same method and (if by the nonbond cutoff method) with the same nonbond list. However, different methods and parameters may be used for van der Waals and Coulombic terms in CDiscover and (except for some operations) in Cerius^{2}·OFF and CHARMm. This allows you, for instance, to use a large cutoff for electrostatic interactions and a smaller cutoff for van der Waals interactions.
The van der Waals interaction potential is relatively short range and dies out as 1 \xda r^{6}. By 810 Å, the energy and forces are quite small. Thus, using cutoffs to bring the van der Waals potential to zero at about 10 Å can be a reasonable approximation. The Coulombic interactions, on the other hand, die off as 1 \xda r, so even at considerable distances the energy of interaction is not negligible. But this depends on the model: except for a few formally charged groups, most molecules are composed of neutral fragments with dipoles and quadrupoles. Thus, in most models the major component of the electrostatic interaction between molecules or parts of molecules is a dipoledipole interaction, which falls off as 1 \xda r^{3}.
Note
Automatic exclusions
Van der Waals and Coulombic interactions are ordinarily calculated between all atom pairs that are not specifically excluded. Most forcefields exclude nonbond terms for atoms connected by bonds (12 interactions) and valence angles (13). Some forcefields also exclude nonbond terms between end atoms in torsion (14) interactions. These interactions are illustrated Figure 12.
Figure 12
, Types of interactions usually excluded from nonbond
calculations

14 interactions and AMBER
If 14 nonbond interactions in torsions are included in the nonbond list, they may be scaled. For example, with the AMBER forcefield (as implemented in both Cerius^{2}·OFF and Discover) these nonbond interactions must be scaled by 0.50.
Scaling by 0.5 occurs by default with Cerius^{2} when AMBER is loaded.
In the Insight·Discover module, you need to toggle the p1_4 parameter in the Parameters/Scale_Terms parameter block on and enter 0.5 in the p14 parameter box,
The standalone version of the FDiscover program handles this scaling with the following DSL command:
> scale 14 by 0.5
Important
The equivalent BTCL command for the CDiscover program is forcefield scale with the vdw_1_4 keyword.
van der Waals radius combination rules
Any van der Waals interaction parameters that are actually defined for heterogenous atom pairs are called offdiagonal parameters. Offdiagonal parameters that are not available for such atom pairs are calculated by averaging those for each of the two atom types, using a geometric, arithmetic, or (in CDiscover and Cerius·OFF) 6thpower combination rule:
Availability
In Discover and Cerius^{2}·OFF, a choice of combination rule is available and is specified in the forcefield file (see the File Formats documentation).
Quality of results
The arithmetic mean gives marginally better equilibrium distances for van der Waals interactions than the geometric combination rule (Halgren 1992). The 6thpower rule (not available with all forcefields) yields even better results (Waldman and Hagler 1993).
van der Waals combination rules and Ewald sums
With the Ewald method (Karasawa & Goddard 1989) (Ewald sums for periodic systems), the geometric mean leads to faster convergence than the arithmetic mean.
In addition, because the Ewald sum calculation proceeds much faster when only diagonal parameters are used, the Cerius^{2}·OFF Van Der Waals Preferences control panel includes an option to ignore offdiagonals even when they are present (they are not present in any of the Discover forcefields).
Role of the dielectric constant in modeling
The electrostatic potential is computed from the partial atomic charges associated with the model (Assigning charges). Approximate solventscreening effects can be included by specifying a nondefault value for the dielectric constant if it is explicitly included in the forcefield. (The "dielectric constant" used in modeling is not the dielectric constant that most experimental chemists would think ofit is instead an empirical, dimensionless scaling factor.)
The dielectric constant reflects the polarizability of the solvent molecules. A polarizable solvent such as water has a greater dielectric constant than less polar liquids. Electrostatic interactions in polarizable solvents with high dielectric constants are greatly attenuated. In closely packed molecules, however, there are fewer solvent molecules to screen the charge interactions.
A relatively large dielectric constant can be used for simulating the aqueous environment of small systems. However, many calculations on models use a smaller dielectric constant. For example, a dielectric constant between 2.0 and 10.0 has been used for simulations in the interior of a protein. A typical value for water would be around 4.
Additional information
For a helpful review, please see Harvey (1989). A tutorial on dielectric constants in forcefields can be found at MSI's website:
http://www.msi.com/support/insight/insight/dielectric.html
A distancedependent dielectric "constant"
The dielectric constant can be kept constant, or the Coulombic term can be made a shielded function, where the dielectric "constant" is a function of distance (r ·). This is useful for electrostatic interactions in closely packed molecules, where the number of solvent molecules between two interacting charges is usually fewer than in bulk solvent. A distancedependent dielectric constant is also useful for models in which explicit solvent molecules are not included.
The distancedependent dielectric function (also called a shielded dielectric) is generally used with the Dreiding and AMBER forcefields and may be used with others.
A shielded Coulombic term is faster to calculate than a nonshielded term because no square root has to be evaluated.
Availability
The dielectric constant can be changed and/or made distance dependent in any of MSI's simulation engines.
In Cerius^{2}·FFE, the Coulombic control panel allows you to choose the form of the Coulombic termthe term can be distancedependent, not distancedependent, or corrected by an erfc term (see Glass forcefield).
Special considerations for the AMBER forcefield
With the AMBER forcefield, in most applications a distancedependent dielectric ( = f (r)) should be used.
In Cerius^{2}·OFF, the Epsilon value is 1.0 unless you change it in the Coulomb Preferences control panel (which is accessed by selecting the Energy Terms/Coulomb card menu item).
In the Insight·Discover_3 module, in the Specify/Nonbonds parameter block you would click More, then set Dielectric to Dist_Dependent and enter 4.0 for the Dielectric Value.
The equivalent BTCL command for the CDiscover program is forcefield with the distance_dep keyword set to true and dielect set to 4.0.
In the Insight·Discover module, in the Parameters/Set parameter block you would enter 4.0 in the Dielectric parameter box and make sure the Dist_Dependent parameter is toggled on.
The standalone version of the FDiscover program handles this with the following DSL command:
> set dielectric = 4.0*r
Nonbond cutoffs
Why read this section
An energy expression such as Eq. 8, which is representative of current forcefields, is computationally tractable only for systems with relatively small numbers of atoms. The number of internal coordinates grows linearly with the size of a model, so the computational work involved in the first nine terms in Eq. 8 also grows linearly.
However, inspection of the final summation, which represents the nonbond interactions, reveals a quadratic dependence on the number of atoms in the system: If the system of interest has 1000 atoms, the nonbond summation has about 500,000 terms. If it has 10,000 atoms, the summation has 50,000,000!
Therefore, it is common to neglect or approximate the nonbond interactions for widely separated pairs of atoms.
Choosing how to treat longrange nonbond interactions is an important factor in determining the accuracy and the calculation time of an energy evaluation.
Several cutoff methods are discussed below, and a review was published by Brooks et al. (1985b). More recently, two other methodscell multipoles (Cell multipole method) and Ewald sums (Ewald sums for periodic systems)have also become available. You should read all these sections to decide which method is best for your model and computational problem.
1. However, the method and/or specifications used for van der Waals interactions may differ from those used for Coulombic interactions (see You may use different methods for van der Waals and electrostatic interactions).
Effect of nonbond cutoff distance on calculation of nonbond interactions
To appreciate the impact of cutoffs on computational efficiency, consider a receptorligandsolvent system with 5000 total atoms. An example would be a small protein (100150 residues) surrounded by 12 layers of water.
Figure 13 shows how the number of nonbond interactions increases with the cutoff distance. This calculation would run at least 10 times faster with an 8.0 Å cutoff than with no cutoff (assuming that the nonbond term is rate limiting, which it usually is). The tradeoff is, of course, that interactions beyond the cutoff distance are not accounted for.
The significance of nonbond interactions beyond the cutoff distance depends on the system being simulated. In modeling an isolated molecule or cluster, the use of cutoffs for the van der Waals interactions is quite reasonable. The potential is relatively short range and dies out as 1/r^{6}. Consequently, by 810 Å, both the energy and forces are quite small.
In contrast, the situation is slightly different in modeling disordered or ordered crystalline systems. For a typical disordered system, which might consist of a cube of organic material with ~25 Å edges, contributions of the van der Waals interactions at distances greater than 810 Å to the energy and pressure can amount to ~50200 kcal mol^{1} and ~5001000 bar (0.050.10 GPa), respectively, while contributions of electrostatic interactions are much smaller. Also, contributions of remote nonbond interactions of all types to the resultant force on atoms is small. Fortunately, it is possible in such systems to apply tail corrections, which permit the use of 810 Å cutoffs while simultaneously yielding accurate values of energy and pressure.
Finally, in periodic crystalline systems, both van der Waals interactions and electrostatic interactions can be significant up to 15 Å or more. For example, in a calculation of the energy as a function of cutoff distance in the [AlaProDPhe]_{2} crystal, Kitson and Hagler showed that the nonbond energy accounted for changes from 63% to 97% of the asymptotic value as the cutoff distance was increased from 8 to 15 Å (Kitson and Hagler 1988).
Figure 14 shows how the van der Waals component of the nonbond energy varies as a function of cutoff distance for an [AlaProDPhe]_{2} crystal. The van der Waals energy changes by 40% as the cutoff distance is increased from 8 to 15 Å. The exact dependence of the energy on the cutoff distance depends on the system itself and should be calibrated for each new system.
Direct cutoff not recommended
MSI applications make several methods available for calculation of longrange nonbond interactions. Cerius^{2}·OFF offers (among others) the direct method, which is straightforward and can be applied to nonperiodic and periodic models. Nonbond interactions are simply calculated to a cutoff distance and interactions beyond this distance are ignored.
However, the direct method can lead to discontinuities in the energy and its derivatives. As an atom pair distance moves in and out of the cutoff range between calculation steps, the energy jumps, since the nonbond energy for that atom pair is included in one step and excluded from the next. (For small models you may, of course, calculate all nonbond interactions by setting a large enough cutoff distance and using the direct method.)
Minimizing discontinuities in the potential energy surface
To avoid the discontinuities caused by direct cutoffs, most simulation engines use some kind of switching function (Figure 15) to smoothly turn off nonbond interactions over a range of distances. (The variable names and definitions differ among MSI simulation engines, as illustrated in the figure).
Implementation in Cerius^{2}·OFF
Cerius^{2}·OFF allows you to use a cubic spline switching method, by which the energy is multiplied by a spline function. The interaction cutoff is defined by two parameters: the splineon and the splineoff distances (Figure 15). Within the splineon/splineoff range, the nonbond interaction energy is attenuated by the spline function. Beyond the splineoff distance, nonbond interactions are ignored. The current defaults in the Open Force Field module set a narrow on/off range that results in fast calculations. Using a broader spline range gives more accurate results, but slower calculation.
Implementation in Discover
In the Discover program the switching function is defined by two variables: the point where the function reaches zero and the range over which the function decreases from one to zero (Figure 15).
The Discover program uses a fifthorder polynomial for the switching function. It is formulated so that the first and second derivatives are zero at both the inner and outer ends of the switching region. Thus, the interaction energy and its first and second derivatives are continuous, although higher derivatives are not.
Implementation in CHARMm
CHARMm and Cerius^{2}·MMFF offer two types of switching functions, which its documentation refers to as a switching function (not the same as the Discover switching function) and a shifted potential.
Two variables corresponding to the ends of the switching regions (r_{on} and r_{off}, Figure 15) are required to define the switching function. Depending on the value of r_{ij}, the following values are used to multiply individual electrostatic or van der Waals energy terms:
However, energy can be significant at the cutoff distance, which can result in artificially large forces at long ranges. This is especially true for relatively short (that is, less than 12 Å) cutoff distances and small ranges (that is, when r_{off}  r_{on} < 3 Å).
The shifted potential modifies the radial function so that energy and forces go to zero at some cutoff distance (Eq. 49). The individual electrostatic and van der Waals terms in the energy function are simply multiplied by this term:
One disadvantage to the CHARMm functions is a discontinuity in the second derivatives at the cutoff distance.
To maximize the efficiency of nonbond calculations, Cerius^{2}·OFF, Discover, and CHARMm create a neighbor list that contains all pair interactions to be considered during calculation of the nonbond interactions. Atom pairs are not included in the list if they are too far apart or if they are excluded (Automatic exclusions).
Advantages of a neighbor list
Neighbor list generation was chosen over other approaches using cutoffs for computational efficiency:
The buffer region
Although a neighbor list requires time to set up, the net result is time saving for models containing more than about 50 atoms, because the list is not recalculated each time the energy expression is evaluated. Since the list is not updated at every step, it includes atoms in a buffer region (the distance between the two rightmost lines in Figure 15) that might move close enough together to contribute to the energy calculation before the next update of the neighbor list.
Updating the neighbor list
To ensure that no atoms outside the buffer region can move close enough to interact during an energy minimization or molecular dynamics simulation, the nonbond list is automatically updated whenever any atom moves more than onehalf the buffer width. Thus, the width of the buffer region, coupled with the velocity with which atoms move, determines the maximum amount of time before the neighbor list is updated.
Dipoles must not be split by cutoff distances
To understand the implications of the generalization that the strongest electrostatic interactions in many molecules are due to dipoles rather than fully charged groups (see You may use different methods for van der Waals and electrostatic interactions), note that the interaction energy for two monopoles, each of one e.u. of charge, is about 33 kcal mol^{1} at 10 Å, while that for two dipoles formed from unit monopoles is no more than about 0.3 kcal mol^{1}. It is clear that ignoring monopolemonopole interactions would give grossly misleading results, whereas ignoring dipoledipole interactions would be only a modest approximation.
If nonbond cutoffs were applied to such a model on an atombyatom basis, this could generate spurious monopoles by artificially splitting dipoles (by having one of a dipole's atoms inside the cutoff and one outside). Instead of ignoring a relatively small dipoledipole interaction, this would artificially introduce a large monopolemonopole interaction. To avoid these artifacts, the Discover and CHARMm simulation engines can apply cutoffs over charge groups. (In CHARMm in PSF mode, every residue is a charge group.)
Functional groups and charge groups
A charge group is a small group of atoms close to one another which has a net charge of zero or almost zero. Often, charge groups are identical to common chemical functional groups. Thus, a carbonyl group, methyl group, or carboxylic group would be an approximately neutral charge group.
Implementation in Discoverswitching atoms
The Discover program designates one atom from each charge group as the switching atom and generates the neighbor list by considering the distance between the switching atoms of two charge groups. If the distance is less than the cutoff distance, then the pairwise interactions between all atoms in the two groups are included. If the distance is greater than the cutoff, they are all excluded. Similarly, when calculating the actual interaction energy, the Discover program switches off the interactions between all atomatom pairs in the two charge groups based only on the distance between the two switching atoms. This procedure prevents artifactual splitting of dipoles.
Implementation in CHARMm
If groupbased cutoffs are used in CHARMm, the neighbor list is stored in terms of group pairs.
Charge group size and the cutoff distance
The size of a charge group, as defined by the greatest distance from the switching atom to another atom in the same group, must be significantly smaller than the cutoff distances. Otherwise, an interaction between two atoms close to each other might be ignored because the switching atoms of the two groups are farther apart than the cutoffs. Typical groups are no more than 13 Å large, so cutoffs larger than 78 Å are reasonable. However, some models contain considerably larger groups.
The Discover program checks the size of the groups against the cutoff distances, then outputs an error message and terminates if the cutoffs are too short relative to the group size. If this happens, you must either increase the cutoffs or define smaller groups.
Charge group neutrality...
The Discover program also warns you about significantly nonneutral groups. Some can be expected if the model contains formally charged functional groups, such as protonated amines and carboxylates. However, other nonneutral groups usually indicate an error in group definitions.
... and defining or checking charge groups
In Cerius^{2}·Discover, you can specify the tolerance with which neutrality is defined when you ask Discover to perform charge grouping.
In Insight, charge groups and switching atoms are defined, edited, and checked with the Forcefield/Groups parameter block, which is found in the Builder, Biopolymer, and other modules. Potentials and charges for the atoms must be fixed or accepted before defining charge groups.
In CHARMm, you can edit charge groups in the RTF files with any text editor. In PSF mode, every residue is a charge group.
The FDiscover program also incorporates an improvement over a single cutoff distance called double cutoffs, or, as it is sometimes called in dynamics, multiple timesteps. The nonbond interaction potential at a distance is a smooth function that does not vary rapidly.
With double cutoffs, two cutoff distancesan inner and outer oneare assigned. The two distances define an inner spherical region and an outer shell around a given atom.
Whenever the neighbor list is updated, interactions are calculated in exactly the same way as in the single cutoff scheme, but considering the outer cutoff as the cutoff. Then the resulting interaction energy is partitioned into contributions from atoms within the inner cutoff and atoms within a spherical shell from the inner to the outer cutoff. The contribution from the shell is treated as a constant in subsequent molecular dynamics steps, until the next neighbor list update occurs. The basic idea behind double cutoffs is to calculate the shell contribution only when the neighbor list needs to be updated, while calculating inner cutoff contributions at every step.
These double cutoffs make the calculation less expensive by allowing smaller values for the inner cutoff than could normally be used with a single cutoff. Accuracy is regained at minimal cost by using a large distance for the outer cutoff.
Discontinuities in the potential energy surface
It is important to realize that the effective potential energy surface is not quite continuous when double cutoffs are used. The magnitude of the discontinuities depends on the cutoff distances and the system that is being studied.
These discontinuities are only a minor problem for dynamics, where they are manifested as small fluctuations in the total energy.
Their effect during minimization depends on the minimizer that is used, because some minimization algorithms, such as conjugate gradient, are quite sensitive to discontinuities in the surface. Other algorithms, such as steepest descents, are relatively robust.
Longrange van der Waals interactions in disordered periodic systems
For disordered periodic systems, contributions to the potential energy and pressure from van der Waals interactions outside the cutoff can be written as:
where N_{i} and _{i} denote the number and number density of atoms of type i, U_{} (r) denotes the van der Waals nonbond potential describing interactions between atoms of type and , and g_{} (r) denotes the pair correlation function describing the probability of finding and at separation r relative to the probability of finding the pair at an infinite distance (McQuarrie, 1976, Chapter 13).
Except in rare cases, the function g_{} (r) is short range, reaching its limiting value of unity at distances of ~10 Å. Moreover, g_{} (r)  1.0 is small even at shorter distances. In consequence, accurate estimates of the tail corrections for all normal nonbond cutoff values may be safely made by setting all g_{} (r) = 1.0 in Eq. 50 and Eq. 51.
Computational costs
Note also that applying Eqs. 50 and 51 at each step in a simulation contributes negligibly to the overall simulation cost, since for constantvolume simulations the full correction may be precomputed, and in simulations where the volume fluctuates it is necessary only to recompute the volume at each step.
CDiscover allows cellbased cutoffs for periodic systems. This is another imagebased method, in which the neighbor list is based on a specified number of cell layers surrounding the central cell.
Mor rigorous, controllable, and efficient than cutoffs
The cell multipole method (CMM, available only in CDiscover) provides a treatment of the nonbond interactions for both nonperiodic and periodic systems that is more rigorous and efficient than cutoffs. This method (Greengard and Rokhlin 1987, Schmidt and Lee 1991, Ding et al. 1992) is a hierarchical approach that allows the accuracy of the nonbond calculation to be controlled. Shortrange interactions are treated in the usual way, but longrange groupgroup interactions are treated in terms of multipoles. Computational time scales as N (the number of atoms).
The cell multipole method applies to the general energy expression of the following form:
where _{i} is the potential at atom i, R_{ij} is the distance between atom i and atom j, p is a number (p = 1 for Coulombic and 6 for London dispersion interactions, for example), and the 's are general charges. For Coulombic interactions, the 's are real charges.
Near and farfield potentials
The general potential _{i} may be divided into a nearfield potential due to the surrounding atoms (those within a few angstroms) and a farfield potential due to the rest of the atoms that interact with the ith atom.
The number of interactions in the near field is limited, so it is relatively easy to calculate the nearfield potential exactly. The number of interactions in the far field is of order N^{ 2}, making an exact calculation of this potential intractable for large models. The cell multipole method calculates the farfield potential accurately and efficiently in the following manner.
Derivation of cell multipole method
We begin by placing an arbitrarily shaped molecule in a rectangular box. The box is then cubed into a number of basic cells of length 46 Å and containing 24 atoms on average. The basic cell level is denoted level A in Figure 16. Starting from a corner of the box, every eight basic cells may be considered to constitute a larger, parent cell, termed level B. Every eight parent cells may constitute a grandparent cell, termed level C. This procedure is repeated until only a few large cells fill the box. For example, considering any atom in cell A_{0} of the threelevel cell system (Figure 16) the other atoms in A_{0} and all atoms in A_{n} contribute to the nearfield potential, and the atoms in A_{f}, B, and C contribute to the farfield potential.
Key steps used in cell multipole method
The cell multipole method involves the following two key steps:
 1. Multipole expansion and calculation of general multipole moments.
 The potential associated with each basic cell can be represented as a general potential originating at the center of the cell. This potential may be expanded into an infinite series of multipole moments. For example, the potential associated with cell A_{f} in Figure 16 centered at r_{A}_{f}, is expressed as:
 where R = r_{A}_{f}  r; r is any point outside cell A_{f}; , = x, y, z; and Z, D, and Q are monopoles, dipoles, and quadrupoles, respectively.
 The potentials associated with the higherlevel cells can be expanded in an analogous manner, with moments derived from lowerlevel cell moments.
 2. Generation of Taylor coefficients.
 Using this expansion to represent the potentials associated with A_{f}, B, and Clevel cells, the farfield potential of cell A_{0} may be obtained by summing all the farcell contributions. The resulting potential may now be expanded as a Taylor series about the center of cell A_{0}:
 where r_{A}_{0} is the position vector of the center of cell A_{0} and r_{} = r_{}  r_{A}_{0}. The Taylor coefficients in Eq. 54 are due to all the farcell contributions.
A key point of the cell multipole method is that, once the set of Taylor coefficients is calculated at r_{A}_{0}, the farfield potential of any atom in cell A_{0} is obtained easily through Eq. 54.
Since the Taylor coefficients must be generated for every basic cell, another key point of the cell multipole method is efficient generation of these coefficients. A hierarchical procedure is used, in which coefficients determined for higherlevel cells are propagated to the coefficients for lowerlevel cells. Thus, coefficients for a child B cell are obtained by adding contributions directly translated from the Clevel coefficients at the center of the parent C cell to the coefficients at the center of B, generated by considering only the Bcell contributions.
Improved computational performance and accuracy
The cell multipole method is an orderN method (Greengard and Rokhlin 1987, Schmidt and Lee 1991, Ding et al. 1992). The time savings with respect to an exact N^{ 2} algorithm, as well as the improved accuracy relative to using cutoffs, can be dramatic. Table 12 shows results from several calculations on hemoglobin. When the conventional method with 9.5Å cutoffs is used, the computational and setup times are greatly reduced, but at the cost of a disturbingly large error (over 1% of the correct energies). The last 6 lines of the table show results for second, third, and fourthorder multipole expansions at two levels of computational accuracy. The shortrange treatment becomes progressively better towards the bottom of the table. However, the overall CPU time increases. It is practical to achieve essentially exact results (within a fraction of a kcal mol^{1}) in reasonable times.
For systems larger than hemoglobin, the improvement in performance can be even more dramatic. For a system ten times larger, the cell multipole method would take 310 minutes for the energy evaluation, depending on the accuracy desired. The exact N^{ 2} calculation, in contrast, would take about three days!
Nonbond interaction energies
Due to the nature of the cell multipole method, specific nonbond interaction energies cannot be calculated unless you use the ESFF forcefield. When this method is used with other forcefields, the peratom energy is calculated by using the cell multipole method, and the nonbond interaction energy is calculated using the groupbased method. You can specify cutoffs for the groupbased method of nonbond analysis. A large cutoff in the groupbased method may give reasonably accurate energies compared with the cell multipole method.
Nonbond energies of periodic systems
Mainly for crystals
The Ewald technique (Tosi 1964, Ewald 1921) is a method for computation of nonbond energies of periodic systems. Crystalline solids are the most appropriate candidates for Ewald summation, partly because the error associated with using cutoffs (Nonbond cutoffs) is much greater in an infinite system. The technique can also be applied to amorphous solids and solutions.
Ewald method compared with cutoffbased methodsCoulombic energy...
Figure 17 shows the electrostatic energy for quartz as computed by various techniques. One would feel that all the techniques should converge to the same value at high cutoff distances. However, the direct atombased cutoffs approach yields results that fluctuate wildly as the cutoff increases, even for rather large cutoffs. The problem is that the sum is only conditionally convergent. As the cutoff increases, charges of opposite sign are taken into account and the partial sum is modified significantly. Worse, reordering the terms of a conditionally convergent series can yield arbitrary results. The problem then is to find physically and chemically meaningful orderings of the series.
The cellbased (Cellbased cutoffs) and groupbased (Charge groups and groupbased cutoffs) cutoff techniques are natural candidates. However, they yield somewhat different values (Figure 17), due to the different cutoff conventions employed. The groupbased technique computes the result for a sphere, but the cellbased technique computes the result for a parallelepiped that preserves the shape of the unit cell.
A standard Ewald calculation that does not take the dipole moment of the unit cell into account yields yet another value. An Ewald calculation that includes the effect of the dipole moment agrees with the groupbased calculation (Figure 17).
...and van der Waals energy
For van der Waals energy, the energy sum is absolutely convergent, and no chaotic behavior arises from the direct approach. Even so, as Figure 18 indicates, the convergence of the dispersive energy is slower than might be expected. Even with a cutoff distance of 30 Å, the error is a significant fraction of 0.1 kcal mol^{1}. (The Ewald calculation is less costly for comparable accuracy.) The repulsive energy, on the other hand, converges at a cutoff distance of only 15 Å and needs no special treatment. (Atombased calculations for much larger systems, however, show that sometimes even the repulsive energy can exhibit a surprisingly high error at a cutoff of 12 Å.)
For full details on the Ewald summation method and parameter optimization procedure used in MSI's simulation engines, please refer to Karasawa and Goddard (1989).
The Ewald approach to improving convergence is to multiply a general lattice sum:
by a convergence function (r), which decreases rapidly with r. Of course, to preserve equality, one must then add a term equal to the product of 1  (r) with the lattice sum:
Here, the first term converges quickly, because _{m}(r) decreases rapidly. Ewald's insight was that the second term can be Fourier transformed to provide a rapidly converging sum over the reciprocal lattice. The sum over L in Eq. 56 runs over all lattice vectors, but the i = j terms must be omitted when L = 0.
The convergence functions
The convergence functions are, for the electrostatic energy:
and for the dispersive energy:
Optimizing computational effort
The electrostatic convergence function _{1} was also used by Catlow and Norgett (1976) and Karasawa and Goddard (1989). The dispersive convergence function _{6} was recommended and used by Karasawa and Goddard. The convergence parameter plays a similar role in both cases. As increases, the realspace sum converges more rapidly and the reciprocal space sum converges more slowly. (That is, a large implies a heavy computational load for reciprocal space, and a small implies a heavy computational load for real space.) Cutoffs must be adjusted accordingly, and processing time is affected by the cutoffs. A value of that balances processing in the real and reciprocal spaces proves to be optimal. The same value of can be used for both the dispersive and electrostatic energy, and thus they can be combined for greater efficiency.
Implementation in Discover
The Discover program automatically chooses so as to balance the computational loads for real and reciprocal space.
Implementation in Cerius^{2}· OFF
Cerius^{2}·OFF instead uses the inverse of as input, that is, the ratio of the time required for a realspace calculation to the time for a reciprocalspace calculation. The value chosen for the time ratio does not affect the accuracy of the calculation, only the time taken to perform it. Realspace calculations typically take longer than reciprocal ones, so the value of the time ratio is usually greater than 1.
Electrostatic energy
The Ewald expression for the electrostatic energy is (dropping a factor of 1/4_{0}):
where a = r_{i}  r_{j}  R_{L}; r_{i} = Hs_{i}; h = 2(H^{T})^{1}n (reciprocal lattice vectors); = det(H) = cell volume; and b = h \xda 2.
In Eq. 59, the first term corresponds to the realspace sum, the contribution (L = 0, i = j) being omitted. The second term in Eq. 59 corresponds to the reciprocalspace sum. For electrostatic interactions, the double sum over i and j is reduced to a single sum. This provides a substantial performance improvement (N^{1.5} instead of N^{2.5}, where N is the number of atoms per unit cell). A similar reduction of the reciprocalspace sum occurs for the dispersive energy only if the geometric combination rule () holds. The third term in Eq. 59 arises from the selfenergy of the charge distributions produced by the introduction of the convergence function. The final term of Eq. 59 is zero if the unit cell is charge neutral, which is normally the case. Indeed, the electrostatic energy in an infinite system of nonneutral cells is formally infinite. However, some applications can involve the use of nonneutral cells. For example, Cation Locator adds cations to the system one by one, and neutrality is not achieved until the final cation is added. The effect of the final term in the equation is to give convergence to a finite value of the energy, which corresponds to a system where the excess charge is neutralized by a compensating uniform background charge density.
The Ewald sum as it appears in Eq. 59, with no h = 0 term, strictly represents an infinite crystal. A real macroscopic but finite crystal also includes surface contributions, which can be substantial (Deem et al. 1990) and which depend on the dipole moment of the unit cell and the shape of the crystal. However, in a real environment, physical effects such as surface reconstruction and dielectric effects in the surrounding medium serve, in general, to diminish the surface charge. The Cerius^{2}·OFF and CDiscover programs therefore omit such terms, corresponding to the socalled "tinfoil" boundary conditions.
You choose the accuracy before beginning calculation
The Ewald method allows you to select, before running the calculation, a level of accuracy for the calculation. (Estimation of the cutoff and convergence constants is difficult, so a facility to automatically calculate these parameters to a certain accuracy (Karasawa & Goddard 1989) is provided instead.)
Depending on the system, an Ewald calculation with accuracy = 1 e^{4} can be comparable in performance to an atombased calculation with a large cutoff (19 Å) over the range that has been tested. In addition, the Ewald results are significantly more accurate.
Computation time vs. accuracy and model size
Ewald processing time grows as N^{1.5}, where N is the number of atoms in the unit cell. Increases in accuracy do not require unreasonable increases in the Ewald lattice cutoff. For acetic acid, for example, increasing the accuracy by 2 orders of magnitude (from 1 e^{2} to 1 e^{4}), with constant repulsive cutoff, increased processing time only about 1.5 fold (from 22.08 to 35.34 seconds) and increased the Ewald lattice cutoff less than 20% (from 11.7 to 13.4 Å).
Troubleshooting
Although the default Ewald accuracy is acceptable for most singlepoint energy calculations, tighter accuracy may be required for some minimization and dynamics runs, to assure acceptable gradient accuracy. A value as low as 0.00025 may be preferable if your minimization run fails to converge or your dynamics run misbehaves.
Nonbond interaction energies
Due to the nature of the Ewald sum method, specific nonbond interaction energies cannot be calculated. When this method is used, the peratom energy is calculated by using the Ewald sum method, and the nonbond interaction energy is calculated using the groupbased method. You can specify cutoffs for the groupbased method of nonbond analysis. A large cutoff in the groupbased method may give reasonably accurate energies compared with the Ewald sum method.
Only for Coulombic terms
If the model has 2D periodicity, an Ewald sum may be applied to the Coulombic terms, using the method of F. Harris (communication). A nonEwald method must be used for the van der Waals terms in this case, and a large cutoff may be required to obtain good accuracy.
Slow but accurate
The method is similar in principle to the 3D Ewald sum, but more complex in that one direction must be treated nonperiodically. To avoid loss of accuracy, no cutoff is applied in the nonperiodic direction. However, the method can still be optimized, since fewer terms are required in the periodic sums for those atom pairs having a large separation in the nonperiodic direction. Although the method is slower than a nonEwald method (or a 3D Ewald sum), it provides very good accuracy.
Activated automatically
The method is available in Cerius^{2}·OFF and is activated automatically if you select the Ewald method for Coulomb terms when the current model has 2D periodicity.
Last updated September 09, 1998 at 09:35AM Pacific Daylight Time.
Copyright © 1997, 1998 Molecular Simulations, Inc. All rights
reserved.