Semi-empirical Calculations with xTB
This HQS Molecules
module contains a Python interface to perform calculations with the xTB software package.
The xTB (extended tight binding) program package is developed by the research group of Prof. Stefan Grimme at the University of Bonn. It contains implementations of the semi-empirical quantum-mechanical methods GFNn-xTB (n = 0–2) as well as of the GFN-FF force field ("GFN" stands for Geometries, Frequencies, and Non-covalent interactions, for which it has been parametrized). HQS Molecules
interfaces the following features, permitting them to be used conveniently from within Python workflows:
- Single-point energy calculations,
- Structure optimizations,
- Gradient calculations,
- Frequency calculations,
- Stable optimizations including Hessian calculations, and the
- Biased Hessian.
Single-point energy calculations
The energy of any meaningful set of atomic coordinates with a specific charge and multiplicity
can be calculated using the single_point
function. Calculations are performed with the GFN2-xTB method by default, or alternatively with the GFN-FF method.
Input
The first mandatory argument of the function is a Molecule
object, as the charge and multiplicity need to be known to perform a calculation.
Note: the functions for reading an XYZ file return MolecularGeometry
objects, which contain neither charge nor spin information. A MolecularGeometry
objects can be easily converted to a Molecule
object by providing information on the charge and, optionally, on the multiplicity, as described in the XYZ File Handling chapter. On the other hand, the 2D-to-3D structure conversion functions return Molecule
objects with charge but without spin information.
Keyword arguments for single-point energy calculations
Calculation settings can be adjusted with the following keyword arguments:
force_field
: Use GFN-FF force-field (defaults toFalse
, in which case the GFN2-xTB method is used).solvent
: By default, calculations are performed in vacuum. If the name of a solvent is provided, the implicit solvation model ALPB (analytical linearized Poisson-Boltzmann) will be used in the calculation.create_tmpdir
: Run the calculation in a temporary folder that is deleted with completion of the calculation (the default is set toTrue
).
Output
The single_point
function returns a float with the single-point energy in Hartree.
If an implicit solvation model is used, the solvation contribution ("Gsolv" in the xTB output) is
part of the single-point energy.
Example
An example is provided here:
from hqs_molecules.xtb import single_point
energy = single_point(molecule, force_field=False)
Solvents
HQS Molecules
supports using the ALPB implicit solvation model implemented in the xTB code. By default, calculations are performed in vacuum. Parameters of specific solvents can be selected by providing the solvent name as a string to the solvent
keyword argument of the various wrapper functions, such as solvent="Water"
.
The following loop prints all solvent names that have been defined in the interface:
from hqs_molecules import xtb
for name, keyword in xtb.SOLVENT_KEYWORDS.items():
print(f"{name:20}: {keyword}")
It is possible to specify keys or values of the xtb.SOLVENT_KEYWORDS
dictionary, so that
"Octanol (wet)"
or "woctanol"
are equally valid. In addition, the input is treated
case-insensitively.
For more information on defined solvents, please check the section in the xTB manual.
Example on specifying a solvent in a single-point energy calculation:
from hqs_molecules.xtb import single_point
energy = single_point(molecule, solvent="Water", force_field=False)
Structure optimization
xTB can optimize atomic positions in a molecular system to find a nearby local minimum of the
total energy representing an equilibrium structure.
This feature is embedded into HQS Molecules
through the optimization
function.
Keyword arguments for geometry optimizations
In addition to the arguments for single-point energy calculations, the following options are available:
-
opt_level
with the possible optimization levels as defined by xTB. In this wrapper, the levels are defined in theXTBOptLevel
class. By default, the very tight criterion is employed (XTBOptLevel.VTight
), whereas the xTB software itself usesXTBOptLevel.Normal
. For the definition of the options, ordered here from low to high accuracy, please refer to the xTB documentation:- Crude
- Sloppy
- Loose
- Lax
- Normal
- Tight
- VTight
- Extreme
-
max_cycles
is the maximum number of optimization iterations. By default, it is set toNone
, which means thatmax_cycles
is automatically determined by the xTB software depending on the size of the molecule (check above link).
Output
Calls to the optimization
function return a Trajectory
object containing the optimization trajectory (Trajectory.structures
) as a list of Molecule
objects and the energies (in Hartree) for each step (Trajectory.energies
) as a list of floats.
Most of the time, the quantities of interest are the optimized geometry and its energy. To access these, a Trajectory
object contains two types of attributes: Trajectory.last
and Trajectory.last_energy
contain the last geometry and its energy, respectively; Trajectory.lowest_energy
and Trajectory.lowest
contain the lowest energy and the corresponding geometry. Trajectory.lowest_step
returns the index of the energetically lowest geometry. In many cases, the last step may also be the one with the lowest energy.
Note that the energy might increase slightly at the end of an optimization (within the given tolerance criteria) or it can converge to saddle point, in which case the lowest-energy geometry can be the anywhere along the trajectory, even at its beginning.
The trajectory can be written to an XYZ file using the write
or write_with_info
(includes energies) functions of the xyz
module.
Example
The following box illustrates some general ideas behind executing a geometry optimization, starting from a Molecule
object.
from hqs_molecules import xyz
from hqs_molecules.xtb import optimization
# perform optimization
opt_data = optimization(molecule, solvent="water")
# Write the trajectory into an XYZ file
xyz.write_with_info(file="trajectory.xyz", mol=opt_data)
# Optimized structure: here it is assumed to be in the last step.
opt_molecule = opt_data.last # Molecule object
opt_energy = opt_data.last_energy # float
Gradient calculation
The nuclear gradient (first derivative of the energy with respect to the atomic coordinates) is calculated at each step of an optimization procedure. An optimization is converged if the energy no longer changes and the (norm of the) gradient is close to zero.
A single-point gradient can be calculated using the gradient
function with the same arguments as a single_point
procedure. It returns a nested list of floating point numbers, representing an N × 3 matrix (where N is the number of atoms in the molecule). Each value in the list corresponds to the derivative of an atomic position with respect to the x, y or z coordinates, represented in units of Hartree Bohr−1.
Vibrational frequency calculations
The Hessian matrix (a matrix of second derivatives of the energy with respect to the atomic coordinates) is used to determine the nature of a stationary point, vibrational frequencies, molecular vibrational spectra, and thermochemical properties.
In xTB, the Hessian matrix is calculated semi-numerically, i.e., from the analytical gradient using finite differences.
HQS Molecules
provides the frequencies
wrapper function to calculate vibrational frequencies and thermochemical corrections.
Keyword arguments for vibrational frequency calculations
In addition to the options for single-point energy calculations, this function also accepts a temperature
argument. This allows the user to define the absolute temperature for the calculation of thermochemical corrections. By default, the calculation will be performed at 298.15 K, but any other non-negative temperature can also be specified.
Nature of the input geometry
When calculating vibrational frequencies for a stationary point (an optimized equilibrium structure or a saddle point), it is important to use the same method for the frequencies as for the preceding geometry optimization; the potential energy surfaces of different methods have stationary points at different geometries.
Frequency calculations are often used to ascertain if a stationary point is a minimum or a saddle point. A molecule of N atoms has 3N degrees of freedom: three translational degrees, three rotational degrees (two for linear molecules), and 3N − 6 vibrational degrees (or 3N − 5 for linear molecules). For a minimum, the frequencies associated to those vibrational degrees need to be positive; for a transition state (i.e., a first-order saddle point), exactly one of them needs to be imaginary.
Output
The return value is a tuple with a VibrationalAnalysis
and a Thermochemistry
object.
VibrationalAnalysis
objects contain vibrational frequencies and normal modes in their fields and properties:
modes
: a list ofVibrationalMode
objects, which contain the vibrationalfrequency
(in cm−1), the infrared (IR) intensity (ir_intensity
in km mol−1) as well as the Cartesiandisplacements
of the normal mode as attributes.frequencies
: a list with the vibrational frequencies (in cm−1) in strictly increasing order. This list contains 3N − 6 elements (or 3N − 5 for linear molecules). By convention, imaginary frequencies are represented as negative numbers.hessian
: nested lists representing a 3N × 3N matrix with the nuclear Hessian (in Hartree Bohr−2).reduced_masses
: a list with the reduced masses (in atomic mass units) associated with the normal modes.ir_intensities
: a list with the IR intensities (in km mol−1) containing 3N − 6 or 3N − 5 elements.displacements
: a list with the Cartesian displacements for each normal mode that are normalized to unity.is_linear
: a property flag indicating whether the molecule is linear or not.natoms
: the number of atoms in the system, inferred from the dimensions of the displacements.
Thermochemistry
objects contain thermodynamic data calculated with the modified rigid rotor and harmonic oscillator (mRRHO) approximation.
The following data is accessible:
temperature
: the temperature (in K) used to calculate the thermodynamic properties.energy
: the total electronic single-point energy (in Hartree)enthalpy
: the enthalpy (in Hartree), including the total electronic energy.entropy
: the entropy (in Hartree K-1).gibbs_energy
: the Gibbs energy (in Hartree) as a property. It not stored explicitly, but calculated from the enthalpy, entropy and temperature in the object. Like the enthalpy, it includes the total electronic energy.
When using an implicit solvation model, the xTB code calculates a solvation contribution,
labelled as "Gsolv" in the output of the xTB binary.
Even though "Gsolv" is technically a free energy contribution, it is reported as part of the total
electronic energy by the xTB code.
Therefore, the solvation contribution is also part of energy
in the Thermochemistry
object
returned after a Hessian calculation with the xtb
module.
Example
from hqs_molecules.xtb import frequencies
vib_analysis, thermochem = frequencies(opt_molecule, temperature=100, solvent="water")
print("Single point energy:", thermochem.energy)
print("Frequencies:", vib_analysis.frequencies)
print(
f"At {thermochem.temperature} K, "
f"the enthalpy is {thermochem.enthalpy} Eh, "
f"the entropy is {thermochem.entropy} Eh/K, and "
f"the Gibbs energy is {thermochem.gibbs_energy} Eh"
)
Stable optimization: calculating equilibrium structures reliably
With a simple geometry optimization, there is no guarantee to obtain a minimum on the potential energy surface; a converged structure can also be a saddle point. Vibrational frequencies need to be calculated for a full characterization. Only if all of them are non-negative, a minimum has been found.
If imaginary frequencies are found, a displacement along the associated normal mode is a good starting point for a new optimization. In difficult cases, optimizations followed by frequency calculations may need to be repeated multiple times in order to find a genuine local minimum.
Such a composite calculation is provided by HQS Molecules
with the opt_with_hess
function. It repeats the Hessian and optimization calculations until no imaginary frequencies are found. If the maximum number of restarts (indicated by the max_restarts
keyword, 4 by default) is reached without converging to a minimum, an exception is raised by default.
The latter behavior can be altered by setting fail_with_saddle
to False
, in which case the non-converged trajectory is returned together
with the Hessian result from the lowest-energy structure. The latter can be obtained from the lowest
property
and can be used for a successive calculation.
Note that the restart geometries used in opt_with_hess
are generated by the xTB code upon encountering a saddle point in a Hessian calculation. These are suited to displace the geometry away from a saddle point. As there is no control over the direction along a normal mode with imaginary frequency, the feature is meant to converge to a nearby local minimum, but not for a systematic exploration of reaction paths.
Input
Since the function combines optimizations with frequencies, it requires the same arguments as the optimization
function, with an option to provide a temperature
argument (for the thermodynamic properties) and max_restarts
.
Output
The return value is a tuple containing a Trajectory
object (as for an optimization), and
VibrationalAnalysis
and Thermochemistry
objects (as for a frequency calculation). The trajectory concatenates the geometries and energies of all optimizations performed within the procedure. On the other hand, the VibrationalAnalysis
and Thermochemistry
objects contains information for the converged geometry, determined by the overall lowest energy.
If fail_with_saddle
is set to False
, the appropriate results for the lowest-energy structure are returned, even if it could not be converged to a stable minimum.
Example
A simple call to this procedure can be made as exemplified below:
from hqs_molecules import xtb, xyz
molecule = xyz.read("input_geometry.xyz").to_molecule(charge=0, multiplicity=1)
trajectory, vib_analysis, thermochem = xtb.opt_with_hess(molecule)
Biased Hessian
Hessian calculations for vibrational frequencies or thermochemical contributions are normally performed at stationary points of the potential energy surface. Therefore, the geometry optimization and the Hessian calculation need to be performed with the same method.
However, Spicher and Grimme proposed a method ("single-point Hessian") to calculate the Hessian for stationary points that were optimized with a different method. Therefore, the geometry can be optimized with a more expensive method, while xTB is used to obtain a quantity resembling the Hessian. In a nutshell, it employs the following steps:
- The input structure is re-optimized under application of a biasing potential, which constrains the atoms to remain close to their original positions.
- The Hessian is calculated at the new, slightly distorted geometry under the application of the biasing potential.
- Finally, the frequencies are re-scaled to reduce the impact of the biasing potential.
HQS Molecules
provides the function biased_hessian
to interface the described method in the xTB package. Its usage is analogous to the function frequencies
described above.