Semi-empirical Calculations with xTB

This HQS Moleculesmodule contains a Python interface to perform calculations with the xTB and CREST software packages.

The xTB (extended tight binding) program package is developed by the research group of Prof. Stefan Grimme at Universität Bonn. It contains implementations of the semi-empirical quantum-mechanical methods GFNn-xTB (n = 0–2) as well as of the GFN-FF force field ("GFN" stands for Geometries, Frequencies, and Non-covalent interactions, for which it has been parametrized). HQS Molecules contains wrapper functions to perform single-point energy calculations, geometry optimizations, and frequency calculations.

In this section the user will find further details on the associated functions:

  • Single-point energy calculations,
  • Structure optimizations,
  • Gradient calculations,
  • Frequency calculations,
  • Stable optimizations including Hessian calculations,
  • Biased Hessian, and
  • Conformer/Rotamer search.

Single-point energy calculations

The energy of any meaningful set of atomic coordinates with a specific charge and multiplicity can be calculated using the single_point function. Calculations are performed with the GFN2-xTB method by default, or alternatively with the GFN-FF method.

Input

The first mandatory argument of the function is a Molecule object, as the charge and multiplicity need to be known to perform a calculation. Note: the functions for reading an XYZ file return MolecularGeometry objects, which contain neither charge nor spin information. A MolecularGeometry objects can be easily converted to a Molecule object by providing information on the charge and, optionally, on the multiplicity, as described in the XYZ File Handling chapter. On the other hand, the 2D-to-3D structure conversion functions return Molecule objects with charge but without spin information.

Keyword arguments

It is possible to further refine the calculation with the following keyword arguments:

  • force_field: Use GFN-FF force-field (defaults to False, in which case the GFN2-xTB method is used).
  • solvent: By default, calculations are performed in vacuum. If the name of a solvent is provided, the implicit solvation model ALPB (analytical linearized Poisson-Boltzmann) will be used in the calculation. The available solvents are listed here.
  • create_tmpdir: Run the calculation in a temporary folder that is deleted with completion of the calculation (the default is set to True).

Output

The single_point function returns a float with the single-point energy in Hartree.

Example

An example is provided here:

from hqs_molecules import single_point

energy = single_point(molecule, solvent="Water", force_field=False)

Structure optimization

xTB can vary the position of atoms in the system to find the equilibrium structure with the lowest energy. This functionality is embedded into HQS Molecules in the optimization function.

Extra arguments

The function works similarly to the single_point function but it has two extra arguments:

  • opt_level with the possible optimization levels as defined by xTB. In this wrapper, the levels are defined in the XTBOptLevel class. By default, the very tight criterion is employed (XTBOptLevel.VTight), whereas the xTB software itself usesXTBOptLevel.Normal. The options are:

    • Crude
    • Sloppy
    • Loose
    • Lax
    • Normal
    • Tight
    • VTight
    • Extreme
  • max_cycles is the maximum number of optimization iterations. By default, it is set to None, which means that max_cycles is automatically determined by the xTB software depending on the size of the molecule (check above link).

Output: Trajectory

Calls to optimization function return a Trajectory object containing the optimization trajectory (structures) as a list of Molecule objects and the energies (in Hartree) for each step (energies) as a list of floats.

Most of the time, the quantities of interest are the optimized geometry and its energy. To access these, a Trajectory object contains two types of attributes: Trajectory.last and Trajectory.last_energy contain the last geometry and its energy, respectively; Trajectory.lowest_energy and Trajectory.lowest contain the lowest energy and the corresponding geometry. Trajectory.lowest_step returns the index of the energetically lowest geometry. In many cases, the last step may also be the one with the lowest energy.

The trajectory can be written to an XYZ file using the write (or write_with_info) functions of the xyz module.

Example

The following box illustrates some general ideas behind executing a geometry optimization, starting from a Molecule object.

from hqs_molecules import optimization, xyz

# perform optimization
opt_data = optimization(molecule, solvent="water")

# Write the trajectory into an XYZ file
xyz.write(file="trajectory.xyz", mol=opt_data.structures)

# Optimized structure: here it is assumed to be in the last step.
opt_molecule = opt_data.last  # Molecule object
opt_energy = opt_data.last_energy  # float

Gradient calculation

The nuclear gradient (first derivative of the energy with respect to the atomic coordinates) is calculated at each step of an optimization procedure. A necessary condition for an optimization to converge is typically to reduce the (norm of the) gradient to almost zero.

A single-point gradient can be calculated using the gradient function with the same arguments as a single_point procedure. It returns a nested list of floating point numbers, representing an N × 3 matrix (where N is the number of atoms in the molecule). Each value in the list corresponds to the derivative of an atomic position with respect to the x, y or z coordinates, represented in units of Hartree/Bohr.

Frequency calculations

The Hessian matrix (a matrix of second derivatives of the energy with respect to the atomic coordinates) is used to determine the nature of a stationary point, vibrational frequencies, molecular vibrational spectra, and thermochemical properties. In xTB, the Hessian matrix is calculated numerically.

Extra argument

HQS molecules provides the frequencies wrapper function to calculate vibrational frequencies and thermochemical corrections; note that it does not return the Hessian matrix itself. Besides the arguments already mentioned in the single_point wrapper, this function also accepts a temperature argument, which allows the user to define one or multiple temperatures for the calculation of thermochemical corrections. By default, the calculation will be performed at 298.15 K, but other temperature or a list of temperatures can also be specified.

Stationary point

If frequency calculations are meant to be performed for a stationary point (an optimized equilibrium structure or a saddle point), it is important to perform the calculations with the same method as the optimization itself; the potential energy surfaces of different methods have differing stationary points.

Frequency calculations are often used to ascertain if a stationary point is a minimum or a saddle point. A molecule of N atoms has 3N degrees of freedom: three translational degrees, three rotational degrees (two for linear molecules), and 3N − 6 vibrational degrees (or 3N − 5 for linear molecules). For a minimum, those frequencies need to be positive; for a transition state, exactly one of them needs to be imaginary.

Output

The return value is a MolecularFrequencies object, which contains three important fields:

  • total_energy: the total electronic energy of the system (in Hartree).
  • frequencies: a list with the harmonic frequencies (in cm−1). This list contains 3N elements, where 6 values should be zero (or 5 for linear molecules). By convention, imaginary frequencies are represented as negative numbers.
  • thermochem: a list of objects with temperature-dependent thermodynamic properties (BasicThermochemistry). Each of these BasicThermochemistry objects contains the temperature (in K), the total enthalpy and the total gibbs_energy (both in Hartree), and the entropy (in Hartree / K). Note that the electronic energy is included in the enthalpy and in the Gibbs energy; however, the solvation contribution currently is not.

Example

from hqs_molecules import frequencies

thermo_data = frequencies(opt_geom, temperature=[100, 125], solvent="water")

print("Single point energy:", thermo_data.total_energy)
print("Frequencies:", thermo_data.frequencies)
for res in thermo_data.thermochem:
    print(
        f"At {res.temperature} K: "
        f"the enthalpy is {res.enthalpy}, "
        f"the entropy is {res.entropy}, and "
        f"the Gibbs energy is {res.gibbs_energy} (in Hartree)"
    )

Stable optimization with Hessian calculations

With a simple geometry optimization, there is no guarantee of obtaining a minimum on the potential energy surface; a converged structure can also be a saddle point. Vibrational frequencies need to be calculated for a full characterization. Only if all of them are non-negative, a minimum has been found.

If imaginary frequencies are found, a displacement along the associated normal mode is a good starting point for a new optimization. In difficult cases, optimizations followed by frequency calculations may need to be repeated multiple times in order to find a genuine local minimum.

Such a composite calculation is provided by HQS Molecules with the opt_with_hess function. It repeats the Hessian and optimization calculations until no imaginary frequencies are found. If the maximum number of restarts (indicated by the max_restarts keyword, 4 by default) is reached without converging to a minimum, an exception is raised by default. The latter behavior can be altered by setting fail_with_saddle to False, in which case the non-converged trajectory is returned together with the Hessian result from the lowest-energy structure. The latter can be obtained from the lowest property and can be used for a successive calculation.

Input

Since the function combines optimizations with frequencies, it requires the same arguments as the optimization function, with an option to provide a temperature argument (for the thermodynamic properties) and max_restarts.

Output

The output of optimization with Hessian calculation is a tuple with a Trajectory object (as for an optimization) and a MolecularFrequencies object (as for a frequency calculation). The trajectory concatenates the geometries and energies of all optimizations performed within the procedure. On the other hand, the MolecularFrequencies object contains information for the converged geometry, determined by the overall lowest energy. If fail_with_saddle is set to False, the analogous results from the possibly non-converged geometry may be returned.

Example

A simple call to this procedure can be made as exemplified below:

from hqs_molecules import opt_with_hess, xyz
molecule = xyz.read("input_geometry.xyz").to_molecule(charge=0, multiplicity=1)
trajectory, frequency_data = opt_with_hess(molecule)

Biased Hessian

Hessian calculations for vibrational frequencies or thermochemical contributions are normally performed at stationary points of the potential energy surface. Therefore, the geometry optimization and the Hessian calculation need to be performed with the same method.

However, Spicher and Grimme proposed a method ("single-point Hessian") to calculate the Hessian for stationary points that were optimized with a different method. Therefore, the geometry can be optimized with a more expensive method, while xTB is used to obtain a quantity resembling the Hessian. In a nutshell, it employs the following steps:

  1. The input structure is re-optimized under application of a biasing potential, which constrains the atoms to remain close to their original positions.
  2. The Hessian is calculated at the new, slightly distorted geometry under the application of the biasing potential.
  3. Finally, the frequencies are re-scaled to reduce the impact of the biasing potential.

HQS Moleculesprovides the function biased_hessian to interface the described method in the xTB package. Its usage is analogous to the function frequencies described above.