Orbital localization

The OrbitalLocalizer algorithm in QDK/Chemistry performs various orbital transformations to create localized or otherwise transformed molecular orbitals. Following QDK/Chemistry’s algorithm design principles, it takes a Wavefunction instance with reference orbitals as input and produces a new Wavefunction instance with localized orbitals as output. For more information about this pattern, see the Factory Pattern documentation.

Overview

Canonical molecular orbitals from SCF calculations are typically delocalized over the entire molecule, which can complicate chemical interpretation and slow the convergence of post-SCF correlation methods. The OrbitalLocalizer algorithm applies unitary transformations to these orbitals to obtain alternative representations that may be more physically intuitive or computationally advantageous.

Orbital transformation techniques broadly fall into two categories:

Optimization-Based Methods

The vast majority of localization methods define a cost function and iteratively minimize it to yield localized orbitals [LJ13]. Popular choices include Pipek-Mezey [PM89], Foster-Boys [FB60], and Edmiston-Ruedenberg [ER63] localization. These methods produce orbitals that are maximally localized on specific atoms or bonds.

Analytical Methods

These methods transform orbitals in a single step through analytical techniques rather than iterative optimization. Natural orbitals [LowdinS56], which diagonalize the one-particle reduced density matrix, are a prominent example and can be particularly useful for active space selection. Cholesky localization [ABPSdMK06] provides efficient approximate localization via Cholesky decomposition of the density matrix. The VVHV separation also falls into this category, using projection and orthogonalization to partition virtual orbitals into valence and hard-virtual subspaces.

The specific methods available depend on the backend implementation selected. QDK/Chemistry provides native implementations of key algorithms and extends these capabilities through plugins that integrate external quantum chemistry packages. See Available implementations below for details on each backend and its supported methods.

Running orbital localization

This section demonstrates how to create, configure, and run orbital localization. The run method takes a Wavefunction instance and returns a new Wavefunction object with transformed orbitals.

Input requirements

The OrbitalLocalizer requires the following inputs:

Wavefunction

A Wavefunction instance containing the molecular orbitals to be localized.

Alpha orbital indices (loc_indices_a)

A list/vector of indices specifying which alpha orbitals to include in the localization. Indices must be sorted in ascending order.

Beta orbital indices (loc_indices_b)

A list/vector of indices specifying which beta orbitals to include in the localization. Indices must be sorted in ascending order.

Creating a localizer

#include <iostream>
#include <qdk/chemistry.hpp>
#include <string>
using namespace qdk::chemistry::algorithms;
using namespace qdk::chemistry::data;

// Create a Pipek-Mezey localizer using the factory
auto localizer = LocalizerFactory::create("qdk_pipek_mezey");
from pathlib import Path
from qdk_chemistry.algorithms import create
from qdk_chemistry.data import Structure

# Create a Pipek-Mezey localizer
localizer = create("orbital_localizer", "qdk_pipek_mezey")

Configuring settings

Settings can be modified using the settings() object. See Available implementations below for implementation-specific options.

  // Configure settings for localizer
  localizer->settings().set("tolerance", 1.0e-6);
  localizer->settings().set("max_iterations", 100);
  std::cout << localizer->settings().keys() << std::endl;
# Configure localizer settings
localizer.settings().set("tolerance", 1.0e-6)
localizer.settings().set("max_iterations", 100)

# View available settings
print(f"Localizer settings: {localizer.settings().keys()}")

Running localization

Note

For restricted calculations, loc_indices_a and loc_indices_b must be identical.

  // Load H2O molecule from XYZ file
  auto structure = Structure::from_xyz_file("../data/water.structure.xyz");

  // Obtain orbitals from SCF calculation
  auto scf_solver = ScfSolverFactory::create();
  auto [E_scf, wavefunction] = scf_solver->run(structure, 0, 1, "sto-3g");

  // Specify which orbitals to localize
  // For restricted calculations, alpha and beta orbitals are identical
  std::vector<size_t> loc_indices = {0, 1, 2, 3};

  // Localize the specified orbitals
  auto localized_wfn = localizer->run(wavefunction, loc_indices, loc_indices);
  auto localized_orbitals = localized_wfn->get_orbitals();

  // Print summary
  std::cout << localizer->get_summary() << std::endl;
# Load H2O molecule from XYZ file
structure = Structure.from_xyz_file(
    Path(__file__).parent / "../data/water.structure.xyz"
)

# Obtain orbitals from SCF
scf_solver = create("scf_solver")
E_scf, wfn = scf_solver.run(
    structure, charge=0, spin_multiplicity=1, basis_or_guess="sto-3g"
)

# Create indices for orbitals to localize
loc_indices = [0, 1, 2, 3]

# Localize the specified orbitals
localized_wfn = localizer.run(wfn, loc_indices, loc_indices)

localized_orbitals = localized_wfn.get_orbitals()
print(localized_orbitals.get_summary())

Available implementations

QDK/Chemistry’s OrbitalLocalizer provides a unified interface to orbital localization methods. You can discover available implementations programmatically:

  auto names = LocalizerFactory::available();
  for (const auto& name : names) {
    std::cout << name << std::endl;
  }
from qdk_chemistry.algorithms import registry  # noqa: E402

print(registry.available("orbital_localizer"))
# ['pyscf_multi', 'qdk_vvhv', 'qdk_mp2_natural_orbitals', 'qdk_pipek_mezey']

QDK Pipek-Mezey

Factory name: "qdk_pipek_mezey" (default)

Native implementation of Pipek-Mezey localization [PM89], which maximizes the sum of squared Mulliken charges on each atom for each orbital. This produces orbitals that are maximally localized on specific atoms or bonds, making them well-suited for chemical interpretation and local correlation methods.

Settings

Setting

Type

Default

Description

tolerance

float

1e-6

Convergence criterion for localization iterations

max_iterations

int

10000

Maximum number of localization iterations

small_rotation_tolerance

float

1e-12

Threshold for small rotation detection

QDK MP2 Natural Orbitals

Factory name: "qdk_mp2_natural_orbitals"

Computes natural orbitals [LowdinS56] from the MP2 one-particle density matrix. These orbitals diagonalize the correlation effects and provide occupation numbers that can guide active space selection.

Settings

This implementation has no configurable settings.

QDK VVHV

Factory name: "qdk_vvhv"

The VVHV (Valence Virtual–Hard Virtual) localizer addresses the numerical challenges of orbital localization in near-complete basis sets by partitioning the virtual space into chemically meaningful subspaces. See VVHV Algorithm below for a detailed description.

Settings

Setting

Type

Default

Description

tolerance

float

1e-6

Convergence criterion for iterative localization optimization

max_iterations

int

10000

Maximum number of localization iterations

small_rotation_tolerance

float

1e-12

Threshold for small rotation detection

minimal_basis

string

"sto-3g"

Minimal basis set for valence virtual projection

weighted_orthogonalization

bool

True

Use weighted orthogonalization in hard virtual construction

VVHV Algorithm

Localization of molecular orbitals expressed in near-complete basis sets is numerically ill-posed and challenging for most localizers. This can lead to orbitals that do not vary smoothly with molecular geometry, numerically unstable results, and reproducibility difficulties across architectures and compute environments. The Valence Virtual–Hard Virtual (VVHV) separation [SDHG05] addresses these problems by partitioning the virtual orbital space into chemically meaningful subspaces before localization.

The Problem with Standard Localization

Standard orbital localization methods optimize a cost function (e.g., Pipek-Mezey, Foster-Boys) over blocks of orbitals simultaneously (e.g. occupied, virtual, active). In large basis sets, the virtual space contains orbitals of vastly different character:

Valence-virtual orbitals

Low-lying virtual orbitals that are chemically relevant for describing bond breaking/formation and correlation effects

Hard-virtual orbitals

High-energy orbitals that primarily describe core-valence polarization and basis set completeness

When localization is applied to the full virtual space, the optimizer may mix these distinct orbital types, leading to non-physical results that are sensitive to numerical precision and can vary discontinuously along reaction coordinates.

The VVHV Separation Procedure

The VVHV algorithm proceeds in three stages:

  1. Minimal Basis Projection: Project the canonical virtual orbitals onto a minimal basis set (e.g., STO-3G) to identify the valence-virtual subspace. Given the overlap matrix \(\mathbf{S}_{\text{min}}\) between the computational basis and minimal basis, the valence-virtual orbitals span the range of:

    \[\mathbf{P}_{\text{VV}} = \mathbf{S}_{\text{min}} (\mathbf{S}_{\text{min}}^T \mathbf{S}_{\text{min}})^{-1} \mathbf{S}_{\text{min}}^T\]
  2. Orthogonalization: Construct orthonormal valence-virtual and hard-virtual orbital sets. The hard-virtual orbitals are obtained as the orthogonal complement to the valence-virtual space. QDK/Chemistry implements both standard and weighted orthogonalization [WIS+25] procedures; weighted orthogonalization improves numerical stability for near-linear dependencies.

  3. Subspace Localization: Apply the chosen localization method (e.g., Pipek-Mezey) separately within each subspace. This ensures that the optimization landscape is well-behaved and that orbitals vary smoothly with geometry.

Benefits for Active Space Selection

The VVHV separation is particularly valuable for multi-configuration calculations where a consistent active space must be maintained along a reaction pathway. By localizing only the valence-virtual orbitals, which are the chemically relevant virtual orbitals for active space construction, the VVHV procedure ensures:

  • Orbitals that track smoothly as bonds stretch and form

  • Numerically stable and reproducible results

  • Well-defined orbital character that aids chemical interpretation

For details on using localized orbitals in active space selection, see ActiveSpaceSelector.

PySCF Multi

Factory name: "pyscf_multi"

The PySCF plugin provides access to additional localization algorithms through PySCF:

Pipek-Mezey [PM89]

Maximizes atomic charge localization

Foster-Boys [FB60]

Minimizes the spatial extent of orbitals

Edmiston-Ruedenberg [ER63]

Maximizes self-repulsion energy

Cholesky [ABPSdMK06]

Efficient analytical localization via Cholesky decomposition

Settings

Setting

Type

Default

Description

method

string

"pipek-mezey"

Localization algorithm: "pipek-mezey", "foster-boys", "edmiston-ruedenberg", or "cholesky"

population_method

string

"mulliken"

Population analysis method for Pipek-Mezey localization

occupation_threshold

float

1e-10

Threshold for classifying orbitals as occupied vs virtual

Further reading