Active space selection

The ActiveSpaceSelector algorithm in QDK/Chemistry performs active space selection to identify the most chemically relevant orbitals for multi-configurational calculations. Following QDK/Chemistry’s algorithm design principles, it takes a Wavefunction instance as input and produces a Wavefunction instance with active space information as output. Its primary purpose is to reduce the cost of quantum chemistry calculations by focusing on a specific set of relevant (active) orbitals while treating others as either fully occupied (core) or empty (virtual).

Overview

Active space methods classify molecular orbitals into three categories:

Inactive (core) orbitals

Always doubly occupied and not explicitly correlated

Active orbitals

Allow variable occupation and are explicitly correlated

Virtual orbitals

Always empty and not explicitly correlated

The key challenge is selecting which orbitals to include in the active space. An ideal active space should:

  • Include all orbitals with significant entanglement character

  • Be as compact as possible to keep computational cost manageable

  • Capture the essential chemistry of the system

The selected active space then serves as input for post-SCF methods like multi-configuration calculations that explicitly treat electron correlation within the active space.

Running an active space selection

This section demonstrates how to create, configure, and run an active space selection. The run method takes a Wavefunction from a prior SCF calculation and returns a new Wavefunction with active space information populated.

Input requirements

The ActiveSpaceSelector requires the following input:

Wavefunction

A Wavefunction instance containing data necessary for active space selection, including: orbital information, electron counts, etc. Some selection methods (e.g., entropy-based) may require additional information such as orbital occupation numbers or entropies from a prior multi-configuration calculation.

Note

The specific requirements depend on the chosen implementation. Manual selection methods (like qdk_valence) require user-specified active space sizes, while automatic methods (like qdk_occupation or qdk_entropy) analyze orbital properties to determine the active space.

Creating an active space selector

#include <iostream>
#include <qdk/chemistry.hpp>
using namespace qdk::chemistry::algorithms;
using namespace qdk::chemistry::data;

int main() {
  // Create the default ActiveSpaceSelector instance
  auto active_space_selector = ActiveSpaceSelectorFactory::create();
from qdk_chemistry.algorithms import create

# Create the default ActiveSpaceSelector instance
active_space_selector = create("active_space_selector", "qdk_valence")

Configuring settings

Settings can be modified using the settings() object. See Available implementations below for implementation-specific options.

  // Configure the selector using the settings interface
  // Set the number of electrons and orbitals for the active space
  active_space_selector->settings().set("num_active_electrons", 4);
  active_space_selector->settings().set("num_active_orbitals", 4);
# Configure the selector using the settings interface
# Set the number of electrons and orbitals for the active space
active_space_selector.settings().set("num_active_electrons", 4)
active_space_selector.settings().set("num_active_orbitals", 4)

Running the selection

  // Load a molecular structure (water molecule) from XYZ file
  auto structure = Structure::from_xyz_file("../data/water.structure.xyz");
  int charge = 0;

  // First, run SCF to get molecular orbitals
  auto scf_solver = ScfSolverFactory::create();
  auto [scf_energy, scf_wavefunction] =
      scf_solver->run(structure, charge, 1, "6-31g");

  // Run active space selection
  auto active_wavefunction = active_space_selector->run(scf_wavefunction);
  auto active_orbitals = active_wavefunction->get_orbitals();

  std::cout << "SCF Energy: " << scf_energy << " Hartree" << std::endl;
  std::cout << "Active orbitals summary:\n"
            << active_orbitals->get_summary() << std::endl;
from pathlib import Path  # noqa: E402
from qdk_chemistry.data import Structure  # noqa: E402

# Load a molecular structure (water molecule) from XYZ file
structure = Structure.from_xyz_file(
    Path(__file__).parent / "../data/water.structure.xyz"
)
charge = 0

# First, run SCF to get molecular orbitals
scf_solver = create("scf_solver")
scf_energy, scf_wavefunction = scf_solver.run(
    structure, charge=charge, spin_multiplicity=1, basis_or_guess="6-31g"
)

# Run active space selection
active_wavefunction = active_space_selector.run(scf_wavefunction)
active_orbitals = active_wavefunction.get_orbitals()

print(f"SCF Energy: {scf_energy:.10f} Hartree")
print(f"Active orbitals summary:\n{active_orbitals.get_summary()}")

Available implementations

QDK/Chemistry’s ActiveSpaceSelector provides implementations for various selection strategies. You can discover available implementations programmatically:

  auto names = ActiveSpaceSelectorFactory::available();
  for (const auto& name : names) {
    std::cout << name << std::endl;
  }
from qdk_chemistry.algorithms import registry  # noqa: E402

print(registry.available("active_space_selector"))
# ['pyscf_avas', 'qdk_occupation', 'qdk_autocas_eos', 'qdk_autocas', 'qdk_valence']

QDK Valence

Factory name: "qdk_valence" (default)

Manual valence-based selection where users specify the number of active electrons and orbitals. Selects orbitals near the HOMO-LUMO gap.

Settings

Setting

Type

Default

Description

num_active_electrons

int

-1

Number of electrons in the active space (required)

num_active_orbitals

int

-1

Number of orbitals in the active space (required)

QDK Occupation

Factory name: "qdk_occupation"

Automatic selection based on orbital occupation numbers, identifying orbitals with fractional occupation.

Settings

Setting

Type

Default

Description

occupation_threshold

float

0.1

Orbitals with occupations deviating from 0 or 2 by more than this threshold are selected

QDK AutoCAS

Factory name: "qdk_autocas"

Entropy-based automatic selection using histogram-based plateau detection to identify strongly correlated orbitals. See AutoCAS Algorithm below for a detailed description.

Note

This method requires the input wavefunction to have orbital entropies populated. Orbital entropies are computed from the one- and two-electron reduced density matrices (1-RDM and 2-RDM), which are typically obtained from a multi-configuration calculation with calculate_one_rdm=True and calculate_two_rdm=True.

See MultiConfigurationCalculator for details on generating wavefunctions with RDMs.

Settings

Setting

Type

Default

Description

entropy_threshold

float

0.14

Entropy threshold for selection

min_plateau_size

int

10

Minimum size of entropy plateau for selection

num_bins

int

100

Number of histogram bins for plateau detection

normalize_entropies

bool

True

Whether to normalize entropy values

QDK AutoCAS EOS

Factory name: "qdk_autocas_eos"

Entropy-based selection using consecutive entropy differences to identify plateau boundaries. See AutoCAS Algorithm below for a detailed description.

Note

This method requires the input wavefunction to have orbital entropies populated. Orbital entropies are computed from the one- and two-electron reduced density matrices (1-RDM and 2-RDM), which are typically obtained from a multi-configuration calculation with calculate_one_rdm=True and calculate_two_rdm=True. See MultiConfigurationCalculator for details on generating wavefunctions with RDMs.

Settings

Setting

Type

Default

Description

entropy_threshold

float

0.14

Entropy threshold for selection

diff_threshold

float

0.1

Difference threshold for EOS-based selection

normalize_entropies

bool

True

Whether to normalize entropy values by the maximum

AutoCAS Algorithm

Selecting an appropriate active space is one of the most challenging aspects of multi-configuration calculations. Traditional approaches rely on chemical intuition and trial-and-error, which can be unreliable for complex systems. The AutoCAS algorithm [SR16, SR19] provides a systematic, black-box approach to active space selection.

AutoCAS leverages concepts from quantum information theory to quantify orbital correlation. The key insight is that strongly correlated orbitals are highly entangled with the rest of the electronic system. This entanglement can be measured using the single-orbital entropy \(s_i^{(1)}\), which quantifies how much information about orbital \(i\) is “shared” with all other orbitals.

Single orbital entropies can be calculated for many-body systems given access to (approximate) one- and two-particle reduced density matrices (RDM) [BT15], which are easily accessible in QDK/Chemistry through multi-configuration wavefunction data structures. As such, single orbital entropies are computed by default when RDMs are requested in multi-configuration calculations. The QDK/Chemistry implementation of AutoCAS is agnostic to the underlying wavefunction method, as long as the required RDMs are available, thus allowing for comparisons across different multi-configuration approaches.

QDK/Chemistry AutoCAS Variants

QDK/Chemistry provides two entropy-based selection methods:

AutoCAS (Histogram-Based Plateau Detection)

As described in the original AutoCAS protocol [SR16, SR19], this method discretizes the entropy distribution into histogram bins and identifies plateaus—contiguous regions where the count of orbitals above each entropy threshold remains constant. This approach is robust for systems with clear entropy gaps but requires tuning of num_bins and min_plateau_size parameters. If none of the entropies exceed the entropy_threshold, the system is considered single configurational and all orbitals are excluded from the active space.

AutoCAS-EOS (Entropy Difference Detection)

Uses a direct approach that examines consecutive differences in the sorted entropy values. When the difference between adjacent entropies exceeds diff_threshold and the entropy is above entropy_threshold, a plateau boundary is identified.

Both methods sort orbitals by decreasing entropy and select the largest identified group of strongly correlated orbitals for the active space.

Populating Orbital Entropies

The entropy-based AutoCAS methods require orbital entropies as input, which are computed from the one- and two-electron reduced density matrices (RDMs). These RDMs must be obtained from a multi-configuration calculation that captures static correlation. A key practical consideration is balancing the cost of this initial calculation against the quality of the resulting entropies.

A recommended approach is to use Selected Configuration Interaction (SCI) with a relatively small number of determinants (e.g., 10,000–50,000) within a conservatively chosen active space. SCI methods are well-suited for this purpose because they:

  • Automatically identify the most important determinants for capturing static correlation

  • Scale favorably compared to full configuration interaction

  • Provide high-quality RDMs even with truncated determinant spaces

This approach has been shown to provide a reasonable trade-off between computational cost and entropy accuracy for active space selection. The resulting entropies are typically sufficient to identify strongly correlated orbitals, even when the SCI calculation uses a fraction of the determinants that would be required for quantitative energy accuracy.

Example

Note

The number of valence orbitals and electrons can be automatically determined using the utility function compute_valence_space_parameters().

  // Create a valence space active space selector
  auto valence_selector = ActiveSpaceSelectorFactory::create("qdk_valence");
  // Automatically select valence parameters based on the input structure
  auto [num_electrons, num_orbitals] =
      qdk::chemistry::utils::compute_valence_space_parameters(scf_wavefunction,
                                                              charge);
  valence_selector->settings().set("num_active_electrons", num_electrons);
  valence_selector->settings().set("num_active_orbitals", num_orbitals);
  auto active_valence_wfn = valence_selector->run(scf_wavefunction);

  // Create active Hamiltonian
  auto active_hamiltonian_generator = HamiltonianConstructorFactory::create();
  auto active_hamiltonian =
      active_hamiltonian_generator->run(active_valence_wfn->get_orbitals());

  // Run Active Space Calculation with Selected CI
  auto mc_calculator =
      MultiConfigurationCalculatorFactory::create("macis_asci");
  mc_calculator->settings().set("ntdets_max", 50000);
  mc_calculator->settings().set("calculate_one_rdm", true);
  mc_calculator->settings().set("calculate_two_rdm", true);
  auto [mc_energy, mc_wavefunction] = mc_calculator->run(
      active_hamiltonian, num_electrons / 2, num_electrons / 2);

  // Select active space using AutoCAS
  auto autocas_selector = ActiveSpaceSelectorFactory::create("qdk_autocas_eos");
  auto active_autocas_wfn = autocas_selector->run(mc_wavefunction);
  std::cout << "AutoCAS selected active orbitals summary:\n"
            << active_autocas_wfn->get_orbitals()->get_summary() << std::endl;
from qdk_chemistry.utils import compute_valence_space_parameters  # noqa: E402

# Create a valence space active space selector
valence_selector = create("active_space_selector", "qdk_valence")
# Automatically select valence parameters based on the input structure
num_electrons, num_orbitals = compute_valence_space_parameters(scf_wavefunction, charge)
valence_selector.settings().set("num_active_electrons", num_electrons)
valence_selector.settings().set("num_active_orbitals", num_orbitals)
active_valence_wfn = valence_selector.run(scf_wavefunction)

# Create active Hamiltonian
active_hamiltonian_generator = create("hamiltonian_constructor")
active_hamiltonian = active_hamiltonian_generator.run(active_valence_wfn.get_orbitals())

# Run Active Space Calculation with Selected CI
mc_calculator = create("multi_configuration_calculator", "macis_asci")
mc_calculator.settings().set("ntdets_max", 50000)
mc_calculator.settings().set("calculate_one_rdm", True)
mc_calculator.settings().set("calculate_two_rdm", True)
mc_energy, mc_wavefunction = mc_calculator.run(
    active_hamiltonian, num_electrons // 2, num_electrons // 2
)

# Select active space using AutoCAS
autocas_selector = create("active_space_selector", "qdk_autocas_eos")
active_autocas_wfn = autocas_selector.run(mc_wavefunction)
print("AutoCAS selected active orbitals summary:")
print(active_autocas_wfn.get_orbitals().get_summary())

PySCF AVAS

Factory name: "pyscf_avas"

The PySCF plugin provides access to the Automated Valence Active Space (AVAS) method from PySCF. AVAS selects active orbitals by projecting molecular orbitals onto a target atomic orbital basis. See the original AVAS publication [SSCK17] for details.

Settings

Setting

Type

Default

Description

ao_labels

list[str]

[]

Atomic orbital labels to include (e.g., ["Fe 3d", "Fe 4d"]); required

canonicalize

bool

False

Whether to canonicalize active orbitals after selection

openshell_option

int

2

Handling of singly-occupied orbitals: 2 = project as alpha, 3 = keep in active space

Example

avas = create("active_space_selector", "pyscf_avas")
avas.settings().set("ao_labels", ["O 2p", "O 2s"])
avas.settings().set("canonicalize", True)

active_wavefunction = avas.run(scf_wavefunction)

Further reading