Active space selection
The ActiveSpaceSelector algorithm in QDK/Chemistry performs active space selection to identify the most chemically relevant orbitals for multi-configurational calculations.
Following QDK/Chemistry’s algorithm design principles, it takes a Wavefunction instance as input and produces a Wavefunction instance with active space information as output.
Its primary purpose is to reduce the cost of quantum chemistry calculations by focusing on a specific set of relevant (active) orbitals while treating others as either fully occupied (core) or empty (virtual).
Overview
Active space methods classify molecular orbitals into three categories:
- Inactive (core) orbitals
Always doubly occupied and not explicitly correlated
- Active orbitals
Allow variable occupation and are explicitly correlated
- Virtual orbitals
Always empty and not explicitly correlated
The key challenge is selecting which orbitals to include in the active space. An ideal active space should:
Include all orbitals with significant entanglement character
Be as compact as possible to keep computational cost manageable
Capture the essential chemistry of the system
The selected active space then serves as input for post-SCF methods like multi-configuration calculations that explicitly treat electron correlation within the active space.
Running an active space selection
This section demonstrates how to create, configure, and run an active space selection.
The run method takes a Wavefunction from a prior SCF calculation and returns a new Wavefunction with active space information populated.
Input requirements
The ActiveSpaceSelector requires the following input:
- Wavefunction
A
Wavefunctioninstance containing data necessary for active space selection, including: orbital information, electron counts, etc. Some selection methods (e.g., entropy-based) may require additional information such as orbital occupation numbers or entropies from a prior multi-configuration calculation.
Note
The specific requirements depend on the chosen implementation. Manual selection methods (like qdk_valence) require user-specified active space sizes, while automatic methods (like qdk_occupation or qdk_entropy) analyze orbital properties to determine the active space.
Creating an active space selector
#include <iostream>
#include <qdk/chemistry.hpp>
using namespace qdk::chemistry::algorithms;
using namespace qdk::chemistry::data;
int main() {
// Create the default ActiveSpaceSelector instance
auto active_space_selector = ActiveSpaceSelectorFactory::create();
from qdk_chemistry.algorithms import create
# Create the default ActiveSpaceSelector instance
active_space_selector = create("active_space_selector", "qdk_valence")
Configuring settings
Settings can be modified using the settings() object.
See Available implementations below for implementation-specific options.
// Configure the selector using the settings interface
// Set the number of electrons and orbitals for the active space
active_space_selector->settings().set("num_active_electrons", 4);
active_space_selector->settings().set("num_active_orbitals", 4);
# Configure the selector using the settings interface
# Set the number of electrons and orbitals for the active space
active_space_selector.settings().set("num_active_electrons", 4)
active_space_selector.settings().set("num_active_orbitals", 4)
Running the selection
// Load a molecular structure (water molecule) from XYZ file
auto structure = Structure::from_xyz_file("../data/water.structure.xyz");
int charge = 0;
// First, run SCF to get molecular orbitals
auto scf_solver = ScfSolverFactory::create();
auto [scf_energy, scf_wavefunction] =
scf_solver->run(structure, charge, 1, "6-31g");
// Run active space selection
auto active_wavefunction = active_space_selector->run(scf_wavefunction);
auto active_orbitals = active_wavefunction->get_orbitals();
std::cout << "SCF Energy: " << scf_energy << " Hartree" << std::endl;
std::cout << "Active orbitals summary:\n"
<< active_orbitals->get_summary() << std::endl;
from pathlib import Path # noqa: E402
from qdk_chemistry.data import Structure # noqa: E402
# Load a molecular structure (water molecule) from XYZ file
structure = Structure.from_xyz_file(
Path(__file__).parent / "../data/water.structure.xyz"
)
charge = 0
# First, run SCF to get molecular orbitals
scf_solver = create("scf_solver")
scf_energy, scf_wavefunction = scf_solver.run(
structure, charge=charge, spin_multiplicity=1, basis_or_guess="6-31g"
)
# Run active space selection
active_wavefunction = active_space_selector.run(scf_wavefunction)
active_orbitals = active_wavefunction.get_orbitals()
print(f"SCF Energy: {scf_energy:.10f} Hartree")
print(f"Active orbitals summary:\n{active_orbitals.get_summary()}")
Available implementations
QDK/Chemistry’s ActiveSpaceSelector provides implementations for various selection strategies.
You can discover available implementations programmatically:
auto names = ActiveSpaceSelectorFactory::available();
for (const auto& name : names) {
std::cout << name << std::endl;
}
from qdk_chemistry.algorithms import registry # noqa: E402
print(registry.available("active_space_selector"))
# ['pyscf_avas', 'qdk_occupation', 'qdk_autocas_eos', 'qdk_autocas', 'qdk_valence']
QDK Valence
Factory name: "qdk_valence" (default)
Manual valence-based selection where users specify the number of active electrons and orbitals. Selects orbitals near the HOMO-LUMO gap.
Settings
Setting |
Type |
Default |
Description |
|---|---|---|---|
|
int |
|
Number of electrons in the active space (required) |
|
int |
|
Number of orbitals in the active space (required) |
QDK Occupation
Factory name: "qdk_occupation"
Automatic selection based on orbital occupation numbers, identifying orbitals with fractional occupation.
Settings
Setting |
Type |
Default |
Description |
|---|---|---|---|
|
float |
|
Orbitals with occupations deviating from 0 or 2 by more than this threshold are selected |
QDK AutoCAS
Factory name: "qdk_autocas"
Entropy-based automatic selection using histogram-based plateau detection to identify strongly correlated orbitals. See AutoCAS Algorithm below for a detailed description.
Note
This method requires the input wavefunction to have orbital entropies populated.
Orbital entropies are computed from the one- and two-electron reduced density matrices (1-RDM and 2-RDM), which are typically obtained from a multi-configuration calculation with calculate_one_rdm=True and calculate_two_rdm=True.
See MultiConfigurationCalculator for details on generating wavefunctions with RDMs.
Settings
Setting |
Type |
Default |
Description |
|---|---|---|---|
|
float |
|
Entropy threshold for selection |
|
int |
|
Minimum size of entropy plateau for selection |
|
int |
|
Number of histogram bins for plateau detection |
|
bool |
|
Whether to normalize entropy values |
QDK AutoCAS EOS
Factory name: "qdk_autocas_eos"
Entropy-based selection using consecutive entropy differences to identify plateau boundaries. See AutoCAS Algorithm below for a detailed description.
Note
This method requires the input wavefunction to have orbital entropies populated.
Orbital entropies are computed from the one- and two-electron reduced density matrices (1-RDM and 2-RDM), which are typically obtained from a multi-configuration calculation with calculate_one_rdm=True and calculate_two_rdm=True. See MultiConfigurationCalculator for details on generating wavefunctions with RDMs.
Settings
Setting |
Type |
Default |
Description |
|---|---|---|---|
|
float |
|
Entropy threshold for selection |
|
float |
|
Difference threshold for EOS-based selection |
|
bool |
|
Whether to normalize entropy values by the maximum |
AutoCAS Algorithm
Selecting an appropriate active space is one of the most challenging aspects of multi-configuration calculations. Traditional approaches rely on chemical intuition and trial-and-error, which can be unreliable for complex systems. The AutoCAS algorithm [SR16, SR19] provides a systematic, black-box approach to active space selection.
AutoCAS leverages concepts from quantum information theory to quantify orbital correlation. The key insight is that strongly correlated orbitals are highly entangled with the rest of the electronic system. This entanglement can be measured using the single-orbital entropy \(s_i^{(1)}\), which quantifies how much information about orbital \(i\) is “shared” with all other orbitals.
Single orbital entropies can be calculated for many-body systems given access to (approximate) one- and two-particle reduced density matrices (RDM) [BT15], which are easily accessible in QDK/Chemistry through multi-configuration wavefunction data structures. As such, single orbital entropies are computed by default when RDMs are requested in multi-configuration calculations. The QDK/Chemistry implementation of AutoCAS is agnostic to the underlying wavefunction method, as long as the required RDMs are available, thus allowing for comparisons across different multi-configuration approaches.
QDK/Chemistry AutoCAS Variants
QDK/Chemistry provides two entropy-based selection methods:
- AutoCAS (Histogram-Based Plateau Detection)
As described in the original AutoCAS protocol [SR16, SR19], this method discretizes the entropy distribution into histogram bins and identifies plateaus—contiguous regions where the count of orbitals above each entropy threshold remains constant. This approach is robust for systems with clear entropy gaps but requires tuning of
num_binsandmin_plateau_sizeparameters. If none of the entropies exceed theentropy_threshold, the system is considered single configurational and all orbitals are excluded from the active space.- AutoCAS-EOS (Entropy Difference Detection)
Uses a direct approach that examines consecutive differences in the sorted entropy values. When the difference between adjacent entropies exceeds
diff_thresholdand the entropy is aboveentropy_threshold, a plateau boundary is identified.
Both methods sort orbitals by decreasing entropy and select the largest identified group of strongly correlated orbitals for the active space.
Populating Orbital Entropies
The entropy-based AutoCAS methods require orbital entropies as input, which are computed from the one- and two-electron reduced density matrices (RDMs). These RDMs must be obtained from a multi-configuration calculation that captures static correlation. A key practical consideration is balancing the cost of this initial calculation against the quality of the resulting entropies.
A recommended approach is to use Selected Configuration Interaction (SCI) with a relatively small number of determinants (e.g., 10,000–50,000) within a conservatively chosen active space. SCI methods are well-suited for this purpose because they:
Automatically identify the most important determinants for capturing static correlation
Scale favorably compared to full configuration interaction
Provide high-quality RDMs even with truncated determinant spaces
This approach has been shown to provide a reasonable trade-off between computational cost and entropy accuracy for active space selection. The resulting entropies are typically sufficient to identify strongly correlated orbitals, even when the SCI calculation uses a fraction of the determinants that would be required for quantitative energy accuracy.
Example
Note
The number of valence orbitals and electrons can be automatically determined using the utility function compute_valence_space_parameters().
// Create a valence space active space selector
auto valence_selector = ActiveSpaceSelectorFactory::create("qdk_valence");
// Automatically select valence parameters based on the input structure
auto [num_electrons, num_orbitals] =
qdk::chemistry::utils::compute_valence_space_parameters(scf_wavefunction,
charge);
valence_selector->settings().set("num_active_electrons", num_electrons);
valence_selector->settings().set("num_active_orbitals", num_orbitals);
auto active_valence_wfn = valence_selector->run(scf_wavefunction);
// Create active Hamiltonian
auto active_hamiltonian_generator = HamiltonianConstructorFactory::create();
auto active_hamiltonian =
active_hamiltonian_generator->run(active_valence_wfn->get_orbitals());
// Run Active Space Calculation with Selected CI
auto mc_calculator =
MultiConfigurationCalculatorFactory::create("macis_asci");
mc_calculator->settings().set("ntdets_max", 50000);
mc_calculator->settings().set("calculate_one_rdm", true);
mc_calculator->settings().set("calculate_two_rdm", true);
auto [mc_energy, mc_wavefunction] = mc_calculator->run(
active_hamiltonian, num_electrons / 2, num_electrons / 2);
// Select active space using AutoCAS
auto autocas_selector = ActiveSpaceSelectorFactory::create("qdk_autocas_eos");
auto active_autocas_wfn = autocas_selector->run(mc_wavefunction);
std::cout << "AutoCAS selected active orbitals summary:\n"
<< active_autocas_wfn->get_orbitals()->get_summary() << std::endl;
from qdk_chemistry.utils import compute_valence_space_parameters # noqa: E402
# Create a valence space active space selector
valence_selector = create("active_space_selector", "qdk_valence")
# Automatically select valence parameters based on the input structure
num_electrons, num_orbitals = compute_valence_space_parameters(scf_wavefunction, charge)
valence_selector.settings().set("num_active_electrons", num_electrons)
valence_selector.settings().set("num_active_orbitals", num_orbitals)
active_valence_wfn = valence_selector.run(scf_wavefunction)
# Create active Hamiltonian
active_hamiltonian_generator = create("hamiltonian_constructor")
active_hamiltonian = active_hamiltonian_generator.run(active_valence_wfn.get_orbitals())
# Run Active Space Calculation with Selected CI
mc_calculator = create("multi_configuration_calculator", "macis_asci")
mc_calculator.settings().set("ntdets_max", 50000)
mc_calculator.settings().set("calculate_one_rdm", True)
mc_calculator.settings().set("calculate_two_rdm", True)
mc_energy, mc_wavefunction = mc_calculator.run(
active_hamiltonian, num_electrons // 2, num_electrons // 2
)
# Select active space using AutoCAS
autocas_selector = create("active_space_selector", "qdk_autocas_eos")
active_autocas_wfn = autocas_selector.run(mc_wavefunction)
print("AutoCAS selected active orbitals summary:")
print(active_autocas_wfn.get_orbitals().get_summary())
PySCF AVAS
Factory name: "pyscf_avas"
The PySCF plugin provides access to the Automated Valence Active Space (AVAS) method from PySCF. AVAS selects active orbitals by projecting molecular orbitals onto a target atomic orbital basis. See the original AVAS publication [SSCK17] for details.
Settings
Setting |
Type |
Default |
Description |
|---|---|---|---|
|
list[str] |
|
Atomic orbital labels to include (e.g., |
|
bool |
|
Whether to canonicalize active orbitals after selection |
|
int |
|
Handling of singly-occupied orbitals: |
Example
avas = create("active_space_selector", "pyscf_avas")
avas.settings().set("ao_labels", ["O 2p", "O 2s"])
avas.settings().set("canonicalize", True)
active_wavefunction = avas.run(scf_wavefunction)
Further reading
The above examples can be downloaded as a complete Python script or C++ source file.
MCCalculator: Uses active space for multiconfigurational calculations
Settings: Configuration settings for algorithms
Factory Pattern: Understanding algorithm creation
compute_valence_space_parameters(): Utility function to determine valence orbitals and electrons