Serialization

QDK/Chemistry provides serialization capabilities for all its data classes, allowing to save and load computational results in various formats. This document explains the serialization mechanisms and formats supported by QDK/Chemistry.

Overview

Serialization is the process of converting complex data structures into a format that can be stored or transmitted. In QDK/Chemistry, this is crucial for:

  • Saving intermediate results of calculations

  • Sharing data between different programs or languages

  • Preserving computational results for future analysis

  • Implementing checkpoint and restart capabilities

Supported formats

QDK/Chemistry supports multiple serialization formats:

JSON

Human-readable text format, suitable for small to medium data

HDF5

Hierarchical binary format, suitable for large data sets

XYZ

Standard format for molecular geometries (for Structure only)

FCIDUMP

Format for Hamiltonian integrals (for Hamiltonian only)

Common serialization interface

All QDK/Chemistry data classes implement a consistent serialization interface as described below.

JSON serialization

#include <filesystem>
#include <qdk/chemistry.hpp>
using namespace qdk::chemistry::data;

int main() {
  // Load structure from XYZ file (the file uses Angstrom, converted to Bohr
  // internally)
  Structure structure_from_file =
      Structure::from_xyz_file("../data/h2.structure.xyz");

  // For demonstration: create a structure with custom masses and charges
  // (requires explicit coordinates, here in Bohr)
  std::vector<Eigen::Vector3d> coords = {{0.0, 0.0, 0.0}, {0.0, 0.0, 1.4}};
  std::vector<std::string> symbols = {"H", "H"};
  std::vector<double> custom_masses{1.001, 0.999};
  std::vector<double> custom_charges = {0.9, 1.1};
  Structure structure(coords, symbols, custom_masses, custom_charges);

  // Serialize to JSON object
  auto structure_data = structure.to_json();

  const char* filename =
      "h2_example.structure.json";  // Extension depends on object type

  // Deserialize from JSON object
  // "Structure" is the data type to de-serialize into (will throw, if it
  // doesn't match)
  auto structure_from_json = Structure::from_json(structure_data);

  // Write to json file
  structure.to_json_file(filename);

  // Read from json file
  auto structure_from_json_file = Structure::from_json_file(filename);

  std::filesystem::remove(filename);
import os
from pathlib import Path

import numpy as np
from qdk_chemistry.data import (
    Structure,
    Hamiltonian,
    CanonicalFourCenterHamiltonianContainer,
    ModelOrbitals,
)

# Load structure from XYZ file (the file uses Angstrom, which is converted to Bohr internally)
structure = Structure.from_xyz_file(Path(__file__).parent / "../data/h2.structure.xyz")

# For demonstration: create a structure with custom masses and charges
# (requires explicit coordinates, here in Bohr)
coords = np.array([[0.0, 0.0, 0.0], [0.0, 0.0, 1.4]])  # Bohr
symbols = ["H", "H"]
custom_masses = [1.001, 0.999]
custom_charges = [0.9, 1.1]
structure_custom = Structure(
    coords, symbols=symbols, masses=custom_masses, nuclear_charges=custom_charges
)

# Serialize to JSON object
structure_data = structure_custom.to_json()

# Deserialize from JSON object
# "Structure" is the data type to de-serialize into (will throw, if it doesn't match)
structure_from_json = Structure.from_json(structure_data)

# Write to json file
tmpfile = "example.structure.json"
structure.to_json_file(tmpfile)

# Read from json file
structure_from_json_file = Structure.from_json_file(tmpfile)

os.remove(tmpfile)

HDF5 serialization

  // Hamiltonian data class example
  // Create dummy data for Hamiltonian class
  Eigen::MatrixXd one_body = Eigen::MatrixXd::Identity(2, 2);
  Eigen::VectorXd two_body = 2 * Eigen::VectorXd::Ones(16);
  auto orbitals =
      std::make_shared<ModelOrbitals>(2, true);  // 2 orbitals, restricted
  double core_energy = 1.5;
  Eigen::MatrixXd inactive_fock = Eigen::MatrixXd::Zero(0, 0);

  Hamiltonian h2_example(one_body, two_body, orbitals, core_energy,
                         inactive_fock);

  h2_example.to_hdf5_file(
      "h2_example.hamiltonian.h5");  // Extension depends on object type

  // Deserialize from HDF5 file
  auto h2_example_from_hdf5_file =
      Hamiltonian::from_hdf5_file("h2_example.hamiltonian.h5");
# Hamiltonian data class example
# Create dummy data for Hamiltonian class
one_body = np.identity(2)
two_body = 2 * np.ones((16,))
orbitals = ModelOrbitals(2, True)  # 2 orbitals, restricted
core_energy = 1.5
inactive_fock = np.zeros((0, 0))

h2_example = Hamiltonian(
    CanonicalFourCenterHamiltonianContainer(
        one_body, two_body, orbitals, core_energy, inactive_fock
    )
)

h2_example.to_hdf5_file("h2_example.hamiltonian.h5")

# Deserialize from HDF5 file
h2_example_from_hdf5_file = Hamiltonian.from_hdf5_file("h2_example.hamiltonian.h5")
os.remove("h2_example.hamiltonian.h5")

File extensions

QDK/Chemistry enforces specific file extensions to ensure clarity about the content type:

Data class

JSON extension

HDF5 extension

Other formats

Structure

.structure.json

.structure.h5

.structure.xyz

BasisSet

.basis_set.json

.basis_set.h5

Orbitals

.orbitals.json

.orbitals.h5

Hamiltonian

.hamiltonian.json

.hamiltonian.h5

hamiltonian.fcidump

Wavefunction

.wavefunction.json

.wavefunction.h5

The same patterns are observed for other data classes in QDK/Chemistry.

Further reading

  • The above examples can be downloaded as complete C++ and Python scripts.

  • Structure: Molecular geometry and atomic information

  • BasisSet: Basis set definitions

  • Orbitals: Molecular orbital coefficients and properties

  • Hamiltonian: Electronic Hamiltonian operator

  • Wavefunction: Wavefunction data