Utils API Reference¶

Utility functions and classes for the SciStanPy package.

This module provides various utility functions and classes that support the core functionality of SciStanPy, including:

Lazy importing mechanisms for performance optimization

Mathematical utility functions for numerical stability

Array chunking utilities for efficient memory management

Context managers for external library integration

Optimized statistical computation functions

Users will not typically need to interact with this module directly–it is designed to be used internally by SciStanPy.

Lazy Import System¶

To speed up initial import times, SciStanPy provides a lazy import system that defers loading of optional dependencies until they are actually needed.

scistanpy.utils.lazy_import(name: str)[source]¶

Import a module only when it is first needed.

This function implements lazy module importing to improve package import performance by deferring module loading until actual use.

Parameters:: name (str) – The fully qualified module name to import
Returns:: The imported module
Return type:: module
Raises:: ImportError – If the specified module cannot be found

Example:

>>> # Module is not loaded until first use
>>> numpy_module = lazy_import('numpy')
>>> # Now numpy is actually imported
>>> array = numpy_module.array([1, 2, 3])

Note

If the module is already imported, returns the cached version from sys.modules for efficiency.

class scistanpy.utils.LazyObjectProxy(module_name: str, obj_name: str)[source]¶

Bases: object

A proxy that delays importing a module and accessing an object until first use.

This class provides a lazy loading mechanism for specific objects within modules, allowing fine-grained control over when imports occur. The proxy forwards all method calls and attribute access to the actual object once it’s loaded.

Parameters:

module_name (str) – The fully qualified name of the module containing the object
obj_name (str) – The name of the object to import from the module

Variables:

_module_name – Stored module name for lazy loading
_obj_name – Stored object name for lazy loading
_cached_obj – Cached reference to the imported object (None until first use)

Example:

>>> # Create a proxy for numpy.array
>>> array_proxy = LazyObjectProxy('numpy', 'array')
>>> # numpy is not imported yet
>>> my_array = array_proxy([1, 2, 3])  # Now numpy is imported

scistanpy.utils.lazy_import_from(module_name: str, obj_name: str)[source]¶

Create a lazy import proxy for a specific object from a module.

This function provides a convenient way to create lazy import proxies, equivalent to from module_name import obj_name but with deferred loading.

Parameters:

module_name (str) – The fully qualified module name to import from
obj_name (str) – The name of the object to import from the module

Returns:

A proxy that will import and return the object when first accessed

Return type:

LazyObjectProxy

Example:

>>> # Equivalent to 'from numpy import array' but lazy
>>> array = lazy_import_from('numpy', 'array')
>>> my_array = array([1, 2, 3])  # numpy imported here

Backend Selection¶

Many SciStanPy operations can be performed using either NumPy or PyTorch as the underlying numerical backend. The utility function below automates the selection of the appropriate backend based on the input data type.

scistanpy.utils.choose_module(dist: torch.Tensor | 'custom_types.SampleType') → ModuleType[source]¶

Choose the appropriate computational module based on input type.

This function provides automatic backend selection between NumPy and PyTorch based on the type of the input data.

Parameters:: dist (Union[torch.Tensor, np.ndarray, custom_types.SampleType]) – Input data whose type determines the module choice
Returns:: The appropriate module (torch for tensors, numpy for arrays)
Return type:: Union[torch, np]
Raises:: TypeError – If the input type is not supported

Example:

>>> import torch
>>> tensor = torch.tensor([1.0, 2.0])
>>> module = choose_module(tensor)  # Returns torch module
>>> result = module.exp(tensor)

Numerical Stability¶

With probabilistic computations, numerical stability is often a concern. The following utility function provides a numerically stable implementation of the sigmoid function.

scistanpy.utils.stable_sigmoid( exponent: ndarray[tuple[int, ...], dtype[floating]], ) → ndarray[tuple[int, ...], dtype[floating]][source]¶

scistanpy.utils.stable_sigmoid(exponent: torch.Tensor) → torch.Tensor

Compute sigmoid function in a numerically stable way.

This function implements a numerically stable version of the sigmoid function that avoids overflow issues by using different computational approaches for positive and negative inputs.

Parameters:: exponent (Union[torch.Tensor, npt.NDArray[np.floating]]) – Input values for sigmoid computation
Returns:: Sigmoid values with the same type and shape as input
Return type:: Union[torch.Tensor, npt.NDArray[np.floating]]

The function uses the identity:

\[\begin{split}\sigma(x) = \begin{cases} \frac{1}{1 + e^{-x}} & \text{if } x \geq 0 \\ \frac{e^{x}}{1 + e^{x}} & \text{if } x < 0 \end{cases}\end{split}\]

Dask Integration¶

For particularly large models, sampling via Stan can yield more data than cat fit in memory. To handle such cases, SciStanPy integrates Dask to enable out-of-core computation and parallel processing, particularly with the SampleResults class. The following utility functions assist with Dask integration.

scistanpy.utils.get_chunk_shape( array_shape: tuple[custom_types.Integer, ...], array_precision: Literal['double', 'single', 'half'], mib_per_chunk: custom_types.Integer | None = None, frozen_dims: Collection[custom_types.Integer] = (), ) → tuple[custom_types.Integer, ...][source]

Calculate optimal chunk shape for Dask arrays based on memory constraints.

This function determines the optimal chunking strategy for large arrays processed with Dask, balancing memory usage with computational efficiency. It respects frozen dimensions that should not be chunked.

Parameters:

array_shape (tuple[custom_types.Integer, ...]) – Shape of the array to be chunked
array_precision (Literal["double", "single", "half"]) – Numerical precision assumed in calculating memory usage.
mib_per_chunk (Union[custom_types.Integer, None]) – Target chunk size in MiB. If None, uses Dask default
frozen_dims (Collection[custom_types.Integer]) – Dimensions that should not be chunked

Returns:

Optimal chunk shape for the array

Return type:

tuple[custom_types.Integer, …]

Raises:

ValueError – If mib_per_chunk is negative
IndexError – If frozen_dims contains invalid dimension indices

The algorithm:

Calculates memory usage per array element based on precision
Sets frozen dimensions to their full size
Iteratively determines chunk sizes for remaining dimensions
Ensures total chunk memory stays within the specified limit (or as close to it as possible if frozen dimensions result in a smallest possible size above the limit)

Example:

>>> # Chunk a (1000, 2000, 100) array, keeping last dim intact
>>> shape = get_chunk_shape(
...     (1000, 2000, 100), "double",
...     mib_per_chunk=64, frozen_dims=(2,)
... )

class scistanpy.utils.az_dask(dask_type: str = 'parallelized', output_dtypes: list[object] | None = None)[source]¶

Bases: object

Context manager for enabling Dask integration with ArviZ.

This context manager provides a convenient way to enable Dask-based parallel computation within ArviZ operations, automatically handling the setup and teardown of Dask configuration.

Parameters:

dask_type (str) – Type of Dask computation to enable
output_dtypes (Union[list[object], None]) – Expected output data types for Dask operations

Variables:

dask_type – Stored Dask computation type
output_dtypes – Stored output data types configuration

Example:

>>> with az_dask() as dask_ctx:
...     # ArviZ operations here will use Dask parallelization
...     result = az.summary(trace_data)

Note

The context manager automatically disables Dask when exiting, ensuring clean state management.

Utils API Reference¶

Lazy Import System¶

Backend Selection¶

Numerical Stability¶

Dask Integration¶

SciStanPy

Navigation

Related Topics