Custom SciPy Distributions API Reference¶
Custom SciPy distribution implementations for SciStanPy models.
This module provides extended and custom SciPy distribution classes. Among other things, these implementations provide:
Enhanced Batch Support: Extended multivariate distributions with variable batch dimensions
Custom Transformations: Log-transformed distributions with proper Jacobian corrections
Alternative Parameterizations: Logit and log-probability parameterizations for multinomial distributions
Numerical Stability: Improved implementations for edge cases and extreme values
- The following custom probability distributions are implemented using SciPy’s distribution framework:
The distributions in this module are designed to work within SciPy’s distribution
framework while providing enhanced functionality for advanced probabilistic modeling
scenarios commonly encountered in SciStanPy applications. They will not normally be used directly by users; instead, they are used internally by SciStanPy model components such as Parameter
for sampling from prior distributions.
Custom-Built SciPy Distributions¶
Some distributions are extensions of existing SciPy distributions while others are built from the ground up to provide specific features. First, the distribution built from the ground up:
- class scistanpy.model.components.custom_distributions.custom_scipy_dists.CustomDirichlet(seed=None)[source]¶
Bases:
dirichlet_gen
Enhanced Dirichlet distribution supporting variable batch dimensions.
This class extends SciPy’s standard Dirichlet distribution to support arbitrary batch dimensions while maintaining compatibility with the SciPy distribution interface. The standard SciPy implementation has limitations with batch operations that this class addresses.
- Key Enhancements:
Support for arbitrary batch dimensions in alpha parameters
Proper broadcasting behavior across batch dimensions
Consistent output shapes for all distribution methods
Efficient vectorized operations over batch elements
The implementation uses a decorator pattern to extend existing SciPy methods with batch dimension handling while preserving the original mathematical properties of the Dirichlet distribution.
- rvs(
- alpha: npt.NDArray[np.floating],
- size: tuple['custom_types.Integer', ...] | 'custom_types.Integer' | None = 1,
- random_state: 'custom_types.Integer' | np.random.Generator | None = None,
Generate random samples from the Dirichlet distribution.
- Parameters:
alpha (npt.NDArray[np.floating]) – Concentration parameters with shape (…, k)
size (Union[tuple[custom_types.Integer, ...], custom_types.Integer, None]) – Output shape. Defaults to 1.
random_state (Union[custom_types.Integer, np.random.Generator, None]) – Random state for reproducible sampling. Defaults to None.
- Returns:
Random samples from Dirichlet distribution
- Return type:
npt.NDArray[np.floating]
- Raises:
ValueError – If alpha cannot be broadcast to the specified size
This method supports arbitrary batch dimensions in the alpha parameter and properly broadcasts to the requested output size while maintaining the simplex constraint for each sample.
- class scistanpy.model.components.custom_distributions.custom_scipy_dists.ExpDirichlet(seed=None)[source]¶
Bases:
CustomDirichlet
Log-transformed Dirichlet distribution (Exponential-Dirichlet).
This class implements a distribution where the logarithm of a Dirichlet-distributed random vector follows this distribution. It’s useful for modeling log-scale compositional data and log-probability vectors with proper Jacobian corrections.
- logpdf(x, alpha)[source]¶
Compute log probability density with Jacobian correction.
- Parameters:
x – Log-probability values
alpha – Concentration parameters
- Returns:
Log probability density values
The implementation includes the proper Jacobian correction for the log-transformation, computed analytically for efficiency and numerical stability.
- pdf(x, alpha)[source]¶
Compute probability density function.
- Parameters:
x – Log-probability values
alpha – Concentration parameters
- Returns:
Probability density values
Computed as the exponential of the log probability density for numerical stability and consistency.
- rvs(
- alpha: npt.NDArray[np.floating],
- size: tuple['custom_types.Integer', ...] | 'custom_types.Integer' | None = 1,
- random_state: 'custom_types.Integer' | np.random.Generator | None = None,
Generate random samples from the log-transformed Dirichlet distribution.
- Parameters:
alpha (npt.NDArray[np.floating]) – Concentration parameters
size (Union[tuple[custom_types.Integer, ...], custom_types.Integer, None]) – Output shape. Defaults to 1.
random_state (Union[custom_types.Integer, np.random.Generator, None]) – Random state. Defaults to None.
- Returns:
Log-probability samples
- Return type:
npt.NDArray[np.floating]
Samples are generated by first sampling from the standard Dirichlet distribution and then applying the logarithmic transformation.
- class scistanpy.model.components.custom_distributions.custom_scipy_dists.CustomMultinomial(seed=None)[source]¶
Bases:
multinomial_gen
Enhanced multinomial distribution supporting variable batch dimensions.
This class extends SciPy’s standard multinomial distribution to support arbitrary batch dimensions in both the trial count (n) and probability parameters (p), enabling flexible batch operations for discrete multivariate modeling scenarios.
- Key Enhancements:
Variable batch dimensions for n and p parameters
Proper broadcasting behavior between n and p
Support for different trial counts across batch elements
Consistent output shapes for sampling operations
- Example:
>>> # Batch multinomial with different trial counts >>> n = np.array([[10], [20], [15]]) >>> p = np.array([[0.3, 0.4, 0.3], ... [0.2, 0.5, 0.3], ... [0.4, 0.3, 0.3]]) >>> multinomial = CustomMultinomial() >>> samples = multinomial.rvs(n=n, p=p, size=100) # shape = (100, 3, 3)
- rvs(
- n: 'custom_types.Integer' | npt.NDArray[np.integer],
- p: npt.NDArray[np.floating],
- size: tuple['custom_types.Integer', ...] | 'custom_types.Integer' | None = 1,
- random_state: 'custom_types.Integer' | np.random.Generator | None = None,
Generate random samples from the multinomial distribution.
- Parameters:
n (Union[custom_types.Integer, npt.NDArray[np.integer]]) – Number of trials (can be scalar or array)
p (npt.NDArray[np.floating]) – Event probabilities with shape (…, k)
size (Union[tuple[custom_types.Integer, ...], custom_types.Integer, None]) – Output shape. Defaults to 1.
random_state (Union[custom_types.Integer, np.random.Generator, None]) – Random state for reproducible sampling. Defaults to None.
- Returns:
Random samples from multinomial distribution
- Return type:
npt.NDArray[np.integer]
- Raises:
ValueError – If n and p cannot be broadcast to compatible shapes
This method supports different trial counts for each batch element and handles broadcasting between scalar/array n and multi-dimensional p.
- class scistanpy.model.components.custom_distributions.custom_scipy_dists.MultinomialLogit(seed=None)[source]¶
Bases:
CustomMultinomial
Multinomial distribution with logit parameterization. This is identical to
CustomMultinomial
except that the probabilities need not be normalized and are specified as logits.
- class scistanpy.model.components.custom_distributions.custom_scipy_dists.MultinomialLogTheta(seed=None)[source]¶
Bases:
CustomMultinomial
Multinomial distribution with normalized log-probability parameterization. This is identical to
CustomMultinomial
except that the probabilities are specified as log-probabilities that must already be normalized (i.e., their exponentials sum to 1).
Transforms of Existing SciPy Distributions¶
Other distributions are implemented as transforms of existing SciPy distributions to provide additional flexibility. Transformations are applied to the base distribution using the following classes:
- class scistanpy.model.components.custom_distributions.custom_scipy_dists.TransformedScipyDist(base_dist: rv_continuous)[source]¶
Bases:
ABC
Abstract base class for transformed SciPy distributions.
This class provides a framework for creating distributions that are transformations of existing SciPy distributions. It handles the mathematical details of transformation including Jacobian corrections for probability density functions.
- Parameters:
base_dist (stats.rv_continuous) – Base SciPy distribution to transform
- Key Features:
Automatic Jacobian correction for probability densities
Proper transformation of all distribution methods
Maintains SciPy distribution interface compatibility
Support for arbitrary invertible transformations
- Subclasses must implement:
transform: Forward transformation function
inverse_transform: Inverse transformation function
log_jacobian_correction: Log determinant of Jacobian matrix
- cdf(x, *args, **kwargs)[source]¶
Compute cumulative distribution function.
- Parameters:
x – Values at which to evaluate CDF
args – Arguments for base distribution
kwargs – Keyword arguments for base distribution
- Returns:
Cumulative probability values
Uses inverse transformation to map to base distribution space.
- abstractmethod inverse_transform(
- x: ndarray[tuple[int, ...], dtype[floating]],
- isf(q, *args, **kwargs)[source]¶
Compute inverse survival function.
- Parameters:
q – Probability values
args – Arguments for base distribution
kwargs – Keyword arguments for base distribution
- Returns:
Inverse survival function values
- abstractmethod log_jacobian_correction(
- x: ndarray[tuple[int, ...], dtype[floating]],
- logpdf(x, *args, **kwargs)[source]¶
Compute log probability density function with Jacobian correction.
- Parameters:
x – Values at which to evaluate log-PDF
args – Arguments for base distribution
kwargs – Keyword arguments for base distribution
- Returns:
Log probability density values
More numerically stable than computing log of PDF directly.
- logsf(x, *args, **kwargs)[source]¶
Compute log survival function.
- Parameters:
x – Values at which to evaluate log survival function
args – Arguments for base distribution
kwargs – Keyword arguments for base distribution
- Returns:
Log survival probability values
More numerically stable for small survival probabilities.
- pdf(x, *args, **kwargs)[source]¶
Compute probability density function with Jacobian correction.
- Parameters:
x – Values at which to evaluate PDF
args – Arguments for base distribution
kwargs – Keyword arguments for base distribution
- Returns:
Probability density values
Applies the change of variables formula with proper Jacobian correction.
- ppf(q, *args, **kwargs)[source]¶
Compute percent point function (inverse CDF).
- Parameters:
q – Probability values
args – Arguments for base distribution
kwargs – Keyword arguments for base distribution
- Returns:
Quantile values
Uses forward transformation of base distribution quantiles.
- rvs(*args, **kwargs)[source]¶
Generate random samples from transformed distribution.
- Parameters:
args – Arguments for base distribution
kwargs – Keyword arguments for base distribution
- Returns:
Random samples from transformed distribution
Generates samples from base distribution and applies transformation.
- class scistanpy.model.components.custom_distributions.custom_scipy_dists.LogUnivariateScipyTransform(base_dist: rv_continuous)[source]¶
Bases:
TransformedScipyDist
Log transformation for univariate SciPy distributions.
This class implements the natural logarithm transformation for any univariate SciPy distribution, creating a log-transformed variant with proper Jacobian corrections.
- This transformation is commonly used to:
Convert positive-valued distributions to real-valued distributions
Enable log-scale modeling of multiplicative processes
Improve numerical stability for heavy-tailed distributions
Create log-normal variants of arbitrary positive distributions
- Example:
>>> # Create log-transformed exponential distribution >>> exp_exponential = LogUnivariateScipyTransform(stats.expon) >>> # This is equivalent to a Gumbel distribution >>> samples = exp_exponential.rvs(scale=1.0, size=1000)
- inverse_transform(
- x: ndarray[tuple[int, ...], dtype[floating]],
Apply inverse logarithmic transformation to input values.
- Parameters:
x (npt.NDArray[np.floating]) – Input values from transformed distribution
- Returns:
Inverse log-transformed values
- Return type:
npt.NDArray[np.floating]
- log_jacobian_correction(
- x: ndarray[tuple[int, ...], dtype[floating]],
Compute log Jacobian correction for logarithmic transformation.
- Parameters:
x (npt.NDArray[np.floating]) – Values in transformed (log) space
- Returns:
Log Jacobian determinant (equal to x for log transform)
- Return type:
npt.NDArray[np.floating]
- transform(
- x: ndarray[tuple[int, ...], dtype[floating]],
Apply logarithmic transformation to input values.
- Parameters:
x (npt.NDArray[np.floating]) – Input values from base distribution
- Returns:
Log-transformed values
- Return type:
npt.NDArray[np.floating]
This method implements the natural logarithm transformation.
Distribution Instances¶
The above classes are used to create the following distribution instances that can be used directly in SciStanPy models:
- scistanpy.model.components.custom_distributions.custom_scipy_dists.dirichlet¶
Instance of
CustomDirichlet
. See that class for details.
- scistanpy.model.components.custom_distributions.custom_scipy_dists.expdirichlet¶
Instance of
ExpDirichlet
. See that class for details.
- scistanpy.model.components.custom_distributions.custom_scipy_dists.expexponential¶
scipy.stats.expon
transformed to the log scale usingLogUnivariateScipyTransform
.
- scistanpy.model.components.custom_distributions.custom_scipy_dists.explomax¶
scipy.stats.lomax
transformed to the log scale usingLogUnivariateScipyTransform
.
- scistanpy.model.components.custom_distributions.custom_scipy_dists.multinomial¶
Instance of
CustomMultinomial
. See that class for details.
- scistanpy.model.components.custom_distributions.custom_scipy_dists.multinomial_logit¶
Instance of
MultinomialLogit
. See that class for details.
- scistanpy.model.components.custom_distributions.custom_scipy_dists.multinomial_log_theta¶
Instance of
MultinomialLogTheta
. See that class for details.