Custom SciPy Distributions API Reference¶

Custom SciPy distribution implementations for SciStanPy models.

This module provides extended and custom SciPy distribution classes. Among other things, these implementations provide:

Enhanced Batch Support: Extended multivariate distributions with variable batch dimensions

Custom Transformations: Log-transformed distributions with proper Jacobian corrections

Alternative Parameterizations: Logit and log-probability parameterizations for multinomial distributions

Numerical Stability: Improved implementations for edge cases and extreme values

The following custom probability distributions are implemented using SciPy’s distribution framework:

dirichlet
expdirichlet
expexponential
explomax
multinomial
multinomial_logit
multinomial_log_theta

The distributions in this module are designed to work within SciPy’s distribution framework while providing enhanced functionality for advanced probabilistic modeling scenarios commonly encountered in SciStanPy applications. They will not normally be used directly by users; instead, they are used internally by SciStanPy model components such as Parameter for sampling from prior distributions.

Custom-Built SciPy Distributions¶

Some distributions are extensions of existing SciPy distributions while others are built from the ground up to provide specific features. First, the distribution built from the ground up:

class scistanpy.model.components.custom_distributions.custom_scipy_dists.CustomDirichlet(seed=None)[source]¶

Bases: dirichlet_gen

Enhanced Dirichlet distribution supporting variable batch dimensions.

This class extends SciPy’s standard Dirichlet distribution to support arbitrary batch dimensions while maintaining compatibility with the SciPy distribution interface. The standard SciPy implementation has limitations with batch operations that this class addresses.

Key Enhancements:

Support for arbitrary batch dimensions in alpha parameters
Proper broadcasting behavior across batch dimensions
Consistent output shapes for all distribution methods
Efficient vectorized operations over batch elements

The implementation uses a decorator pattern to extend existing SciPy methods with batch dimension handling while preserving the original mathematical properties of the Dirichlet distribution.

rvs( alpha: npt.NDArray[np.floating], size: tuple['custom_types.Integer', ...] | 'custom_types.Integer' | None = 1, random_state: 'custom_types.Integer' | np.random.Generator | None = None, ) → npt.NDArray[np.floating][source]¶

Generate random samples from the Dirichlet distribution.

Parameters:

alpha (npt.NDArray[np.floating]) – Concentration parameters with shape (…, k)
size (Union[tuple[custom_types.Integer, ...], custom_types.Integer, None]) – Output shape. Defaults to 1.
random_state (Union[custom_types.Integer, np.random.Generator, None]) – Random state for reproducible sampling. Defaults to None.

Returns:

Random samples from Dirichlet distribution

Return type:

npt.NDArray[np.floating]

Raises:

ValueError – If alpha cannot be broadcast to the specified size

This method supports arbitrary batch dimensions in the alpha parameter and properly broadcasts to the requested output size while maintaining the simplex constraint for each sample.

class scistanpy.model.components.custom_distributions.custom_scipy_dists.ExpDirichlet(seed=None)[source]¶

Bases: CustomDirichlet

Log-transformed Dirichlet distribution (Exponential-Dirichlet).

This class implements a distribution where the logarithm of a Dirichlet-distributed random vector follows this distribution. It’s useful for modeling log-scale compositional data and log-probability vectors with proper Jacobian corrections.

cov(alpha)[source]¶

This is not implemented.

Raises:: NotImplementedError –

entropy(alpha)[source]¶

This is not implemented.

Raises:: NotImplementedError –

logpdf(x, alpha)[source]¶

Compute log probability density with Jacobian correction.

Parameters:

x – Log-probability values
alpha – Concentration parameters

Returns:

Log probability density values

The implementation includes the proper Jacobian correction for the log-transformation, computed analytically for efficiency and numerical stability.

mean(alpha)[source]¶

This is not implemented.

Raises:: NotImplementedError –

pdf(x, alpha)[source]¶

Compute probability density function.

Parameters:

x – Log-probability values
alpha – Concentration parameters

Returns:

Probability density values

Computed as the exponential of the log probability density for numerical stability and consistency.

rvs( alpha: npt.NDArray[np.floating], size: tuple['custom_types.Integer', ...] | 'custom_types.Integer' | None = 1, random_state: 'custom_types.Integer' | np.random.Generator | None = None, ) → npt.NDArray[np.floating][source]¶

Generate random samples from the log-transformed Dirichlet distribution.

Parameters:

alpha (npt.NDArray[np.floating]) – Concentration parameters
size (Union[tuple[custom_types.Integer, ...], custom_types.Integer, None]) – Output shape. Defaults to 1.
random_state (Union[custom_types.Integer, np.random.Generator, None]) – Random state. Defaults to None.

Returns:

Log-probability samples

Return type:

npt.NDArray[np.floating]

Samples are generated by first sampling from the standard Dirichlet distribution and then applying the logarithmic transformation.

var(alpha)[source]¶

This is not implemented.

Raises:: NotImplementedError –

class scistanpy.model.components.custom_distributions.custom_scipy_dists.CustomMultinomial(seed=None)[source]¶

Bases: multinomial_gen

Enhanced multinomial distribution supporting variable batch dimensions.

This class extends SciPy’s standard multinomial distribution to support arbitrary batch dimensions in both the trial count (n) and probability parameters (p), enabling flexible batch operations for discrete multivariate modeling scenarios.

Key Enhancements:

Variable batch dimensions for n and p parameters
Proper broadcasting behavior between n and p
Support for different trial counts across batch elements
Consistent output shapes for sampling operations

Example:

>>> # Batch multinomial with different trial counts
>>> n = np.array([[10], [20], [15]])
>>> p = np.array([[0.3, 0.4, 0.3],
...               [0.2, 0.5, 0.3],
...               [0.4, 0.3, 0.3]])
>>> multinomial = CustomMultinomial()
>>> samples = multinomial.rvs(n=n, p=p, size=100) # shape = (100, 3, 3)

rvs( n: 'custom_types.Integer' | npt.NDArray[np.integer], p: npt.NDArray[np.floating], size: tuple['custom_types.Integer', ...] | 'custom_types.Integer' | None = 1, random_state: 'custom_types.Integer' | np.random.Generator | None = None, ) → npt.NDArray[np.integer][source]¶

Generate random samples from the multinomial distribution.

Parameters:

n (Union[custom_types.Integer, npt.NDArray[np.integer]]) – Number of trials (can be scalar or array)
p (npt.NDArray[np.floating]) – Event probabilities with shape (…, k)
size (Union[tuple[custom_types.Integer, ...], custom_types.Integer, None]) – Output shape. Defaults to 1.
random_state (Union[custom_types.Integer, np.random.Generator, None]) – Random state for reproducible sampling. Defaults to None.

Returns:

Random samples from multinomial distribution

Return type:

npt.NDArray[np.integer]

Raises:

ValueError – If n and p cannot be broadcast to compatible shapes

This method supports different trial counts for each batch element and handles broadcasting between scalar/array n and multi-dimensional p.

class scistanpy.model.components.custom_distributions.custom_scipy_dists.MultinomialLogit(seed=None)[source]¶

Bases: CustomMultinomial

Multinomial distribution with logit parameterization. This is identical to CustomMultinomial except that the probabilities need not be normalized and are specified as logits.

class scistanpy.model.components.custom_distributions.custom_scipy_dists.MultinomialLogTheta(seed=None)[source]¶

Bases: CustomMultinomial

Multinomial distribution with normalized log-probability parameterization. This is identical to CustomMultinomial except that the probabilities are specified as log-probabilities that must already be normalized (i.e., their exponentials sum to 1).

Transforms of Existing SciPy Distributions¶

Other distributions are implemented as transforms of existing SciPy distributions to provide additional flexibility. Transformations are applied to the base distribution using the following classes:

class scistanpy.model.components.custom_distributions.custom_scipy_dists.TransformedScipyDist(base_dist: rv_continuous)[source]¶

Bases: ABC

Abstract base class for transformed SciPy distributions.

This class provides a framework for creating distributions that are transformations of existing SciPy distributions. It handles the mathematical details of transformation including Jacobian corrections for probability density functions.

Parameters:: base_dist (stats.rv_continuous) – Base SciPy distribution to transform

Key Features:

Automatic Jacobian correction for probability densities
Proper transformation of all distribution methods
Maintains SciPy distribution interface compatibility
Support for arbitrary invertible transformations

Subclasses must implement:

transform: Forward transformation function
inverse_transform: Inverse transformation function
log_jacobian_correction: Log determinant of Jacobian matrix

cdf(x, *args, **kwargs)[source]¶

Compute cumulative distribution function.

Parameters:

x – Values at which to evaluate CDF
args – Arguments for base distribution
kwargs – Keyword arguments for base distribution

Returns:

Cumulative probability values

Uses inverse transformation to map to base distribution space.

abstractmethod inverse_transform( x: ndarray[tuple[int, ...], dtype[floating]], ) → ndarray[tuple[int, ...], dtype[floating]][source]¶

isf(q, *args, **kwargs)[source]¶

Compute inverse survival function.

Parameters:

q – Probability values
args – Arguments for base distribution
kwargs – Keyword arguments for base distribution

Returns:

Inverse survival function values

abstractmethod log_jacobian_correction( x: ndarray[tuple[int, ...], dtype[floating]], ) → ndarray[tuple[int, ...], dtype[floating]][source]¶

logpdf(x, *args, **kwargs)[source]¶

Compute log probability density function with Jacobian correction.

Parameters:

x – Values at which to evaluate log-PDF
args – Arguments for base distribution
kwargs – Keyword arguments for base distribution

Returns:

Log probability density values

More numerically stable than computing log of PDF directly.

logsf(x, *args, **kwargs)[source]¶

Compute log survival function.

Parameters:

x – Values at which to evaluate log survival function
args – Arguments for base distribution
kwargs – Keyword arguments for base distribution

Returns:

Log survival probability values

More numerically stable for small survival probabilities.

pdf(x, *args, **kwargs)[source]¶

Compute probability density function with Jacobian correction.

Parameters:

x – Values at which to evaluate PDF
args – Arguments for base distribution
kwargs – Keyword arguments for base distribution

Returns:

Probability density values

Applies the change of variables formula with proper Jacobian correction.

ppf(q, *args, **kwargs)[source]¶

Compute percent point function (inverse CDF).

Parameters:

q – Probability values
args – Arguments for base distribution
kwargs – Keyword arguments for base distribution

Returns:

Quantile values

Uses forward transformation of base distribution quantiles.

rvs(*args, **kwargs)[source]¶

Generate random samples from transformed distribution.

Parameters:

args – Arguments for base distribution
kwargs – Keyword arguments for base distribution

Returns:

Random samples from transformed distribution

Generates samples from base distribution and applies transformation.

sf(x, *args, **kwargs)[source]¶

Compute survival function (1 - CDF).

Parameters:

x – Values at which to evaluate survival function
args – Arguments for base distribution
kwargs – Keyword arguments for base distribution

Returns:

Survival probability values

abstractmethod transform( x: ndarray[tuple[int, ...], dtype[floating]], ) → ndarray[tuple[int, ...], dtype[floating]][source]¶

class scistanpy.model.components.custom_distributions.custom_scipy_dists.LogUnivariateScipyTransform(base_dist: rv_continuous)[source]¶

Bases: TransformedScipyDist

Log transformation for univariate SciPy distributions.

This class implements the natural logarithm transformation for any univariate SciPy distribution, creating a log-transformed variant with proper Jacobian corrections.

This transformation is commonly used to:

Convert positive-valued distributions to real-valued distributions
Enable log-scale modeling of multiplicative processes
Improve numerical stability for heavy-tailed distributions
Create log-normal variants of arbitrary positive distributions

Example:

>>> # Create log-transformed exponential distribution
>>> exp_exponential = LogUnivariateScipyTransform(stats.expon)
>>> # This is equivalent to a Gumbel distribution
>>> samples = exp_exponential.rvs(scale=1.0, size=1000)

inverse_transform( x: ndarray[tuple[int, ...], dtype[floating]], ) → ndarray[tuple[int, ...], dtype[floating]][source]¶

Apply inverse logarithmic transformation to input values.

Parameters:: x (npt.NDArray[np.floating]) – Input values from transformed distribution
Returns:: Inverse log-transformed values
Return type:: npt.NDArray[np.floating]

log_jacobian_correction( x: ndarray[tuple[int, ...], dtype[floating]], ) → ndarray[tuple[int, ...], dtype[floating]][source]¶

Compute log Jacobian correction for logarithmic transformation.

Parameters:: x (npt.NDArray[np.floating]) – Values in transformed (log) space
Returns:: Log Jacobian determinant (equal to x for log transform)
Return type:: npt.NDArray[np.floating]

transform( x: ndarray[tuple[int, ...], dtype[floating]], ) → ndarray[tuple[int, ...], dtype[floating]][source]¶

Apply logarithmic transformation to input values.

Parameters:: x (npt.NDArray[np.floating]) – Input values from base distribution
Returns:: Log-transformed values
Return type:: npt.NDArray[np.floating]

This method implements the natural logarithm transformation.

Distribution Instances¶

The above classes are used to create the following distribution instances that can be used directly in SciStanPy models:

scistanpy.model.components.custom_distributions.custom_scipy_dists.dirichlet¶: Instance of CustomDirichlet. See that class for details.

scistanpy.model.components.custom_distributions.custom_scipy_dists.expdirichlet¶: Instance of ExpDirichlet. See that class for details.

scistanpy.model.components.custom_distributions.custom_scipy_dists.expexponential¶: scipy.stats.expon transformed to the log scale using LogUnivariateScipyTransform.

scistanpy.model.components.custom_distributions.custom_scipy_dists.explomax¶: scipy.stats.lomax transformed to the log scale using LogUnivariateScipyTransform.

scistanpy.model.components.custom_distributions.custom_scipy_dists.multinomial¶: Instance of CustomMultinomial. See that class for details.

scistanpy.model.components.custom_distributions.custom_scipy_dists.multinomial_logit¶: Instance of MultinomialLogit. See that class for details.

scistanpy.model.components.custom_distributions.custom_scipy_dists.multinomial_log_theta¶: Instance of MultinomialLogTheta. See that class for details.

Custom SciPy Distributions API Reference¶

Custom-Built SciPy Distributions¶

Transforms of Existing SciPy Distributions¶

Distribution Instances¶

SciStanPy

Navigation

Related Topics