Custom SciPy Distributions API Reference

Custom SciPy distribution implementations for SciStanPy models.

This module provides extended and custom SciPy distribution classes. Among other things, these implementations provide:

  • Enhanced Batch Support: Extended multivariate distributions with variable batch dimensions

  • Custom Transformations: Log-transformed distributions with proper Jacobian corrections

  • Alternative Parameterizations: Logit and log-probability parameterizations for multinomial distributions

  • Numerical Stability: Improved implementations for edge cases and extreme values

The following custom probability distributions are implemented using SciPy’s distribution framework:

The distributions in this module are designed to work within SciPy’s distribution framework while providing enhanced functionality for advanced probabilistic modeling scenarios commonly encountered in SciStanPy applications. They will not normally be used directly by users; instead, they are used internally by SciStanPy model components such as Parameter for sampling from prior distributions.

Custom-Built SciPy Distributions

Some distributions are extensions of existing SciPy distributions while others are built from the ground up to provide specific features. First, the distribution built from the ground up:

class scistanpy.model.components.custom_distributions.custom_scipy_dists.CustomDirichlet(seed=None)[source]

Bases: dirichlet_gen

Enhanced Dirichlet distribution supporting variable batch dimensions.

This class extends SciPy’s standard Dirichlet distribution to support arbitrary batch dimensions while maintaining compatibility with the SciPy distribution interface. The standard SciPy implementation has limitations with batch operations that this class addresses.

Key Enhancements:
  • Support for arbitrary batch dimensions in alpha parameters

  • Proper broadcasting behavior across batch dimensions

  • Consistent output shapes for all distribution methods

  • Efficient vectorized operations over batch elements

The implementation uses a decorator pattern to extend existing SciPy methods with batch dimension handling while preserving the original mathematical properties of the Dirichlet distribution.

rvs(
alpha: npt.NDArray[np.floating],
size: tuple['custom_types.Integer', ...] | 'custom_types.Integer' | None = 1,
random_state: 'custom_types.Integer' | np.random.Generator | None = None,
) npt.NDArray[np.floating][source]

Generate random samples from the Dirichlet distribution.

Parameters:
  • alpha (npt.NDArray[np.floating]) – Concentration parameters with shape (…, k)

  • size (Union[tuple[custom_types.Integer, ...], custom_types.Integer, None]) – Output shape. Defaults to 1.

  • random_state (Union[custom_types.Integer, np.random.Generator, None]) – Random state for reproducible sampling. Defaults to None.

Returns:

Random samples from Dirichlet distribution

Return type:

npt.NDArray[np.floating]

Raises:

ValueError – If alpha cannot be broadcast to the specified size

This method supports arbitrary batch dimensions in the alpha parameter and properly broadcasts to the requested output size while maintaining the simplex constraint for each sample.

class scistanpy.model.components.custom_distributions.custom_scipy_dists.ExpDirichlet(seed=None)[source]

Bases: CustomDirichlet

Log-transformed Dirichlet distribution (Exponential-Dirichlet).

This class implements a distribution where the logarithm of a Dirichlet-distributed random vector follows this distribution. It’s useful for modeling log-scale compositional data and log-probability vectors with proper Jacobian corrections.

cov(alpha)[source]

This is not implemented.

Raises:

NotImplementedError

entropy(alpha)[source]

This is not implemented.

Raises:

NotImplementedError

logpdf(x, alpha)[source]

Compute log probability density with Jacobian correction.

Parameters:
  • x – Log-probability values

  • alpha – Concentration parameters

Returns:

Log probability density values

The implementation includes the proper Jacobian correction for the log-transformation, computed analytically for efficiency and numerical stability.

mean(alpha)[source]

This is not implemented.

Raises:

NotImplementedError

pdf(x, alpha)[source]

Compute probability density function.

Parameters:
  • x – Log-probability values

  • alpha – Concentration parameters

Returns:

Probability density values

Computed as the exponential of the log probability density for numerical stability and consistency.

rvs(
alpha: npt.NDArray[np.floating],
size: tuple['custom_types.Integer', ...] | 'custom_types.Integer' | None = 1,
random_state: 'custom_types.Integer' | np.random.Generator | None = None,
) npt.NDArray[np.floating][source]

Generate random samples from the log-transformed Dirichlet distribution.

Parameters:
  • alpha (npt.NDArray[np.floating]) – Concentration parameters

  • size (Union[tuple[custom_types.Integer, ...], custom_types.Integer, None]) – Output shape. Defaults to 1.

  • random_state (Union[custom_types.Integer, np.random.Generator, None]) – Random state. Defaults to None.

Returns:

Log-probability samples

Return type:

npt.NDArray[np.floating]

Samples are generated by first sampling from the standard Dirichlet distribution and then applying the logarithmic transformation.

var(alpha)[source]

This is not implemented.

Raises:

NotImplementedError

class scistanpy.model.components.custom_distributions.custom_scipy_dists.CustomMultinomial(seed=None)[source]

Bases: multinomial_gen

Enhanced multinomial distribution supporting variable batch dimensions.

This class extends SciPy’s standard multinomial distribution to support arbitrary batch dimensions in both the trial count (n) and probability parameters (p), enabling flexible batch operations for discrete multivariate modeling scenarios.

Key Enhancements:
  • Variable batch dimensions for n and p parameters

  • Proper broadcasting behavior between n and p

  • Support for different trial counts across batch elements

  • Consistent output shapes for sampling operations

Example:
>>> # Batch multinomial with different trial counts
>>> n = np.array([[10], [20], [15]])
>>> p = np.array([[0.3, 0.4, 0.3],
...               [0.2, 0.5, 0.3],
...               [0.4, 0.3, 0.3]])
>>> multinomial = CustomMultinomial()
>>> samples = multinomial.rvs(n=n, p=p, size=100) # shape = (100, 3, 3)
rvs(
n: 'custom_types.Integer' | npt.NDArray[np.integer],
p: npt.NDArray[np.floating],
size: tuple['custom_types.Integer', ...] | 'custom_types.Integer' | None = 1,
random_state: 'custom_types.Integer' | np.random.Generator | None = None,
) npt.NDArray[np.integer][source]

Generate random samples from the multinomial distribution.

Parameters:
  • n (Union[custom_types.Integer, npt.NDArray[np.integer]]) – Number of trials (can be scalar or array)

  • p (npt.NDArray[np.floating]) – Event probabilities with shape (…, k)

  • size (Union[tuple[custom_types.Integer, ...], custom_types.Integer, None]) – Output shape. Defaults to 1.

  • random_state (Union[custom_types.Integer, np.random.Generator, None]) – Random state for reproducible sampling. Defaults to None.

Returns:

Random samples from multinomial distribution

Return type:

npt.NDArray[np.integer]

Raises:

ValueError – If n and p cannot be broadcast to compatible shapes

This method supports different trial counts for each batch element and handles broadcasting between scalar/array n and multi-dimensional p.

class scistanpy.model.components.custom_distributions.custom_scipy_dists.MultinomialLogit(seed=None)[source]

Bases: CustomMultinomial

Multinomial distribution with logit parameterization. This is identical to CustomMultinomial except that the probabilities need not be normalized and are specified as logits.

class scistanpy.model.components.custom_distributions.custom_scipy_dists.MultinomialLogTheta(seed=None)[source]

Bases: CustomMultinomial

Multinomial distribution with normalized log-probability parameterization. This is identical to CustomMultinomial except that the probabilities are specified as log-probabilities that must already be normalized (i.e., their exponentials sum to 1).

Transforms of Existing SciPy Distributions

Other distributions are implemented as transforms of existing SciPy distributions to provide additional flexibility. Transformations are applied to the base distribution using the following classes:

class scistanpy.model.components.custom_distributions.custom_scipy_dists.TransformedScipyDist(base_dist: rv_continuous)[source]

Bases: ABC

Abstract base class for transformed SciPy distributions.

This class provides a framework for creating distributions that are transformations of existing SciPy distributions. It handles the mathematical details of transformation including Jacobian corrections for probability density functions.

Parameters:

base_dist (stats.rv_continuous) – Base SciPy distribution to transform

Key Features:
  • Automatic Jacobian correction for probability densities

  • Proper transformation of all distribution methods

  • Maintains SciPy distribution interface compatibility

  • Support for arbitrary invertible transformations

Subclasses must implement:
  • transform: Forward transformation function

  • inverse_transform: Inverse transformation function

  • log_jacobian_correction: Log determinant of Jacobian matrix

cdf(x, *args, **kwargs)[source]

Compute cumulative distribution function.

Parameters:
  • x – Values at which to evaluate CDF

  • args – Arguments for base distribution

  • kwargs – Keyword arguments for base distribution

Returns:

Cumulative probability values

Uses inverse transformation to map to base distribution space.

abstractmethod inverse_transform(
x: ndarray[tuple[int, ...], dtype[floating]],
) ndarray[tuple[int, ...], dtype[floating]][source]
isf(q, *args, **kwargs)[source]

Compute inverse survival function.

Parameters:
  • q – Probability values

  • args – Arguments for base distribution

  • kwargs – Keyword arguments for base distribution

Returns:

Inverse survival function values

abstractmethod log_jacobian_correction(
x: ndarray[tuple[int, ...], dtype[floating]],
) ndarray[tuple[int, ...], dtype[floating]][source]
logpdf(x, *args, **kwargs)[source]

Compute log probability density function with Jacobian correction.

Parameters:
  • x – Values at which to evaluate log-PDF

  • args – Arguments for base distribution

  • kwargs – Keyword arguments for base distribution

Returns:

Log probability density values

More numerically stable than computing log of PDF directly.

logsf(x, *args, **kwargs)[source]

Compute log survival function.

Parameters:
  • x – Values at which to evaluate log survival function

  • args – Arguments for base distribution

  • kwargs – Keyword arguments for base distribution

Returns:

Log survival probability values

More numerically stable for small survival probabilities.

pdf(x, *args, **kwargs)[source]

Compute probability density function with Jacobian correction.

Parameters:
  • x – Values at which to evaluate PDF

  • args – Arguments for base distribution

  • kwargs – Keyword arguments for base distribution

Returns:

Probability density values

Applies the change of variables formula with proper Jacobian correction.

ppf(q, *args, **kwargs)[source]

Compute percent point function (inverse CDF).

Parameters:
  • q – Probability values

  • args – Arguments for base distribution

  • kwargs – Keyword arguments for base distribution

Returns:

Quantile values

Uses forward transformation of base distribution quantiles.

rvs(*args, **kwargs)[source]

Generate random samples from transformed distribution.

Parameters:
  • args – Arguments for base distribution

  • kwargs – Keyword arguments for base distribution

Returns:

Random samples from transformed distribution

Generates samples from base distribution and applies transformation.

sf(x, *args, **kwargs)[source]

Compute survival function (1 - CDF).

Parameters:
  • x – Values at which to evaluate survival function

  • args – Arguments for base distribution

  • kwargs – Keyword arguments for base distribution

Returns:

Survival probability values

abstractmethod transform(
x: ndarray[tuple[int, ...], dtype[floating]],
) ndarray[tuple[int, ...], dtype[floating]][source]
class scistanpy.model.components.custom_distributions.custom_scipy_dists.LogUnivariateScipyTransform(base_dist: rv_continuous)[source]

Bases: TransformedScipyDist

Log transformation for univariate SciPy distributions.

This class implements the natural logarithm transformation for any univariate SciPy distribution, creating a log-transformed variant with proper Jacobian corrections.

This transformation is commonly used to:
  • Convert positive-valued distributions to real-valued distributions

  • Enable log-scale modeling of multiplicative processes

  • Improve numerical stability for heavy-tailed distributions

  • Create log-normal variants of arbitrary positive distributions

Example:
>>> # Create log-transformed exponential distribution
>>> exp_exponential = LogUnivariateScipyTransform(stats.expon)
>>> # This is equivalent to a Gumbel distribution
>>> samples = exp_exponential.rvs(scale=1.0, size=1000)
inverse_transform(
x: ndarray[tuple[int, ...], dtype[floating]],
) ndarray[tuple[int, ...], dtype[floating]][source]

Apply inverse logarithmic transformation to input values.

Parameters:

x (npt.NDArray[np.floating]) – Input values from transformed distribution

Returns:

Inverse log-transformed values

Return type:

npt.NDArray[np.floating]

log_jacobian_correction(
x: ndarray[tuple[int, ...], dtype[floating]],
) ndarray[tuple[int, ...], dtype[floating]][source]

Compute log Jacobian correction for logarithmic transformation.

Parameters:

x (npt.NDArray[np.floating]) – Values in transformed (log) space

Returns:

Log Jacobian determinant (equal to x for log transform)

Return type:

npt.NDArray[np.floating]

transform(
x: ndarray[tuple[int, ...], dtype[floating]],
) ndarray[tuple[int, ...], dtype[floating]][source]

Apply logarithmic transformation to input values.

Parameters:

x (npt.NDArray[np.floating]) – Input values from base distribution

Returns:

Log-transformed values

Return type:

npt.NDArray[np.floating]

This method implements the natural logarithm transformation.

Distribution Instances

The above classes are used to create the following distribution instances that can be used directly in SciStanPy models:

scistanpy.model.components.custom_distributions.custom_scipy_dists.dirichlet

Instance of CustomDirichlet. See that class for details.

scistanpy.model.components.custom_distributions.custom_scipy_dists.expdirichlet

Instance of ExpDirichlet. See that class for details.

scistanpy.model.components.custom_distributions.custom_scipy_dists.expexponential

scipy.stats.expon transformed to the log scale using LogUnivariateScipyTransform.

scistanpy.model.components.custom_distributions.custom_scipy_dists.explomax

scipy.stats.lomax transformed to the log scale using LogUnivariateScipyTransform.

scistanpy.model.components.custom_distributions.custom_scipy_dists.multinomial

Instance of CustomMultinomial. See that class for details.

scistanpy.model.components.custom_distributions.custom_scipy_dists.multinomial_logit

Instance of MultinomialLogit. See that class for details.

scistanpy.model.components.custom_distributions.custom_scipy_dists.multinomial_log_theta

Instance of MultinomialLogTheta. See that class for details.