Homogeneity Tests

class Homogeneity_Binomial

Bases: ITestHomogeneity

Homogeneity test based on binomial approach via quantile estimator. Implements ITestHomogeneity interface for use with mCIT.

get_actual_error_control(N: int, dim_Z: int = 0) float

Gets the actual error-control after accounting for counting-statistics. Depending on the internals of the used hyper-parameter set, this may be different from \(\alpha\) as specified originally.

Parameters:
  • N (int) – Sample size N.

  • dim_Z (int, optional) – Size of the conditioning-set Z, defaults to 0

Returns:

Effective error-control target \(\alpha\).

Return type:

float

__init__(hyperparams: IProvideHyperparamsForBinomial, cit: ITestCI, cit_analytic_quantile_estimate: IProvideAnalyticQuantilesForCIT | None = None, bootstrap_block_count: int = 5000, next_bootstrap_seed: Callable[[], None | int | ~numpy.random.bit_generator.SeedSequence]=<function Homogeneity_Binomial.<lambda>>)

Construct from hyper-parameter set, and either cit-specific quantile estimate or bootstrap block-count for generic quantile estimation.

Parameters:
  • hyperparams (IProvideHyperparamsForBinomial) – Hyper-parameter set to use.

  • cit_analytic_quantile_estimate (IProvideAnalyticQuantilesForCIT | None, optional) – Cit-specific quantile estimate (if available), defaults to None

  • bootstrap_block_count (int, optional) – Block-count for bootstrap of quantile (if no cit-specific quantile was provided), defaults to 5000

get_quantile(data: BlockView, cit_result: Result, beta: float) float

Obtain a quantile for the given dataset.

Parameters:
  • data (BlockView) – Data-blocks associated to current test.

  • cit_result (ITestCI.Result) – The CIT result for the data (currently always run previously anyway).

  • beta (float) – Target probabilty to get a quantile (lower bound) for.

Returns:

The estimated quantile lower bound.

Return type:

float

is_homogeneous(data: CIT_DataPatterned) bool

Implements functionality of interface ITestHomogeneity.

Test if the data supplied by the query is homogenous.

Parameters:

data (CIT_DataPatterned) – The data to inspect.

Returns:

The truth-value indicating if the data-set was accepted as homogenous.

Return type:

bool