Partial Correlation CIT

class ParCorr

Bases: ITestCI, IProvideAnalyticQuantilesForCIT, IProvideVarianceForCIT

Implementation of partial correlation independence test and interface for use with mCIT. Can run on many blocks at once efficiently.

__init__(alpha: float = 0.05, lower_bound_clip_value: float = 0.3, force_regression_global: bool = False, analytic_approximation_for_cutoff: Literal['by effective count', 'by large N expansion'] = 'by effective count')

Constructor for partial correlation CIT.

Parameters:
  • alpha (float, optional) – target for FPR-control, defaults to 0.05

  • lower_bound_clip_value (float, optional) – to avoid numeric instability for high dependencies, the implementation provided for IProvideAnalyticQuantilesForCIT.cit_quantile_estimate clips bounds to a predefined range (this is consistent, and does not cost substantial power), defaults to 0.3

  • force_regression_global (bool, optional) – By default (disabled) regressions are computed locally per block, while in principle slightly less sample-efficient on IID data, this is more robust against non-stationarities, defaults to False

  • analytic_approximation_for_cutoff (Literal["by effective count", "by large N expansion"]) – analytic approximation used to implement IProvideAnalyticQuantilesForCIT, defaults to “by effective count”

run_single(data: CIT_DataPatterned) Result

Implements functionality of interface ITestCI.

Run a single CIT on all data.

Parameters:

data (CIT_DataPatterned) – The data-set to use.

Returns:

Structured test-output.

Return type:

ITestCI.Result

run_many(data: BlockView) Result

Implements functionality of interface ITestCI.

Run CITs on many blocks (efficiently).

Parameters:

data (BlockView) – The data-set to use.

Returns:

Structured test-output.

Return type:

ITestCI.Result

cit_quantile_estimate(data: BlockView, cit_result: Result, beta: float, cit_obj: ITestCI) float

Implements functionality of interface IProvideAnalyticQuantilesForCIT.

Provide an estimate of the \(\beta\)-quantile for the test implemented by cit_obj.

Parameters:
  • data (BlockView) – The data-blocks to operate on.

  • cit_result (ITestCI.Result) – The CIT result for the data (currently always run previously anyway).

  • beta (float) – The quantile \(\beta\) to estimate.

  • cit_obj (ITestCI) – The underlying cit-instance for which the quantile should be computed. (The present interface IProvideAnalyticQuantilesForCIT can, but does not have to, be exposed on the CIT-type itself.)

Returns:

Estimate of the dependence-value at the quantile \(\beta\)

Return type:

float

get_variance_estimate(N: int, dim_Z: int, cit_obj: ITestCI) float

Implements functionality of interface IProvideVarianceForCIT.

Get an estimate of the block-wise variance of the dependence score implemented by cit_obj.

Parameters:
  • N (int) – The sample count N.

  • dim_Z (int) – The dimension of (number of variables in) the condition conditioning set Z.

  • cit_obj (ITestCI) – The underlying cit-instance for which the variance should be computed. (The present interface IProvideVarianceForCIT can, but does not have to, be exposed on the CIT-type itself.)

Returns:

The estimated value of the variance.

Return type:

float

static effective_sample_count(n: int, z_dim: int) int

Compute effective sample-size

Parameters:
  • n (int) – actual sample-size

  • z_dim (int) – size of conditioning set

Returns:

effective sample-size

Return type:

int

classmethod analytic_score_var(n: int, z_dim: int) float

Analytic approximation for score variance.

Parameters:
  • n (int) – sample size

  • z_dim (int) – size of conditioning set

Returns:

score variance

Return type:

float

classmethod analytic_score_std(n: int, z_dim: int) float

Analytic approximation for score standard deviation.

Parameters:
  • n (int) – sample size

  • z_dim (int) – size of conditioning set

Returns:

score standard dceviation

Return type:

float

score_many(data: BlockView) ndarray

Compute score (z-transformed partial correlation) on blocks

Parameters:

data (BlockView) – data blocks

Returns:

score per block

Return type:

np.ndarray

score_single(data: CIT_DataPatterned) float

Compute score (z-transformed partial correlation)

Parameters:

data (CIT_DataPatterned) – data

Returns:

score

Return type:

float

pvalue(score: float | ndarray, N: int, dim_Z: int) float | ndarray

Compute p-value for a given score and setup (possibly per block).

Parameters:
  • score (float | ndarray) – z-value(s)

  • N (int) – sample-size

  • dim_Z (int) – size of conditioning set

Returns:

p-value(s)

Return type:

float | ndarray

pvalue_of_mean(score_mean: float, block_size: int, block_count: int, dim_Z: int) float

Compute p-value for a given score-mean over blocks and setup.

Parameters:
  • score_mean (float) – mean z-value

  • block_size (int) – block-size

  • block_count (int) – block-count

  • dim_Z (int) – size of conditioning set

Returns:

p-value

Return type:

float

is_pvalue_dependent(pvalue: float) bool

Decide if a given p-value should be considered evidence for a dependent test.

Parameters:

pvalue (float) – p-value

Returns:

test considered dependent

Return type:

bool