Partial Correlation CIT

class ParCorr

Bases: ITestCI, IProvideAnalyticQuantilesForCIT, IProvideVarianceForCIT

Implementation of partial correlation independence test and interface for use with mCIT. Can run on many blocks at once efficiently.

__init__(alpha: float = 0.05, lower_bound_clip_value: float = 0.3, force_regression_global: bool = False, analytic_approximation_for_cutoff: Literal['by effective count', 'by large N expansion'] = 'by effective count')

Constructor for partial correlation CIT.

Parameters:

alpha (float, optional) – target for FPR-control, defaults to 0.05
lower_bound_clip_value (float, optional) – to avoid numeric instability for high dependencies, the implementation provided for IProvideAnalyticQuantilesForCIT.cit_quantile_estimate clips bounds to a predefined range (this is consistent, and does not cost substantial power), defaults to 0.3
force_regression_global (bool, optional) – By default (disabled) regressions are computed locally per block, while in principle slightly less sample-efficient on IID data, this is more robust against non-stationarities, defaults to False
analytic_approximation_for_cutoff (Literal["by effective count", "by large N expansion"]) – analytic approximation used to implement IProvideAnalyticQuantilesForCIT, defaults to “by effective count”

run_single(data: CIT_DataPatterned) → Result

Implements functionality of interface ITestCI.

Run a single CIT on all data.

Parameters:: data (CIT_DataPatterned) – The data-set to use.
Returns:: Structured test-output.
Return type:: ITestCI.Result

run_many(data: BlockView) → Result

Implements functionality of interface ITestCI.

Run CITs on many blocks (efficiently).

Parameters:: data (BlockView) – The data-set to use.
Returns:: Structured test-output.
Return type:: ITestCI.Result

cit_quantile_estimate(data: BlockView, cit_result: Result, beta: float, cit_obj: ITestCI) → float

Implements functionality of interface IProvideAnalyticQuantilesForCIT.

Provide an estimate of the \(\beta\)-quantile for the test implemented by cit_obj.

Parameters:

data (BlockView) – The data-blocks to operate on.
cit_result (ITestCI.Result) – The CIT result for the data (currently always run previously anyway).
beta (float) – The quantile \(\beta\) to estimate.
cit_obj (ITestCI) – The underlying cit-instance for which the quantile should be computed. (The present interface IProvideAnalyticQuantilesForCIT can, but does not have to, be exposed on the CIT-type itself.)

Returns:

Estimate of the dependence-value at the quantile \(\beta\)

Return type:

float

get_variance_estimate(N: int, dim_Z: int, cit_obj: ITestCI) → float

Implements functionality of interface IProvideVarianceForCIT.

Get an estimate of the block-wise variance of the dependence score implemented by cit_obj.

Parameters:

N (int) – The sample count N.
dim_Z (int) – The dimension of (number of variables in) the condition conditioning set Z.
cit_obj (ITestCI) – The underlying cit-instance for which the variance should be computed. (The present interface IProvideVarianceForCIT can, but does not have to, be exposed on the CIT-type itself.)

Returns:

The estimated value of the variance.

Return type:

float

static effective_sample_count(n: int, z_dim: int) → int

Compute effective sample-size

Parameters:

n (int) – actual sample-size
z_dim (int) – size of conditioning set

Returns:

effective sample-size

Return type:

int

classmethod analytic_score_var(n: int, z_dim: int) → float

Analytic approximation for score variance.

Parameters:

n (int) – sample size
z_dim (int) – size of conditioning set

Returns:

score variance

Return type:

float

classmethod analytic_score_std(n: int, z_dim: int) → float

Analytic approximation for score standard deviation.

Parameters:

n (int) – sample size
z_dim (int) – size of conditioning set

Returns:

score standard dceviation

Return type:

float

score_many(data: BlockView) → ndarray

Compute score (z-transformed partial correlation) on blocks

Parameters:: data (BlockView) – data blocks
Returns:: score per block
Return type:: np.ndarray

score_single(data: CIT_DataPatterned) → float

Compute score (z-transformed partial correlation)

Parameters:: data (CIT_DataPatterned) – data
Returns:: score
Return type:: float

pvalue(score: float | ndarray, N: int, dim_Z: int) → float | ndarray

Compute p-value for a given score and setup (possibly per block).

Parameters:

score (float | ndarray) – z-value(s)
N (int) – sample-size
dim_Z (int) – size of conditioning set

Returns:

p-value(s)

Return type:

float | ndarray

pvalue_of_mean(score_mean: float, block_size: int, block_count: int, dim_Z: int) → float

Compute p-value for a given score-mean over blocks and setup.

Parameters:

score_mean (float) – mean z-value
block_size (int) – block-size
block_count (int) – block-count
dim_Z (int) – size of conditioning set

Returns:

p-value

Return type:

float

is_pvalue_dependent(pvalue: float) → bool

Decide if a given p-value should be considered evidence for a dependent test.

Parameters:: pvalue (float) – p-value
Returns:: test considered dependent
Return type:: bool