Composition Layer

The composition layer contains code for controlling CD-algorithm execution based on extended independence-structure information to solve the compound-problem of hidden constext (context-specific) causal discovery (HCCD).

Independence Atom Backend

The composition layer operates (exclusively) on primitives exposed by the data_processing layer, with data_management logically detached (accessed only thourgh the data_processing layer).

The composition-layer specifies two interfaces to this end:

IProvideIndependenceAtoms specifies functionality a compound backend has to forward from the data_processing layer.
IHandleExplicitTransitionToMCI specifies extensions for PCMCI-familiy time-series algortihms.

State Space Construction

The state-space construction is the second formal ingredient to HCCD. This modul specifies (for custom or future extension) what functionality a state-space construction strategy has to expose for use with HCCD. This includes:

IConstructStateSpace specifies a state-space construction strategy
IRepresentStateSpace specifies what properties of a state-space have to be made available to HCCD
IRepresentState specifies what properties of a state have to be made available to HCCD
IPresentResult what properties of a state-space-construction result have to be made available to HCCD

Further the module state_space_construction provides a baseline implementation of state-space-construction for acyclic union graphs:

state_space_construction.NoUnionCycles provides a baseline implemenation of state-space-construction for acyclic union-graphs.
state_space_construction.ResolveByRepresentor provides a preliminary/experimental implementation for representor-based indicator-resolution.

HCCD Controller

The HCCD controller coordinates (re)execution of CD-algorithm and state-space-construction, see §4.4 in [RR25]. There are currently three implemenations with increasing specialization:

Controller is suitable for general IID-algorithms.
ControllerTimeseriesMCI is suitable for PCMCI and PCMCI+.
ControllerTimeseriesLPCMCI is suitable for LPCMCI.

Abstract CD Specification

Further, this module specifies how third-party or custom (IID or stationary) CD-algorithms may be exposed for direct use in the framework.

Abstract CITs

Conditional independence-tests are abstracted as mappings from a CI-identifier CI_Identifier to a boolean value indicating dependence (True) or independence (False). Thus an abstract cit in this sense presumes a compound (data-provider plus actual test) backend (cf. Independence Atom Backend).

When implementing a custom CIT (or wrapping a third-party implementation), it is usually correct to provide Interfaces for CIT implementations instead. The abstract_cit_t type should primarily be used when specifying custom CD-algorithms (see below).

type abstract_cit_t = Callable[[CI_Identifier], bool]: Specifies the signature of abstract cits as used by CD-algorithms abstract_cd_t.

Graph Encoding

Graphs are output in tigramite-format:

Given \(k\) variables of IID data, the graph is a shape \((k,k)\) numpy array. The entry at [i,j] is the edge from index i to index j. Below we refer to ‘from’ as left-hand-side (lhs), and to ‘to’ as right-hand-side (rhs). The encoding is anti-symmetric, i.e. the edge from i to j is the mirrored edge from j to i (see edge-encodings below).
Given \(k\) variables of time-series data with maximum timelag \(\tau_{\text{max}}\), the graph is a shape \((k,k,\tau_{\text{max}})\) numpy array. This encodes a (time-)window-graph: The entry at [i,j,t] is the edge from index i to index j at time-lag t. The ‘slice’ at t=0 (and in general only at t=0) is anti-symmetric.

In both cases the numpy arrays contain 3-character strings (numpy dtype='<U3'), such that:

If the string is empty "", there is no edge, otherwise the middle character is a minus "*-*".
If the edge is into the rhs (lhs) node, the rhs (lhs) character is an arrowhead "*->" ("<-*").
If the edge is out of (but not into) the lhs (rhs) node, the lhs (rhs) character is a minus "--*" ("*--").
If it could not be determined if the edge is into or out of the lhs (rhs) node, the character on the lhs (rhs) side is a letter ‘o’ (as a circle) "o-*" ("*-o").
If inconsistencies are detected (contradicting edge-marks are deduced in the same place), the potentially dubious edgemark is replaced by a letter ‘x’ (as a cross), e.g. "x-*" (or "*-x").

type graph_t = ndarray: Used to annotate graphs in tigramite-format.

Abstract CD

Constraint-based causal discovery is abstracted as a mapping from a lazily evaluated independence structure (represented by abstract CITs, see abstract_cit_t) to a graph (in tigramite-format, see graph_t). Thus an abstract CD simply is a function (more generally: a callable object), taking an argument of type abstract_cit_t and returning a graph_t. Most third-party implemenations of CD-algorithms (see bridges for examples) can readily be wrapped to satisfy this format.

type abstract_cd_t = Callable[[abstract_cit_t], graph_t]: Specifies the signature of abstract CD-algorithms.