Patterns

Patterns describe prior knowledge about likely regime-structures, for example temporal persistence. We provide an extension-point for the easy customization of patterns through a derived class of CIT_DataPatterned, and provide simple baseline implemenations for temporally or spatially persistent regime-structure.

Custom Patterns

Custom patterns can be provided by deriving from CIT_DataPatterned and implementing at least

  • CIT_DataPatterned.view_blocks() (desribing the actual prior assumption). This is the main actual input to provide. Given an integer block_size, we have to subdivide the data into “blocks” of approximately size block_size (to avoid degeneracies, it is recommended to round up, rather than down, if necessary). A block is considered “valid” if all its data-points were generated in the same (true) regime, otherwise it is “invalid”. In this notation, the subdivision into blocks should be such that blocks are valid with high probability (while satisfying the block-size requirement). This encodes prior knowledge about the system: For example, it is often plausible to assume that regimes in time-series are persistent in time (switch rarely). In this case choosing blocks as (contiguous) intervals in time makes sense, as then for each (of the presumably few) switches, at most one block is invalidated. Blocks are currently considered disjoint. For providing a cache-id, see Cache IDs.

  • CIT_DataPatterned.reproject_blocks() (reprojecting a function evaluated on blocks back to the original index-set/-layout for plotting) and

  • CIT_DataPatterned.get_actual_block_format() (also used for plotting, typically trivial).

class CIT_DataPatterned

Bases: CIT_Data

Patterned data for mCIT.

See also

Pattern-related aspects are to be overwritten by custom pattern providers, for example CIT_DataPatterned_PersistentInTime or CIT_DataPatterned_PesistentInSpace.

__init__(x_data: ndarray, y_data: ndarray, z_data: ndarray, cache_id: tuple | None) None

Construct from raw data. Typcially the user should use clone_from_data() (preserving the derived type, thus pattern specifications) instead.

view_blocks(block_size: int) BlockView

View as blocks of given size. The layout of blocks encodes the (prior) knowledge about patterns.

Parameters:

block_size (int) – requested block-size (the block-size of the result may not exactly match this number, if the underlying pattern provider cannot construct arbitrary block-sizes).

Returns:

view as pattern-aligned blocks

Return type:

BlockView

static get_actual_block_format(requested_size: int) int | tuple[int, ...]

Get the actual (possibly multi-dimensional) format of blocks produced. Used for plotting.

Parameters:

requested_size (int) – The size of blocks requested.

Returns:

Format of blocks produced.

Return type:

int|tuple[int,…]

static reproject_blocks(value_per_block: ndarray, block_configuration: BlockView, data_configuration: tuple[int, ...]) ndarray

Reproject a function \(f\) on blocks to the original index-set layout (for example time, space etc). Used for plotting.

Parameters:
  • value_per_block (np.ndarray) – values of \(f\) for each block

  • block_configuration (BlockView) – the block-configuration (eg block-size) used

  • data_configuration (tuple[int,...]) – the data-shape (per-variable) in the original data

Returns:

plottable layout of \(f\) as function of the original index-space

Return type:

np.ndarray

sample_count() int

Get sample size.

Returns:

sample-size N

Return type:

int

z_dim() int

Get dimension (number of variables) of conditioning set Z.

Returns:

dim(Z)

Return type:

int

copy_and_center() CIT_DataPatterned

Copy and subtract mean.

Returns:

centered copy

Return type:

CIT_DataPatterned

view_blocks_trivial() BlockView

View as trivial blocks (a single block of size N).

Returns:

view by trivial blocks

Return type:

BlockView

view_blocks_match(other: BlockView) BlockView

View as blocks matching configuration (block-sizes etc) of another block-view.

Parameters:

other (BlockView) – The other block-view, whose configuration should be copied.

Returns:

A block-view of the data with same configuration as :param other:.

Return type:

BlockView

bootstrap_unaligned_blocks(rng: Generator, bootstrap_block_count: int, block_size: int) BlockView

Bootstrap random (unaligned) blocks.

Parameters:
  • rng (np.random.Generator (or similar, requires .integers of numpy.random.Generator)) – A random number generator.

  • bootstrap_block_count (int) – Number of blocks to bootstrap

  • block_size (int) – Size per block

Returns:

A the bootstrapped blocks (actually a copy, not a view)

Return type:

BlockView

classmethod clone_from_data(X: ndarray, Y: ndarray, Z: ndarray) CIT_DataPatterned

Attach the currently used pattern-provider to given data.

Parameters:
  • X (np.ndarray) – X-data

  • Y (np.ndarray) – Y-data

  • Z (np.ndarray) – Z-data

Returns:

Patterned data.

Return type:

decltype(self), a type derived from CIT_DataPatterned

Baseline Patterns

class CIT_DataPatterned_PersistentInTime

Bases: CIT_DataPatterned

Patterned data for mCIT. The implemented pattern captures persistent regimes in a single (eg time) direction.

x_data has shape=(N), where N is sample-size
y_data has shape=(N), where N is sample-size
z_data has shape=(N,k), where N is sample-size and k=dim(Z)

See also

See overview on custom patterns. Methods are specified and documented on CIT_DataPatterned.

view_blocks(block_size: int) BlockView

Implements functionality of interface CIT_DataPatterned.

View as blocks of given size. The layout of blocks encodes the (prior) knowledge about patterns.

Parameters:

block_size (int) – requested block-size (the block-size of the result may not exactly match this number, if the underlying pattern provider cannot construct arbitrary block-sizes).

Returns:

view as pattern-aligned blocks

Return type:

BlockView

static get_actual_block_format(requested_size: int) int

Implements functionality of interface CIT_DataPatterned.

Get the actual (possibly multi-dimensional) format of blocks produced. Used for plotting.

Parameters:

requested_size (int) – The size of blocks requested.

Returns:

Format of blocks produced.

Return type:

int|tuple[int,…]

static reproject_blocks(value_per_block: ndarray, block_configuration: BlockView, data_configuration: tuple[int, ...]) ndarray

Implements functionality of interface CIT_DataPatterned.

Reproject a function \(f\) on blocks to the original index-set layout (for example time, space etc). Used for plotting.

Parameters:
  • value_per_block (np.ndarray) – values of \(f\) for each block

  • block_configuration (BlockView) – the block-configuration (eg block-size) used

  • data_configuration (tuple[int,...]) – the data-shape (per-variable) in the original data

Returns:

plottable layout of \(f\) as function of the original index-space

Return type:

np.ndarray

class CIT_DataPatterned_PesistentInSpace

Bases: CIT_DataPatterned

Patterned data for mCIT. The implemented pattern captures persistent regimes in two (eg spatial) direction.

x_data has shape=(w,h), where w, h are the width and height of the sample-grid.
y_data has shape=(w,h), where w, h are the width and height of the sample-grid.
z_data has shape=(w,h,k), where w, h are the width and height of the sample-grid, and k=dim(Z)

See also

See overview on custom patterns. Methods are specified and documented on CIT_DataPatterned.

static get_actual_block_format(requested_size: int) tuple[int, int]

Implements functionality of interface CIT_DataPatterned.

Get the actual (possibly multi-dimensional) format of blocks produced. Used for plotting.

Parameters:

requested_size (int) – The size of blocks requested.

Returns:

Format of blocks produced.

Return type:

int|tuple[int,…]

view_blocks(block_size: int) BlockView

Implements functionality of interface CIT_DataPatterned.

View as blocks of given size. The layout of blocks encodes the (prior) knowledge about patterns.

Parameters:

block_size (int) – requested block-size (the block-size of the result may not exactly match this number, if the underlying pattern provider cannot construct arbitrary block-sizes).

Returns:

view as pattern-aligned blocks

Return type:

BlockView

static reproject_blocks(value_per_block: ndarray, block_configuration: BlockView, data_configuration: tuple[int, ...]) ndarray

Implements functionality of interface CIT_DataPatterned.

Reproject a function \(f\) on blocks to the original index-set layout (for example time, space etc). Used for plotting.

Parameters:
  • value_per_block (np.ndarray) – values of \(f\) for each block

  • block_configuration (BlockView) – the block-configuration (eg block-size) used

  • data_configuration (tuple[int,...]) – the data-shape (per-variable) in the original data

Returns:

plottable layout of \(f\) as function of the original index-space

Return type:

np.ndarray

Underlying Data

Data is currently stored in numpy-arrays, whose shape depends on the data-manager used:

class CIT_Data

Data for CIT.

See also

Used through CIT_DataPatterned.

x_data: ndarray

Shape specified by data-manager/pattern-provider. See CIT_DataPatterned.

y_data: ndarray

Same shape as x_data.

z_data: ndarray

Shape=(shape_xy,k), where shape_xy is the shape of x_data/y_data and k=dim(Z) is the size of the conditioning set.

cache_id: tuple | None

Unique identifier associated to the data by the data-manager, to be used for caching results. None to disable caching.

__init__(x_data: ndarray, y_data: ndarray, z_data: ndarray, cache_id: tuple | None) None