Patterns

Patterns describe prior knowledge about likely regime-structures, for example temporal persistence. We provide an extension-point for the easy customization of patterns through a derived class of CIT_DataPatterned, and provide simple baseline implemenations for temporally or spatially persistent regime-structure.

Custom Patterns

Custom patterns can be provided by deriving from CIT_DataPatterned and implementing at least

CIT_DataPatterned.view_blocks() (desribing the actual prior assumption). This is the main actual input to provide. Given an integer block_size, we have to subdivide the data into “blocks” of approximately size block_size (to avoid degeneracies, it is recommended to round up, rather than down, if necessary). A block is considered “valid” if all its data-points were generated in the same (true) regime, otherwise it is “invalid”. In this notation, the subdivision into blocks should be such that blocks are valid with high probability (while satisfying the block-size requirement). This encodes prior knowledge about the system: For example, it is often plausible to assume that regimes in time-series are persistent in time (switch rarely). In this case choosing blocks as (contiguous) intervals in time makes sense, as then for each (of the presumably few) switches, at most one block is invalidated. Blocks are currently considered disjoint. For providing a cache-id, see Cache IDs.
CIT_DataPatterned.reproject_blocks() (reprojecting a function evaluated on blocks back to the original index-set/-layout for plotting) and
CIT_DataPatterned.get_actual_block_format() (also used for plotting, typically trivial).

class CIT_DataPatterned

Bases: CIT_Data

Patterned data for mCIT.

See also

Pattern-related aspects are to be overwritten by custom pattern providers, for example CIT_DataPatterned_PersistentInTime or CIT_DataPatterned_PesistentInSpace.

__init__(x_data: ndarray, y_data: ndarray, z_data: ndarray, cache_id: tuple | None) → None: Construct from raw data. Typcially the user should use clone_from_data() (preserving the derived type, thus pattern specifications) instead.

view_blocks(block_size: int) → BlockView

View as blocks of given size. The layout of blocks encodes the (prior) knowledge about patterns.

Parameters:: block_size (int) – requested block-size (the block-size of the result may not exactly match this number, if the underlying pattern provider cannot construct arbitrary block-sizes).
Returns:: view as pattern-aligned blocks
Return type:: BlockView

static get_actual_block_format(requested_size: int) → int | tuple[int, ...]

Get the actual (possibly multi-dimensional) format of blocks produced. Used for plotting.

Parameters:: requested_size (int) – The size of blocks requested.
Returns:: Format of blocks produced.
Return type:: int|tuple[int,…]

static reproject_blocks(value_per_block: ndarray, block_configuration: BlockView, data_configuration: tuple[int, ...]) → ndarray

Reproject a function \(f\) on blocks to the original index-set layout (for example time, space etc). Used for plotting.

Parameters:

value_per_block (np.ndarray) – values of \(f\) for each block
block_configuration (BlockView) – the block-configuration (eg block-size) used
data_configuration (tuple[int,...]) – the data-shape (per-variable) in the original data

Returns:

plottable layout of \(f\) as function of the original index-space

Return type:

np.ndarray

sample_count() → int

Get sample size.

Returns:: sample-size N
Return type:: int

z_dim() → int

Get dimension (number of variables) of conditioning set Z.

Returns:: dim(Z)
Return type:: int

copy_and_center() → CIT_DataPatterned

Copy and subtract mean.

Returns:: centered copy
Return type:: CIT_DataPatterned

view_blocks_trivial() → BlockView

View as trivial blocks (a single block of size N).

Returns:: view by trivial blocks
Return type:: BlockView

view_blocks_match(other: BlockView) → BlockView

View as blocks matching configuration (block-sizes etc) of another block-view.

Parameters:: other (BlockView) – The other block-view, whose configuration should be copied.
Returns:: A block-view of the data with same configuration as :param other:.
Return type:: BlockView

bootstrap_unaligned_blocks(rng: Generator, bootstrap_block_count: int, block_size: int) → BlockView

Bootstrap random (unaligned) blocks.

Parameters:

rng (np.random.Generator (or similar, requires .integers of numpy.random.Generator)) – A random number generator.
bootstrap_block_count (int) – Number of blocks to bootstrap
block_size (int) – Size per block

Returns:

A the bootstrapped blocks (actually a copy, not a view)

Return type:

BlockView

classmethod clone_from_data(X: ndarray, Y: ndarray, Z: ndarray) → CIT_DataPatterned

Attach the currently used pattern-provider to given data.

Parameters:

X (np.ndarray) – X-data
Y (np.ndarray) – Y-data
Z (np.ndarray) – Z-data

Returns:

Patterned data.

Return type:

decltype(self), a type derived from CIT_DataPatterned

Baseline Patterns

The class CIT_DataPatterned_PersistentInTime provides an implementation for one-dimensional persistent patterns, for example persistent-in-time regimes.
The class CIT_DataPatterned_PesistentInSpace provides an implementation for two-dimensional persistent patterns, for example persistent-in-space regimes.

class CIT_DataPatterned_PersistentInTime

Bases: CIT_DataPatterned

Patterned data for mCIT. The implemented pattern captures persistent regimes in a single (eg time) direction.

x_data has shape=(N), where N is sample-size
y_data has shape=(N), where N is sample-size
z_data has shape=(N,k), where N is sample-size and k=dim(Z)

See also

See overview on custom patterns. Methods are specified and documented on CIT_DataPatterned.

view_blocks(block_size: int) → BlockView

Implements functionality of interface CIT_DataPatterned.

View as blocks of given size. The layout of blocks encodes the (prior) knowledge about patterns.

Parameters:: block_size (int) – requested block-size (the block-size of the result may not exactly match this number, if the underlying pattern provider cannot construct arbitrary block-sizes).
Returns:: view as pattern-aligned blocks
Return type:: BlockView

static get_actual_block_format(requested_size: int) → int

Implements functionality of interface CIT_DataPatterned.

Get the actual (possibly multi-dimensional) format of blocks produced. Used for plotting.

Parameters:: requested_size (int) – The size of blocks requested.
Returns:: Format of blocks produced.
Return type:: int|tuple[int,…]

static reproject_blocks(value_per_block: ndarray, block_configuration: BlockView, data_configuration: tuple[int, ...]) → ndarray

Implements functionality of interface CIT_DataPatterned.

Reproject a function \(f\) on blocks to the original index-set layout (for example time, space etc). Used for plotting.

Parameters:

value_per_block (np.ndarray) – values of \(f\) for each block
block_configuration (BlockView) – the block-configuration (eg block-size) used
data_configuration (tuple[int,...]) – the data-shape (per-variable) in the original data

Returns:

plottable layout of \(f\) as function of the original index-space

Return type:

np.ndarray

class CIT_DataPatterned_PesistentInSpace

Bases: CIT_DataPatterned

Patterned data for mCIT. The implemented pattern captures persistent regimes in two (eg spatial) direction.

x_data has shape=(w,h), where w, h are the width and height of the sample-grid.
y_data has shape=(w,h), where w, h are the width and height of the sample-grid.
z_data has shape=(w,h,k), where w, h are the width and height of the sample-grid, and k=dim(Z)

See also

See overview on custom patterns. Methods are specified and documented on CIT_DataPatterned.

static get_actual_block_format(requested_size: int) → tuple[int, int]

Implements functionality of interface CIT_DataPatterned.

Get the actual (possibly multi-dimensional) format of blocks produced. Used for plotting.

Parameters:: requested_size (int) – The size of blocks requested.
Returns:: Format of blocks produced.
Return type:: int|tuple[int,…]

view_blocks(block_size: int) → BlockView

Implements functionality of interface CIT_DataPatterned.

View as blocks of given size. The layout of blocks encodes the (prior) knowledge about patterns.

Parameters:: block_size (int) – requested block-size (the block-size of the result may not exactly match this number, if the underlying pattern provider cannot construct arbitrary block-sizes).
Returns:: view as pattern-aligned blocks
Return type:: BlockView

static reproject_blocks(value_per_block: ndarray, block_configuration: BlockView, data_configuration: tuple[int, ...]) → ndarray

Implements functionality of interface CIT_DataPatterned.

Reproject a function \(f\) on blocks to the original index-set layout (for example time, space etc). Used for plotting.

Parameters:

value_per_block (np.ndarray) – values of \(f\) for each block
block_configuration (BlockView) – the block-configuration (eg block-size) used
data_configuration (tuple[int,...]) – the data-shape (per-variable) in the original data

Returns:

plottable layout of \(f\) as function of the original index-space

Return type:

np.ndarray

Underlying Data

Data is currently stored in numpy-arrays, whose shape depends on the data-manager used:

class CIT_Data

Data for CIT.

See also

Used through CIT_DataPatterned.

x_data: ndarray: Shape specified by data-manager/pattern-provider. See CIT_DataPatterned.

y_data: ndarray: Same shape as x_data.

z_data: ndarray: Shape=(shape_xy,k), where shape_xy is the shape of x_data/y_data and k=dim(Z) is the size of the conditioning set.

cache_id: tuple | None: Unique identifier associated to the data by the data-manager, to be used for caching results. None to disable caching.

__init__(x_data: ndarray, y_data: ndarray, z_data: ndarray, cache_id: tuple | None) → None