============================================
GLDF: Non-Homogeneous Causal Graph Discovery
============================================

.. py:module:: GLDF


This is the reference implementation (`github repo <https://github.com/martin-rabel/Causal_GLDF>`_) of the framework described by [RR25]_
for causal graph discovery on non-homogeneous data.

.. figure:: _static/output_example.png
   :scale: 30 %
   :alt: sample-output plus interpretation

   [RR25]_, Fig. 1.1:
   On the left-hand side (a), the framework's output is shown (see tutorial 1 below).
   Primary focus so far is the (labeled/colored) graph on the very left.
   The right-hand side (b) illustrates, how the model is described by different
   causal graphs in different regions of space.


Introduction
============

Algorithms for causal graph discovery (CD) on IID or stationary data are readily
available. However, real-world data rarely is IID or stationary.
This framework particularly aims to find ways in which a causal graph
*changes* over time, space or patterns in other extraneous information attached
to data. It does so by focusing on an approach that is local in the graph (gL)
and testing qualitative questions directly (D), hence the name gLD-framework
(GLDF).

Besides statistical efficiency, another major strength of this approach
is its modularity. This allows the present framework to directly
build on existing CD-algorithm implementations.
At the same time, almost arbitrary "patterns" generalizing persistence
in time or space can be leveraged to account for regime-structure.
The framework extensively builds on modified conditional independence tests
(CITs), where individual (or all) modifications can easily be customized
if the need arises.


Getting Started
===============

The framework works with many algorithms from the
`tigramite <https://github.com/jakobrunge/tigramite/>`_
and `causal-learn <https://causal-learn.readthedocs.io/en/latest/>`_
python packages out of the box.
We recommend the user to install at least one of these.


Installation
------------

.. role:: bash(code)
   :language: bash

The package can currently be installed through pip or by cloning the `code-repository <https://github.com/martin-rabel/Causal_GLDF>`_.
Installation via pip should typically occur within a virtual environment,
see e.g. `venv <https://docs.python.org/3/library/venv.html>`_.

.. tabs::
         
   .. tab:: Full
      
      Recommended.
      Install relevant/helpful dependencies (see :ref:`label-dependencies`).

      .. code-block:: bash

         pip install 'GLDF[full]'

      
      The tutorials, provided as jupyter notebooks, additionally require `juypter <https://jupyter.org/>`_:

      .. code-block:: bash

         pip install ipykernel jupyter


   .. tab:: Minimal

      Install only minimal dependencies (see :ref:`label-dependencies`).

      .. code-block:: bash

         pip install 'GLDF[minimal]'

      .. note::

         Currently :bash:`pip install GLDF` also installs the minimal version,
         with `PEP 771 <https://peps.python.org/pep-0771/>`_, if adopted,
         we may change the default behavior
         to the recommended (full) setting in future versions. To enforce a minimal
         setup you should thus *not* rely on pip's current fallback behavior.

   
   .. tab:: No Dependencies

      This package (as any other package) can be installed without pulling in depedendencies
      via the :bash:`--no-deps` option:
      
      .. code-block:: bash

         pip install GLDF --no-deps


.. _label-dependencies:

Dependencies
------------

Beyond python (version 3.10 or newer, tested on 3.13.5) and the python standard-library,
this package requires:

*  numpy (version 2.0.0 or newer, tested on 2.3.2)
*  scipy (version 1.10.0 or newer, tested on 1.16.1)

For easy and direct access to plotting functionality:

*  matplotlib (version 3.7.0 or newer, tested on 3.10.5)
*  tigramite (version 5.2.0.0 or newer, tested on 5.2.8.2)
*  networkx (version 3.0 or newer, tested on 3.5)

For easy and direct access to causal discovery algorithms:

*  tigramite (time-series algorithms; version see above)
*  causal-learn (IID algorithms; tested on version 0.1.4.3)


Usage
-----

If :py:obj:`data` is a numpy array of shape :math:`(N, k)` for :math:`k` variables
and a sample-size of :math:`N`, then
a simple analysis assuming time-series data and persistent in time regimes
can be run out of the box using :py:func:`run_hccd_temporal_regimes<GLDF.frontend.run_hccd_temporal_regimes>` as follows:

.. code-block:: python

   import GLDF
   result = GLDF.run_hccd_temporal_regimes(data)

The result is structured as python object (see :py:class:`frontend.Result<GLDF.frontend.Result>`)
with easy-to-use visualization and inspection functionality.
Let's say :math:`k=4` and the variables should be labeled :math:`X, Y, Z, W`
in the plot (not providing variable names will use indices instead),
then for example the method
:py:meth:`plot_labeled_union_graph<GLDF.frontend.Result.plot_labeled_union_graph>`
produces a plot of the labeled union-graph:

.. code-block:: python

   import matplotlib.pyplot as plt
   result.var_names = ["X", "Y", "Z", "W"]
   result.plot_labeled_union_graph()
   plt.show()

Similarly, the individual model-indicators (changing links) can be further inspected as
:py:meth:`model_indicators<GLDF.frontend.Result.model_indicators>`,
for example via the method :py:meth:`plot_resolution<GLDF.frontend.ResolvableModelIndicator.plot_resolution>`\ :

.. code-block:: python

   for mi in result.model_indicators():
      mi.plot_resolution()
      plt.show()

See also the tutorials below for a more detailed start.


Tutorials
---------

We provide short `tutorials <https://github.com/martin-rabel/Causal_GLDF/tree/main/tutorials>`_ on the use and customization of our framework in the form of jupyter-notebooks:

1. `Quickstart <https://github.com/martin-rabel/Causal_GLDF/blob/main/tutorials/01_preconfigured_quickstart.ipynb>`_: Using preconfigured setups for time-series or 2D spatial data.
2. `Detailed Configuration <https://github.com/martin-rabel/Causal_GLDF/blob/main/tutorials/02_detailed_configuration.ipynb>`_: Change modules, their parameters and other aspects of a setup.
3. `Custom Patterns <https://github.com/martin-rabel/Causal_GLDF/blob/main/tutorials/03_custom_patterns.ipynb>`_: A quick intro describing the definition of user-defined patterns.

The tutorials assume the full :ref:`label-dependencies` are available (including matplotlib, tigramite and causal-learn).


Overview
========

This documentation and the framework are divided into two main parts:

*  The backend contains code implementing functionality required for the framework
   in a modular way. The documentation of the backend in particular contains
   detailed information on what features are implemented where and how to
   customize individual modules. This typically includes a specification
   of each given (sub-)component in the form of a documented interface
   (following the naming convention of starting with a capital 'I' for interface
   followed by a verb expressing what the component generally does).
   The backend is primarily of interest to the user who wants to
   substantially customize and extend the functionality of the framework itself.
*  The frontend contains code that satisfies simple user-queries by assembling
   backend-components in a suitable way.
   The frontend is primariy of interest to the user who wants to
   interact with the features provided by the framework in a simple way
   through higher level (non-modular) tasks, like end-to-end analysis
   of provided data.


Frontend
--------

.. toctree::
   :maxdepth: 2

   frontend


Backend
-------

.. toctree::
   :maxdepth: 2

   data_mgmt
   layer_data_proc
   layer_composition
   post_processing


Bridges
-------

.. toctree::
   :maxdepth: 2

   bridges


Roadmap
=======

Features currently under consideration for future versions.
If you want to contribute or have suggestions for improvements,
feel free to use github's features or contact us directly.

*  Systematic unit-testing, also for simple sanity-checking of custom implementations.
*  Allow marked tests to "mark" for aspects other than "independent regimes",
   e.g. heterogeneity (which is already tested anyway), and aggregate per link results
   (e.g. consider the *link* heterogeneous, if *all* tests on the link [for all conditioning sets]
   are reported heterogeneous)
*  Prepare for easier and (computationally and statistically) efficient integration
   of non-linear tests.
*  Integrate with JCI/CD-NOD like arguments for obtaining additional orientation information.
*  Make state-space-construction more robust for large models by exploiting graphical constraints.
*  Extend post-processing for statistically more meaningful indicator-resolution
   (potentially including confidence-assessment).


Citation
========

Please cite as:

.. code-block:: none

   @article{rabel2025context,
      title={Context-Specific Causal Graph Discovery with Unobserved Contexts:
         Non-Stationarity, Regimes and Spatio-Temporal Patterns},
      author={Rabel, Martin and Runge, Jakob},
      journal={arXiv preprint arXiv:2511.21537},
      year={2025},
      url={https://arxiv.org/abs/2511.21537},
      doi={10.48550/arXiv.2511.21537}
   }

Literature
==========

.. [RR25]
   M. Rabel, J. Runge.
   Context-Specific Causal Graph Discovery with Unobserved Contexts: Non-Stationarity, Regimes and Spatio-Temporal Patterns.
   *archive preprint* `arXiv:2511.21537 <https://arxiv.org/abs/2511.21537>`_, 2025.

.. [RNK+19] J. Runge, P. Nowack, M. Kretschmer, S. Flaxman, D. Sejdinovic.
   Detecting and quantifying causal associations in large nonlinear time series datasets. `Sci. Adv. 5, eaau4996 <https://advances.sciencemag.org/content/5/11/eaau4996>`_, 2019.

.. [R20] J. Runge.
   Discovering contemporaneous and lagged causal relations in autocorrelated nonlinear time series datasets.
   `Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence <http://auai.org/uai2020/proceedings/579_main_paper.pdf>`_,
   UAI 2020, Toronto, Canada, 2019, AUAI Press, 2020.

.. [GR20] A. Gerhardus, J. Runge.
   High-recall causal discovery for autocorrelated time series with latent confounders.
   `Advances in Neural Information Processing Systems 33 <https://proceedings.neurips.cc/paper/2020/hash/94e70705efae423efda1088614128d0b-Abstract.html>`_, 2020.

.. [SGS01] P. Spirtes, C. Glymour, and R. Scheines.
   Causation, prediction, and search. MIT press, 2001.

.. [SG91] P. Spirtes, C. Glymour.
   An algorithm for fast recovery of sparse causal graphs.
   Social Science Computer Review, 9:62–72, 1991.

.. [CM+14] D. Colombo, M. H. Maathuis, et al.
   Order-independent constraint-based causal structure learning.
   J. Mach. Learn. Res., 15(1):3741–3782, 2014.

.. [ZHC+24] Y. Zheng, B. Huang, W. Chen, J. Ramsey, M. Gong, R. Cai, S. Shimizu, P. Spirtes, and K. Zhang.
   `Causal-learn: Causal discovery in python <https://causal-learn.readthedocs.io/en/latest/index.html>`_.
   Journal of Machine Learning Research, 25(60):1–8, 2024.


License
=======

Copyright (C) 2025-2026 Martin Rabel

GNU General Public License v3.0

See `license-file <https://github.com/martin-rabel/Causal_GLDF/blob/main/LICENSE>`_ in the code-repository for full text.

This package is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version. This package is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.