causalnex.structure.notears.from_pandas

causalnex.structure.notears.from_pandas(X, max_iter=100, h_tol=1e-08, w_threshold=0.0, tabu_edges=None, tabu_parent_nodes=None, tabu_child_nodes=None)[source]

Learn the StructureModel, the graph structure describing conditional dependencies between variables in data presented as a pandas dataframe.

The optimisation is to minimise a score function \(F(W)\) over the graph’s weighted adjacency matrix, \(W\), subject to the a constraint function \(h(W)\), where \(h(W) == 0\) characterises an acyclic graph. \(h(W) > 0\) is a continuous, differentiable function that encapsulated how acyclic the graph is (less == more acyclic). Full details of this approach to structure learning are provided in the publication:

Based on DAGs with NO TEARS. @inproceedings{zheng2018dags,

author = {Zheng, Xun and Aragam, Bryon and Ravikumar, Pradeep and Xing, Eric P.}, booktitle = {Advances in Neural Information Processing Systems}, title = {{DAGs with NO TEARS: Continuous Optimization for Structure Learning}}, year = {2018}, codebase = {https://github.com/xunzheng/notears}

}

Parameters:
  • X (DataFrame) – input data.
  • max_iter (int) – max number of dual ascent steps during optimisation.
  • h_tol (float) – exit if h(W) < h_tol (as opposed to strict definition of 0).
  • w_threshold (float) – fixed threshold for absolute edge weights.
  • tabu_edges (Optional[List[Tuple[str, str]]]) – list of edges(from, to) not to be included in the graph.
  • tabu_parent_nodes (Optional[List[str]]) – list of nodes banned from being a parent of any other nodes.
  • tabu_child_nodes (Optional[List[str]]) – list of nodes banned from being a child of any other nodes.
Returns:

graph of conditional dependencies between data variables.

Return type:

StructureModel

Raises:

ValueError – If X does not contain data.