causalnex.structure.DAGClassifier¶
-
class
causalnex.structure.
DAGClassifier
(dist_type_schema=None, alpha=0.0, beta=0.0, fit_intercept=True, hidden_layer_units=None, threshold=0.0, tabu_edges=None, tabu_parent_nodes=None, tabu_child_nodes=None, dependent_target=True, enforce_dag=False, standardize=False, target_dist_type=None, notears_mlp_kwargs=None)[source]¶ Bases:
sklearn.base.ClassifierMixin
,causalnex.structure.pytorch.sklearn._base.DAGBase
Classifier wrapper of the StructureModel. Implements the sklearn .fit and .predict interface.
Example:
from causalnex.sklearn import DAGRegressor clf = DAGClassifier(threshold=0.1) clf.fit(X_train, y_train) y_preds = clf.predict(X_test) type(y_preds) np.ndarray type(clf.feature_importances_) np.ndarray
.. attribute:: feature_importances_
An array of edge weights corresponding
- type
np.ndarray
-
positionally to the feature X.
-
coef_
¶ An array of edge weights corresponding
- Type
np.ndarray
-
positionally to the feature X.
-
intercept_
¶ The target node bias value.
- Type
float
Attributes
Signed relationship between features and the target. For this linear case this equivalent to linear regression coefficients. :rtype:
ndarray
:returns: the mean effect relationship between nodes. shape: (1, n_features) or (n_classes, n_features).Unsigned importances of the features wrt to the target. NOTE: these are used as the graph adjacency matrix. :rtype:
ndarray
:returns: the L2 relationship between nodes. shape: (1, n_features) or (n_classes, n_features).Returns: The bias term from the target node.
Methods
DAGClassifier.__delattr__
(name, /)Implement delattr(self, name).
DAGClassifier.__dir__
()default dir() implementation
DAGClassifier.__eq__
(value, /)Return self==value.
DAGClassifier.__format__
default object formatter
DAGClassifier.__ge__
(value, /)Return self>=value.
DAGClassifier.__getattribute__
(name, /)Return getattr(self, name).
DAGClassifier.__getstate__
()DAGClassifier.__gt__
(value, /)Return self>value.
DAGClassifier.__hash__
()Return hash(self).
DAGClassifier.__init__
([dist_type_schema, …])- type dist_type_schema
Optional
[Dict
[Union
[str
,int
],str
]]
DAGClassifier.__init_subclass__
This method is called when a class is subclassed.
DAGClassifier.__le__
(value, /)Return self<=value.
DAGClassifier.__lt__
(value, /)Return self<value.
DAGClassifier.__ne__
(value, /)Return self!=value.
DAGClassifier.__new__
(**kwargs)Create and return a new object.
DAGClassifier.__reduce__
helper for pickle
DAGClassifier.__reduce_ex__
helper for pickle
DAGClassifier.__repr__
([N_CHAR_MAX])Return repr(self).
DAGClassifier.__setattr__
(name, value, /)Implement setattr(self, name, value).
DAGClassifier.__setstate__
(state)DAGClassifier.__sizeof__
()size of object in memory, in bytes
DAGClassifier.__str__
()Return str(self).
DAGClassifier.__subclasshook__
Abstract classes can override this to customize issubclass().
DAGClassifier._check_n_features
(X, reset)Set the n_features_in_ attribute, or check against it.
DAGClassifier._get_param_names
()Get parameter names for the estimator
DAGClassifier._get_tags
()DAGClassifier._more_tags
()DAGClassifier._repr_html_inner
()This function is returned by the @property _repr_html_ to make hasattr(estimator, “_repr_html_”) return `True or False depending on get_config()[“display”].
DAGClassifier._repr_mimebundle_
(**kwargs)Mime bundle used by jupyter kernels to display estimator
DAGClassifier._validate_data
(X[, y, reset, …])Validate input data and set or check the n_features_in_ attribute.
DAGClassifier.fit
(X, y)Fits the sm model using the concat of X and y.
DAGClassifier.get_edges_to_node
(name[, data])Get the edges to a specific node. :type name:
str
:param name: The name of the node which to get weights towards. :type data:str
:param data: The edge parameter to get. Default is “weight” to return the adjacency matrix. Set to “mean_effect” to return the signed average effect of features on the target node.DAGClassifier.get_params
([deep])Get parameters for this estimator.
DAGClassifier.plot_dag
([enforce_dag, …])Plot the DAG of the fitted model.
Uses the fitted NOTEARS algorithm to reconstruct y from known X data.
Uses the fitted NOTEARS algorithm to reconstruct y from known X data.
DAGClassifier.score
(X, y[, sample_weight])Return the mean accuracy on the given test data and labels.
DAGClassifier.set_params
(**params)Set the parameters of this estimator.
-
__init__
(dist_type_schema=None, alpha=0.0, beta=0.0, fit_intercept=True, hidden_layer_units=None, threshold=0.0, tabu_edges=None, tabu_parent_nodes=None, tabu_child_nodes=None, dependent_target=True, enforce_dag=False, standardize=False, target_dist_type=None, notears_mlp_kwargs=None)¶ - Parameters
dist_type_schema (
Optional
[Dict
[Union
[str
,int
],str
]]) – The dist type schema corresponding to the X data passed to fit or predict.maps the pandas column name in X to the string alias of a dist type. (It) –
X is a np.ndarray (If) –
maps the positional index to the string alias of a dist type. (it) –
list of alias names can be found in dist_type/__init__.py. (A) –
None (If) –
that all data in X is continuous. (assumes) –
alpha (
float
) – l1 loss weighting. When using nonlinear layers this is only appliedthe first layer. (to) –
beta (
float
) – l2 loss weighting. Applied across all layers. Reccomended to use thisfitting nonlinearities. (when) –
fit_intercept (
bool
) – Whether to fit an intercept in the structure modelUse this if variables are offset. (equation.) –
hidden_layer_units (
Optional
[Iterable
[int
]]) – An iterable where its length determine the number of layers used,the numbers determine the number of nodes used for the layer in order. (and) –
threshold (
float
) – The thresholding to apply to the DAG weights.0.0 (If) –
not apply any threshold. (does) –
tabu_edges (
Optional
[List
]) – Tabu edges passed directly to the NOTEARS algorithm.tabu_parent_nodes (
Optional
[List
]) – Tabu nodes passed directly to the NOTEARS algorithm.tabu_child_nodes (
Optional
[List
]) – Tabu nodes passed directly to the NOTEARS algorithm.dependent_target (
bool
) – If True, constrains NOTEARS so that y can onlydependent (be) –
enforce_dag (
bool
) – If True, thresholds the graph until it is a DAG.a properly trained model should be a DAG (NOTE) –
failure (and) –
other issues. Use of this is only recommended if (indicates) –
have similar units (features) –
comparing edge weight (otherwise) –
has limited meaning. (magnitude) –
standardize (
bool
) – Whether to standardize the X and y variables before fitting.L-BFGS algorithm used to fit the underlying NOTEARS works best on data (The) –
of the same scale so this parameter is reccomended. (all) –
notears_mlp_kwargs (
Optional
[Dict
]) – Additional arguments for the NOTEARS MLP model.target_dist_type (
Optional
[str
]) – The distribution type of the target.the same aliases as dist_type_schema. (Uses) –
- Raises
TypeError – if alpha is not numeric.
TypeError – if beta is not numeric.
TypeError – if fit_intercept is not a bool.
TypeError – if threshold is not numeric.
NotImplementedError – if target_dist_type not in supported_types
-
property
coef_
¶ Signed relationship between features and the target. For this linear case this equivalent to linear regression coefficients. :rtype:
ndarray
:returns: the mean effect relationship between nodes.shape: (1, n_features) or (n_classes, n_features).
-
property
feature_importances_
¶ Unsigned importances of the features wrt to the target. NOTE: these are used as the graph adjacency matrix. :rtype:
ndarray
:returns: the L2 relationship between nodes.shape: (1, n_features) or (n_classes, n_features).
-
fit
(X, y)[source]¶ Fits the sm model using the concat of X and y.
- Raises
NotImplementedError – If unsupported target_dist_type provided.
ValueError – If less than 2 classes provided.
- Return type
DAGClassifier
- Returns
Instance of DAGClassifier.
-
get_edges_to_node
(name, data='weight')¶ Get the edges to a specific node. :type name:
str
:param name: The name of the node which to get weights towards. :type data:str
:param data: The edge parameter to get. Default is “weight” to returnthe adjacency matrix. Set to “mean_effect” to return the signed average effect of features on the target node.
- Return type
Series
- Returns
The specified edge data.
-
get_params
(deep=True)¶ Get parameters for this estimator.
- Parameters
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns
params – Parameter names mapped to their values.
- Return type
dict
-
property
intercept_
¶ Returns: The bias term from the target node. shape: (1,) or (n_classes,).
- Return type
ndarray
-
plot_dag
(enforce_dag=False, plot_structure_kwargs=None, use_mpl=True, ax=None, pixel_size_in=0.01)¶ Plot the DAG of the fitted model. :type enforce_dag:
bool
:param enforce_dag: Whether to threshold the model until it is a DAG. :param Does not alter the underlying model.: :type ax:Optional
[Axes
] :param ax: Matplotlib axes to plot the model on. :param If None: :param creates axis.: :type pixel_size_in:float
:param pixel_size_in: Scaling multiple for the plot. :type plot_structure_kwargs:Optional
[Dict
] :param plot_structure_kwargs: Dictionary of kwargs for the causalnex plotting module. :type use_mpl:bool
:param use_mpl: Whether to use matplotlib as the backend. :param If False: :param ax and pixel_size_in are ignored.:- Return type
Union
[Tuple
[Figure
,Axes
],Image
]- Returns
Plot of the DAG.
-
predict
(X)[source]¶ Uses the fitted NOTEARS algorithm to reconstruct y from known X data.
- Return type
ndarray
- Returns
Predicted y values for each row of X.
-
predict_proba
(X)[source]¶ Uses the fitted NOTEARS algorithm to reconstruct y from known X data.
- Return type
ndarray
- Returns
Predicted y class probabilities for each row of X.
-
score
(X, y, sample_weight=None)¶ Return the mean accuracy on the given test data and labels.
In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.
- Parameters
X (array-like of shape (n_samples, n_features)) – Test samples.
y (array-like of shape (n_samples,) or (n_samples, n_outputs)) – True labels for X.
sample_weight (array-like of shape (n_samples,), default=None) – Sample weights.
- Returns
score – Mean accuracy of
self.predict(X)
wrt. y.- Return type
float
-
set_params
(**params)¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters
**params (dict) – Estimator parameters.
- Returns
self – Estimator instance.
- Return type
estimator instance