causalnex.structure.DAGClassifier¶
-
class
causalnex.structure.
DAGClassifier
(dist_type_schema=None, alpha=0.0, beta=0.0, fit_intercept=True, hidden_layer_units=None, threshold=0.0, tabu_edges=None, tabu_parent_nodes=None, tabu_child_nodes=None, dependent_target=True, enforce_dag=False, standardize=False, target_dist_type=None, notears_mlp_kwargs=None)[source]¶ Bases:
sklearn.base.ClassifierMixin
,causalnex.structure.pytorch.sklearn._base.DAGBase
Classifier wrapper of the StructureModel. Implements the sklearn .fit and .predict interface.
Example:
from causalnex.sklearn import DAGRegressor clf = DAGClassifier(threshold=0.1) clf.fit(X_train, y_train) y_preds = clf.predict(X_test) type(y_preds) np.ndarray type(clf.feature_importances_) np.ndarray
.. attribute:: feature_importances_
An array of edge weights corresponding
- type
np.ndarray
-
positionally to the feature X.
-
coef_
¶ An array of edge weights corresponding
- Type
np.ndarray
-
positionally to the feature X.
-
intercept_
¶ The target node bias value.
- Type
float
Attributes
Signed relationship between features and the target. For this linear case this equivalent to linear regression coefficients. :rtype:
ndarray
:returns: the mean effect relationship between nodes. shape: (1, n_features) or (n_classes, n_features).Unsigned importances of the features wrt to the target. NOTE: these are used as the graph adjacency matrix. :rtype:
ndarray
:returns: the L2 relationship between nodes. shape: (1, n_features) or (n_classes, n_features).Returns: The bias term from the target node.
Methods
DAGClassifier.__delattr__
(name, /)Implement delattr(self, name).
DAGClassifier.__dir__
()Default dir() implementation.
DAGClassifier.__eq__
(value, /)Return self==value.
DAGClassifier.__format__
(format_spec, /)Default object formatter.
DAGClassifier.__ge__
(value, /)Return self>=value.
DAGClassifier.__getattribute__
(name, /)Return getattr(self, name).
DAGClassifier.__getstate__
()DAGClassifier.__gt__
(value, /)Return self>value.
DAGClassifier.__hash__
()Return hash(self).
DAGClassifier.__init__
([dist_type_schema, …])- type dist_type_schema
Optional
[Dict
[Union
[str
,int
],str
]]
DAGClassifier.__init_subclass__
(**kwargs)Set the
set_{method}_request
methods.DAGClassifier.__le__
(value, /)Return self<=value.
DAGClassifier.__lt__
(value, /)Return self<value.
DAGClassifier.__ne__
(value, /)Return self!=value.
DAGClassifier.__new__
(**kwargs)Create and return a new object.
DAGClassifier.__reduce__
()Helper for pickle.
DAGClassifier.__reduce_ex__
(protocol, /)Helper for pickle.
DAGClassifier.__repr__
([N_CHAR_MAX])Return repr(self).
DAGClassifier.__setattr__
(name, value, /)Implement setattr(self, name, value).
DAGClassifier.__setstate__
(state)DAGClassifier.__sizeof__
()Size of object in memory, in bytes.
DAGClassifier.__sklearn_clone__
()DAGClassifier.__str__
()Return str(self).
DAGClassifier.__subclasshook__
Abstract classes can override this to customize issubclass().
DAGClassifier._build_request_for_signature
(…)Build the MethodMetadataRequest for a method using its signature.
DAGClassifier._check_feature_names
(X, *, reset)Set or check the feature_names_in_ attribute.
DAGClassifier._check_n_features
(X, reset)Set the n_features_in_ attribute, or check against it.
DAGClassifier._get_default_requests
()Collect default request values.
DAGClassifier._get_metadata_request
()Get requested data properties.
DAGClassifier._get_param_names
()Get parameter names for the estimator
DAGClassifier._get_tags
()DAGClassifier._more_tags
()DAGClassifier._repr_html_inner
()This function is returned by the @property _repr_html_ to make hasattr(estimator, “_repr_html_”) return `True or False depending on get_config()[“display”].
DAGClassifier._repr_mimebundle_
(**kwargs)Mime bundle used by jupyter kernels to display estimator
DAGClassifier._validate_data
([X, y, reset, …])Validate input data and set or check the n_features_in_ attribute.
DAGClassifier._validate_params
()Validate types and values of constructor parameters
DAGClassifier.fit
(X, y)Fits the sm model using the concat of X and y.
DAGClassifier.get_edges_to_node
(name[, data])Get the edges to a specific node. :type name:
str
:param name: The name of the node which to get weights towards. :type data:str
:param data: The edge parameter to get. Default is “weight” to return the adjacency matrix. Set to “mean_effect” to return the signed average effect of features on the target node.Get metadata routing of this object.
DAGClassifier.get_params
([deep])Get parameters for this estimator.
DAGClassifier.plot_dag
(output_filename[, …])Plot the DAG of the fitted model. :type enforce_dag:
bool
:param enforce_dag: Whether to threshold the model until it is a DAG. :param Does not alter the underlying model.: :type plot_structure_kwargs:Optional
[Dict
[str
,Dict
]] :param plot_structure_kwargs: Dictionary of kwargs for the causalnex plotting module. :type layout_kwargs:Optional
[Dict
[str
,Dict
]] :param layout_kwargs: Dictionary to set the layout and physics of the graph. :param Example: :param ::: layout_kwargs = { “physics”: { “solver”: “repulsion” }, “layout”: { “hierarchical”: { “enabled”: True } } } :type output_filename:str
:param output_filename: If provided, write html to a given path, e.g. “./plot.html”.Uses the fitted NOTEARS algorithm to reconstruct y from known X data.
Uses the fitted NOTEARS algorithm to reconstruct y from known X data.
DAGClassifier.score
(X, y[, sample_weight])Return the mean accuracy on the given test data and labels.
DAGClassifier.set_params
(**params)Set the parameters of this estimator.
DAGClassifier.set_score_request
(*[, …])Request metadata passed to the
score
method.-
__init__
(dist_type_schema=None, alpha=0.0, beta=0.0, fit_intercept=True, hidden_layer_units=None, threshold=0.0, tabu_edges=None, tabu_parent_nodes=None, tabu_child_nodes=None, dependent_target=True, enforce_dag=False, standardize=False, target_dist_type=None, notears_mlp_kwargs=None)¶ - Parameters
dist_type_schema (
Optional
[Dict
[Union
[str
,int
],str
]]) – The dist type schema corresponding to the X data passed to fit or predict.maps the pandas column name in X to the string alias of a dist type. (It) –
X is a np.ndarray (If) –
maps the positional index to the string alias of a dist type. (it) –
list of alias names can be found in dist_type/__init__.py. (A) –
None (If) –
that all data in X is continuous. (assumes) –
alpha (
float
) – l1 loss weighting. When using nonlinear layers this is only appliedthe first layer. (to) –
beta (
float
) – l2 loss weighting. Applied across all layers. Reccomended to use thisfitting nonlinearities. (when) –
fit_intercept (
bool
) – Whether to fit an intercept in the structure modelUse this if variables are offset. (equation.) –
hidden_layer_units (
Optional
[Iterable
[int
]]) – An iterable where its length determine the number of layers used,the numbers determine the number of nodes used for the layer in order. (and) –
threshold (
float
) – The thresholding to apply to the DAG weights.0.0 (If) –
not apply any threshold. (does) –
tabu_edges (
Optional
[List
]) – Tabu edges passed directly to the NOTEARS algorithm.tabu_parent_nodes (
Optional
[List
]) – Tabu nodes passed directly to the NOTEARS algorithm.tabu_child_nodes (
Optional
[List
]) – Tabu nodes passed directly to the NOTEARS algorithm.dependent_target (
bool
) – If True, constrains NOTEARS so that y can onlydependent (be) –
enforce_dag (
bool
) – If True, thresholds the graph until it is a DAG.a properly trained model should be a DAG (NOTE) –
failure (and) –
other issues. Use of this is only recommended if (indicates) –
have similar units (features) –
comparing edge weight (otherwise) –
has limited meaning. (magnitude) –
standardize (
bool
) – Whether to standardize the X and y variables before fitting.L-BFGS algorithm used to fit the underlying NOTEARS works best on data (The) –
of the same scale so this parameter is reccomended. (all) –
notears_mlp_kwargs (
Optional
[Dict
]) – Additional arguments for the NOTEARS MLP model.target_dist_type (
Optional
[str
]) – The distribution type of the target.the same aliases as dist_type_schema. (Uses) –
- Raises
TypeError – if alpha is not numeric.
TypeError – if beta is not numeric.
TypeError – if fit_intercept is not a bool.
TypeError – if threshold is not numeric.
NotImplementedError – if target_dist_type not in supported_types
-
property
coef_
¶ Signed relationship between features and the target. For this linear case this equivalent to linear regression coefficients. :rtype:
ndarray
:returns: the mean effect relationship between nodes.shape: (1, n_features) or (n_classes, n_features).
-
property
feature_importances_
¶ Unsigned importances of the features wrt to the target. NOTE: these are used as the graph adjacency matrix. :rtype:
ndarray
:returns: the L2 relationship between nodes.shape: (1, n_features) or (n_classes, n_features).
-
fit
(X, y)[source]¶ Fits the sm model using the concat of X and y.
- Raises
NotImplementedError – If unsupported target_dist_type provided.
ValueError – If less than 2 classes provided.
- Return type
DAGClassifier
- Returns
Instance of DAGClassifier.
-
get_edges_to_node
(name, data='weight')¶ Get the edges to a specific node. :type name:
str
:param name: The name of the node which to get weights towards. :type data:str
:param data: The edge parameter to get. Default is “weight” to returnthe adjacency matrix. Set to “mean_effect” to return the signed average effect of features on the target node.
- Return type
Series
- Returns
The specified edge data.
-
get_metadata_routing
()¶ Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns
routing – A
MetadataRequest
encapsulating routing information.- Return type
MetadataRequest
-
get_params
(deep=True)¶ Get parameters for this estimator.
- Parameters
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns
params – Parameter names mapped to their values.
- Return type
dict
-
property
intercept_
¶ Returns: The bias term from the target node. shape: (1,) or (n_classes,).
- Return type
ndarray
-
plot_dag
(output_filename, enforce_dag=False, plot_structure_kwargs=None, layout_kwargs=None)¶ Plot the DAG of the fitted model. :type enforce_dag:
bool
:param enforce_dag: Whether to threshold the model until it is a DAG. :param Does not alter the underlying model.: :type plot_structure_kwargs:Optional
[Dict
[str
,Dict
]] :param plot_structure_kwargs: Dictionary of kwargs for the causalnex plotting module. :type layout_kwargs:Optional
[Dict
[str
,Dict
]] :param layout_kwargs: Dictionary to set the layout and physics of the graph. :param Example: :param ::layout_kwargs = { "physics": { "solver": "repulsion" }, "layout": { "hierarchical": { "enabled": True } } }
- Parameters
output_filename (
str
) – If provided, write html to a given path, e.g. “./plot.html”- Return type
IFrame
- Returns
Plot of the DAG with the proper encoding to run on Windows machines.
-
predict
(X)[source]¶ Uses the fitted NOTEARS algorithm to reconstruct y from known X data.
- Return type
ndarray
- Returns
Predicted y values for each row of X.
-
predict_proba
(X)[source]¶ Uses the fitted NOTEARS algorithm to reconstruct y from known X data.
- Return type
ndarray
- Returns
Predicted y class probabilities for each row of X.
-
score
(X, y, sample_weight=None)¶ Return the mean accuracy on the given test data and labels.
In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.
- Parameters
X (array-like of shape (n_samples, n_features)) – Test samples.
y (array-like of shape (n_samples,) or (n_samples, n_outputs)) – True labels for X.
sample_weight (array-like of shape (n_samples,), default=None) – Sample weights.
- Returns
score – Mean accuracy of
self.predict(X)
w.r.t. y.- Return type
float
-
set_params
(**params)¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters
**params (dict) – Estimator parameters.
- Returns
self – Estimator instance.
- Return type
estimator instance
-
set_score_request
(*, sample_weight: Union[bool, None, str] = '$UNCHANGED$') → causalnex.structure.pytorch.sklearn.clf.DAGClassifier¶ Request metadata passed to the
score
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed toscore
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it toscore
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.New in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
pipeline.Pipeline
. Otherwise it has no effect.- Parameters
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weight
parameter inscore
.- Returns
self – The updated object.
- Return type
object