causalnex.structure.DAGRegressor¶
-
class
causalnex.structure.
DAGRegressor
(dist_type_schema=None, alpha=0.0, beta=0.0, fit_intercept=True, hidden_layer_units=None, threshold=0.0, tabu_edges=None, tabu_parent_nodes=None, tabu_child_nodes=None, dependent_target=True, enforce_dag=False, standardize=False, target_dist_type=None, notears_mlp_kwargs=None)[source]¶ Bases:
sklearn.base.RegressorMixin
,causalnex.structure.pytorch.sklearn._base.DAGBase
Regressor wrapper of the StructureModel. Implements the sklearn .fit and .predict interface.
Example:
from causalnex.sklearn import DAGRegressor reg = DAGRegressor(threshold=0.1) reg.fit(X_train, y_train) y_preds = reg.predict(X_test) type(y_preds) np.ndarray type(reg.feature_importances_) np.ndarray
.. attribute:: feature_importances_
An array of edge weights corresponding
- type
np.ndarray
-
positionally to the feature X.
-
coef_
¶ An array of edge weights corresponding
- Type
np.ndarray
-
positionally to the feature X.
-
intercept_
¶ The target node bias value.
- Type
float
Attributes
Signed relationship between features and the target.
Unsigned importances of the features wrt to the target.
The bias term from the target node
Methods
DAGRegressor.__delattr__
(name, /)Implement delattr(self, name).
DAGRegressor.__dir__
()default dir() implementation
DAGRegressor.__eq__
(value, /)Return self==value.
DAGRegressor.__format__
default object formatter
DAGRegressor.__ge__
(value, /)Return self>=value.
DAGRegressor.__getattribute__
(name, /)Return getattr(self, name).
DAGRegressor.__getstate__
()DAGRegressor.__gt__
(value, /)Return self>value.
DAGRegressor.__hash__
()Return hash(self).
DAGRegressor.__init__
([dist_type_schema, …])- type dist_type_schema
Optional
[Dict
[Union
[str
,int
],str
]]
DAGRegressor.__init_subclass__
This method is called when a class is subclassed.
DAGRegressor.__le__
(value, /)Return self<=value.
DAGRegressor.__lt__
(value, /)Return self<value.
DAGRegressor.__ne__
(value, /)Return self!=value.
DAGRegressor.__new__
(**kwargs)Create and return a new object.
DAGRegressor.__reduce__
helper for pickle
DAGRegressor.__reduce_ex__
helper for pickle
DAGRegressor.__repr__
([N_CHAR_MAX])Return repr(self).
DAGRegressor.__setattr__
(name, value, /)Implement setattr(self, name, value).
DAGRegressor.__setstate__
(state)DAGRegressor.__sizeof__
()size of object in memory, in bytes
DAGRegressor.__str__
()Return str(self).
DAGRegressor.__subclasshook__
Abstract classes can override this to customize issubclass().
DAGRegressor._check_n_features
(X, reset)Set the n_features_in_ attribute, or check against it.
DAGRegressor._get_param_names
()Get parameter names for the estimator
DAGRegressor._get_tags
()DAGRegressor._more_tags
()DAGRegressor._repr_html_inner
()This function is returned by the @property _repr_html_ to make hasattr(estimator, “_repr_html_”) return `True or False depending on get_config()[“display”].
DAGRegressor._repr_mimebundle_
(**kwargs)Mime bundle used by jupyter kernels to display estimator
DAGRegressor._validate_data
(X[, y, reset, …])Validate input data and set or check the n_features_in_ attribute.
DAGRegressor.fit
(X, y)Fits the sm model using the concat of X and y.
DAGRegressor.get_edges_to_node
(name[, data])Get the edges to a specific node. :type name:
str
:param name: The name of the node which to get weights towards. :type data:str
:param data: The edge parameter to get. Default is “weight” to return the adjacency matrix. Set to “mean_effect” to return the signed average effect of features on the target node.DAGRegressor.get_params
([deep])Get parameters for this estimator.
DAGRegressor.plot_dag
([enforce_dag, …])Plot the DAG of the fitted model.
Uses the fitted NOTEARS algorithm to reconstruct y from known X data.
DAGRegressor.score
(X, y[, sample_weight])Return the coefficient of determination \(R^2\) of the prediction.
DAGRegressor.set_params
(**params)Set the parameters of this estimator.
-
__init__
(dist_type_schema=None, alpha=0.0, beta=0.0, fit_intercept=True, hidden_layer_units=None, threshold=0.0, tabu_edges=None, tabu_parent_nodes=None, tabu_child_nodes=None, dependent_target=True, enforce_dag=False, standardize=False, target_dist_type=None, notears_mlp_kwargs=None)¶ - Parameters
dist_type_schema (
Optional
[Dict
[Union
[str
,int
],str
]]) – The dist type schema corresponding to the X data passed to fit or predict.maps the pandas column name in X to the string alias of a dist type. (It) –
X is a np.ndarray (If) –
maps the positional index to the string alias of a dist type. (it) –
list of alias names can be found in dist_type/__init__.py. (A) –
None (If) –
that all data in X is continuous. (assumes) –
alpha (
float
) – l1 loss weighting. When using nonlinear layers this is only appliedthe first layer. (to) –
beta (
float
) – l2 loss weighting. Applied across all layers. Reccomended to use thisfitting nonlinearities. (when) –
fit_intercept (
bool
) – Whether to fit an intercept in the structure modelUse this if variables are offset. (equation.) –
hidden_layer_units (
Optional
[Iterable
[int
]]) – An iterable where its length determine the number of layers used,the numbers determine the number of nodes used for the layer in order. (and) –
threshold (
float
) – The thresholding to apply to the DAG weights.0.0 (If) –
not apply any threshold. (does) –
tabu_edges (
Optional
[List
]) – Tabu edges passed directly to the NOTEARS algorithm.tabu_parent_nodes (
Optional
[List
]) – Tabu nodes passed directly to the NOTEARS algorithm.tabu_child_nodes (
Optional
[List
]) – Tabu nodes passed directly to the NOTEARS algorithm.dependent_target (
bool
) – If True, constrains NOTEARS so that y can onlydependent (be) –
enforce_dag (
bool
) – If True, thresholds the graph until it is a DAG.a properly trained model should be a DAG (NOTE) –
failure (and) –
other issues. Use of this is only recommended if (indicates) –
have similar units (features) –
comparing edge weight (otherwise) –
has limited meaning. (magnitude) –
standardize (
bool
) – Whether to standardize the X and y variables before fitting.L-BFGS algorithm used to fit the underlying NOTEARS works best on data (The) –
of the same scale so this parameter is reccomended. (all) –
notears_mlp_kwargs (
Optional
[Dict
]) – Additional arguments for the NOTEARS MLP model.target_dist_type (
Optional
[str
]) – The distribution type of the target.the same aliases as dist_type_schema. (Uses) –
- Raises
TypeError – if alpha is not numeric.
TypeError – if beta is not numeric.
TypeError – if fit_intercept is not a bool.
TypeError – if threshold is not numeric.
NotImplementedError – if target_dist_type not in supported_types
-
property
coef_
¶ Signed relationship between features and the target. For this linear case this equivalent to linear regression coefficients. :rtype:
ndarray
:returns: the mean effect relationship between nodes.
-
property
feature_importances_
¶ Unsigned importances of the features wrt to the target. NOTE: these are used as the graph adjacency matrix. :rtype:
ndarray
:returns: the L2 relationship between nodes.
-
fit
(X, y)[source]¶ Fits the sm model using the concat of X and y.
- Raises
NotImplementedError – If unsupported _target_dist_type provided.
- Return type
DAGRegressor
- Returns
Instance of DAGRegressor.
-
get_edges_to_node
(name, data='weight')¶ Get the edges to a specific node. :type name:
str
:param name: The name of the node which to get weights towards. :type data:str
:param data: The edge parameter to get. Default is “weight” to returnthe adjacency matrix. Set to “mean_effect” to return the signed average effect of features on the target node.
- Return type
Series
- Returns
The specified edge data.
-
get_params
(deep=True)¶ Get parameters for this estimator.
- Parameters
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns
params – Parameter names mapped to their values.
- Return type
dict
-
property
intercept_
¶ The bias term from the target node
- Return type
float
-
plot_dag
(enforce_dag=False, plot_structure_kwargs=None, use_mpl=True, ax=None, pixel_size_in=0.01)¶ Plot the DAG of the fitted model. :type enforce_dag:
bool
:param enforce_dag: Whether to threshold the model until it is a DAG. :param Does not alter the underlying model.: :type ax:Optional
[Axes
] :param ax: Matplotlib axes to plot the model on. :param If None: :param creates axis.: :type pixel_size_in:float
:param pixel_size_in: Scaling multiple for the plot. :type plot_structure_kwargs:Optional
[Dict
] :param plot_structure_kwargs: Dictionary of kwargs for the causalnex plotting module. :type use_mpl:bool
:param use_mpl: Whether to use matplotlib as the backend. :param If False: :param ax and pixel_size_in are ignored.:- Return type
Union
[Tuple
[Figure
,Axes
],Image
]- Returns
Plot of the DAG.
-
predict
(X)¶ Uses the fitted NOTEARS algorithm to reconstruct y from known X data.
- Return type
ndarray
- Returns
Predicted y values for each row of X.
-
score
(X, y, sample_weight=None)¶ Return the coefficient of determination \(R^2\) of the prediction.
The coefficient \(R^2\) is defined as \((1 - \frac{u}{v})\), where \(u\) is the residual sum of squares
((y_true - y_pred) ** 2).sum()
and \(v\) is the total sum of squares((y_true - y_true.mean()) ** 2).sum()
. The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a \(R^2\) score of 0.0.- Parameters
X (array-like of shape (n_samples, n_features)) – Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead with shape
(n_samples, n_samples_fitted)
, wheren_samples_fitted
is the number of samples used in the fitting for the estimator.y (array-like of shape (n_samples,) or (n_samples, n_outputs)) – True values for X.
sample_weight (array-like of shape (n_samples,), default=None) – Sample weights.
- Returns
score – \(R^2\) of
self.predict(X)
wrt. y.- Return type
float
Notes
The \(R^2\) score used when calling
score
on a regressor usesmultioutput='uniform_average'
from version 0.23 to keep consistent with default value ofr2_score()
. This influences thescore
method of all the multioutput regressors (except forMultiOutputRegressor
).
-
set_params
(**params)¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters
**params (dict) – Estimator parameters.
- Returns
self – Estimator instance.
- Return type
estimator instance