causalnex.structure.DAGRegressor¶
-
class
causalnex.structure.
DAGRegressor
(dist_type_schema=None, alpha=0.0, beta=0.0, fit_intercept=True, hidden_layer_units=None, threshold=0.0, tabu_edges=None, tabu_parent_nodes=None, tabu_child_nodes=None, dependent_target=True, enforce_dag=False, standardize=False, target_dist_type=None, notears_mlp_kwargs=None)[source]¶ Bases:
sklearn.base.RegressorMixin
,causalnex.structure.pytorch.sklearn._base.DAGBase
Regressor wrapper of the StructureModel. Implements the sklearn .fit and .predict interface.
Example:
from causalnex.sklearn import DAGRegressor reg = DAGRegressor(threshold=0.1) reg.fit(X_train, y_train) y_preds = reg.predict(X_test) type(y_preds) np.ndarray type(reg.feature_importances_) np.ndarray
.. attribute:: feature_importances_
An array of edge weights corresponding
- type
np.ndarray
-
positionally to the feature X.
-
coef_
¶ An array of edge weights corresponding
- Type
np.ndarray
-
positionally to the feature X.
-
intercept_
¶ The target node bias value.
- Type
float
Attributes
Signed relationship between features and the target.
Unsigned importances of the features wrt to the target.
The bias term from the target node
Methods
DAGRegressor.__delattr__
(name, /)Implement delattr(self, name).
DAGRegressor.__dir__
()Default dir() implementation.
DAGRegressor.__eq__
(value, /)Return self==value.
DAGRegressor.__format__
(format_spec, /)Default object formatter.
DAGRegressor.__ge__
(value, /)Return self>=value.
DAGRegressor.__getattribute__
(name, /)Return getattr(self, name).
DAGRegressor.__getstate__
()DAGRegressor.__gt__
(value, /)Return self>value.
DAGRegressor.__hash__
()Return hash(self).
DAGRegressor.__init__
([dist_type_schema, …])- type dist_type_schema
Optional
[Dict
[Union
[str
,int
],str
]]
DAGRegressor.__init_subclass__
(**kwargs)Set the
set_{method}_request
methods.DAGRegressor.__le__
(value, /)Return self<=value.
DAGRegressor.__lt__
(value, /)Return self<value.
DAGRegressor.__ne__
(value, /)Return self!=value.
DAGRegressor.__new__
(**kwargs)Create and return a new object.
DAGRegressor.__reduce__
()Helper for pickle.
DAGRegressor.__reduce_ex__
(protocol, /)Helper for pickle.
DAGRegressor.__repr__
([N_CHAR_MAX])Return repr(self).
DAGRegressor.__setattr__
(name, value, /)Implement setattr(self, name, value).
DAGRegressor.__setstate__
(state)DAGRegressor.__sizeof__
()Size of object in memory, in bytes.
DAGRegressor.__sklearn_clone__
()DAGRegressor.__str__
()Return str(self).
DAGRegressor.__subclasshook__
Abstract classes can override this to customize issubclass().
DAGRegressor._build_request_for_signature
(…)Build the MethodMetadataRequest for a method using its signature.
DAGRegressor._check_feature_names
(X, *, reset)Set or check the feature_names_in_ attribute.
DAGRegressor._check_n_features
(X, reset)Set the n_features_in_ attribute, or check against it.
DAGRegressor._get_default_requests
()Collect default request values.
DAGRegressor._get_metadata_request
()Get requested data properties.
DAGRegressor._get_param_names
()Get parameter names for the estimator
DAGRegressor._get_tags
()DAGRegressor._more_tags
()DAGRegressor._repr_html_inner
()This function is returned by the @property _repr_html_ to make hasattr(estimator, “_repr_html_”) return `True or False depending on get_config()[“display”].
DAGRegressor._repr_mimebundle_
(**kwargs)Mime bundle used by jupyter kernels to display estimator
DAGRegressor._validate_data
([X, y, reset, …])Validate input data and set or check the n_features_in_ attribute.
DAGRegressor._validate_params
()Validate types and values of constructor parameters
DAGRegressor.fit
(X, y)Fits the sm model using the concat of X and y.
DAGRegressor.get_edges_to_node
(name[, data])Get the edges to a specific node. :type name:
str
:param name: The name of the node which to get weights towards. :type data:str
:param data: The edge parameter to get. Default is “weight” to return the adjacency matrix. Set to “mean_effect” to return the signed average effect of features on the target node.Get metadata routing of this object.
DAGRegressor.get_params
([deep])Get parameters for this estimator.
DAGRegressor.plot_dag
(output_filename[, …])Plot the DAG of the fitted model. :type enforce_dag:
bool
:param enforce_dag: Whether to threshold the model until it is a DAG. :param Does not alter the underlying model.: :type plot_structure_kwargs:Optional
[Dict
[str
,Dict
]] :param plot_structure_kwargs: Dictionary of kwargs for the causalnex plotting module. :type layout_kwargs:Optional
[Dict
[str
,Dict
]] :param layout_kwargs: Dictionary to set the layout and physics of the graph. :param Example: :param ::: layout_kwargs = { “physics”: { “solver”: “repulsion” }, “layout”: { “hierarchical”: { “enabled”: True } } } :type output_filename:str
:param output_filename: If provided, write html to a given path, e.g. “./plot.html”.Uses the fitted NOTEARS algorithm to reconstruct y from known X data.
DAGRegressor.score
(X, y[, sample_weight])Return the coefficient of determination of the prediction.
DAGRegressor.set_params
(**params)Set the parameters of this estimator.
DAGRegressor.set_score_request
(*[, …])Request metadata passed to the
score
method.-
__init__
(dist_type_schema=None, alpha=0.0, beta=0.0, fit_intercept=True, hidden_layer_units=None, threshold=0.0, tabu_edges=None, tabu_parent_nodes=None, tabu_child_nodes=None, dependent_target=True, enforce_dag=False, standardize=False, target_dist_type=None, notears_mlp_kwargs=None)¶ - Parameters
dist_type_schema (
Optional
[Dict
[Union
[str
,int
],str
]]) – The dist type schema corresponding to the X data passed to fit or predict.maps the pandas column name in X to the string alias of a dist type. (It) –
X is a np.ndarray (If) –
maps the positional index to the string alias of a dist type. (it) –
list of alias names can be found in dist_type/__init__.py. (A) –
None (If) –
that all data in X is continuous. (assumes) –
alpha (
float
) – l1 loss weighting. When using nonlinear layers this is only appliedthe first layer. (to) –
beta (
float
) – l2 loss weighting. Applied across all layers. Reccomended to use thisfitting nonlinearities. (when) –
fit_intercept (
bool
) – Whether to fit an intercept in the structure modelUse this if variables are offset. (equation.) –
hidden_layer_units (
Optional
[Iterable
[int
]]) – An iterable where its length determine the number of layers used,the numbers determine the number of nodes used for the layer in order. (and) –
threshold (
float
) – The thresholding to apply to the DAG weights.0.0 (If) –
not apply any threshold. (does) –
tabu_edges (
Optional
[List
]) – Tabu edges passed directly to the NOTEARS algorithm.tabu_parent_nodes (
Optional
[List
]) – Tabu nodes passed directly to the NOTEARS algorithm.tabu_child_nodes (
Optional
[List
]) – Tabu nodes passed directly to the NOTEARS algorithm.dependent_target (
bool
) – If True, constrains NOTEARS so that y can onlydependent (be) –
enforce_dag (
bool
) – If True, thresholds the graph until it is a DAG.a properly trained model should be a DAG (NOTE) –
failure (and) –
other issues. Use of this is only recommended if (indicates) –
have similar units (features) –
comparing edge weight (otherwise) –
has limited meaning. (magnitude) –
standardize (
bool
) – Whether to standardize the X and y variables before fitting.L-BFGS algorithm used to fit the underlying NOTEARS works best on data (The) –
of the same scale so this parameter is reccomended. (all) –
notears_mlp_kwargs (
Optional
[Dict
]) – Additional arguments for the NOTEARS MLP model.target_dist_type (
Optional
[str
]) – The distribution type of the target.the same aliases as dist_type_schema. (Uses) –
- Raises
TypeError – if alpha is not numeric.
TypeError – if beta is not numeric.
TypeError – if fit_intercept is not a bool.
TypeError – if threshold is not numeric.
NotImplementedError – if target_dist_type not in supported_types
-
property
coef_
¶ Signed relationship between features and the target. For this linear case this equivalent to linear regression coefficients. :rtype:
ndarray
:returns: the mean effect relationship between nodes.
-
property
feature_importances_
¶ Unsigned importances of the features wrt to the target. NOTE: these are used as the graph adjacency matrix. :rtype:
ndarray
:returns: the L2 relationship between nodes.
-
fit
(X, y)[source]¶ Fits the sm model using the concat of X and y.
- Raises
NotImplementedError – If unsupported _target_dist_type provided.
- Return type
DAGRegressor
- Returns
Instance of DAGRegressor.
-
get_edges_to_node
(name, data='weight')¶ Get the edges to a specific node. :type name:
str
:param name: The name of the node which to get weights towards. :type data:str
:param data: The edge parameter to get. Default is “weight” to returnthe adjacency matrix. Set to “mean_effect” to return the signed average effect of features on the target node.
- Return type
Series
- Returns
The specified edge data.
-
get_metadata_routing
()¶ Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns
routing – A
MetadataRequest
encapsulating routing information.- Return type
MetadataRequest
-
get_params
(deep=True)¶ Get parameters for this estimator.
- Parameters
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns
params – Parameter names mapped to their values.
- Return type
dict
-
property
intercept_
¶ The bias term from the target node
- Return type
float
-
plot_dag
(output_filename, enforce_dag=False, plot_structure_kwargs=None, layout_kwargs=None)¶ Plot the DAG of the fitted model. :type enforce_dag:
bool
:param enforce_dag: Whether to threshold the model until it is a DAG. :param Does not alter the underlying model.: :type plot_structure_kwargs:Optional
[Dict
[str
,Dict
]] :param plot_structure_kwargs: Dictionary of kwargs for the causalnex plotting module. :type layout_kwargs:Optional
[Dict
[str
,Dict
]] :param layout_kwargs: Dictionary to set the layout and physics of the graph. :param Example: :param ::layout_kwargs = { "physics": { "solver": "repulsion" }, "layout": { "hierarchical": { "enabled": True } } }
- Parameters
output_filename (
str
) – If provided, write html to a given path, e.g. “./plot.html”- Return type
IFrame
- Returns
Plot of the DAG with the proper encoding to run on Windows machines.
-
predict
(X)¶ Uses the fitted NOTEARS algorithm to reconstruct y from known X data.
- Return type
ndarray
- Returns
Predicted y values for each row of X.
-
score
(X, y, sample_weight=None)¶ Return the coefficient of determination of the prediction.
The coefficient of determination \(R^2\) is defined as \((1 - \frac{u}{v})\), where \(u\) is the residual sum of squares
((y_true - y_pred)** 2).sum()
and \(v\) is the total sum of squares((y_true - y_true.mean()) ** 2).sum()
. The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a \(R^2\) score of 0.0.- Parameters
X (array-like of shape (n_samples, n_features)) – Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead with shape
(n_samples, n_samples_fitted)
, wheren_samples_fitted
is the number of samples used in the fitting for the estimator.y (array-like of shape (n_samples,) or (n_samples, n_outputs)) – True values for X.
sample_weight (array-like of shape (n_samples,), default=None) – Sample weights.
- Returns
score – \(R^2\) of
self.predict(X)
w.r.t. y.- Return type
float
Notes
The \(R^2\) score used when calling
score
on a regressor usesmultioutput='uniform_average'
from version 0.23 to keep consistent with default value ofr2_score()
. This influences thescore
method of all the multioutput regressors (except forMultiOutputRegressor
).
-
set_params
(**params)¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters
**params (dict) – Estimator parameters.
- Returns
self – Estimator instance.
- Return type
estimator instance
-
set_score_request
(*, sample_weight: Union[bool, None, str] = '$UNCHANGED$') → causalnex.structure.pytorch.sklearn.reg.DAGRegressor¶ Request metadata passed to the
score
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed toscore
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it toscore
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.New in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
pipeline.Pipeline
. Otherwise it has no effect.- Parameters
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weight
parameter inscore
.- Returns
self – The updated object.
- Return type
object