causalnex.discretiser.MDLPSupervisedDiscretiserMethod¶
-
class
causalnex.discretiser.
MDLPSupervisedDiscretiserMethod
(mdlp_args=None)[source]¶ Bases:
causalnex.discretiser.abstract_discretiser.AbstractSupervisedDiscretiserMethod
Allows discretisation of continuous features using mdlp algorithm
Example:
import pandas as pd import numpy as np from causalnex.discretiser.discretiser_strategy import MDLPSupervisedDiscretiserMethod from sklearn.datasets import load_iris iris = load_iris() X, y = iris["data"], iris["target"] names = iris["feature_names"] data = pd.DataFrame(X, columns=names) data["target"] = y discretiser = MDLPSupervisedDiscretiserMethod( {"min_depth": 0, "random_state": 2020, "min_split": 1e-3, "dtype": int} ) discretiser.fit( feat_names=["sepal length (cm)"], dataframe=data, target="target", target_continuous=False, ) discretised_data = discretiser.transform(data[["sepal length (cm)"]]) discretised_data.values.ravel() array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 2, 1, 2, 1, 2, 0, 2, 0, 0, 2, 2, 2, 1, 2, 1, 1, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 2, 0, 2, 2, 2, 1, 1, 1, 2, 1, 0, 1, 1, 1, 2, 0, 1, 2, 1, 2, 2, 2, 2, 0, 2, 2, 2, 2, 2, 2, 1, 1, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2])
Attributes
Methods
MDLPSupervisedDiscretiserMethod.__delattr__
(name, /)Implement delattr(self, name).
MDLPSupervisedDiscretiserMethod.__dir__
()Default dir() implementation.
MDLPSupervisedDiscretiserMethod.__eq__
(value, /)Return self==value.
MDLPSupervisedDiscretiserMethod.__format__
(…)Default object formatter.
MDLPSupervisedDiscretiserMethod.__ge__
(value, /)Return self>=value.
MDLPSupervisedDiscretiserMethod.__getattribute__
(name, /)Return getattr(self, name).
MDLPSupervisedDiscretiserMethod.__getstate__
()MDLPSupervisedDiscretiserMethod.__gt__
(value, /)Return self>value.
MDLPSupervisedDiscretiserMethod.__hash__
()Return hash(self).
This method of discretisation applies MDLP to discretise the data
MDLPSupervisedDiscretiserMethod.__init_subclass__
(…)Set the
set_{method}_request
methods.MDLPSupervisedDiscretiserMethod.__le__
(value, /)Return self<=value.
MDLPSupervisedDiscretiserMethod.__lt__
(value, /)Return self<value.
MDLPSupervisedDiscretiserMethod.__ne__
(value, /)Return self!=value.
MDLPSupervisedDiscretiserMethod.__new__
(**kwargs)Create and return a new object.
MDLPSupervisedDiscretiserMethod.__reduce__
()Helper for pickle.
MDLPSupervisedDiscretiserMethod.__reduce_ex__
(…)Helper for pickle.
MDLPSupervisedDiscretiserMethod.__repr__
([…])Return repr(self).
MDLPSupervisedDiscretiserMethod.__setattr__
(…)Implement setattr(self, name, value).
MDLPSupervisedDiscretiserMethod.__setstate__
(state)MDLPSupervisedDiscretiserMethod.__sizeof__
()Size of object in memory, in bytes.
MDLPSupervisedDiscretiserMethod.__sklearn_clone__
()MDLPSupervisedDiscretiserMethod.__str__
()Return str(self).
MDLPSupervisedDiscretiserMethod.__subclasshook__
Abstract classes can override this to customize issubclass().
MDLPSupervisedDiscretiserMethod._build_request_for_signature
(…)Build the MethodMetadataRequest for a method using its signature.
MDLPSupervisedDiscretiserMethod._check_feature_names
(X, …)Set or check the feature_names_in_ attribute.
MDLPSupervisedDiscretiserMethod._check_n_features
(X, …)Set the n_features_in_ attribute, or check against it.
MDLPSupervisedDiscretiserMethod._get_default_requests
()Collect default request values.
MDLPSupervisedDiscretiserMethod._get_metadata_request
()Get requested data properties.
MDLPSupervisedDiscretiserMethod._get_param_names
()Get parameter names for the estimator
MDLPSupervisedDiscretiserMethod._get_tags
()MDLPSupervisedDiscretiserMethod._more_tags
()MDLPSupervisedDiscretiserMethod._repr_html_inner
()This function is returned by the @property _repr_html_ to make hasattr(estimator, “_repr_html_”) return `True or False depending on get_config()[“display”].
MDLPSupervisedDiscretiserMethod._repr_mimebundle_
(…)Mime bundle used by jupyter kernels to display estimator
MDLPSupervisedDiscretiserMethod._transform_one_column
(…)Given one “original” feature (continuous), discretise it.
MDLPSupervisedDiscretiserMethod._validate_data
([…])Validate input data and set or check the n_features_in_ attribute.
MDLPSupervisedDiscretiserMethod._validate_params
()Validate types and values of constructor parameters
The fit method allows MDLP to learn split thresholds from the input data.
- raises NotImplementedError
fit_transform is not implemented
Get metadata routing of this object.
Get parameters for this estimator.
Request metadata passed to the
fit
method.Set the parameters of this estimator.
Request metadata passed to the
transform
method.Given one “original” dataframe, discretise it.
-
__init__
(mdlp_args=None)[source]¶ This method of discretisation applies MDLP to discretise the data
- Parameters
min_depth – The minimum depth of the interval splitting.
min_split – The minmum size to split a bin
dtype – The type of the array returned by the transform() method
**dlp_args – keyword arguments, which are parameters used for mdlp.discretization.MDLP
- Raises
ImportError – if mdlp-discretization is not installed successfully
-
fit
(feat_names, target, dataframe, target_continuous)[source]¶ The fit method allows MDLP to learn split thresholds from the input data. The target feature cannot be continuous
- Parameters
feat_names (List[str]) – a list of feature to be discretised
target (str) – name of the variable that is going to be used a target for MDLP
dataframe (pd.DataFrame) – pandas dataframe of input data
target_continuous (bool) – boolean that indicates if target variable is continuous.
- Returns
MDLPSupervisedDiscretiserMethod object with learned split thresholds from mdlp algorithm
- Return type
self
- Raises
ValueError – if the target is continuous
-
fit_transform
(*args, **kwargs)¶ - Raises
NotImplementedError – fit_transform is not implemented
-
get_metadata_routing
()¶ Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns
routing – A
MetadataRequest
encapsulating routing information.- Return type
MetadataRequest
-
get_params
(deep=True)¶ Get parameters for this estimator.
- Parameters
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns
params – Parameter names mapped to their values.
- Return type
dict
-
set_fit_request
(*, dataframe: Union[bool, None, str] = '$UNCHANGED$', feat_names: Union[bool, None, str] = '$UNCHANGED$', target: Union[bool, None, str] = '$UNCHANGED$', target_continuous: Union[bool, None, str] = '$UNCHANGED$') → causalnex.discretiser.discretiser_strategy.MDLPSupervisedDiscretiserMethod¶ Request metadata passed to the
fit
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed tofit
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it tofit
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.New in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
pipeline.Pipeline
. Otherwise it has no effect.- Parameters
dataframe (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
dataframe
parameter infit
.feat_names (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
feat_names
parameter infit
.target (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
target
parameter infit
.target_continuous (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
target_continuous
parameter infit
.
- Returns
self – The updated object.
- Return type
object
-
set_params
(**params)¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters
**params (dict) – Estimator parameters.
- Returns
self – Estimator instance.
- Return type
estimator instance
-
set_transform_request
(*, data: Union[bool, None, str] = '$UNCHANGED$') → causalnex.discretiser.discretiser_strategy.MDLPSupervisedDiscretiserMethod¶ Request metadata passed to the
transform
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed totransform
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it totransform
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.New in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
pipeline.Pipeline
. Otherwise it has no effect.- Parameters
data (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
data
parameter intransform
.- Returns
self – The updated object.
- Return type
object
-
transform
(data)¶ Given one “original” dataframe, discretise it.
- Parameters
data (
DataFrame
) – dataframe with continuous features, to be transformed into discrete- Return type
array
- Returns
discretised version of the input data