causalnex.discretiser.MDLPSupervisedDiscretiserMethod

class causalnex.discretiser.MDLPSupervisedDiscretiserMethod(mdlp_args=None)[source]

Bases: causalnex.discretiser.abstract_discretiser.AbstractSupervisedDiscretiserMethod

Allows discretisation of continuous features using mdlp algorithm

Example:

 import pandas as pd
 import numpy as np
 from causalnex.discretiser.discretiser_strategy import MDLPSupervisedDiscretiserMethod
 from sklearn.datasets import load_iris
 iris = load_iris()
 X, y = iris["data"], iris["target"]
 names = iris["feature_names"]
 data = pd.DataFrame(X, columns=names)
 data["target"] = y
 discretiser = MDLPSupervisedDiscretiserMethod(
     {"min_depth": 0, "random_state": 2020, "min_split": 1e-3, "dtype": int}
 )
 discretiser.fit(
     feat_names=["sepal length (cm)"],
     dataframe=data,
     target="target",
     target_continuous=False,
 )
 discretised_data = discretiser.transform(data[["sepal length (cm)"]])
 discretised_data.values.ravel()

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 2, 2, 2, 1, 2, 1, 2, 0, 2, 0, 0, 2, 2, 2, 1, 2,
       1, 1, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 2, 0, 2, 2, 2,
       1, 1, 1, 2, 1, 0, 1, 1, 1, 2, 0, 1, 2, 1, 2, 2, 2, 2, 0, 2, 2, 2,
       2, 2, 2, 1, 1, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2])

Attributes

Methods

MDLPSupervisedDiscretiserMethod.__delattr__(name, /)

Implement delattr(self, name).

MDLPSupervisedDiscretiserMethod.__dir__()

Default dir() implementation.

MDLPSupervisedDiscretiserMethod.__eq__(value, /)

Return self==value.

MDLPSupervisedDiscretiserMethod.__format__(…)

Default object formatter.

MDLPSupervisedDiscretiserMethod.__ge__(value, /)

Return self>=value.

MDLPSupervisedDiscretiserMethod.__getattribute__(name, /)

Return getattr(self, name).

MDLPSupervisedDiscretiserMethod.__getstate__()

MDLPSupervisedDiscretiserMethod.__gt__(value, /)

Return self>value.

MDLPSupervisedDiscretiserMethod.__hash__()

Return hash(self).

MDLPSupervisedDiscretiserMethod.__init__([…])

This method of discretisation applies MDLP to discretise the data

MDLPSupervisedDiscretiserMethod.__init_subclass__(…)

Set the set_{method}_request methods.

MDLPSupervisedDiscretiserMethod.__le__(value, /)

Return self<=value.

MDLPSupervisedDiscretiserMethod.__lt__(value, /)

Return self<value.

MDLPSupervisedDiscretiserMethod.__ne__(value, /)

Return self!=value.

MDLPSupervisedDiscretiserMethod.__new__(**kwargs)

Create and return a new object.

MDLPSupervisedDiscretiserMethod.__reduce__()

Helper for pickle.

MDLPSupervisedDiscretiserMethod.__reduce_ex__(…)

Helper for pickle.

MDLPSupervisedDiscretiserMethod.__repr__([…])

Return repr(self).

MDLPSupervisedDiscretiserMethod.__setattr__(…)

Implement setattr(self, name, value).

MDLPSupervisedDiscretiserMethod.__setstate__(state)

MDLPSupervisedDiscretiserMethod.__sizeof__()

Size of object in memory, in bytes.

MDLPSupervisedDiscretiserMethod.__sklearn_clone__()

MDLPSupervisedDiscretiserMethod.__str__()

Return str(self).

MDLPSupervisedDiscretiserMethod.__subclasshook__

Abstract classes can override this to customize issubclass().

MDLPSupervisedDiscretiserMethod._build_request_for_signature(…)

Build the MethodMetadataRequest for a method using its signature.

MDLPSupervisedDiscretiserMethod._check_feature_names(X, …)

Set or check the feature_names_in_ attribute.

MDLPSupervisedDiscretiserMethod._check_n_features(X, …)

Set the n_features_in_ attribute, or check against it.

MDLPSupervisedDiscretiserMethod._get_default_requests()

Collect default request values.

MDLPSupervisedDiscretiserMethod._get_metadata_request()

Get requested data properties.

MDLPSupervisedDiscretiserMethod._get_param_names()

Get parameter names for the estimator

MDLPSupervisedDiscretiserMethod._get_tags()

MDLPSupervisedDiscretiserMethod._more_tags()

MDLPSupervisedDiscretiserMethod._repr_html_inner()

This function is returned by the @property _repr_html_ to make hasattr(estimator, “_repr_html_”) return `True or False depending on get_config()[“display”].

MDLPSupervisedDiscretiserMethod._repr_mimebundle_(…)

Mime bundle used by jupyter kernels to display estimator

MDLPSupervisedDiscretiserMethod._transform_one_column(…)

Given one “original” feature (continuous), discretise it.

MDLPSupervisedDiscretiserMethod._validate_data([…])

Validate input data and set or check the n_features_in_ attribute.

MDLPSupervisedDiscretiserMethod._validate_params()

Validate types and values of constructor parameters

MDLPSupervisedDiscretiserMethod.fit(…)

The fit method allows MDLP to learn split thresholds from the input data.

MDLPSupervisedDiscretiserMethod.fit_transform(…)

raises NotImplementedError

fit_transform is not implemented

MDLPSupervisedDiscretiserMethod.get_metadata_routing()

Get metadata routing of this object.

MDLPSupervisedDiscretiserMethod.get_params([deep])

Get parameters for this estimator.

MDLPSupervisedDiscretiserMethod.set_fit_request(*)

Request metadata passed to the fit method.

MDLPSupervisedDiscretiserMethod.set_params(…)

Set the parameters of this estimator.

MDLPSupervisedDiscretiserMethod.set_transform_request(*)

Request metadata passed to the transform method.

MDLPSupervisedDiscretiserMethod.transform(data)

Given one “original” dataframe, discretise it.

__init__(mdlp_args=None)[source]

This method of discretisation applies MDLP to discretise the data

Parameters
  • min_depth – The minimum depth of the interval splitting.

  • min_split – The minmum size to split a bin

  • dtype – The type of the array returned by the transform() method

  • **dlp_args – keyword arguments, which are parameters used for mdlp.discretization.MDLP

Raises

ImportError – if mdlp-discretization is not installed successfully

fit(feat_names, target, dataframe, target_continuous)[source]

The fit method allows MDLP to learn split thresholds from the input data. The target feature cannot be continuous

Parameters
  • feat_names (List[str]) – a list of feature to be discretised

  • target (str) – name of the variable that is going to be used a target for MDLP

  • dataframe (pd.DataFrame) – pandas dataframe of input data

  • target_continuous (bool) – boolean that indicates if target variable is continuous.

Returns

MDLPSupervisedDiscretiserMethod object with learned split thresholds from mdlp algorithm

Return type

self

Raises

ValueError – if the target is continuous

fit_transform(*args, **kwargs)
Raises

NotImplementedError – fit_transform is not implemented

get_metadata_routing()

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns

routing – A MetadataRequest encapsulating routing information.

Return type

MetadataRequest

get_params(deep=True)

Get parameters for this estimator.

Parameters

deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns

params – Parameter names mapped to their values.

Return type

dict

set_fit_request(*, dataframe: Union[bool, None, str] = '$UNCHANGED$', feat_names: Union[bool, None, str] = '$UNCHANGED$', target: Union[bool, None, str] = '$UNCHANGED$', target_continuous: Union[bool, None, str] = '$UNCHANGED$')causalnex.discretiser.discretiser_strategy.MDLPSupervisedDiscretiserMethod

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a pipeline.Pipeline. Otherwise it has no effect.

Parameters
  • dataframe (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for dataframe parameter in fit.

  • feat_names (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for feat_names parameter in fit.

  • target (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for target parameter in fit.

  • target_continuous (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for target_continuous parameter in fit.

Returns

self – The updated object.

Return type

object

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters

**params (dict) – Estimator parameters.

Returns

self – Estimator instance.

Return type

estimator instance

set_transform_request(*, data: Union[bool, None, str] = '$UNCHANGED$')causalnex.discretiser.discretiser_strategy.MDLPSupervisedDiscretiserMethod

Request metadata passed to the transform method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to transform if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to transform.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a pipeline.Pipeline. Otherwise it has no effect.

Parameters

data (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for data parameter in transform.

Returns

self – The updated object.

Return type

object

transform(data)

Given one “original” dataframe, discretise it.

Parameters

data (DataFrame) – dataframe with continuous features, to be transformed into discrete

Return type

array

Returns

discretised version of the input data