causalnex.discretiser.MDLPSupervisedDiscretiserMethod

class causalnex.discretiser.MDLPSupervisedDiscretiserMethod(mdlp_args=None)[source]

Bases: causalnex.discretiser.abstract_discretiser.AbstractSupervisedDiscretiserMethod

Allows discretisation of continuous features using mdlp algorithm

Example:

 import pandas as pd
 import numpy as np
 from causalnex.discretiser.discretiser_strategy import MDLPSupervisedDiscretiserMethod
 from sklearn.datasets import load_iris
 iris = load_iris()
 X, y = iris["data"], iris["target"]
 names = iris["feature_names"]
 data = pd.DataFrame(X, columns=names)
 data["target"] = y
 discretiser = MDLPSupervisedDiscretiserMethod(
     {"min_depth": 0, "random_state": 2020, "min_split": 1e-3, "dtype": int}
 )
 discretiser.fit(
     feat_names=["sepal length (cm)"],
     dataframe=data,
     target="target",
     target_continuous=False,
 )
 discretised_data = discretiser.transform(data[["sepal length (cm)"]])
 discretised_data.values.ravel()

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 2, 2, 2, 1, 2, 1, 2, 0, 2, 0, 0, 2, 2, 2, 1, 2,
       1, 1, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 2, 0, 2, 2, 2,
       1, 1, 1, 2, 1, 0, 1, 1, 1, 2, 0, 1, 2, 1, 2, 2, 2, 2, 0, 2, 2, 2,
       2, 2, 2, 1, 1, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2])

Attributes

Methods

MDLPSupervisedDiscretiserMethod.__delattr__(name, /)

Implement delattr(self, name).

MDLPSupervisedDiscretiserMethod.__dir__()

default dir() implementation

MDLPSupervisedDiscretiserMethod.__eq__(value, /)

Return self==value.

MDLPSupervisedDiscretiserMethod.__format__

default object formatter

MDLPSupervisedDiscretiserMethod.__ge__(value, /)

Return self>=value.

MDLPSupervisedDiscretiserMethod.__getattribute__(name, /)

Return getattr(self, name).

MDLPSupervisedDiscretiserMethod.__getstate__()

MDLPSupervisedDiscretiserMethod.__gt__(value, /)

Return self>value.

MDLPSupervisedDiscretiserMethod.__hash__()

Return hash(self).

MDLPSupervisedDiscretiserMethod.__init__([…])

This method of discretisation applies MDLP to discretise the data

MDLPSupervisedDiscretiserMethod.__init_subclass__

This method is called when a class is subclassed.

MDLPSupervisedDiscretiserMethod.__le__(value, /)

Return self<=value.

MDLPSupervisedDiscretiserMethod.__lt__(value, /)

Return self<value.

MDLPSupervisedDiscretiserMethod.__ne__(value, /)

Return self!=value.

MDLPSupervisedDiscretiserMethod.__new__(**kwargs)

Create and return a new object.

MDLPSupervisedDiscretiserMethod.__reduce__

helper for pickle

MDLPSupervisedDiscretiserMethod.__reduce_ex__

helper for pickle

MDLPSupervisedDiscretiserMethod.__repr__([…])

Return repr(self).

MDLPSupervisedDiscretiserMethod.__setattr__(…)

Implement setattr(self, name, value).

MDLPSupervisedDiscretiserMethod.__setstate__(state)

MDLPSupervisedDiscretiserMethod.__sizeof__()

size of object in memory, in bytes

MDLPSupervisedDiscretiserMethod.__str__()

Return str(self).

MDLPSupervisedDiscretiserMethod.__subclasshook__

Abstract classes can override this to customize issubclass().

MDLPSupervisedDiscretiserMethod._check_n_features(X, …)

Set the n_features_in_ attribute, or check against it.

MDLPSupervisedDiscretiserMethod._get_param_names()

Get parameter names for the estimator

MDLPSupervisedDiscretiserMethod._get_tags()

MDLPSupervisedDiscretiserMethod._more_tags()

MDLPSupervisedDiscretiserMethod._repr_html_inner()

This function is returned by the @property _repr_html_ to make hasattr(estimator, “_repr_html_”) return `True or False depending on get_config()[“display”].

MDLPSupervisedDiscretiserMethod._repr_mimebundle_(…)

Mime bundle used by jupyter kernels to display estimator

MDLPSupervisedDiscretiserMethod._transform_one_column(…)

Given one “original” feature (continuous), discretise it.

MDLPSupervisedDiscretiserMethod._validate_data(X)

Validate input data and set or check the n_features_in_ attribute.

MDLPSupervisedDiscretiserMethod.fit(…)

The fit method allows MDLP to learn split thresholds from the input data.

MDLPSupervisedDiscretiserMethod.fit_transform(…)

raises NotImplementedError

fit_transform is not implemented

MDLPSupervisedDiscretiserMethod.get_params([deep])

Get parameters for this estimator.

MDLPSupervisedDiscretiserMethod.set_params(…)

Set the parameters of this estimator.

MDLPSupervisedDiscretiserMethod.transform(data)

Given one “original” dataframe, discretise it.

__init__(mdlp_args=None)[source]

This method of discretisation applies MDLP to discretise the data

Parameters
  • min_depth – The minimum depth of the interval splitting.

  • min_split – The minmum size to split a bin

  • dtype – The type of the array returned by the transform() method

  • **dlp_args – keyword arguments, which are parameters used for mdlp.discretization.MDLP

Raises

ImportError – if mdlp-discretization is not installed successfully

fit(feat_names, target, dataframe, target_continuous)[source]

The fit method allows MDLP to learn split thresholds from the input data. The target feature cannot be continuous

Parameters
  • feat_names (List[str]) – a list of feature to be discretised

  • target (str) – name of the variable that is going to be used a target for MDLP

  • dataframe (pd.DataFrame) – pandas dataframe of input data

  • target_continuous (bool) – boolean that indicates if target variable is continuous.

Returns

MDLPSupervisedDiscretiserMethod object with learned split thresholds from mdlp algorithm

Return type

self

Raises

ValueError – if the target is continuous

fit_transform(*args, **kwargs)
Raises

NotImplementedError – fit_transform is not implemented

get_params(deep=True)

Get parameters for this estimator.

Parameters

deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns

params – Parameter names mapped to their values.

Return type

dict

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters

**params (dict) – Estimator parameters.

Returns

self – Estimator instance.

Return type

estimator instance

transform(data)

Given one “original” dataframe, discretise it.

Parameters

data (DataFrame) – dataframe with continuous features, to be transformed into discrete

Return type

array

Returns

discretised version of the input data