RAkELd: random label space partitioning with Label Powerset

RAkELd: random label space partitioning with Label Powerset¶

class skmultilearn.ensemble.RakelD(base_classifier=None, labelset_size=3, base_classifier_require_dense=None)[source]¶

Bases: skmultilearn.base.base.MLClassifierBase

Distinct RAndom k-labELsets multi-label classifier.

Divides the label space in to equal partitions of size k, trains a Label Powerset classifier per partition and predicts by summing the result of all trained classifiers.

Parameters:

Parameters:	base_classifier (sklearn.base) – the base classifier that will be used in a class, will be automatically put under `self.classifier` for future access. base_classifier_require_dense ([bool, bool]) – whether the base classifier requires [input, output] matrices in dense representation, will be automatically put under `self.require_dense` labelset_size (int) – the desired size of each of the partitions, parameter k according to paper Default is 3, according to paper it has the best results

base_classifier (sklearn.base) – the base classifier that will be used in a class, will be automatically put under self.classifier for future access.
base_classifier_require_dense ([bool, bool]) – whether the base classifier requires [input, output] matrices in dense representation, will be automatically put under self.require_dense
labelset_size (int) – the desired size of each of the partitions, parameter k according to paper Default is 3, according to paper it has the best results

_label_count¶

the number of labels the classifier is fit to, set by fit()

Type:	int

model_count_¶

the number of sub classifiers trained, set by fit()

Type:	int

classifier_¶

the underneath classifier that perform the label space partitioning using a random clusterer skmultilearn.ensemble.RandomLabelSpaceClusterer

Type:	`skmultilearn.ensemble.LabelSpacePartitioningClassifier`

References

If you use this class please cite the paper introducing the method:

@ARTICLE{5567103,
    author={G. Tsoumakas and I. Katakis and I. Vlahavas},
    journal={IEEE Transactions on Knowledge and Data Engineering},
    title={Random k-Labelsets for Multilabel Classification},
    year={2011},
    volume={23},
    number={7},
    pages={1079-1089},
    doi={10.1109/TKDE.2010.164},
    ISSN={1041-4347},
    month={July},
}

Examples

Here’s a simple example of how to use this class with a base classifier from scikit-learn to teach non-overlapping classifiers each trained on at most four labels:

from sklearn.naive_bayes import GaussianNB
from skmultilearn.ensemble import RakelD

classifier = RakelD(
    base_classifier=GaussianNB(),
    base_classifier_require_dense=[True, True],
    labelset_size=4
)

classifier.fit(X_train, y_train)
prediction = classifier.predict(X_train, y_train)

fit(X, y)[source]¶

Fit classifier to multi-label data

Parameters:	X (numpy.ndarray or scipy.sparse) – input features, can be a dense or sparse matrix of size `(n_samples, n_features)` y (numpy.ndaarray or scipy.sparse {0,1}) – binary indicator matrix with label assignments, shape `(n_samples, n_labels)`
Returns:
Return type:	fitted instance of self

predict(X)[source]¶

Predict label assignments

Parameters:	X (numpy.ndarray or scipy.sparse.csc_matrix) – input features of shape `(n_samples, n_features)`
Returns:	binary indicator matrix with label assignments with shape `(n_samples, n_labels)`
Return type:	scipy.sparse of int

predict_proba(X)[source]¶

Predict label probabilities

Parameters:	X (numpy.ndarray or scipy.sparse.csc_matrix) – input features of shape `(n_samples, n_features)`
Returns:	binary indicator matrix with probability of label assignment with shape `(n_samples, n_labels)`
Return type:	scipy.sparse of float