skmultilearn.cluster.fixed module

class skmultilearn.cluster.FixedLabelSpaceClusterer(clusters=None)[source]

Bases: skmultilearn.cluster.base.LabelSpaceClustererBase

Return a fixed label space partition

This clusterer takes a predefined fixed clustering of the label space and returns it in fit_predict as the label space division. This is useful for employing expert knowledge about label space division or partitions in ensemble classifiers such as: LabelSpacePartitioningClassifier or MajorityVotingClassifier.

Parameters:clusters (array of arrays of int) – provided partition of the label space in the for of numpy array of numpy arrays of indexes for each partition, ex. [[0,1],[2,3]]

An example use of the fixed clusterer with a label partitioning classifier to train randomforests for a set of subproblems defined upon expert knowledge:

from skmultilearn.ensemble import LabelSpacePartitioningClassifier
from skmultilearn.cluster import FixedLabelSpaceClusterer
from skmultilearn.problem_transform import LabelPowerset
from sklearn.ensemble import RandomForestClassifier

classifier = LabelSpacePartitioningClassifier(
    classifier = LabelPowerset(
        classifier=RandomForestClassifier(n_estimators=100),
        require_dense = [False, True]
    ),
    require_dense = [True, True],
    clusterer = FixedLabelSpaceClusterer(clustering=[[1,2,3], [0,4]])
)

# train
classifier.fit(X_train, y_train)

# predict
predictions = classifier.predict(X_test)
fit_predict(X, y)[source]

Returns the provided label space division

Parameters:
  • X (None) – currently unused, left for scikit compatibility
  • y (scipy.sparse) – label space of shape (n_samples, n_labels)
Returns:

label space division, each sublist represents labels that are in that community

Return type:

arrray of arrays of label indexes (numpy.ndarray)