Network-based label space partition ensemble classification

class skmultilearn.ensemble.LabelSpacePartitioningClassifier(classifier=None, clusterer=None, require_dense=None)[source]

Bases: skmultilearn.problem_transform.br.BinaryRelevance

Partition label space and classify each subspace separately

This classifier performs classification by:

1. partitioning the label space into separate, smaller multi-label sub problems, using the supplied label space clusterer

  1. training an instance of the supplied base mult-label classifier for each label space subset in the partition
  2. predicting the result with each of subclassifiers and returning the sum of their results
Parameters:
  • classifier (BaseEstimator) – the base classifier that will be used in a class, will be automatically put under self.classifier.
  • clusterer (LabelSpaceClustererBase) – object that partitions the output space, will be automatically put under self.clusterer.
  • require_dense ([bool, bool]) – whether the base classifier requires [input, output] matrices in dense representation, will be automatically put under self.require_dense.
model_count_

number of trained models, in this classifier equal to the number of partitions

Type:int
partition_

list of lists of label indexes, used to index the output space matrix, set in _generate_partition() via fit()

Type:List[List[int]], shape=(model_count_,)
classifiers

list of classifiers trained per partition, set in fit()

Type:List[BaseEstimator], shape=(model_count_,)

References

If you use this clusterer please cite the clustering paper:

@Article{datadriven,
    author = {Szymański, Piotr and Kajdanowicz, Tomasz and Kersting, Kristian},
    title = {How Is a Data-Driven Approach Better than Random Choice in
    Label Space Division for Multi-Label Classification?},
    journal = {Entropy},
    volume = {18},
    year = {2016},
    number = {8},
    article_number = {282},
    url = {http://www.mdpi.com/1099-4300/18/8/282},
    issn = {1099-4300},
    doi = {10.3390/e18080282}
}

Examples

Here’s an example of building a partitioned ensemble of Classifier Chains

from skmultilearn.ensemble import MajorityVotingClassifier
from skmultilearn.cluster import FixedLabelSpaceClusterer
from skmultilearn.problem_transform import ClassifierChain
from sklearn.naive_bayes import GaussianNB

classifier = MajorityVotingClassifier(
    clusterer = FixedLabelSpaceClusterer(clusters = [[1,3,4], [0, 2, 5]]),
    classifier = ClassifierChain(classifier=GaussianNB())
)
classifier.fit(X_train,y_train)
predictions = classifier.predict(X_test)

More advanced examples can be found in the label relations exploration guide

predict(X)[source]

Predict labels for X

Parameters:X (numpy.ndarray or scipy.sparse.csc_matrix) – input features of shape (n_samples, n_features)
Returns:binary indicator matrix with label assignments with shape (n_samples, n_labels)
Return type:scipy.sparse of int