skmultilearn.cluster.base module¶

class
skmultilearn.cluster.base.
GraphBuilderBase
[source]¶ Bases:
future.types.newobject.newobject
An abstract base class for a graph building class used in Label Space clustering.
Inherit it in your classifier according to`developer guide <../developer.ipynb>`_.

class
skmultilearn.cluster.base.
LabelCooccurrenceGraphBuilder
(weighted=None, include_self_edges=None, normalize_self_edges=None)[source]¶ Bases:
skmultilearn.cluster.base.GraphBuilderBase
Base class providing API and common functions for all label cooccurence based multilabel classifiers.
This graph builder constructs a Label Graph based on the output matrix where two label nodes are connected when at least one sample is labeled with both of them. If the graph is weighted, the weight of an edge between two label nodes is the number of samples labeled with these two labels. Selfedge weights contain the number of samples with a given label.
Parameters:  weighted (bool) – decide whether to generate a weighted or unweighted graph.
 include_self_edges (bool) – decide whether to include selfedge i.e. label 1  label 1 in cooccurrence graph
 normalize_self_edges (bool) – if including self edges, divide the (i, i) edge by 2.0, requires include_self_edges=True
References
If you use this graph builder please cite the clustering paper:
@Article{datadriven, author = {Szymański, Piotr and Kajdanowicz, Tomasz and Kersting, Kristian}, title = {How Is a DataDriven Approach Better than Random Choice in Label Space Division for MultiLabel Classification?}, journal = {Entropy}, volume = {18}, year = {2016}, number = {8}, article_number = {282}, url = {http://www.mdpi.com/10994300/18/8/282}, issn = {10994300}, doi = {10.3390/e18080282} }
Examples
A full example of building a modularitybased label space division based on the Label Cooccurrence Graph and classifying with a separate classifier chain per subspace.
from skmultilearn.cluster import LabelCooccurrenceGraphBuilder, NetworkXLabelGraphClusterer from skmultilearn.ensemble import LabelSpacePartitioningClassifier from skmultilearn.problem_transform import ClassifierChain from sklearn.naive_bayes import GaussianNB graph_builder = LabelCooccurrenceGraphBuilder(weighted=True, include_self_edges=False, normalize_self_edges=False) clusterer = NetworkXLabelGraphClusterer(graph_builder, method='louvain') classifier = LabelSpacePartitioningClassifier( classifier = ClassifierChain(classifier=GaussianNB()), clusterer = clusterer ) classifier.fit(X_train, y_train) prediction = classifier.predict(X_test)
For more use cases see the label relations exploration guide.

transform
(y)[source]¶ Generate adjacency matrix from label matrix
This function generates a weighted or unweighted cooccurence Label Graph adjacency matrix in dictionary of keys format based on input binary label vectors
Parameters: y (numpy.ndarray or scipy.sparse) – dense or sparse binary matrix with shape (n_samples, n_labels)
Returns: weight map with a tuple of label indexes as keys and a the number of samples in which the two cooccurred Return type: Dict[(int, int), float]

class
skmultilearn.cluster.base.
LabelGraphClustererBase
(graph_builder)[source]¶ Bases:
future.types.newobject.newobject
An abstract base class for Label Graph clustering
Inherit it in your classifier according to`developer guide <../developer.ipynb>`_.

class
skmultilearn.cluster.base.
LabelSpaceClustererBase
[source]¶ Bases:
sklearn.base.BaseEstimator
An abstract base class for Label Space clustering
Inherit it in your classifier according to`developer guide <../developer.ipynb>`_.