skmultilearn.cluster.graphtool module

class skmultilearn.cluster.StochasticBlockModel(nested, use_degree_correlation, allow_overlap, weight_model)[source]

Bases: object

A Stochastic Blockmodel fit to Label Graph

This contains a stochastic block model instance constructed for a block model variant specified in parameters. It can be fit to an instance of a graph and set of weights. More information on how to select parameters can be found in the extensive introduction into Stochastic Block Models in graphtool documentation.

  • nested (boolean) – whether to build a nested Stochastic Block Model or the regular variant, will be automatically put under self.nested.
  • use_degree_correlation (boolean) – whether to correct for degree correlation in modeling, will be automatically put under self.use_degree_correlation.
  • allow_overlap (boolean) – whether to allow overlapping clusters or not, will be automatically put under self.allow_overlap.
  • weight_model (string or None) – decide whether to generate a weighted or unweighted graph, will be automatically put under self.weight_model.

an instance of the fitted model obtained from graph-tool

Type:graph_tool.inference.BlockState or its subclass
fit_predict(graph, weights)[source]

Fits model to a given graph and weights list

Sets self.model_ to the state of graphtool’s Stochastic Block Model the after fitting.


the graph to fit the model to


the property map: edge -> weight (double) to fit the model to, if weighted variant is selected

Returns:partition of labels, each sublist contains label indices related to label positions in y
Return type:numpy.ndarray
class skmultilearn.cluster.GraphToolLabelGraphClusterer(graph_builder, model)[source]

Bases: skmultilearn.cluster.base.LabelGraphClustererBase

Fits a Stochastic Block Model to the Label Graph and infers the communities

This clusterer clusters the label space using by fitting a stochastic block model to the label network and inferring the community structure using graph-tool. The obtained community structure is returned as the label clustering. More information on the inference itself can be found in the extensive introduction into Stochastic Block Models in graphtool documentation.

  • graph_builder (a GraphBuilderBase inherited transformer) – the graph builder to provide the adjacency matrix and weight map for the underlying graph
  • model (StochasticBlockModel) – the desired stochastic block model variant to use

object representing a label co-occurence graph


edge weights defined by graph builder stored in a graphtool compatible format



This functionality is still undergoing research.


This clusterer is GPL-licenced and will taint your code with GPL restrictions.


If you use this class please cite:


An example code for using this clusterer with a classifier looks like this:

from sklearn.ensemble import RandomForestClassifier
from skmultilearn.problem_transform import LabelPowerset
from skmultilearn.cluster import IGraphLabelGraphClusterer, LabelCooccurrenceGraphBuilder
from skmultilearn.ensemble import LabelSpacePartitioningClassifier

# construct base forest classifier
base_classifier = RandomForestClassifier(n_estimators=1000)

# construct a graph builder that will include
# label relations weighted by how many times they
# co-occurred in the data, without self-edges
graph_builder = LabelCooccurrenceGraphBuilder(
    weighted = True,
    include_self_edges = False

# select parameters for the model, we fit a flat,
# non-degree correlated, partitioning model
# which will use fit the normal distribution as the weights model
model = StochasticBlockModel(

# setup problem transformation approach with sparse matrices for random forest
problem_transform_classifier = LabelPowerset(classifier=base_classifier,
    require_dense=[False, False])

# setup the clusterer to use, we selected the fast greedy modularity-maximization approach
clusterer = GraphToolLabelGraphClusterer(graph_builder=graph_builder, model=model)

# setup the ensemble metaclassifier
classifier = LabelSpacePartitioningClassifier(problem_transform_classifier, clusterer)

# train, y_train)

# predict
predictions = classifier.predict(X_test)

For more use cases see the label relations exploration guide.

fit_predict(X, y)[source]

Performs clustering on y and returns list of label lists

Builds a label graph using the provided graph builder’s transform method on y and then detects communities using the selected method.

Sets self.weights_ and self.graph_.

  • X (None) – currently unused, left for scikit compatibility
  • y (scipy.sparse) – label space of shape (n_samples, n_labels)

label space division, each sublist represents labels that are in that community

Return type:

arrray of arrays of label indexes (numpy.ndarray)