skmultilearn.cluster.networkx module¶
-
class
skmultilearn.cluster.
NetworkXLabelGraphClusterer
(graph_builder, method)[source]¶ Bases:
skmultilearn.cluster.base.LabelGraphClustererBase
Cluster label space with NetworkX community detection
This clusterer constructs a NetworkX representation of the Label Graph generated by graph builder and detects communities in it using methods from the NetworkX library. Detected communities are converted to a label space clustering.
Parameters: - graph_builder (a GraphBuilderBase inherited transformer) – the graph builder to provide the adjacency matrix and weight map for the underlying graph
- method (string) –
the community detection method to use, this clusterer supports the following community detection methods:
Method name string Description louvain Detecting communities with largest modularity using incremental greedy search label_propagation Detecting communities from multiple async label propagation on the graph
-
graph_
¶ the networkx Graph object containing the graph representation of graph builder’s adjacency matrix and weights
Type: networkx.Graph
-
weights_
¶ edge weights stored in a format recognizable by the networkx module
Type: { ‘weight’ : list of values in edge order of graph edges }
References
If you use this clusterer please cite the igraph paper and the clustering paper:
@unknown{networkx, author = {Hagberg, Aric and Swart, Pieter and S Chult, Daniel}, year = {2008}, month = {01}, title = {Exploring Network Structure, Dynamics, and Function Using NetworkX}, booktitle = {Proceedings of the 7th Python in Science Conference} } @article{blondel2008fast, title={Fast unfolding of communities in large networks}, author={Blondel, Vincent D and Guillaume, Jean-Loup and Lambiotte, Renaud and Lefebvre, Etienne}, journal={Journal of statistical mechanics: theory and experiment}, volume={2008}, number={10}, pages={P10008}, year={2008}, publisher={IOP Publishing} }
Examples
An example code for using this clusterer with a classifier looks like this:
from sklearn.ensemble import RandomForestClassifier from skmultilearn.problem_transform import LabelPowerset from skmultilearn.cluster import NetworkXLabelGraphClusterer, LabelCooccurrenceGraphBuilder from skmultilearn.ensemble import LabelSpacePartitioningClassifier # construct base forest classifier base_classifier = RandomForestClassifier(n_estimators=1000) # construct a graph builder that will include # label relations weighted by how many times they # co-occurred in the data, without self-edges graph_builder = LabelCooccurrenceGraphBuilder( weighted = True, include_self_edges = False ) # setup problem transformation approach with sparse matrices for random forest problem_transform_classifier = LabelPowerset(classifier=base_classifier, require_dense=[False, False]) # setup the clusterer to use, we selected the modularity-based approach clusterer = NetworkXLabelGraphClusterer(graph_builder=graph_builder, method='louvain') # setup the ensemble metaclassifier classifier = LabelSpacePartitioningClassifier(problem_transform_classifier, clusterer) # train classifier.fit(X_train, y_train) # predict predictions = classifier.predict(X_test)
For more use cases see the label relations exploration guide.
-
fit_predict
(X, y)[source]¶ Performs clustering on y and returns list of label lists
Builds a label graph using the provided graph builder’s transform method on y and then detects communities using the selected method.
Sets
self.weights_
andself.graph_
.Parameters: - X (None) – currently unused, left for scikit compatibility
- y (scipy.sparse) – label space of shape
(n_samples, n_labels)
Returns: label space division, each sublist represents labels that are in that community
Return type: arrray of arrays of label indexes (numpy.ndarray)