Multi-label embedding classification

class skmultilearn.embedding.EmbeddingClassifier(embedder, regressor, classifier, regressor_per_dimension=False, require_dense=None)[source]

Bases: skmultilearn.base.problem_transformation.ProblemTransformationBase

Embedding-based classifier

Implements a general scheme presented in LNEMLC: label network embeddings for multi-label classification. The classifier embeds the label space with the embedder, trains a set of single-variate or a multi-variate regressor for embedding unseen cases and a base classifier to predict labels based on input features and the embeddings.

Parameters:
  • embedder (BaseEstimator) – the class to embed the label space
  • regressor (BaseEstimator) – the base regressor to predict embeddings from input features
  • classifier (BaseEstimator) – the base classifier to predict labels from input features and embeddings
  • regressor_per_dimension (bool) – whether to train one joint multi-variate regressor (False) or per dimension single-variate regressor (True)
  • require_dense ([bool, bool], optional) – whether the base classifier requires dense representations for input features and classes/labels matrices in fit/predict.
n_regressors_

number of trained regressors

Type:int
partition_

list of lists of label indexes, used to index the output space matrix, set in _generate_partition() via fit()

Type:List[List[int]], shape=(model_count_,)
classifiers_

list of classifiers trained per partition, set in fit()

Type:List[BaseEstimator] of shape model_count

If you use this classifier please cite the relevant embedding method paper and the label network embedding for multi-label classification paper:

@article{zhang2007ml,
  title={ML-KNN: A lazy learning approach to multi-label learning},
  author={Zhang, Min-Ling and Zhou, Zhi-Hua},
  journal={Pattern recognition},
  volume={40},
  number={7},
  pages={2038--2048},
  year={2007},
  publisher={Elsevier}
}

Example

An example use case for EmbeddingClassifier:

from skmultilearn.embedding import SKLearnEmbedder, EmbeddingClassifier
from sklearn.manifold import SpectralEmbedding
from sklearn.ensemble import RandomForestRegressor
from skmultilearn.adapt import MLkNN

clf = EmbeddingClassifier(
    SKLearnEmbedder(SpectralEmbedding(n_components = 10)),
    RandomForestRegressor(n_estimators=10),
    MLkNN(k=5)
)

clf.fit(X_train, y_train)

predictions = clf.predict(X_test)
fit(X, y)[source]

Fits classifier to training data

Parameters:
  • X (array_like, numpy.matrix or scipy.sparse matrix, shape=(n_samples, n_features)) – input feature matrix
  • y (array_like, numpy.matrix or scipy.sparse matrix of {0, 1}, shape=(n_samples, n_labels)) – binary indicator matrix with label assignments
Returns:

fitted instance of self

Return type:

self

predict(X)[source]

Predict labels for X

Parameters:X (array_like, numpy.matrix or scipy.sparse matrix, shape=(n_samples, n_features)) – input feature matrix
Returns:binary indicator matrix with label assignments
Return type:scipy.sparse matrix of {0, 1}, shape=(n_samples, n_labels)
predict_proba(X)[source]

Predict probabilities of label assignments for X

Parameters:X (array_like, numpy.matrix or scipy.sparse matrix, shape=(n_samples, n_features)) – input feature matrix
Returns:matrix with label assignment probabilities
Return type:scipy.sparse matrix of float in [0.0, 1.0], shape=(n_samples, n_labels)