Multi-label embedding classification

Multi-label embedding classification¶

class skmultilearn.embedding.EmbeddingClassifier(embedder, regressor, classifier, regressor_per_dimension=False, require_dense=None)[source]¶

Bases: skmultilearn.base.problem_transformation.ProblemTransformationBase

Embedding-based classifier

Implements a general scheme presented in LNEMLC: label network embeddings for multi-label classification. The classifier embeds the label space with the embedder, trains a set of single-variate or a multi-variate regressor for embedding unseen cases and a base classifier to predict labels based on input features and the embeddings.

Parameters:

Parameters:	embedder (`BaseEstimator`) – the class to embed the label space regressor (`BaseEstimator`) – the base regressor to predict embeddings from input features classifier (`BaseEstimator`) – the base classifier to predict labels from input features and embeddings regressor_per_dimension (bool) – whether to train one joint multi-variate regressor (False) or per dimension single-variate regressor (True) require_dense ([bool, bool], optional) – whether the base classifier requires dense representations for input features and classes/labels matrices in fit/predict.

embedder (BaseEstimator) – the class to embed the label space
regressor (BaseEstimator) – the base regressor to predict embeddings from input features
classifier (BaseEstimator) – the base classifier to predict labels from input features and embeddings
regressor_per_dimension (bool) – whether to train one joint multi-variate regressor (False) or per dimension single-variate regressor (True)
require_dense ([bool, bool], optional) – whether the base classifier requires dense representations for input features and classes/labels matrices in fit/predict.

n_regressors_¶

number of trained regressors

Type:	int

partition_¶

list of lists of label indexes, used to index the output space matrix, set in _generate_partition() via fit()

Type:	List[List[int]], shape=(model_count_,)

classifiers_¶

list of classifiers trained per partition, set in fit()

Type:	List[`BaseEstimator`] of shape model_count

If you use this classifier please cite the relevant embedding method paper and the label network embedding for multi-label classification paper:

@article{zhang2007ml,
  title={ML-KNN: A lazy learning approach to multi-label learning},
  author={Zhang, Min-Ling and Zhou, Zhi-Hua},
  journal={Pattern recognition},
  volume={40},
  number={7},
  pages={2038--2048},
  year={2007},
  publisher={Elsevier}
}

Example

An example use case for EmbeddingClassifier:

from skmultilearn.embedding import SKLearnEmbedder, EmbeddingClassifier
from sklearn.manifold import SpectralEmbedding
from sklearn.ensemble import RandomForestRegressor
from skmultilearn.adapt import MLkNN

clf = EmbeddingClassifier(
    SKLearnEmbedder(SpectralEmbedding(n_components = 10)),
    RandomForestRegressor(n_estimators=10),
    MLkNN(k=5)
)

clf.fit(X_train, y_train)

predictions = clf.predict(X_test)

fit(X, y)[source]¶

Fits classifier to training data

Parameters:	X (array_like, `numpy.matrix` or `scipy.sparse` matrix, shape=(n_samples, n_features)) – input feature matrix y (array_like, `numpy.matrix` or `scipy.sparse` matrix of {0, 1}, shape=(n_samples, n_labels)) – binary indicator matrix with label assignments
Returns:	fitted instance of self
Return type:	self

predict(X)[source]¶

Predict labels for X

Parameters:	X (array_like, `numpy.matrix` or `scipy.sparse` matrix, shape=(n_samples, n_features)) – input feature matrix
Returns:	binary indicator matrix with label assignments
Return type:	`scipy.sparse` matrix of {0, 1}, shape=(n_samples, n_labels)

predict_proba(X)[source]¶

Predict probabilities of label assignments for X

Parameters:	X (array_like, `numpy.matrix` or `scipy.sparse` matrix, shape=(n_samples, n_features)) – input feature matrix
Returns:	matrix with label assignment probabilities
Return type:	`scipy.sparse` matrix of float in [0.0, 1.0], shape=(n_samples, n_labels)