Multi-label embedding classification¶
-
class
skmultilearn.embedding.
EmbeddingClassifier
(embedder, regressor, classifier, regressor_per_dimension=False, require_dense=None)[source]¶ Bases:
skmultilearn.base.problem_transformation.ProblemTransformationBase
Embedding-based classifier
Implements a general scheme presented in LNEMLC: label network embeddings for multi-label classification. The classifier embeds the label space with the embedder, trains a set of single-variate or a multi-variate regressor for embedding unseen cases and a base classifier to predict labels based on input features and the embeddings.
Parameters: - embedder (
BaseEstimator
) – the class to embed the label space - regressor (
BaseEstimator
) – the base regressor to predict embeddings from input features - classifier (
BaseEstimator
) – the base classifier to predict labels from input features and embeddings - regressor_per_dimension (bool) – whether to train one joint multi-variate regressor (False) or per dimension single-variate regressor (True)
- require_dense ([bool, bool], optional) – whether the base classifier requires dense representations for input features and classes/labels matrices in fit/predict.
-
partition_
¶ list of lists of label indexes, used to index the output space matrix, set in
_generate_partition()
viafit()
Type: List[List[int]], shape=(model_count_,)
-
classifiers_
¶ list of classifiers trained per partition, set in
fit()
Type: List[ BaseEstimator
] of shape model_count
If you use this classifier please cite the relevant embedding method paper and the label network embedding for multi-label classification paper:
@article{zhang2007ml, title={ML-KNN: A lazy learning approach to multi-label learning}, author={Zhang, Min-Ling and Zhou, Zhi-Hua}, journal={Pattern recognition}, volume={40}, number={7}, pages={2038--2048}, year={2007}, publisher={Elsevier} }
Example
An example use case for EmbeddingClassifier:
from skmultilearn.embedding import SKLearnEmbedder, EmbeddingClassifier from sklearn.manifold import SpectralEmbedding from sklearn.ensemble import RandomForestRegressor from skmultilearn.adapt import MLkNN clf = EmbeddingClassifier( SKLearnEmbedder(SpectralEmbedding(n_components = 10)), RandomForestRegressor(n_estimators=10), MLkNN(k=5) ) clf.fit(X_train, y_train) predictions = clf.predict(X_test)
-
fit
(X, y)[source]¶ Fits classifier to training data
Parameters: - X (array_like,
numpy.matrix
orscipy.sparse
matrix, shape=(n_samples, n_features)) – input feature matrix - y (array_like,
numpy.matrix
orscipy.sparse
matrix of {0, 1}, shape=(n_samples, n_labels)) – binary indicator matrix with label assignments
Returns: fitted instance of self
Return type: self
- X (array_like,
-
predict
(X)[source]¶ Predict labels for X
Parameters: X (array_like, numpy.matrix
orscipy.sparse
matrix, shape=(n_samples, n_features)) – input feature matrixReturns: binary indicator matrix with label assignments Return type: scipy.sparse
matrix of {0, 1}, shape=(n_samples, n_labels)
-
predict_proba
(X)[source]¶ Predict probabilities of label assignments for X
Parameters: X (array_like, numpy.matrix
orscipy.sparse
matrix, shape=(n_samples, n_features)) – input feature matrixReturns: matrix with label assignment probabilities Return type: scipy.sparse
matrix of float in [0.0, 1.0], shape=(n_samples, n_labels)
- embedder (