Binary Relevance

Binary Relevance¶

class skmultilearn.problem_transform.BinaryRelevance(classifier=None, require_dense=None)[source]¶

Bases: skmultilearn.base.problem_transformation.ProblemTransformationBase

Performs classification per label

Transforms a multi-label classification problem with L labels into L single-label separate binary classification problems using the same base classifier provided in the constructor. The prediction output is the union of all per label classifiers

Parameters:	classifier (`BaseEstimator`) – scikit-learn compatible base classifier require_dense ([bool, bool], optional) – whether the base classifier requires dense representations for input features and classes/labels matrices in fit/predict. If value not provided, sparse representations are used if base classifier is an instance of `MLClassifierBase` and dense otherwise.

model_count_¶

number of trained models, in this classifier equal to n_labels

Type:	int

partition_¶

list of lists of label indexes, used to index the output space matrix, set in _generate_partition() via fit()

Type:	List[List[int]], shape=(model_count_,)

classifiers_¶

list of classifiers trained per partition, set in fit()

Type:	List[`BaseEstimator`] of shape model_count

Notes

Note

This is one of the most basic approaches to multi-label classification, it ignores relationships between labels.

Examples

An example use case for Binary Relevance classification with an sklearn.svm.SVC base classifier which supports sparse input:

from skmultilearn.problem_transform import BinaryRelevance
from sklearn.svm import SVC

# initialize Binary Relevance multi-label classifier
# with an SVM classifier
# SVM in scikit only supports the X matrix in sparse representation

classifier = BinaryRelevance(
    classifier = SVC(),
    require_dense = [False, True]
)

# train
classifier.fit(X_train, y_train)

# predict
predictions = classifier.predict(X_test)

Another way to use this classifier is to select the best scenario from a set of single-label classifiers used with Binary Relevance, this can be done using cross validation grid search. In the example below, the model with highest accuracy results is selected from either a sklearn.naive_bayes.MultinomialNB or sklearn.svm.SVC base classifier, alongside with best parameters for that base classifier.

from skmultilearn.problem_transform import BinaryRelevance
from sklearn.model_selection import GridSearchCV
from sklearn.naive_bayes import MultinomialNB
from sklearn.svm import SVC

parameters = [
    {
        'classifier': [MultinomialNB()],
        'classifier__alpha': [0.7, 1.0],
    },
    {
        'classifier': [SVC()],
        'classifier__kernel': ['rbf', 'linear'],
    },
]

clf = GridSearchCV(BinaryRelevance(), parameters, scoring='accuracy')
clf.fit(x, y)

print (clf.best_params_, clf.best_score_)

# result:
#
# {
#   'classifier': SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
#   decision_function_shape='ovr', degree=3, gamma='auto', kernel='linear',
#   max_iter=-1, probability=False, random_state=None, shrinking=True,
#   tol=0.001, verbose=False), 'classifier__kernel': 'linear'
# } 0.17

fit(X, y)[source]¶

Fits classifier to training data

Parameters:	X (array_like, `numpy.matrix` or `scipy.sparse` matrix, shape=(n_samples, n_features)) – input feature matrix y (array_like, `numpy.matrix` or `scipy.sparse` matrix of {0, 1}, shape=(n_samples, n_labels)) – binary indicator matrix with label assignments
Returns:	fitted instance of self
Return type:	self

Notes

Note

Input matrices are converted to sparse format internally if a numpy representation is passed

predict(X)[source]¶

Predict labels for X

Parameters:	X (array_like, `numpy.matrix` or `scipy.sparse` matrix, shape=(n_samples, n_features)) – input feature matrix
Returns:	binary indicator matrix with label assignments
Return type:	`scipy.sparse` matrix of {0, 1}, shape=(n_samples, n_labels)

predict_proba(X)[source]¶

Predict probabilities of label assignments for X

Parameters:	X (array_like, `numpy.matrix` or `scipy.sparse` matrix, shape=(n_samples, n_features)) – input feature matrix
Returns:	matrix with label assignment probabilities
Return type:	`scipy.sparse` matrix of float in [0.0, 1.0], shape=(n_samples, n_labels)