Binary Relevance¶
-
class
skmultilearn.problem_transform.
BinaryRelevance
(classifier=None, require_dense=None)[source]¶ Bases:
skmultilearn.base.problem_transformation.ProblemTransformationBase
Performs classification per label
Transforms a multi-label classification problem with L labels into L single-label separate binary classification problems using the same base classifier provided in the constructor. The prediction output is the union of all per label classifiers
Parameters: - classifier (
BaseEstimator
) – scikit-learn compatible base classifier - require_dense ([bool, bool], optional) – whether the base classifier requires dense representations
for input features and classes/labels matrices in fit/predict.
If value not provided, sparse representations are used if base classifier is
an instance of
MLClassifierBase
and dense otherwise.
-
partition_
¶ list of lists of label indexes, used to index the output space matrix, set in
_generate_partition()
viafit()
Type: List[List[int]], shape=(model_count_,)
-
classifiers_
¶ list of classifiers trained per partition, set in
fit()
Type: List[ BaseEstimator
] of shape model_count
Notes
Note
This is one of the most basic approaches to multi-label classification, it ignores relationships between labels.
Examples
An example use case for Binary Relevance classification with an
sklearn.svm.SVC
base classifier which supports sparse input:from skmultilearn.problem_transform import BinaryRelevance from sklearn.svm import SVC # initialize Binary Relevance multi-label classifier # with an SVM classifier # SVM in scikit only supports the X matrix in sparse representation classifier = BinaryRelevance( classifier = SVC(), require_dense = [False, True] ) # train classifier.fit(X_train, y_train) # predict predictions = classifier.predict(X_test)
Another way to use this classifier is to select the best scenario from a set of single-label classifiers used with Binary Relevance, this can be done using cross validation grid search. In the example below, the model with highest accuracy results is selected from either a
sklearn.naive_bayes.MultinomialNB
orsklearn.svm.SVC
base classifier, alongside with best parameters for that base classifier.from skmultilearn.problem_transform import BinaryRelevance from sklearn.model_selection import GridSearchCV from sklearn.naive_bayes import MultinomialNB from sklearn.svm import SVC parameters = [ { 'classifier': [MultinomialNB()], 'classifier__alpha': [0.7, 1.0], }, { 'classifier': [SVC()], 'classifier__kernel': ['rbf', 'linear'], }, ] clf = GridSearchCV(BinaryRelevance(), parameters, scoring='accuracy') clf.fit(x, y) print (clf.best_params_, clf.best_score_) # result: # # { # 'classifier': SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0, # decision_function_shape='ovr', degree=3, gamma='auto', kernel='linear', # max_iter=-1, probability=False, random_state=None, shrinking=True, # tol=0.001, verbose=False), 'classifier__kernel': 'linear' # } 0.17
-
fit
(X, y)[source]¶ Fits classifier to training data
Parameters: - X (array_like,
numpy.matrix
orscipy.sparse
matrix, shape=(n_samples, n_features)) – input feature matrix - y (array_like,
numpy.matrix
orscipy.sparse
matrix of {0, 1}, shape=(n_samples, n_labels)) – binary indicator matrix with label assignments
Returns: fitted instance of self
Return type: self
Notes
Note
Input matrices are converted to sparse format internally if a numpy representation is passed
- X (array_like,
-
predict
(X)[source]¶ Predict labels for X
Parameters: X (array_like, numpy.matrix
orscipy.sparse
matrix, shape=(n_samples, n_features)) – input feature matrixReturns: binary indicator matrix with label assignments Return type: scipy.sparse
matrix of {0, 1}, shape=(n_samples, n_labels)
-
predict_proba
(X)[source]¶ Predict probabilities of label assignments for X
Parameters: X (array_like, numpy.matrix
orscipy.sparse
matrix, shape=(n_samples, n_features)) – input feature matrixReturns: matrix with label assignment probabilities Return type: scipy.sparse
matrix of float in [0.0, 1.0], shape=(n_samples, n_labels)
- classifier (