skmultilearn.problem_transform package

The skmultilearn.problem_transform module provides classifiers that follow the problem transformation approaches to multi-label classification:

  • BinaryRelevance - treats each label as a separate single-class classification problem
  • ClassifierChain- treats each label as a part of a conditioned chain of single-class classification problems
  • LabelPowerset - treats each label combination as a separate class with one multi-class classification problem
class skmultilearn.problem_transform.BinaryRelevance(classifier=None, require_dense=None)[source]

Bases: skmultilearn.base.problem_transformation.ProblemTransformationBase

Binary Relevance Multi-Label Classifier.

Transforms a multi-label classification problem with L labels into L single-label separate binary classification problems using the same base classifier provided in the constructor. The prediction output is the union of all per label classifiers.

Parameters:
  • classifier (sklearn.base.BaseEstimator or compatible) – clonable scikit-compatible base classifier
  • require_dense ([bool, bool]) – whether the base classifier requires dense representations for input features and classes/labels matrices in fit/predict.
BRIEFNAME = 'BR'
fit(X, y)[source]

Fit classifier with training data

Internally this method uses a sparse CSR representation for X (scipy.sparse.csr_matrix) and sparse CSC representation for y (scipy.sparse.csc_matrix).

Parameters:
  • X (dense or sparse matrix (n_samples, n_features)) – input features
  • y (dense or sparse matrix of {0, 1} (n_samples, n_labels)) – binary indicator matrix with label assignments
Returns:

Fitted instance of self

generate_partition(X, y)[source]

Partitions the label space into singletons

Parameters:
  • X – not used
  • y (matrix or sparse matrix) – binary indicator matrix with label assignments - only used for learning # of labels

Sets self.partition (list of single item lists) and self.model_count (equal to number of labels)

predict(X)[source]

Predict labels for X

Internally this method uses a sparse CSR representation for X (scipy.sparse.coo_matrix).

Parameters:X (dense or sparse matrix (n_samples, n_features)) – input features
Returns:binary indicator matrix with label assignments
Return type:sparse matrix of int (n_samples, n_labels)
predict_proba(X)[source]

Predict probabilities of label assignments for X

Internally this method uses a sparse CSR representation for X (scipy.sparse.coo_matrix).

Parameters:X (dense or sparse matrix (n_samples, n_labels)) – input features
Returns:matrix with label assignment probabilities
Return type:sparse matrix of float (n_samples, n_labels)
class skmultilearn.problem_transform.ClassifierChain(classifier=None, require_dense=None)[source]

Bases: skmultilearn.base.problem_transformation.ProblemTransformationBase

Classifier Chains Multi-Label Classifier.

This class provides implementation of Jesse Read’s problem transformation method called Classifier Chains. For L labels it trains L classifiers ordered in a chain according to the Bayesian chain rule. The first classifier is trained just on the input space, and then each next classifier is trained on the input space and all previous classifiers in the chain.

The default classifier chains follow the same ordering as provided in the training set, i.e. label in column 0, then 1, etc.

You can find more information about this method in Jesse Read’s ECML presentation or journal paper.

BRIEFNAME = 'CC'
fit(X, y)[source]

Fit classifier with training data

Internally this method uses a sparse CSC representation (scipy.sparse.csc_matrix) of the X & y matrices.

Parameters:
  • X (dense or sparse matrix (n_samples, n_features)) – input features
  • y (dense or sparse matrix of {0, 1} (n_samples, n_labels)) – binary indicator matrix with label assignments
Returns:

Fitted instance of self

predict(X)[source]

Predict labels for X

Internally this method uses a sparse CSC representation (scipy.sparse.csc_matrix) of the X matrix.

Parameters:X (dense or sparse matrix (n_samples, n_features)) – input features
Returns:binary indicator matrix with label assignments
Return type:sparse matrix of int (n_samples, n_labels)
predict_proba(X)[source]

Predict probabilities of label assignments for X

Internally this method uses a sparse CSC representation (scipy.sparse.csc_matrix) of the X matrix.

Parameters:X (dense or sparse matrix (n_samples, n_labels)) – input features
Returns:matrix with label assignment probabilities
Return type:sparse matrix of float (n_samples, n_labels)
class skmultilearn.problem_transform.LabelPowerset(classifier=None, require_dense=None)[source]

Bases: skmultilearn.base.problem_transformation.ProblemTransformationBase

Label Powerset Multi-Label Classifier.

Label Powerset is a problem transformation approach to multi-label classification that transforms a multi-label problem to a multi-class problem with 1 multi-class classifier trained on all unique label combinations found in the training data.

More information about this method can be found in an introduction to multi-label classification by Tsoumakas et. al.

BRIEFNAME = 'LP'
clean()[source]

Reset classifier internals before refitting

fit(X, y)[source]

Fit classifier with training data

Internally this method uses a sparse CSR representation (scipy.sparse.csr_matrix) of the X matrix and a sparse LIL representation (scipy.sparse.lil_matrix).

Parameters:
  • X (dense or sparse matrix (n_samples, n_features)) – input features
  • y (dense or sparse matrix of {0, 1} (n_samples, n_labels)) – binary indicator matrix with label assignments
Returns:

Fitted instance of self

inverse_transform(y)[source]

Transforms multi-class assignment to multi-label

Transforms a mutli-label problem into a single-label multi-class problem where each label combination is a separate class.

Parameters:y (nd.array) – of size n_samples multi-class output space as transformed by transform()
Returns:assignments following the label combinations of the original multi-label classification problem
Return type:sparse indicator matrix of shape (n_samples, n_labels) of {0,1}
predict(X)[source]

Predict labels for X

Internally this method uses a sparse CSR representation for X (scipy.sparse.csr_matrix).

Parameters:X (dense or sparse matrix (n_samples, n_features)) – input features
Returns:binary indicator matrix with label assignments
Return type:sparse matrix of int (n_samples, n_labels)
predict_proba(X)[source]

Predict probabilities of label assignments for X

Internally this method uses a sparse CSR representation for X (scipy.sparse.csr_matrix).

Parameters:X (dense or sparse matrix (n_samples, n_labels)) – input features
Returns:matrix with label assignment probabilities
Return type:sparse matrix of float (n_samples, n_labels)
transform(y)[source]

Transform multi-label output space to multi-class

Transforms a mutli-label problem into a single-label multi-class problem where each label combination is a separate class.

Parameters:y (matrix) – output space of shape (n_samples, n_labels) of {0,1} of a multi-label classification problem
Returns:a numpy array with multi-class output space problem
Return type:nd.array of size n_samples of int

Submodules

skmultilearn.problem_transform.br module

class skmultilearn.problem_transform.br.BinaryRelevance(classifier=None, require_dense=None)[source]

Bases: skmultilearn.base.problem_transformation.ProblemTransformationBase

Binary Relevance Multi-Label Classifier.

Transforms a multi-label classification problem with L labels into L single-label separate binary classification problems using the same base classifier provided in the constructor. The prediction output is the union of all per label classifiers.

Parameters:
  • classifier (sklearn.base.BaseEstimator or compatible) – clonable scikit-compatible base classifier
  • require_dense ([bool, bool]) – whether the base classifier requires dense representations for input features and classes/labels matrices in fit/predict.
BRIEFNAME = 'BR'
fit(X, y)[source]

Fit classifier with training data

Internally this method uses a sparse CSR representation for X (scipy.sparse.csr_matrix) and sparse CSC representation for y (scipy.sparse.csc_matrix).

Parameters:
  • X (dense or sparse matrix (n_samples, n_features)) – input features
  • y (dense or sparse matrix of {0, 1} (n_samples, n_labels)) – binary indicator matrix with label assignments
Returns:

Fitted instance of self

generate_partition(X, y)[source]

Partitions the label space into singletons

Parameters:
  • X – not used
  • y (matrix or sparse matrix) – binary indicator matrix with label assignments - only used for learning # of labels

Sets self.partition (list of single item lists) and self.model_count (equal to number of labels)

predict(X)[source]

Predict labels for X

Internally this method uses a sparse CSR representation for X (scipy.sparse.coo_matrix).

Parameters:X (dense or sparse matrix (n_samples, n_features)) – input features
Returns:binary indicator matrix with label assignments
Return type:sparse matrix of int (n_samples, n_labels)
predict_proba(X)[source]

Predict probabilities of label assignments for X

Internally this method uses a sparse CSR representation for X (scipy.sparse.coo_matrix).

Parameters:X (dense or sparse matrix (n_samples, n_labels)) – input features
Returns:matrix with label assignment probabilities
Return type:sparse matrix of float (n_samples, n_labels)

skmultilearn.problem_transform.cc module

class skmultilearn.problem_transform.cc.ClassifierChain(classifier=None, require_dense=None)[source]

Bases: skmultilearn.base.problem_transformation.ProblemTransformationBase

Classifier Chains Multi-Label Classifier.

This class provides implementation of Jesse Read’s problem transformation method called Classifier Chains. For L labels it trains L classifiers ordered in a chain according to the Bayesian chain rule. The first classifier is trained just on the input space, and then each next classifier is trained on the input space and all previous classifiers in the chain.

The default classifier chains follow the same ordering as provided in the training set, i.e. label in column 0, then 1, etc.

You can find more information about this method in Jesse Read’s ECML presentation or journal paper.

BRIEFNAME = 'CC'
fit(X, y)[source]

Fit classifier with training data

Internally this method uses a sparse CSC representation (scipy.sparse.csc_matrix) of the X & y matrices.

Parameters:
  • X (dense or sparse matrix (n_samples, n_features)) – input features
  • y (dense or sparse matrix of {0, 1} (n_samples, n_labels)) – binary indicator matrix with label assignments
Returns:

Fitted instance of self

predict(X)[source]

Predict labels for X

Internally this method uses a sparse CSC representation (scipy.sparse.csc_matrix) of the X matrix.

Parameters:X (dense or sparse matrix (n_samples, n_features)) – input features
Returns:binary indicator matrix with label assignments
Return type:sparse matrix of int (n_samples, n_labels)
predict_proba(X)[source]

Predict probabilities of label assignments for X

Internally this method uses a sparse CSC representation (scipy.sparse.csc_matrix) of the X matrix.

Parameters:X (dense or sparse matrix (n_samples, n_labels)) – input features
Returns:matrix with label assignment probabilities
Return type:sparse matrix of float (n_samples, n_labels)

skmultilearn.problem_transform.lp module

class skmultilearn.problem_transform.lp.LabelPowerset(classifier=None, require_dense=None)[source]

Bases: skmultilearn.base.problem_transformation.ProblemTransformationBase

Label Powerset Multi-Label Classifier.

Label Powerset is a problem transformation approach to multi-label classification that transforms a multi-label problem to a multi-class problem with 1 multi-class classifier trained on all unique label combinations found in the training data.

More information about this method can be found in an introduction to multi-label classification by Tsoumakas et. al.

BRIEFNAME = 'LP'
clean()[source]

Reset classifier internals before refitting

fit(X, y)[source]

Fit classifier with training data

Internally this method uses a sparse CSR representation (scipy.sparse.csr_matrix) of the X matrix and a sparse LIL representation (scipy.sparse.lil_matrix).

Parameters:
  • X (dense or sparse matrix (n_samples, n_features)) – input features
  • y (dense or sparse matrix of {0, 1} (n_samples, n_labels)) – binary indicator matrix with label assignments
Returns:

Fitted instance of self

inverse_transform(y)[source]

Transforms multi-class assignment to multi-label

Transforms a mutli-label problem into a single-label multi-class problem where each label combination is a separate class.

Parameters:y (nd.array) – of size n_samples multi-class output space as transformed by transform()
Returns:assignments following the label combinations of the original multi-label classification problem
Return type:sparse indicator matrix of shape (n_samples, n_labels) of {0,1}
predict(X)[source]

Predict labels for X

Internally this method uses a sparse CSR representation for X (scipy.sparse.csr_matrix).

Parameters:X (dense or sparse matrix (n_samples, n_features)) – input features
Returns:binary indicator matrix with label assignments
Return type:sparse matrix of int (n_samples, n_labels)
predict_proba(X)[source]

Predict probabilities of label assignments for X

Internally this method uses a sparse CSR representation for X (scipy.sparse.csr_matrix).

Parameters:X (dense or sparse matrix (n_samples, n_labels)) – input features
Returns:matrix with label assignment probabilities
Return type:sparse matrix of float (n_samples, n_labels)
transform(y)[source]

Transform multi-label output space to multi-class

Transforms a mutli-label problem into a single-label multi-class problem where each label combination is a separate class.

Parameters:y (matrix) – output space of shape (n_samples, n_labels) of {0,1} of a multi-label classification problem
Returns:a numpy array with multi-class output space problem
Return type:nd.array of size n_samples of int