Multilabel k Nearest Neighbours¶
-
class
skmultilearn.adapt.
MLTSVM
(c_k=0, sor_omega=1.0, threshold=1e-06, lambda_param=1.0, max_iteration=500)[source]¶ Twin multi-Label Support Vector Machines
Parameters: - c_k (int) – the empirical risk penalty parameter that determines the trade-off between the loss terms
- sor_omega (float (default is 1.0)) – the smoothing parameter
- threshold (int (default is 1e-6)) – threshold above which a label should be assigned
- lambda_param (float (default is 1.0)) – the regularization parameter
- max_iteration (int (default is 500)) – maximum number of iterations to use in successive overrelaxation
References
If you use this classifier please cite the original paper introducing the method:
@article{chen2016mltsvm, title={MLTSVM: a novel twin support vector machine to multi-label learning}, author={Chen, Wei-Jie and Shao, Yuan-Hai and Li, Chun-Na and Deng, Nai-Yang}, journal={Pattern Recognition}, volume={52}, pages={61--74}, year={2016}, publisher={Elsevier} }
Examples
Here’s a very simple example of using MLTSVM with a fixed number of neighbors:
from skmultilearn.adapt import MLTSVM classifier = MLTSVM(c_k = 2**-1) # train classifier.fit(X_train, y_train) # predict predictions = classifier.predict(X_test)
You can also use
GridSearchCV
to find an optimal set of parameters:from skmultilearn.adapt import MLTSVM from sklearn.model_selection import GridSearchCV parameters = {'c_k': [2**i for i in range(-5, 5, 2)]} score = 'f1-macro clf = GridSearchCV(MLTSVM(), parameters, scoring=score) clf.fit(X, y) print (clf.best_params_, clf.best_score_) # output {'c_k': 0.03125} 0.347518217573
-
fit
(X, Y)[source]¶ Abstract method to fit classifier with training data
It must return a fitted instance of
self
.Parameters: - X (numpy.ndarray or scipy.sparse) – input features, can be a dense or sparse matrix of size
(n_samples, n_features)
- y (numpy.ndaarray or scipy.sparse {0,1}) – binary indicator matrix with label assignments.
Returns: fitted instance of self
Return type: Raises: NotImplementedError
– this is just an abstract method- X (numpy.ndarray or scipy.sparse) – input features, can be a dense or sparse matrix of size
-
get_params
(deep=True)¶ Get parameters to sub-objects
Introspection of classifier for search models like cross-validation and grid search.
Parameters: deep (bool) – if True
all params will be introspected also and appended to the output dictionary.Returns: out – dictionary of all parameters and their values. If deep=True
the dictionary also holds the parameters of the parameters.Return type: dict
-
predict
(X)[source]¶ Abstract method to predict labels
Parameters: X (numpy.ndarray or scipy.sparse.csc_matrix) – input features of shape (n_samples, n_features)
Returns: binary indicator matrix with label assignments with shape (n_samples, n_labels)
Return type: scipy.sparse of int Raises: NotImplementedError
– this is just an abstract method
-
score
(X, y, sample_weight=None)¶ Returns the mean accuracy on the given test data and labels.
In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.
Parameters: - X (array-like, shape = (n_samples, n_features)) – Test samples.
- y (array-like, shape = (n_samples) or (n_samples, n_outputs)) – True labels for X.
- sample_weight (array-like, shape = [n_samples], optional) – Sample weights.
Returns: score – Mean accuracy of self.predict(X) wrt. y.
Return type: