5. Multi-label deep learning with scikit-multilearn¶
Deep learning methods have expanded in the python community with many tutorials on performing classification using neural networks, however few out-of-the-box solutions exist for multi-label classification with deep learning, scikit-multilearn allows you to deploy single-class and multi-class DNNs to solve multi-label problems via problem transformation methods. Two main deep learning frameworks exist for Python: keras and pytorch, you will learn how to use any of them for multi-label problems with scikit-multilearn. Let’s start with loading some data.
In [1]:
import numpy
import sklearn.metrics as metrics
from skmultilearn.dataset import load_dataset
X_train, y_train, feature_names, label_names = load_dataset('emotions', 'train')
X_test, y_test, _, _ = load_dataset('emotions', 'test')
emotions:train - exists, not redownloading
emotions:test - exists, not redownloading
5.1. Keras¶
Keras is a neural network library that supports multiple backends, most notably the well-established tensorflow, but also the popular on Windows: CNTK, as scikit-multilearn supports both Windows, Linux and MacOSX, you can you a backend of choice, as described in the backend selection tutorial. To install Keras run:
pip install -U keras
5.1.1. Single-class Keras classifier¶
We train a two-layer neural network using Keras and tensortflow as backend (feel free to use others), the network is fairly simple 12 x 8 RELU that finish with a sigmoid activator optimized via binary cross entropy. This is a case from the Keras example page. Note that the model creation function must create a model that accepts an input dimension and outpus a relevant output dimension. The Keras wrapper from scikit-multilearn will pass relevant dimensions upon fitting.
In [2]:
from keras.models import Sequential
from keras.layers import Dense
def create_model_single_class(input_dim, output_dim):
# create model
model = Sequential()
model.add(Dense(12, input_dim=input_dim, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(output_dim, activation='sigmoid'))
# Compile model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
return model
Using TensorFlow backend.
Let’s use it with a problem transformation method which converts multi-label classification problems to single-label single-class problems, ex. Binary Relevance which trains a classifier per label. We will use 10 epochs and disable verbosity.
In [8]:
from skmultilearn.problem_transform import BinaryRelevance
from skmultilearn.ext import Keras
KERAS_PARAMS = dict(epochs=10, batch_size=100, verbose=0)
clf = BinaryRelevance(classifier=Keras(create_model_single_class, False, KERAS_PARAMS), require_dense=[True,True])
clf.fit(X_train, y_train)
result = clf.predict(X_test)
Out[8]:
0.42574257425742573
5.1.2. Multi-class Keras classifier¶
We now train a multi-class neural network using Keras and tensortflow as backend (feel free to use others) optimized via categorical cross entropy. This is a case from the Keras multi-class tutorial. Note again that the model creation function must create a model that accepts an input dimension and outpus a relevant output dimension. The Keras wrapper from scikit-multilearn will pass relevant dimensions upon fitting.
In [9]:
def create_model_multiclass(input_dim, output_dim):
# create model
model = Sequential()
model.add(Dense(8, input_dim=input_dim, activation='relu'))
model.add(Dense(output_dim, activation='softmax'))
# Compile model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
return model
We use the Label Powerset multi-label to multi-class transformation approach, but this can also be used with all the advanced label space division methods available in scikit-multilearn. Note that we set the second parameter of our Keras wrapper to true, as the base problem is multi-class now.
In [10]:
from skmultilearn.problem_transform import LabelPowerset
clf = LabelPowerset(classifier=Keras(create_model_multiclass, True, KERAS_PARAMS), require_dense=[True,True])
clf.fit(X_train,y_train)
y_pred = clf.predict(X_test)
5.2. Pytorch¶
Pytorch is another often used library, that is compatible with scikit-multilearn via the skorch wrapping library, to use it, you must first install the required libraries:
pip install -U skorch torch
To start, import:
In [47]:
import torch
from torch import nn
import torch.nn.functional as F
from skorch import NeuralNetClassifier
5.2.1. Single-class pytorch classifier¶
We train a two-layer neural network using pytorch based on a simple example from the pytorch example page. Note that the model’s first layer has to agree in size with the input data, and the model’s last layer is two-dimensions, as there are two classes: 0 or 1.
In [99]:
input_dim = X_train.shape[1]
In [100]:
class SingleClassClassifierModule(nn.Module):
def __init__(
self,
num_units=10,
nonlin=F.relu,
dropout=0.5,
):
super(SingleClassClassifierModule, self).__init__()
self.num_units = num_units
self.dense0 = nn.Linear(input_dim, num_units)
self.dense1 = nn.Linear(num_units, 10)
self.output = nn.Linear(10, 2)
def forward(self, X, **kwargs):
X = F.relu(self.dense0(X))
X = F.relu(self.dense1(X))
X = torch.sigmoid(self.output(X))
return X
We now wrap the model with skorch and use scikit-multilearn for Binary Relevance classification.
In [101]:
net = NeuralNetClassifier(
SingleClassClassifierModule,
max_epochs=20,
verbose=0
)
In [96]:
from skmultilearn.problem_transform import BinaryRelevance
clf = BinaryRelevance(classifier=net, require_dense=[True,True])
clf.fit(X_train.astype(numpy.float32),y_train)
y_pred = clf.predict(X_test.astype(numpy.float32))
5.2.2. Multi-class pytorch classifier¶
Similarly we can train a multi-class DNN, this time hte last layer must agree with size with the number of classes.
In [102]:
nodes = 8
input_dim = X_train.shape[1]
hidden_dim = int(input_dim/nodes)
output_dim = len(numpy.unique(y_train.rows))
In [103]:
class MultiClassClassifierModule(nn.Module):
def __init__(
self,
input_dim=input_dim,
hidden_dim=hidden_dim,
output_dim=output_dim,
dropout=0.5,
):
super(MultiClassClassifierModule, self).__init__()
self.dropout = nn.Dropout(dropout)
self.hidden = nn.Linear(input_dim, hidden_dim)
self.output = nn.Linear(hidden_dim, output_dim)
def forward(self, X, **kwargs):
X = F.relu(self.hidden(X))
X = self.dropout(X)
X = F.softmax(self.output(X), dim=-1)
return X
Now let’s skorch-wrap it:
In [104]:
net = NeuralNetClassifier(
MultiClassClassifierModule,
max_epochs=20,
verbose=0
)
In [105]:
from skmultilearn.problem_transform import LabelPowerset
clf = LabelPowerset(classifier=net, require_dense=[True,True])
clf.fit(X_train.astype(numpy.float32),y_train)
y_pred = clf.predict(X_test.astype(numpy.float32))
/opt/conda/lib/python3.6/site-packages/sklearn/model_selection/_split.py:626: Warning: The least populated class in y has only 1 members, which is too few. The minimum number of members in any class cannot be less than n_splits=5.
% (min_groups, self.n_splits)), Warning)