**3. DNN Architectures**

We create five different DNNs, with diverse architectures, suited to a multilabel classification problem. Model 1, Figure **??**, is a simple CNN with one fully connected layer. Model 2, Figure **??**, combines a Gated Recurrent Unit and a Convolution layer, similar to [**?** ]. Model 3, Figure **??**, uses Term Frequency Inverse Document Frequency Embeddings and three fully connected layers, inspired by [**?** ]. Model 4, Figure **??**, architecture is based on the top performing single model of the Toxic Comment Classification Challenge in Kaggle https://www.kaggle.com/c/jigsaw-toxic-comment-classificationchallenge. Model 5, Figure **??**, combines uni/bi/tri grams follow by three interconnected CNN processes, as presented in [**?** ]. Each of the modules used in these models is presented in this section.

**Figure 2.** A Recurrent CNN with pre-trained Embeddings.

**Figure 3.** A Deep Neural Network with Tf-Idf Embeddings.

**Figure 4.** A Long Short-Term Memory network.

**Figure 5.** A CNN with Uni/Bi/Trigrams consideration.
