*5.1. Semeval 2018*

The SEM2018 Train is a collection of 6838 Tweets with emotion labeling of 11 classes. The classes are: *anger*, *anticipation*, *disgust*, *fear*, *joy*, *love*, *optimism*, *pessimism*, *sadness*, *surprise* and *trust*. Some examples of Tweets included in SEM2018 are:


The Development dataset consists of 886 Tweets with the 11 aforementioned classes and their respective labels. The class distribution in Train dataset is skewed in favor of five emotions, anger, disgust, joy, optimism, and sadness. The same class distribution is evident in the Development dataset, which is dominated by the same five emotions, Figure **??**a.

SEM2018 contains 329 unique class combinations. The frequency of these unique combinations follows a power law distribution for both Train and Development datasets Figure **??**b. The most frequent class combination for Train and Development was: anger and disgust, followed by joy and optimism. One third of the class combinations appears only once and often combine contradiction emotions, such as joy and sadness.
