*3.5. Initial Training*

We split the data into two parts: training and testing; 80% of them were used for training while 20% were set aside for testing according to Pareto's Principle [33]. We saved 10% of the 80% of the training dataset for validation, and the remaining 70% was utilized to train the model. Initial training helped us to check what our model can yield as a baseline model. Before starting training of the model, many different preprocessing techniques were used to boost the performance of the model. As we proposed, a CNN model was used so the training starts from scratch. In our proposed COV-Net model, we used the softmax function along with the cross-entropy cost function for classification.
