*5.2. Experiments*

To evaluate the results of human detection and human segmentation on the MADS dataset, we divide the MADS database into training and testing sets according to the following ratios: 50% for training and 50% for testing (rate\_50\_50), 60% for training, and 40% for testing (rate\_60\_40), 70% for training and 30% for testing (rate\_70\_30), 80% for training and 20% for testing (rate\_80\_20). The images are randomly assigned. The number of frames in the ratios is shown in Table 8.

**Table 8.** The number of frames in the ratios (%—percent) for training and testing of MASK MADS dataset.


In this paper, we used Colab Notebook with GPU Tesla P100, 16 GB for fine-tuning, training, testing the CNNs on the MASK MADS dataset. The processing steps, code finetuning, training, testing, and development process were performed in Python language (≥3.6 version) with the support of the OpenCV, Pytorch (≥3.6 version), CUDA/cuDNN libraries, gcc/& g++ (≥5.4 version), In addition, there are a number of other libraries such as Numpy, scipy, Pillow, cython, matplotlib, scikit-image, tensorflow ≥ 1.3.0, keras ≥ 2.0.8, opencv-python, h5py, imgaug, IPython. The parameters that we use are as follows: the batch size is 2, trained on 90 thousand iterations, the learning rate of 0.02; the weight decay is 0.0001, the momentum is 0.9, and other parameters are the same as when the default training of Mask R-CNN [40] and Detectron2 [41].

In this paper, we also use metrics of the standard COCO metrics including *AP* (Average Precision) of over IoU thresholds, *AP*50, *AP*75, and *APS*, *APM*, *APL* (*AP* at different scales) [40].
