*4.4. Evaluation*

The efficiency of the system was assessed by the general measures, namely, precision (*P*), recall (*R*), and F-measure score (*F*), because they are standard for NER.

$$P = \frac{X}{\overline{Y}'} \tag{17}$$

$$R = \frac{X}{Z'} \tag{18}$$

$$F-measure(F) = \frac{2 \times Precision \times Recall}{Precision + Recall},\tag{19}$$

where *X* is the total number of correctly extracted entities, *Y* is the entire number of recognized entities, and *Z* is the total number of correct entities.

#### **5. Results**

We run our experiments on the ANERCorp dataset described above. To evaluate the performance of our Bi-LSTM/GRU model, we run each model 5 times and take the average of each score value. The experimental results are summarized in Table 1. RNN can handle the sequence labeling problem efficiently without the need for additional information, such as Chunks or Gazetteers, which are essential especially for the NER task.


**Table 1.** Results obtained by our model with different architectures.

The experimental results show that our proposed model with embedding attention layer performs considerably better than the other neural NE recognition and previous baseline systems. The best F-scores of our model are 88.01% and 87.12% for BLSTM and BGRU, respectively.

#### **6. Discussion**
