*4.1. Ethiopic Script*

Ethiopic script, which is derived from Geez, is one of the most ancient scripts in the world. It is used as a writing system for more than 43 languages, including Amharic, Geez, and Tigrigna. The script has largely been used by Geez and Amharic, which are the liturgical and official languages of Ethiopia, respectively. Amharic language is the second Semitic language after Arabic. The script is written down in a tabular format in which the first column denotes the base character and the other columns are vowels derived from the base characters, made by slightly deforming or modifying the

base characters. The script has a total of 466 characters, out of which 20 are digits, 9 are punctuation marks, and the remaining 437 characters are parts of the alphabet. Developing a scene text recognition system for Ethiopic script is challenging, due to the visually similar characters, especially between base characters and the derived vowels, and the number of characters in the script. Furthermore, the lack of training and testing datasets is another limitation in the development of a scene text reading system for Ethiopic scripts. In this paper, we propose an end-to-end trainable bilingual scene text reading model using FPN, RPN, and time-restricted self-attention CTC.
