**1. Introduction**

Transcription factors are ubiquitous in plants. They play crucial roles in various growth and development processes and respond to abiotic stresses [1]. Previous studies reported more than 60 transcription factor families in plants [2,3]. However, little is known about several important transcription factor families. Trihelix transcription factors occur only in plants. They were first identified and isolated from pea (*Pisum sativum*) in the 1990s. They bind to the core sequence of 5'-G-Pu-(T/A)-A-A-(T/A)-3' of the promoter region of rbcS-3A gene to regulate light-dependent expression [4]. They were initially called GT factors because they bind to light-responsive GT elements. The DNA-binding domain of the GT factors has a typical tandem trihelix (helix-loop-helix-loop-helix) structure which was later renamed the trihelix transcription factor. Subsequent research revealed that the trihelix structure of the GT factors resembles the solution structure of the Myb/SANT-LIKE DNA-binding domain [5]. GT factors evolved from Myb/SANT-LIKE proteins in plants. Gaps between helix pairs created different recognition sequences between GT factors and Myb/SANT-LIKE proteins [5,6]. According to databases like Pfam, the Myb/SANT-LIKE domain represents the trihelix conserved domain.

Trihelix is a family of transcription factors that have only recently received attention. However, the trihelix genes have been systematically studied mainly in dicotyledonous plants such as *Arabidopsis*, tomato and chrysanthemum, while almost no research has been carefully carried out in a monocotyledonous plant. In *Arabidopsis*, 30 GT family members were identified and divided into the GT-1, GT-2, GTγ, SH4, and SIP1 subfamilies named after their founding members [7]. The 96 trihelix proteins of tomato (*Solanum lycopersicum*) were classified into six subfamilies (clades GT-1, GT-2, SH4, SIP1, GTγ, and GTδ). The GTδ subfamily is apparently missing in *Arabidopsis* [8]. Most of the trihelix gene subfamily structures vary substantially, especially at the *C*-terminus. The exceptions are GT1 and GT2.

Earlier studies identified the trihelix family genes as a class of light regulators. Nevertheless, the roles of GT factors in light regulation must be systematically established. In *Arabidopsis*, the GT1 subfamily genes may participate in salt stress and pathogen responses and their expression was induced by light in 3-d seedlings [9]. In contrast, the rice GT-1 gene *RML1* (*OsMSL21* in the present study) was repressed by light in etiolated seedlings [10]. The trihelix transcription factors in soybean, *GmGT-2A* and *GmGT-2B*, were induced by ABA (abscisic acid), drought, high salt levels, and cold in soybean seedlings [11]. Loss-of-function analysis of *GTL1* revealed that *gtl1* mutants had fewer stomata than wild type plants. In this way, the former had comparatively lower water loss and higher drought tolerance than the latter [12]. The expression of the rice GTγ clade gene *OsGTγ-1* increased 2.5 to 10 times in response to salt stress and was also upregulated by ABA treatment [13]. On the other hand, the expression of several trihelix genes in Chrysanthemum was downregulated by ABA [14]. Trihelix transcription factors are also associated with plant morphogenesis. The trihelix transcription factor *PETAL LOSS* (*PTL*) determines the number of petals per flower and sepal fusion in *Arabidopsis.* The rice SH4 clade gene (*SH4*) promotes the abscission layer development and function in mature seed peduncles [15]. However, the function of the SH4 clade has not yet been investigated. The *Arabidopsis* SIP1 genes *ASIL1* and *ASIL2* downregulated the LEA (Late Embryogenesis Abundant) genes in *Arabidopsis* seedlings [7]. The trihelix genes also have multiple functions throughout plant development. The molecular mechanisms of their stress responses and their involvement in the signaling pathway require elucidation.

Rice (*Oryza sativa* L.) is both a major global cereal crop and an important tool in plant research. In this study, we identified 41 rice trihelix genes by the Myb/SANT-LIKE domain using HMM-search in silico. We analyzed their chromosomal distributions, gene synteny, phylogenetic analysis, gene structures, motif compositions, *cis*-elements, and expression patterns in different tissues, developmental stages, and environmental stress responses. The aim of this study was to analyze the structure and function of rice trihelix genes and phylogenetic relationship between rice trihelix proteins and other species including dicotyledonous and monocotyledonous plant. To establish the role of the trihelix genes' response to stress, we evaluated their response to abiotic stress factors including drought and high salt, and to stress signal molecules, such as abscisic acid and hydrogen peroxide. Our results provide a theoretical basis for the functional analysis of the rice trihelix family genes especially in abiotic stress responses.
