A Residual LSTM and Seq2Seq Neural Network Based on GPT for Chinese Rice-Related Question and Answer System

Wang, Haoriqin; Wu, Huarui; Zhu, Huaji; Miao, Yisheng; Wang, Qinghu; Qiao, Shicheng; Zhao, Haiyan; Chen, Cheng; Zhang, Jingjian

doi:10.3390/agriculture12060813

Open AccessArticle

A Residual LSTM and Seq2Seq Neural Network Based on GPT for Chinese Rice-Related Question and Answer System

by

Haoriqin Wang

^1,2,3,4,5,

Huarui Wu

^2,3,4,

Huaji Zhu

^2,3,4,

Yisheng Miao

^2,3,4,

Qinghu Wang

¹,

Shicheng Qiao

¹,

Haiyan Zhao

¹,

Cheng Chen

^2,3,4,* and

Jingjian Zhang

^6,*

¹

College of Computer Science and Technology, Inner Mongolia Minzu University, Tongliao 028043, China

²

Research Center of Information Technology, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China

³

Agriculture Key Laboratory of Digital Village, Ministry of Agriculture and Rural Affairs of the People’s Republic of China, Beijing 100097, China

⁴

National Engineering Research Center for Information Technology in Agriculture, Beijing 100097, China

⁵

School of Information and Electrical Engineering, Shenyang Agricultural University, Shenyang 110866, China

⁶

CangZhou Academy of Agriculture and Forestry Sciences, Cangzhou 061001, China

^*

Authors to whom correspondence should be addressed.

Agriculture 2022, 12(6), 813; https://doi.org/10.3390/agriculture12060813

Submission received: 23 April 2022 / Revised: 1 June 2022 / Accepted: 2 June 2022 / Published: 4 June 2022

(This article belongs to the Section Digital Agriculture)

Download

Browse Figures

Versions Notes

Abstract

:

Rice has a wide planting area as one of the essential food crops in China. The problem of diseases and pests in rice production has always been one of the main factors affecting its quality and yield. It is essential to provide treatment methods and means for rice diseases and pests quickly and accurately in the production process. Therefore, we used the rice question-and-answer (Q&A) community as an example. This paper aimed at the critical technical problems faced by the agricultural Q&A community: the accuracy of the existing agricultural Q&A model is low, which is challenging to meet users’ requirements to obtain answers in real-time in the production process. A network based on Attention-ResLSTM-seq2seq was used to realize the construction of the rice question and answer model. Firstly, the text presentation of rice question-and-answer pairs was obtained using the GPT pre-training model based on a 12-layer transformer. Then, ResLSTM(Residual Long Short-Term Memory) was used to extract text features in the encoder and decoder, and the output project matrix and output gate of LSTM were used to control the spatial information flow. When the network contacts the optimal state, the network only retains the constant mapping value of the input vector, which effectually reduces the network parameters and increases the network performance. Next, the attention mechanism was connected between the encoder and the decoder, which can effectually strengthen the weight of the keyword feature information of the question. The results showed that the BLEU and ROUGE of the Attention-ResLSTM-Seq2seq model reached the highest scores, 35.3% and 37.8%, compared with the other six rice-related generative question answering models.

Keywords:

1. Introduction

With the advent of big data, people are facing problems such as information explosion and information overload. Search engines [1] as the primary sources of information provide users with a large amount of useful information and bring great inconvenience and trouble to users. It is difficult to meet people’s complex search needs through keyword matching or shallow analysis of the problem. The search results are not concise and accurate enough, so it is difficult for users to locate the information they need quickly. As an alternative to search engines, the question-answering model is mainly limited to the size of the question-answering data set, which cannot fully cover all questions in a particular field. It is easy to cause wrong answers, multiple answers, and no retrieved answers, and it is also difficult to meet the question-answering needs of professional fields. In agriculture, the problem of diseases and pests in rice production has always been one of the main factors affecting its quality and yield [2]. Therefore, it is vital to provide rapid and accurate control methods and means for rice diseases and pests in the production process. At the same time, farmers often encounter a series of agricultural problems such as breeding and planting, diseases and pests, and water and fertilizer management; most of the solutions are network queries and expert telephone consultations [3]. However, there are some problems with both methods, such as redundancy of network query answer information, lags, and limited expert resources in expert answers. There are question data with different expressions but the same semantics. Repeated answers to such questions by experts will consume a lot of resources. With the penetration of big data technology into agricultural production, agricultural data has shown explosive growth [4]. Still, most agricultural data are messy and unclassified, making it challenging to recognize and use by people directly. The agricultural science and technology information service platform comprises the China Agricultural Technology Extension app and web agricultural technology extension platform. It is a comprehensive service platform including an agricultural technology question-and-answer (Q&A) [5] community, agricultural science and technology service, guidance, market price services, and other functions.

In the agricultural science and technology information service platform, the agricultural question-and-answer (Q&A) community plays a vital role in the technical exchange between farmers and agricultural technicians and users’ access to solutions to problems encountered in agricultural production. Users ask more than 1000 questions in the rice Q&A module every day. Agricultural data are mainly managed by manual screening features and shallow learning models. However, due to the high dimensionality, sparsity, and poor standardization and professionalism of agricultural data, it is not easy to find high-quality Q&A pairs. At present, the critical technical problems faced by the agricultural Q&A community are the low accuracy of the existing agricultural Q&A model and the challenge of meeting users’ requirements to obtain solutions in real-time in the production process. The question answering system generally includes semantic analysis of user questions, answer extraction, and answer generation [6], and the question answering model is a vital research hotspot for answer generation. Therefore, using deep learning and natural language processing technology to automatically and quickly mine the text features of rice knowledge for the construction of a question-answering model is a significant problem to be solved [7]. At the same time, rice is one of the essential food sources and has a wide planting area in China. Therefore, we used the rice Q&A community as an example. This paper constructs the semantic model of the rice Q&A community by using deep learning and natural language processing technology to deeply excavate the semantic information of rice-related text, which is of great significance to improving the performance of the Q&A community and provides ideas for the development of an intelligent rice-related Q&A system.

The early question answering system can only accept the question sentences of specific templates, and there are less question answering data. With the development of the Internet, the acquisition of data has become simple, and the retrieval question-answering technology has developed rapidly. Methods based on logical reasoning, template matching, machine learning, and data redundancy have been proposed to retrieve the answers according to the shallow semantics of the question sentences. However, retrieval question answering has the limitation that answers and questions must have common keywords. With the rise of encyclopedia websites, it is more convenient to obtain high-quality structured data. A large number of knowledge bases have been established. The rise of machine learning technology has promoted the research of question answering systems based on the knowledge base. The purpose of Q&A is to answer a given question in natural language accurately. Q&A does not need to format the query and return the list of relevant documents but enables users to interact with the machine more naturally. It is a classical research problem in natural language processing involving problem analysis, answer retrieval, and answer ranking technology. Question answering has been studied for many years and implemented in various systems. Baseball is considered the earliest question answering system [8], while recent chat robots such as Siri, Alexa, and Cortana are also popular. However, there are still many challenges in the research, from early Q&A templates to recent Q&A systems. Common paradigms of classical machine learning include feature engineering and machine learning algorithms, which require a lot of effort to build handmade functions [9]. As a machine learning method involving deep architecture, deep learning has attracted extensive attention in recent years and has been verified in various fields. Generally speaking, intelligent Q&A can be divided into retrieval-based intelligent agricultural Q&A and generative intelligent agricultural Q&A.

For the questions users raise, the retrieval-based intelligent question answering system usually retrieves the most matching answer from the knowledge base. It has evolved from the traditional handwriting rule matching to the retrieval question answering scheme through the deep learning semantic processing method. In the knowledge base, through semantic processing, the system identifies the user’s knowledge acquisition intention, accurately and concisely answers natural language questions, accurately matches the knowledge, and gives feedback to the user. However, the semantics of agricultural problems are complex and changeable, which brings significant challenges to intelligent question answering. More and more scholars have researched retrieval-based intelligent question answering methods. The traditional retrieval agricultural question answering system is often a simple matching with the knowledge base, which cannot find the deep semantic relationship between problems and has some limitations. At this stage, researchers used the method of deep learning to mine the semantics of question sentences to improve the accuracy and reliability of matching in the question answering system. Medelyan et al. [10] proposed a new algorithm to automatically extract index words from documents related to the agricultural field for document indexing based on a thesaurus. Through machine learning technology and text semantic information processing, controlled vocabulary eliminates the emergence of meaningless or incorrect phrases and effectively improves the system performance. To improve the efficiency of literature retrieval in the agricultural field, Wang et al. [11] constructed the agricultural domain ontology by segmenting and cleaning the literature and using association analysis and improved hierarchical clustering to find the relationship between domain concepts. Experiments show that this method can improve the clustering effect of relationships between concepts in the agricultural field and the construction effect of domain ontology. Tao et al. [12] proposed a deep retrieval model for context retrieval to solve the low utilization of dialogue interaction information in the retrieval question and answer model. As an essential part of the model, the interaction block comprises a self-attention module, an interaction module, and a compression module of compression results. The depth of the model comes from stacking multiple interaction blocks, which are represented interactively in an iterative manner. The evaluation results with the benchmark model show that the depth retrieval model has higher accuracy. No new text information will be generated based on the retrieval question answering model. The reply type and content can only retrieve more appropriate answers from the predefined knowledge base. Once the user’s questions exceed the scope of the knowledge base, the answering question system will not be able to answer the user [13] accurately. Compared with retrieval-based question answering, the generation model of generative question answering does not depend on the constructed knowledge base. Still, the model learns from the knowledge base to generate new responses. The advantage of this method is that it is more intelligent and closer to human dialogue. Suktarachan et al. [14] developed a question-and-answer service system for farmers, focusing on similar technologies previously applied to the original system of language generation, subject roles, and conceptual vocabulary structure through a Thai short message service.

There has been significant research in the field of Q&A systems. Still, there is no specific Q&A system in the agricultural field to give practical answers by analyzing the questions raised by farmers. To solve this limitation, Gaikwad et al. [15] developed a question and answer system in agriculture. The system’s input is the corpus of documents related to agriculture and a set of predefined question templates. The question-and-answer task is divided into sub-tasks such as information database preparation, question processing, answer retrieval, and evaluation. The system provides convenience for farmers to obtain information. Muller et al. [16] proposed an open retrieval generative question answering method for the multilingual environment to solve the problems of multilingualism, multi-source code, and open domain in natural language processing. Firstly, the pre-trained multilingual T5 model is used to model the task. The questions corresponding to the candidate samples are input into the model for optimization and adjustment, and finally, the training model generates an appropriate answer. It can provide a reference for the research of Q&A systems in agriculture. The above research has the problems of small vocabulary, feature coefficient, and single text feature extraction, which cannot integrate rich features. In this paper, the traditional end-to-end question answering model is optimized, and the LSTM neural network is optimized in the encoder and decoder. The residual connected LSTM extracts the semantic features of deep-seated rice question answering pairs. Then, the attention mechanism is added to the encoder and decoder to strengthen the weight of key information in question sentences to mine the semantic information of user question sentences. The main work of this paper is as follows:

(1): A method based on Residual LSTM and Seq2Seq is proposed.
(2): The pre-training model GPT is cleverly transferred to the context embedding layer, which provides a new idea of sentence pair representation.
(3): The deep neural network is used to extract text features from the encoder and decoder, and the attention mechanism is connected between the encoder and decoder to strengthen the weight information of question keywords in the sentence.

2. Materials and Methods

2.1. Corpus Preparation

We derived the rice-related Q&A data from the “China Agricultural Technology Promotion” Q&A community [17]. A total of 15,000 commonly used Q&A data pairs of rice were identified by agricultural experts, including five categories (diseases and insect pests, weeds and pesticides, storage and preservation, cultivation management, and others); the amount of data in each category is 6557, 1425, 1128, 3509 and 2381. In Table 1, a sample of rice related Q&A text data is shown:

The above rice-related text Q&A data sets are different from the text data sets in general fields and have the following characteristics:

(1): High domain specificity. All the text data are question-and-answer pairs related to rice, and the semantic boundaries of sentences are fuzzy. Therefore, it is difficult to transfer the general domain model to the semantic domain of rice knowledge text for training.
(2): The sequence of question-and-answer pairs is short. It is challenging to extract semantic information from features generated during short text training, which increases the difficulty of model recognition.
(3): All kinds of problems are unevenly distributed. In the corpus, the data volume of diseases and insect pests, cultivation management and other three categories is large, while the data volume of weeds and pesticides, storage and preservation is small. The specific data volume category distribution is shown in Figure 1.

2.2. Methods

This paper used Sequence to Sequence (Seq2seq) [18] based on the attention mechanism and ResLSTM to construct a rice-related question answering model. The model consists of four parts: the embedding layer, the encoder, the attention mechanism, and the decoder. The pre-trained GPT model represents the intelligent rice-related Q&A text. The attention mechanism pays attention to the correlation between the coded information and the current output state, as shown in Figure 2.

2.2.1. Embedding Layer

The 12-layer transformer based GPT pre-training model was used to vectorize the built rice answer selection text data. The BERT pre-training model was built using the transformer decoder, while the GPT pre-training model was built using the transformer encoder. The Bert model is a two-way language model, which can better obtain the information of word context and solve the phenomenon of polysemy. The GPT model is a one-way language model; it is more suitable for text generation, text translation, logistic regression and other tasks. Therefore, this paper used GPT as the vectorization method of rice question and answer text data. The word vector dimension was set to 300.

2.2.2. Residual Long-Term and Short-Term Memory Networks

Researchers put forward the LSTM (Long Short-Term Memory) [19] neural network by improving and optimizing the RNN (Recurrent Neural Network). In RNN [20] processing text sequences, gradient explosion or disappearance is easy to occur, leading to the loss of text features and the inability to obtain long-distance text data when extracting text features. The researchers optimized and improved the RNN model and introduced the gate mechanism to solve the above problems. It effectively solves the problem of artificially expanding tasks that are difficult to solve by RNN [21], and RNN solves the problem that gradients tend to disappear. LSTM mainly includes an input gate, output gate, forgetting gate, and memory unit. Combining these three gate structures and memory units dramatically improves the ability of the cyclic neural network to process long-range data [22] and dramatically solves the problems of long-range dependence and gradient loss. The calculation formula is:

f_{t} = σ (w_{f} + U_{f} h_{t - 1} + b_{f})

(1)

i_{t} = σ (w_{i} x_{t} + U_{i} h_{t - 1} + b_{i})

(2)

o_{t} = σ (w_{o} x_{t} + U_{c} h_{t - 1} + b_{c})

(3)

{\hat{c}}_{t} = \tanh (W_{c} x_{t} + U_{c} h_{t - 1} + b_{c})

(4)

c_{t} = f_{t} \circ c_{t - 1} + i_{t} \circ {\hat{c}}_{t}

(5)

h_{t} = o_{t} \circ \tanh (c_{t})

(6)

In the formula, x_t represents the current input information, h_t−₁ represents the historical hidden layer information,

σ

represents the sigmoid function, tanh represents the hyperbolic tangent function,

\circ

represents the vector corresponding to the bit-by-bit multiplication operation, and W and b represent the trainable parameters in the model. The LSTM network structure diagram is shown in Figure 3:

When people use RNN to process sequential tasks, they often lose gradients and undergo gradient explosion. To solve this problem, researchers put forward LSTM based on RNN, a deep neural network optimized and improved based on RNN. It can solve the problem of gradient disappearance in RNN. It has advantages over RNN in the long-term complex data processing. Compared with RNN, LSTM avoids the gradient disappearance problem caused by increasing training time. However, with the increase in data dimension and data volume, the number of LSTM layers and the amount of training data will overfit the model. Therefore, in this paper, residual LSTM is used as a feature extractor to overcome the over-fitting problem. When training multi-layer LSTM, ResLSTM provides an additional low-space shortcut by using the output layer to separate spatial and temporal fast paths. ResLSTM uses the output projection matrix and output gate of LSTM to control the spatial information flow. When the training network reaches the optimal effect, the network only keeps the identical mapping vector value of the input features, effectively reducing the network parameters and improving feature extraction. The structure diagram of ResLSTM is shown in Figure 4.

The residual network’s output consists of two parts, which receive the input of the previous layer and the output of the mapping of this layer. The two features are spliced, and the calculation formula is as follows:

y = F (x; W) + x

(7)

where y is the output result, x is the input of the previous layer,

F (x; W)

is the feature map, and W is the internal weight coefficient of the LSTM network.

F (x; W)

means that y is obtained from input x mapped by the LSTM neural network. When there is a constant mapping of input x,

F (x; W)

needs to learn the residual mapping y-x. When the loss function of the model tends to be stable, the LSTM network transmits it directly through constant mapping, thus simplifying the parameters of the training model and improving the model effect. The residual LSTM model includes three main gate structures: forget gate, input gate, and output gate. The calculation formula for each state is as follows:

f_{t} = σ (W_{f} h_{t - 1} + V_{f} x_{t} + b_{f})

(8)

i_{t} = σ (W_{i} h_{t - 1} + v_{i} x_{t} + b_{i})

(9)

{\tilde{c}}_{t} = \tanh (W_{c} h_{t - 1} + V_{c} x_{t} + b_{c})

(10)

c_{t} = c_{t - 1} f_{t} + i_{t} {\tilde{c}}_{t}

(11)

o_{t} = σ (W_{o} [h_{t - 1}, x_{t}] + b_{o})

(12)

r_{t} = \tanh (c_{t})

(13)

m_{t} = W_{p} r_{t}

(14)

h_{t} = o_{t} (m_{t} + W_{h} x_{t})

(15)

Here, x_t denotes input data, h_t denotes a hidden state, c_t is defined as a cell state, f_t denotes an output state of an updated forgetting gate, i_t and c are an updated input gate and an output state, respectively, and o_t is an output state of an updated output gate. W and V correspond to the weight matrix, b is the deviation coefficient, and σ and tanh are activation functions.

2.2.3. Encoding-Decoding Structure

Seq2seq is the primary neural network model for processing sequence tasks at present. It was initially applied in machine translation [23] and is suitable for sequence-to-sequence applications. Seq2seq is composed of an encoder and decoder, which can extract the features of input data effectively. The Seq2seq model has been widely used in machine translation [24]. The input sequence collects all the words that make up the question. The encoder converts the input sequence into feature vectors of fixed size. In the encoding process, there are three steps: First, the string is marked as a list. Each tag is then associated with a vector, which is a subset of the input vocabulary. The hiding state will be updated according to the input for each time step. Finally, it is embedded by the encoder network to generate an intermediate state. During training and testing, the decoder works differently from the encoder part in the model. The goal of the decoder is to generate an answer according to the hidden state of the encoder. The decoder calculates the previous layer’s hidden state by using the previous layer’s hidden vector, the hidden output vector of the previous layer, and the original hidden vector. Finally, the loss is calculated according to the predicted output of each time step of the model. The structure of Seq2seq is shown in Figure 5. The codec is usually a multi-layer RNN or LSTM structure, where the intermediate vector C contains x₁, x₂, …, x_m encoding information. For time t, the output y_t−1 at the previous time, the hidden layer states s_t−1 and c at the previous time are input to the decoder as inputs. Finally, the hidden layer state s_t of the decoder is obtained, thereby predicting the output value.

This paper used the Seq2Seq structure as the rice question answering model. For the Seq2Seq model, many different neural network structures can be used in the encoder and decoder, and different neural network models are often selected according to different model characteristics. In this paper, the encoder and decoder use a residual LSTM neural network model. The encoder outputs the variable-length sequence as a fixed-length sequence. In the rice question and answer text data set, the question-and-answer pair can be expressed as <H, Y>, H represents the question raised by the user, and Y represents the corresponding answer, so the question-and-answer pair can be expressed as the following formula:

H = 〈 h_{1}, h_{2}, h_{3}, \dots, h_{t} 〉

(16)

Y = 〈 y_{1}, y_{2}, y_{3}, \dots, y_{k} 〉

(17)

In the formula, T and K represent the maximum lengths of question H and answer Y.

The encoder transforms an indefinite length input sequence H into a constant dimension context vector C through a series of nonlinear transformations. In the decoder, the sequence and the context vector are outputCUsed to predict the next output word

y_{k}

. The calculation formula is as follows:

y_{k} = \arg \max P (y_{k}) = \prod_{k = 1}^{T} P (y_{k} | {y_{1}, y_{2}, \dots, y_{k - 1}}, C)

(18)

According to the maximum likelihood estimation, the conditional probability of the output sequence in a given input sequence, the loss function is as follows:

P (Y | X) = \prod_{k = 1}^{N} P (y_{k} | C, y_{1}, y_{2}, \dots, y_{k - 1})

(19)

L = - \frac{1}{n} \log P (y_{1}, \dots, y_{m} | h_{1}, \dots, h_{t}) = - \frac{1}{n} \sum_{t - 1}^{n} \log P (y_{m} | y_{1}, \dots, y_{m - 1}, C)

(20)

To determine the specific output at this point, it needs to map the output of the decoder to the dimensions of the vocabulary and use the Softmax function to calculate the final output to calculate the probability distribution of each word. Usually, the word with the highest frequency is selected as output at this time. However, this Seq2Seq model still has some defects. Firstly, the encoder converts the input sequence into a context vector, which leads to the semantic loss of high compression ratio information. Then the word information at the back end of the input sequence will be concealed by the information at the front end. The transmission of information will deviate, making the information in the sequence more critical. To solve these problems, we introduce an attention mechanism into the model.

2.2.4. Attention Mechanism

The sequence-to-sequence model has two parts: encoder and decoder. In the traditional Seq2Seq model, the encoder and decoder exchange data through an intermediate semantic vector, and the length of the semantic vector is fixed. When a long sequence of text is input into the encoder, with the encoding and decoding process, the feature information behind the text sequence will cover the feature information before it. We used an attention mechanism between the encoder and the decoder; the feature information of keywords can be retained in the input feature information of the decoder to the input sequence of the decoder, such as c2 in Figure 6. The training model selectively learns these outputs and finally associates them with the corresponding output sequence [25]. The structure diagram of the attention mechanism based on Seq2Seq is as follows:

The calculation formula of attention mechanism is as follows:

s_{i} = f (s_{i - 1}, y_{i - 1}, c_{i})

(21)

c_{i} = \sum_{j = i}^{T_{x}} a_{i j} h_{i}

(22)

α_{i j} = \frac{\exp (a_{i j})}{\sum_{k = 1}^{T_{x}} \exp (e_{i k})}

(23)

e_{i j} = a (s_{i - 1}, h_{j})

(24)

e_{i j} = v^{T} \tanh (W_{s} s_{i - 1} + W_{h} h_{t})

(25)

In the formula,

s_{i}

is the hidden layer state of RNN at time i, c_i represents the weight value of each output h_i. v, W_s, and W_h denote the weights, offset terms and embedding layer parameters in the encoder and decoder recurrent neural networks.

3. Results

3.1. Hardware, Software Environment and Evaluation Indicators

The server’s hardware environment was NVIDIA Corporation (Santa Clara, CA, USA) device 1e04 (Rev A1), the GPU was NVIDIA GeForce RTX 2080ti, and both the research and control experiments were done in the Ubuntu 18.04 environment(Canonical, London, UK). For training, the deep learning framework Pytorch was utilized in conjunction with Cuda10.1.

We used the BLEU [26] scoring method to evaluate the model’s performance. The BLEU score is used to calculate the similarity between the model output and the actual sentence. It is usually a number from 0 to 1. If the value is close to 1, it can be considered a better model, and 0 does not match. BLEU1, BLEU2, BLEU3, and BLEU4 represent the accuracy of 1-g, 2-g, 3-g, and 4-g, respectively. BLEU is the average value of BLEU1, BLEU2, BLEU3, and BLEU4, and its formula is as follows:

B L E U = \frac{\sum_{c \in c a n d i d a t e s} \sum_{n - g r a m \in c} C o u n t_{c l i p} (n - g r a m)}{\sum_{c ’ \in c a n d i d a t e s} \sum_{n - g r a m ’ \in c ’} C o u n t (n - g r a m ’)}

(26)

In the formula, the count clip (n-gram) represents the maximum number of n-grams in the output, and in the actual sentence, count (n-gram) represents the number of n-grams in the output and n represents the length of the context word.

The ROUGE scoring method [27] is used to evaluate the output of the answer by the model, and the proportion of N tuples in the answer sequence output by rice question and answer in the standard answer sequence is calculated. The word overlap between the output answer sequence and the standard answer sequence is calculated as the standard to measure the similarity. The calculation formula is as follows:

R O U G E_{N} (c) = \frac{\sum_{s \in S} \sum_{g r a m_{n} \in s} m a t c h (g r a m_{n})}{\sum_{s \in S} \sum_{g r a m_{n} \in s} c o n u t (g r a m_{n})}

(27)

In the formula, C represents the generated answer, S represents the standard answer, match (gram) represents the frequency of N original phrases in the output answer, and count (gram) represents the frequency of N original phrases in the standard answer.

3.2. Model Training and Parameter Setting

We used the ResLSTM neural network for encoding and decoding in this paper. The initial learning rate of the model was set to 0.01, the number of hidden layer nodes was set to 100, and 50 rounds of training were set. We used the Pytorch deep learning framework to build a neural network. In the experiment, we divided 15,000 question-and-answer pairs into training sets and test sets according to the ratio of 9:1, and a random gradient descent algorithm updated the model weights. There were 13,500 pairs of training sets and 1500 pairs of test sets. We trained the ResLSTM-Attention-Seq2seq model with the data set constructed in this chapter. When the model began to converge, we saved the model and related parameters. The specific steps are as follows: first, 13,500 pairs of standard rice question-and-answer pairs are input as training sets, and the neural network encodes the question text; then, the loss between the generated answer and the correct answer of the rice question answering model is calculated, and the model parameters are updated; finally, when the model loss function converges, the weight of the model network parameters is saved. The model training process and testing process are shown in Figure 7.

3.3. Model Hyper-Parameter Setting

Firstly, we set batchsize 32, 64, and 128 to train the model. Batchsize is a super essential parameter of the deep learning model. Too much batchsize leads to poor generalization ability. The BLEU score of the model was shown in Table 2. When the batchsize was set to 32, the ResLSTM-Attention-Seq2seq model had the highest BLEU value and obtained the best answer generation effect, with BLEU1 and BLEU2, BLEU3, and BLEU4 reaching 36.7%, 35.6%, 34.3%, and 35.1% respectively. Therefore, this article set batchsize to 32.

Then, we set dropout to 0.1, 0.3, and 0.05, leaving the other parameters unchanged. The BLEU score of the ResLSTM-Attention-Seq2seq model is shown in Table 3. When we set dropout to 0.1, the model had the highest BLEU value and achieved the best answer generation effect, so this paper set dropout to 0.1.

3.4. Comparative Analysis of Text Vectorization Test

A 12-layer Chinese GPT model was used to vectorize the text data of rice question-and-answer pairs. At the same time, we compared it with Glove [28], BERT [29], and ELMo [30] pre-training models. The text features trained by four models are input into the Seq2Seq-LSTM neural network. From Table 4, it can be seen that four different text pre-training models were used in the embedding layer, and the GPT pre-training model obtained the highest BLEU and ROUGE, with BLEU1, BLEU2, BLEU3, BLEU4, and ROUGE reaching 32.9%, 31.8%, 32.7%, 27.6%, and 32.7% respectively. Compared with Glove’s BLEU1, BLEU2, BLEU3, BLEU4, and ROUGE, the GPT pre-trained model method was improved by 3.2, 4.1, 5.3, 4.5, and 4 percentage points. Glove cannot solve the problem that words have different meanings in different contexts and cannot consider the correlation of all words in the whole sentence. After the GPT model was pre-trained on rice-related question corpus to obtain text features, it achieved context and word order information simultaneously, which improved the BLEU and ROUGE of the neural network. It showed that GPT could solve the problem that words have different meanings in different contexts. Compared with BLEU1, BLEU2, BLEU3, BLEU4, and ROUGE of BERT, the pre-trained model method with GPT was improved by 1.1, 1.7, 3.1, 0.5, and 1.1 percentage points, respectively. Although BERT and GPT are all based on Transformer architecture, BERT is a bidirectional language model [31]. At the same time, GPT is a unidirectional language model, and GPT is more suitable for logistic regression and text generation tasks [32]. Therefore, this paper used the GPT pre-training model and transformed rice-related question-and-answer pairs into word vectors for input into the neural network model.

3.5. Comparative Analysis of the Results of Rice Generative Question and Answer Model

We compared the ResLSTM-Seq2seq model with four other question answering models (LSTM-Seq2seq, BiLSTM-Seq2seq, GRU-Seq2seq, and BiGRU-Seq2seq) in rice-related Q&A data. The above five basic models all used an encoding-decoding structure. However, different neural networks were used in the encoder and decoder to extract text features. The GPT text pre-training model was used in the embedding layer to represent the text of the question-and-answer pairs. Table 5 compared five different deep learning question and answer models in BLEU and ROUGE. It can be seen from Table 5 that the model of BiLSTM-Seq2seq was improved by 1.2, 1.7, 0.9, 0.5, and 1.9 percentage points in BLEU1, BLEU2, BLEU3, BLEU4, and ROUGE, respectively, compared with LSTM-Seq2seq. Compared with GRU-Seq2seq, the model of BiGRU-Seq2seq improved by 0.9, 0.1, 1.7, 1, and 1.2 percentage points in BLEU1, BLEU2, BLEU3, BLEU4, and ROUGE. It showed that the answers generated by BiLSTM-Seq2seq and BiGRU-Seq2seq were more similar to the standard answers. The answers generated were more comprehensive due to the use of bidirectional recurrent neural networks in coding and decoding units to extract more accurate rice-related question features. Compared with BiLSTM-Seq2seq and BiGRU-Seq2seq, ResLSTM-Seq2seq had higher BLEUs and ROUGE of 35.1%, 33.2%, 32.1%, 29.8%, and 36.1%. Compared with the bidirectional recurrent neural network, ResLSTM can better extract text features, express text information, and reduce feature loss through residual connection, thus improving the accuracy of rice-related answer generation.

We added the attention mechanism [33] in BiLSTM-Seq2seq, BiGRU-Seq2seq, and ResLSTM-Seq2seq. Attention-BiLSTM-Seq2seq, Attention-BiGRU-Seq2seq, and Attention-ResLSTM-Seq2seq were obtained. We kept other parameters unchanged, the GPT pre-training model was used to obtain the text representation of rice-related question-and-answer pairs, and training was carried out on the data set of rice-related question-and-answer pairs. Figure 8 shows the loss rate trend chart of six rice-related knowledge question and answer models in the rice-related question-and-answer training set. Compared with the other five rice-related generative question and answer models, the Attention-ResLSTM-Seq2seq model converged faster after 30 rounds of training. It can be seen from Table 6 that the BLEUs and ROUGE of Attention-BiLSTM-Seq2seq have increased by 1.9, 1.1, 1.3, 0.5, and 1.1 percentage points, respectively, compared with BiLSTM-Seq2seq. Compared with BiGRU-Seq2seq, the BLEUs and ROUGE of Attention-BiGRU-Seq2seq are increased by 2.3, 2.1, 0.3, 0.8, and 2.1 percentage points. Compared with ResLSTM-Seq2seq, the BLEUs and ROUGE of Attention-ResLSTM-Seq2seq increased by 2.3, 2.1, 0.3, 0.8, and 1.9 percentage points, respectively. It showed that the attention mechanism can effectively improve the accuracy and comprehensiveness of answers generated by the question-answering model. The attention mechanism can strengthen the weight information of keywords in rice questions between encoder and decoder and make the neural network better understand the semantics of questions and analyze the characteristics of questions. Attention-ResLSTM-Seq2seq achieved the best effect among the six models, with BLEU1, BLEU1, BLEU1, BLEU1 and ROUGE reaching 37.7%, 36.6%, 34.9%, 32.1%, and 37.8%.

4. Discussion

It can be seen from Table 7 that compared with five models (BiLSTM-Seq2seq, BiGRU-Seq2seq, ResLSTM-Seq2seq, Attention-BiLSTM-Seq2seq, Attention-BiGRU-Seq2seq), Attention-ResLSTM-Seq2seq had the highest BLEU on the five categories of data (diseases and insect pests, weeds and pesticides, storage and preservation, cultivation management, and others). BLEU was more significant than 31.6%, and the answer generation effect was better than in other models. In the data set with sufficient experimental data on diseases, insect pests, and cultivation management, the average BLEU of this model reached 38.7% and 37.1%. It was significantly higher than the other five answer generation models. It can be seen from Table 7 that the effect of each model in two categories of data sets (diseases and insect pests, cultivation management) was mostly higher than that of other categories, which was because the deep learning answer generation model was in the process of continuous iterative training. The larger the data set, the better the training effect of the model; the BLEUs of this model were 34.1%, 33.5%, and 31.6%, respectively, in the data sets (weeds and pesticides, storage and preservation, and others), which were higher than those of other models. This showed that the Attention-ResLSTM-Seq2seq model can effectively extract the features of short texts for text similarity calculation even if the data are insufficient, and also showed that this model had good robustness.

Table 8 shows the BLEU and ROUGE of six sequence-to-sequence rice-related question and answer models under the four text representations of GPT, BERT, Glove, and ELMo. The BLEU and ROUGE of GPT text representation were higher than that of the other three text representation methods in six neural network models. Attention-ResLSTM-Seq2seq model achieved the best results in four text representation methods: BLEU and ROUGE, reaching 35.3%, 37.8%, 34.1%, 35.7%, 32.7%, 34.8%, 30.6%, and 31.6%, respectively, and the answer generation effect was better than in the other six neural network models. It can be seen from Table 8 that the average BLEU and ROUGE of BERT and GPT text representation methods were higher than that of Glove and ELMO. Glove text representation ignores polysemous words and long-distance semantic related information in different contexts. It is a static word vector representation method. However, ELMO solves the problem of polysemy dynamically, because ELMO uses LSTM for feature extraction; compared with BERT and GPT, which are based on Transformer as the underlying structure, LSTM’s feature extraction ability was far weaker than Transformer. At the same time, GPT is a one-way language model, while BERT is a two-way language model, so GPT is more suitable for text generation, text translation, and answering question systems. Therefore, GPT as the text representation method can improve the accuracy and comprehensiveness of the answers generated by the rice-related question and answer model.

Table 9 shows the response time, BLEU, and ROUGE of three sequence-to-sequence rice-related question answering models based on an attention mechanism in 1500 test sets. The model Attention-ResLSTM-Seq2seq proposed in this paper meets the requirements of quick answers to rice questions, and Attention-BiGRU-Seq2seq had the fastest response time, which is due to the simple structure of the GRU model and few training layers and model parameters. The model Attention-ResLSTM-Seq2seq proposed in this paper can generate 1500 answers in 21 s, its BLEU and ROUGE reach the highest scores of 29.8% and 27.7%, and it had a better effect under the condition of little difference in response time.

5. Conclusions

We used a network based on Attention-ResLSMT-Seq2seq to construct the rice generative question answering model in this paper. Firstly, the GPT pre-training model based on a 12-layer transformer was used to obtain the text representation of rice question-and-answer pairs. Then, ResLSTM was used to extract text features in the encoder and decoder. The attention mechanism was connected between the encoder and decoder to strengthen the weight of keyword feature information in questions. The results showed that the rice question and answer model based on Attention-ResLSMT-Seq2seq had the highest scores of 35.3% and 37.8% in BLEU and ROUGE compared with the other six question and answer models. Based on the work of this study, the following aspects can be further studied:

(1): The representation of rice knowledge text dataset is still based on manual annotation, which requires a lot of supervision. Semi-supervised and unsupervised models will become one of the main development directions. Reduce the complexity of semantic understanding model processing, improve the efficiency of model processing, and provide edge computing, so as to improve the universality of the model.
(2): For the fusion of multi-modal agricultural semantic analysis and image analysis, image text features can be used to assist text semantic feature recognition, and complex semantics can be processed through unified dimension mapping and model construction.
(3): We will transfer the rice knowledge question answering text semantic model to the construction of other crop text semantic models, and we will provide key technical support for the construction of other crop intelligent question answering models in the future.

Author Contributions

Conceptualization, J.Z. and H.W. (Haoriqin Wang); methodology, H.W. (Haoriqin Wang) and H.Z. (Huaji Zhu); software, H.W. (Huarui Wu); validation, H.W. (Haoriqin Wang), H.Z. (Haiyan Zhao) and C.C.; formal analysis, H.W. (Haoriqin Wang); investigation, Q.W.; resources, H.W. (Haoriqin Wang) and H.Z. (Huaji Zhu); data curation, H.W. (Haoriqin Wang); writing—original draft preparation, H.W. (Haoriqin Wang) and H.Z. (Haiyan Zhao); writing—review and editing, H.W. (Haoriqin Wang), H.W. (Huarui Wu) and J.Z.; visualization, H.W. (Haoriqin Wang) and H.Z. (Huaji Zhu); supervision, S.Q. and Y.M.; project administration, S.Q. and Y.M.; funding acquisition, Q.W., H.Z. (Huaji Zhu) and C.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China, grant number 61871041; Project of Agricultural Equipment Department of Jiangsu University, grant number 4111680005; Youth Foundation of Beijing Academy of Agriculture and Forestry Sciences, grant number QNJJ202030; Natural Science Foundation of Inner Mongolia Autonomous Region, grant number 2021LHMS06006; Automatic Classification of Massive Solar Data Based on Convolutional Deep Confidence Network, grant number KLSA201905; The Design and Realization of Intelligent Garbage Management, Science and Technology Innovation Guidance Project, grant number KCBJ2018029; Science and Technology Plan Project of Inner Mongolia Autonomous Region of China, grant number 2020GG0189; The Central Government Guided Local Science and Technology Development Fund project, grant number 2020ZY0003.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Chen, L.; Ma, R.; Hannák, A.; Wilson, C. Investigating the impact of gender on rank in resume search engines. In Proceedings of the 2018 Chi Conference on Human Factors in Computing Systems, Montreal, QC, Canada, 1–26 April 2018; pp. 1–14. [Google Scholar]
Wang, H.; Zhu, H.; Wu, H.; Wang, X.; Han, X.; Xu, T. A Densely Connected GRU Neural Network Based on Coattention Mechanism for Chinese Rice-Related Question Similarity Matching. Agronomy 2021, 11, 1307. [Google Scholar] [CrossRef]
Muthayya, S.; Sugimoto, J.D.; Montgomery, S.; Maberly, G.F. An overview of global rice production, supply, trade, and consumption. Ann. N. Y. Acad. Sci. 2014, 1324, 7–14. [Google Scholar] [CrossRef]
Weersink, A.; Fraser, E.; Pannell, D.; Duncan, E.; Rotz, S. Opportunities and challenges for big data in agricultural and environmental analysis. Annu. Rev. Resour. Econ. 2018, 10, 19–37. [Google Scholar] [CrossRef]
Li, M.; Li, Y.; Peng, Q.; Wang, J.; Yu, C. Evaluating community question-answering websites using interval-valued intuitionistic fuzzy DANP and TODIM methods. Appl. Soft Comput. 2021, 99, 106918. [Google Scholar] [CrossRef]
Dwivedi, S.K.; Singh, V. Research and reviews in question answering system. Procedia Technol. 2013, 10, 417–424. [Google Scholar] [CrossRef] [Green Version]
Young, T.; Hazarika, D.; Poria, S.; Cambria, E. Recent trends in deep learning based natural language processing. IEEE Comput. Intell. Mag. 2018, 13, 55–75. [Google Scholar] [CrossRef]
Green, B.F., Jr.; Wolf, A.K.; Chomsky, C.; Laughery, K. Baseball: An automatic question-answerer. In Proceedings of the Western Joint IRE-AIEE-ACM Computer Conference, Los Angeles, CA, USA, 9–11 May 1961; pp. 219–224. [Google Scholar]
Dargan, S.; Kumar, M.; Ayyagari, M.R.; Kumar, G. A survey of deep learning and its applications: A new paradigm to machine learning. Arch. Comput. Methods Eng. 2020, 27, 1071–1092. [Google Scholar] [CrossRef]
Medelyan, O.; Witten, I.H. Thesaurus-based index term extraction for agricultural documents. In Proceedings of the 2005 EFITA/WCCA Joint Congress on IT in Agriculture, Vila Real, Portugal, 25–28 July 2005. [Google Scholar]
Chao, W.; Li, S.; Hong, X. Research On Literature-Based Automatic Ontology Construction Method For Agricultural Domain. Comput. Appl. Softw. 2014, 31, 71–74. [Google Scholar]
Tao, C.; Wu, W.; Xu, C.; Hu, W.; Zhao, D.; Yan, R. One time of interaction may not be enough, Go deep with an interaction-over-interaction network for response selection in dialogues. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, 28 July–2 August 2019; pp. 1–11. [Google Scholar]
Wang, B.; Cao, H. A Summary of Research on Intelligent Dialogue Systems. J. Phys. Conf. Ser. 2020, 1651, 012020. [Google Scholar] [CrossRef]
Suktarachan, M.; Kawtrakul, A. The Development of a Question-Answering Services System for the Farmer through SMS: Query Analysis. In Proceedings of the 2009 Workshop on Knowledge and Reasoning for Answering Questions (KRAQ 2009), Singapore, 6 August 2009. [Google Scholar]
Gaikwad, S.; Asodekar, R.; Gadia, S.; Attar, V.Z. AGRI-QAS question-answering system for agriculture domain[C]. In Proceedings of the 2015 International Conference on Advances in Computing, Communications and Informatics (ICACCI), Kochi, India, 10–13 August 2015; pp. 1474–1478. [Google Scholar]
Muller, B.; Soldaini, L.; Koncel-Kedziorski, R.; Lind, E.; Moschitti, A. Cross-Lingual GenQA: A Language-Agnostic Generative Question Answering Approach for Open-Domain Question Answering. arXiv 2021, arXiv:2110.07150. [Google Scholar]
Wang, H.; Wu, H.; Wang, Q.; Qiao, S.; Xu, T.; Zhu, H. A Dynamic Attention and Multi-Strategy-Matching Neural Network Based on Bert for Chinese Rice-Related Answer Selection. Agriculture 2022, 12, 176. [Google Scholar] [CrossRef]
Zhang, Y.; Li, Y.; Zhang, G. Short-term wind power forecasting approach based on Seq2Seq model using NWP data. Energy 2020, 213, 118371. [Google Scholar] [CrossRef]
Yu, Y.; Si, X.; Hu, C.; Zhang, J. A review of recurrent neural networks: LSTM cells and network architectures. Neural Comput. 2019, 31, 1235–1270. [Google Scholar] [CrossRef] [PubMed]
Cho, K.; Van Merriënboer, B.; Gulcehre, C.; Bahdanau, D.; Bougares, F.; Schwenk, H.; Bengio, Y. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv 2014, arXiv:14061078. [Google Scholar]
Sherstinsky, A. Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Phys. D Nonlinear Phenom. 2020, 404, 132306. [Google Scholar] [CrossRef] [Green Version]
Sundermeyer, M.; Schlüter, R.; Ney, H. LSTM neural networks for language modeling. In Proceedings of the Thirteenth Annual Conference of the International Speech Communication Association, Portland, OR, USA, 9–13 September 2012. [Google Scholar]
Huang, J.; Sun, Y.; Zhang, W.; Wang, H.; Liu, T. Entity highlight generation as statistical and neural machine translation. IEEE/ACM Trans. Audio Speech Lang. Processing 2018, 26, 1860–1872. [Google Scholar] [CrossRef]
Jang, M.; Seo, S.; Kang, P. Recurrent neural network-based semantic variational autoencoder for sequence-to-sequence learning. Inf. Sci. 2019, 490, 59–73. [Google Scholar] [CrossRef] [Green Version]
Hastie, T.; Tibshirani, R.; Friedman, J.H.; Friedman, J.H. The Elements of Statistical learning: Data Mining, Inference, and Prediction; Springer: Berlin/Heidelberg, Germany, 2009. [Google Scholar]
Papineni, K.; Roukos, S.; Ward, T.; Zhu, W.J. Bleu: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, PA, USA, 7–12 July 2002; pp. 311–318. [Google Scholar]
Lin, C.Y. Rouge: A package for automatic evaluation of summaries. In Text Summarization Branches Out; University of Southern California: Marina del Rey, CA, USA, 2004; pp. 74–81. [Google Scholar]
Pennington, J.; Socher, R.; Manning, C.D. Glove: Global vectors for word representation[C]. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 25–29 October 2014; pp. 1532–1543. [Google Scholar]
Devlin, J.; Chang, M.-W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018, arXiv:181004805. [Google Scholar]
Peters, M.; Neumann, M.; Iyyer, M.; Gardner, M.; Clark, C.; Lee, K.; Zettlemoyer, L. Deep Contextualized Word Representations. arXiv 2018, arXiv:1802.05365. [Google Scholar]
Yang, B.; Wu, L.; Zhu, J.; Shao, B.; Lin, X.; Liu, T.Y. Multimodal Sentiment Analysis with Two-Phase Multi-Task Learning. IEEE/ACM Trans. Audio Speech Lang. Processing 2022, 1. [Google Scholar] [CrossRef]
Dou, Y.; Forbes, M.; Koncel-Kedziorski, R.; Smith, N.; Choi, Y. Is GPT-3 Text Indistinguishable from Human Text? Scarecrow: A Framework for Scrutinizing Machine Text. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Dublin, Ireland, 22–27 May 2022; pp. 7250–7274. [Google Scholar]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Processing Syst. 2017, 30, 1–11. [Google Scholar]

Figure 1. Category distribution of rice question and answer corpus.

Figure 2. ResLSTM-Attention-Seq2seq Structure.

Figure 3. Long short-term memory structure.

Figure 4. ResLSTM Structure.

Figure 5. Seq2seq Structure.

Figure 6. Attention Structure.

Figure 7. Model training and testing process.

Figure 8. Training set loss rate of rice-related knowledge Q&A model trend chart.

Table 1. Sample of rice-related Q&A pair data.

Quesetion	Answer
How to determine the water shortage of rice seedbed in the management of rice seedbed.	In the morning, there are water drops on the tips of leaves, so you don’t need to water them when you spit water. If you don’t have water drops, you have to water them thoroughly.
When can rice use herbicides correctly?	Weeding should be carried out reasonably before and after rice transplanting or at the 4–6 leaf stage of rice seedlings.
Rice seedlings usually emerge several days after sowing.	Rice seedlings generally emerge from 7 to 10 days after sowing.
Rice damping blight mostly occurs at several leaf stages of rice seedlings.	Rice damping blight mostly occurs in the stage of two leaves and one heart.
Why should rice be dried before sowing?	Sun-drying can promote the post-ripening of seeds and improve the activity of enzymes.
What are the fungicides commonly used in rice seedbeds?	There are mainly 30% metalaxyl, 85% oxidil, dixone, ethylallicin and so on.
What are the advantages of rice floppy disk seedling raising?	Rice seedling tray is beneficial to centralized management, convenient for throwing seedlings, with many roots and fast growth after transplanting.
What are the symptoms of rice sheath blight?	Brown striped patches mainly appear on the basal stems.
What are the main diseases of rice at seedling stage?	The main diseases of rice seedling stage are damping wilt, bakanae disease and seedling blast.
How to choose excellent rice varieties.	Choose varieties suitable for local high yield, high quality, strong stress resistance and high efficiency
When can rice use herbicides correctly?	Weeding should be carried out reasonably before and after rice transplanting or at the 4–6 leaf stage of rice seedlings.
When will barnyard grass in paddy fields be controlled?	Barnyard grass in paddy fields is best removed when the seedlings are ploughed and weeded after turning green.
What problems should be paid attention to when using rice germinator?	Set germination acceleration time and heating upper limit, keep reasonable water level, and check the pipes frequently.
Which soaking agent is better for rice seed soaking?	Soaking rice seeds with strong chlorine essence or potassium permanganate.
What chemical fertilizer should be applied in rice seedling stage?	On the basis of basal tiller fertilizer, recovery fertilizer should be properly applied according to seedling conditions, mainly compound fertilizer containing phosphorus and potassium
What are the precautions for rice seed coating?	Selection of seed coating agent, coating temperature, film solidification, safe storage and strict prevention of seed drying.
What causes the dead stalk of rice in the later stage?	Sheath blight is the main cause of rice stalk death, and rice planthopper can also cause rice stalk death.
What is the cause of rice bakanae disease?	Seeds carry bacteria, and spores of pathogens sneak into seeds and spread in the first year.

Table 2. Bleu value of ResLSTM-Attention-Seq2seq under different batch sizes.

Batchsize	BLEU1 (%)	BLEU2 (%)	BLEU3 (%)	BLEU4 (%)
32	36.7	35.6	34.3	35.1
64	34.6	33.1	32.5	33.9
128	33.7	31.5	29.8	32.7

Table 3. Bleu value of ResLSTM-Attention-Seq2seq under different drops.

Dropout	BLEU1 (%)	BLEU2 (%)	BLEU3 (%)	BLEU4 (%)
0.1	36.9	35.8	34.9	33.7
0.3	34.8	33.9	34.1	30.5
0.05	33.6	32.8	31.8	29.6

Table 4. Effects of seq2seq LSTM Q&A model under different pre-training models.

Pre-Training Model	BLEU1 (%)	BLEU2 (%)	BLEU3 (%)	BLEU4 (%)	ROUGE (%)
GPT	32.9	31.8	30.7	27.6	32.7
Glove	29.7	27.7	25.4	23.1	28.7
BERT	31.8	30.1	27.6	27.1	31.6
ELMo	28.6	25.1	23.6	22.7	27.7

Table 5. BLEU and ROUGE comparison of different deep learning Q&A models.

Question Answering Model	BLEU1 (%)	BLEU2 (%)	BLEU3 (%)	BLEU4 (%)	ROUGE (%)
ResLSTM-Seq2seq	35.1	33.2	32.1	29.8	36.1
LSTM-Seq2seq	32.9	31.8	28.7	27.6	32.7
BiLSTM-Seq2seq	33.7	32.1	29.6	28.1	34.6
GRU-Seq2seq	29.7	28.6	25.3	24.1	30.5
BiGRU-Seq2seq	30.6	28.7	27.6	25.1	31.7

Table 6. BlEU and ROUGE comparison of different deep learning Q&A models.

Question Answering Model	BLEU1 (%)	BLEU2 (%)	BLEU3 (%)	BLEU4 (%)	ROUGE (%)
ResLSTM-Seq2seq	35.1	33.2	32.1	29.8	36.1
Attention-ResLSTM-Seq2seq	37.7	36.6	34.9	32.1	37.8
BiLSTM-Seq2seq	33.7	32.1	29.6	28.1	34.6
Attention-BiLSTM-Seq2seq	35.6	33.2	30.9	28.6	35.7
BiGRU-Seq2seq	30.6	28.7	27.6	25.1	31.7
Attention-BiGRU-Seq2seq	32.9	30.8	27.9	25.9	33.8

Table 7. BLEU and ROUGE of different models and different types of rice-related Q&A pairs.

Question Answering Model	Diseases and Insect Pests (%)	Weeds and Pesticides (%)	Storage and Preservation (%)	Cultivation Management (%)	Other (%)
ResLSTM-Seq2seq	35.1	32.7	36.1	29.8	30.2
Attention-ResLSTM-Seq2seq	38.7	34.1	37.1	33.5	31.6
BiLSTM-Seq2seq	34.1	28.7	35.9	27.8	25.1
Attention-BiLSTM-Seq2seq	36.6	28.8	35.9	29.8	26.9
BiGRU-Seq2seq	32.7	25.8	30.1	25.6	25.8
Attention-BiGRU-Seq2seq	32.4	27.7	31.5	26.5	26.7

Table 8. BLEU and ROUGE of six models under different text vector presentation methods.

Model	GPT		BERT		ELMo		Glove
Model	BLEU	ROUGE	BLEU	ROUGE	BLEU	ROUGE	BLEU	ROUGE
ResLSTM-Seq2seq	32.5	36.1	31.2	33.6	30.9	32.5	28.7	32.1
Attention-ResLSTM-Seq2seq	35.3	37.8	34.1	35.7	32.7	34.8	30.6	31.6
BiLSTM-Seq2seq	30.9	34.6	27.6	31.7	26.6	30.1	25.9	29.6
Attention-BiLSTM-Seq2seq	32.1	35.7	30.5	32.9	30.8	31.6	28.6	30.7
BiGRU-Seq2seq	28.1	31.7	25.7	30.6	23.7	29.7	23.1	28.7
Attention-BiGRU-Seq2seq	29.3	33.8	28.5	32.1	27.8	30.6	26.9	28.5

Table 9. Comparison of response time and effect of three network models.

Model	BLEU (%)	ROUGE (%)	Response Time s
Attention-ResLSTM-Seq2seq	29.8	27.7	21
Attention-BiLSTM-Seq2seq	26.5	25.1	23
Attention-BiGRU-Seq2seq	23.7	23.7	19

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, H.; Wu, H.; Zhu, H.; Miao, Y.; Wang, Q.; Qiao, S.; Zhao, H.; Chen, C.; Zhang, J. A Residual LSTM and Seq2Seq Neural Network Based on GPT for Chinese Rice-Related Question and Answer System. Agriculture 2022, 12, 813. https://doi.org/10.3390/agriculture12060813

AMA Style

Wang H, Wu H, Zhu H, Miao Y, Wang Q, Qiao S, Zhao H, Chen C, Zhang J. A Residual LSTM and Seq2Seq Neural Network Based on GPT for Chinese Rice-Related Question and Answer System. Agriculture. 2022; 12(6):813. https://doi.org/10.3390/agriculture12060813

Chicago/Turabian Style

Wang, Haoriqin, Huarui Wu, Huaji Zhu, Yisheng Miao, Qinghu Wang, Shicheng Qiao, Haiyan Zhao, Cheng Chen, and Jingjian Zhang. 2022. "A Residual LSTM and Seq2Seq Neural Network Based on GPT for Chinese Rice-Related Question and Answer System" Agriculture 12, no. 6: 813. https://doi.org/10.3390/agriculture12060813

APA Style

Wang, H., Wu, H., Zhu, H., Miao, Y., Wang, Q., Qiao, S., Zhao, H., Chen, C., & Zhang, J. (2022). A Residual LSTM and Seq2Seq Neural Network Based on GPT for Chinese Rice-Related Question and Answer System. Agriculture, 12(6), 813. https://doi.org/10.3390/agriculture12060813

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Residual LSTM and Seq2Seq Neural Network Based on GPT for Chinese Rice-Related Question and Answer System

Abstract

1. Introduction

2. Materials and Methods

2.1. Corpus Preparation

2.2. Methods

2.2.1. Embedding Layer

2.2.2. Residual Long-Term and Short-Term Memory Networks

2.2.3. Encoding-Decoding Structure

2.2.4. Attention Mechanism

3. Results

3.1. Hardware, Software Environment and Evaluation Indicators

3.2. Model Training and Parameter Setting

3.3. Model Hyper-Parameter Setting

3.4. Comparative Analysis of Text Vectorization Test

3.5. Comparative Analysis of the Results of Rice Generative Question and Answer Model

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI