*Article* **A Convolution Neural Network-Based Representative Spatio-Temporal Documents Classification for Big Text Data**

**Byoungwook Kim 1, Yeongwook Yang 2, Ji Su Park <sup>3</sup> and Hong-Jun Jang 3,\***


**\*** Correspondence: hongjunjang@jj.ac.kr; Tel.: +82-63-220-2372

**Abstract:** With the proliferation of mobile devices, the amount of social media users and online news articles are rapidly increasing, and text information online is accumulating as big data. As spatiotemporal information becomes more important, research on extracting spatiotemporal information from online text data and utilizing it for event analysis is being actively conducted. However, if spatiotemporal information that does not describe the core subject of a document is extracted, it is rather difficult to guarantee the accuracy of core event analysis. Therefore, it is important to extract spatiotemporal information that describes the core topic of a document. In this study, spatio-temporal information describing the core topic of a document is defined as 'representative spatio-temporal information', and documents containing representative spatiotemporal information are defined as 'representative spatio-temporal documents'. We proposed a character-level Convolution Neuron Network (CNN)-based document classifier to classify representative spatio-temporal documents. To train the proposed CNN model, 7400 training data were constructed for representative spatiotemporal documents. The experimental results show that the proposed CNN model outperforms traditional machine learning classifiers and existing CNN-based classifiers.

**Keywords:** convolution neural network; spatio-temporal document; document classification; big text data

#### **1. Introduction**

Since social media-based data or online media data is composed of natural language, it has a much larger and more complex structure than existing transaction data [1,2]. Recently, the media distributes news articles online in order to quickly deliver news to consumers, online news articles can identify current social trends and behavioral patterns of members of society [3]. The social trend analysis technology for content published in online media has the advantage of being less expensive and faster than the analysis by existing expert groups. Therefore, research to detect and monitor current major issues by analyzing unstructured text information from social media or online news posts and extracting useful knowledge is being actively conducted.

For social trend analysis, it is important to identify event sentences from text documents such as social media or online news articles [4]. The event sentence refers to a sentence in which specific content about a specific topic, i.e., who, where, when, what, what, etc. is expressed. The temporal and spatial information included in news articles is used to detect the early onset of disease and to determine the time and location of disease outbreaks [5]. The temporal and spatial information presented in online news articles plays a decisively important role in understanding social trends.

Existing research to detect spatial and temporal information from text focuses on how accurately all temporal and spatial information contained within a document is extracted [6–8]. A document can contain many pieces of information about time and space.

**Citation:** Kim, B.; Yang, Y.; Park, J.S.; Jang, H.-J. A Convolution Neural Network-Based Representative Spatio-Temporal Documents Classification for Big Text Data. *Appl. Sci.* **2022**, *12*, 3843. https:// doi.org/10.3390/app12083843

Academic Editors: Wei Wang and Ka Lok Man

Received: 20 January 2022 Accepted: 7 April 2022 Published: 11 April 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

In this study, among various spatial and temporal information included in a document, temporal and spatial information describing the core topic of the document is defined as '*representative spatio-temporal information*'. The document including representative spatiotemporal information is defined as a '*representative spatiotemporal document*'. If not only representative spatio-temporal information but also a large number of general spatiotemporal information are extracted from one document, the accuracy of core event analysis based on spatio-temporal information can be lowered. In order to increase the accuracy of core event analysis through artificial intelligence, it is necessary to remove unnecessary spatio-temporal information from one document and extract only the representative spatiotemporal information that accurately describes the core event in the document. Since extracting representative spatio-temporal information from a single document is a highcost task, it is difficult to treat all documents from big data such as social media-based data or online news articles as analysis targets. Therefore, in order to efficiently analyze core events through representative spatio-temporal information, it is important to select documents from which representative spatio-temporal information is extracted.

Research using machine learning (Naïve Bayes [9,10], SVM [11,12] and Random Forest [13,14], etc.) in automatic document classification problems have been conducted so far. Recently, as deep learning-based Convolution Neuron Network (CNN) has been used for document classification, the performance of automatic document classification has been greatly improved [15]. CNN started to attract attention in the field of artificial intelligence as it showed excellent performance in image classification or object detection in the early days [16–18]. Classification technology using CNN has expanded its field of application from images to texts [19]. Recently, document classification using CNN is characterized as an area that classifies documents (patent documents [20], contracts [21], infectious disease documents [22], etc.) of a specific domain.

In this paper, we propose a character-level CNN-based representative spatio-temporal document classification model. First, we built 7400 learning data from online news articles provided by the National Institute of the Korean Language [23]. We developed a character-level CNN-based document classifier (a.k.a. RepSTDoc\_ConvNet) that can classify representative spatio-temporal documents. RepSTDoc\_ConvNet has a deeper CNN layer and a fully-connected layer than the existing CNN-based document classification model. In order to prove the performance of the proposed CNN model, we compared RepSTDoc\_ConvNet with three baseline machine learning classifiers (Gaussian Naïve Bayes, linear SVM, and random forest) and three deep learning-based models (ConvNet, DocClass\_ConvNet [22] and DocClass\_ConvNet\_Mod).

The final goal of our study is to extract representative spatio-temporal information from a large amount of documents. In order to extract representative spatio-temporal information, it is first necessary to classify representative spatio-temporal documents having representative spatio-temporal information in a large number of documents. This paper corresponds to the stage of classification of representative spatio-temporal documents. Through the representative spatio-temporal information, it can be used for natural disaster detection and analysis of factors (events such as urban planning, building construction, traffic control, and store opening) influencing business district analysis.

Our main contributions are summarized as follows.


The rest of the paper is organized as follows. Section 2 presents the literature review. In Section 3, we define the research problem. Section 4 is the proposed CNN-based document classifier model. In Section 5, we provide the experimental results and discuss the detailed implications along with their results. Section 6 presents the conclusion.

#### **2. Literature Reviews**

#### *2.1. Traditional Machine Learning-Based Document Classification*

The study of classifying documents using machine learning rather than reading documents by humans and classifying them into a given class has been conducted using traditional machine learning. Among the various document classifications, the field of detecting whether or not spam is spam was treated as an initial document classification problem. The most common machine learning algorithms used to detect spam emails are Gaussian Naive Bayes, Support Vector Machines (SVMs), and Neural Networks. Gaussian Naive Bayes (GNB) is one of the earliest document classification algorithms applied to spam filtering because it has low false positives and simple processing [9,10]. GNB uses a conditional probability function combined with a simple bag-of-words feature to determine the overall probability of whether a given email is spam or not. First, stop words are deleted from the message, and the message is split into individual words. In all messages in the data set, the total frequency of occurrence for the entire list of words is calculated. A threshold is applied to delete the least frequent words and complete the unique vocabulary of the data. The spam or non-spam label is then used to calculate the probability of each word being included in the spam message. Finally, the probability that the message is spam is calculated by combining the spam probability of each word in the message. Mitra et al. [24] present a least-squares support vector machine (LS-SVM) that classifies noisy document titles into various predetermined categories. Random Forest (RF) classifiers are suitable for text classification on high-dimensional noise data. Islam et al. [25] proposed a dynamic ensemble selection method to improve the performance of a random forest classifier in text classification.

#### *2.2. Deep Learning-Based Document Classification*

Deep learning uses multi-layered artificial neural networks and learns useful features directly from data. Deep learning is changing the paradigm of machine learning research, showing remarkable performance gains in many areas of computer vision. Deep learning technology has been applied to computer vision since 1989, and Yann LeCun [26] proposed a Convolutional Neural Network that divides an image into several local regions and shares weights for character recognition in an automatic postal classification system. CNN learns features of input data using tensors as input, passes the data through a layer of neurons that classifies the data into multiple stages, and computes the weights to pass as input to the next layer. The main components that make CNN different from neural networks are three layers (convolutional layer, pooling layer, and fully connected layer). The convolutional layer convolves the multidimensional features of the input tensor and outputs a reduced vectorization to pass to the pooling layer. In the max-pooling layer, we extract the maxima from each neuron cluster in the previous layer, reducing the dimensionality while retaining important information from the convolution. The final fully connected layer connects the final node to each specified output class. Recently, in the field of computer vision, a Recurrent Neural Network (RNN) is being used for image and video description generation, handwriting recognition, and text or sound translation functions in images or videos [27].

Deep learning is being actively applied not only to computer vision but also to text classification which identifies what kind of category the input text belongs to. Word2Vec is used to transform the text into tensors or vectorized representations for processing in CNNs. CNN showed higher performance in spam classification than traditional machine learning methods. Huang [28] proposed a CNN (Convolutional Neural Network) model for Chinese SMS (Short Message Service) spam detection. This study also discusses the influence of hyper-parameters on CNN models and proposes optimal combinations of hyper-parameters. Liu et al. [29] proposed a modified deep CNN model for email sentiment classification. Mutabazi et al. [30] provided reviews of various medical text questionanswering systems using deep learning. Kim et al. [22] developed a document classification model related to infectious diseases using deep learning. A document classification model was constructed using two deep learning algorithms (ConvNet and BiLSTM) and two classification methods, DocClass and SenClass. Given a specific text extraction system, it was shown to be compatible with the classification performance of human experts. It has shown the potential of using deep learning to identify epidemic outbreaks. Table 1 presents the summary of methods for text classification.

**Table 1.** Summary of methods for text classification.


#### **3. Problem**

In this section, we first define several concepts as well as the problem of representative spatio-temporal documents.

**Subject of the document**. Let *D* = {*d*1, ... , *d*n} be a set of documents. Each document has a core subject, which is the message the author wants to convey to the reader. For example, consider a news article reporting the damage of a typhoon that occurred on Jeju Island, South Korea on September 7. *d*i.*subject* = {'typhoon damage'} denotes the subject of *d*<sup>i</sup> is about the damage caused by the typhoon that occurred on Jeju Island on September 7.

**Spatio-temporal word.** *d*<sup>i</sup> = {*s*1, ... , *s*m} is a sequence of sentences and si = {*w*1, ... , *w*l} is a sequence of words. Among the words contained in a document, there are words for a specific time and place where an event occurred. *w*i.*time* = {'September 7'} denotes that an event occurred on September 7. *w*j.*place* = {'Jeju Island'} denotes that the place where an event occurred is Jeju Island.

**Representativeness of spatio-temporal word.** Several spatio-temporal words can exist in one document. Some of the spatio-temporal words are related to the subject of the document, and some are not. Among spatio-temporal words, we consider the words most relevant to the subject of a document as 'representative spatio-temporal words'. We denote a representative spatio-temporal word, *w*i.*presentativeness* = *true*.

**Representative spatio-temporal document.** We define a document containing both a representative spatial word and a representative temporal word among words included in one document as a representative spatio-temporal document.

#### **4. Materials and Methods**

#### *4.1. Datasets*

In this study, learning data for the classification of representative spatio-temporal documents were constructed using the published Korean corpus. The National Institute of Korean Language [23] discloses various data in Korean. In this study, a newspaper corpus provided by the National Institute of the Korean Language for research purposes was used to construct learning data for representative spatio-temporal documents. The newspaper corpus provided by the National Institute of the Korean Language is a collection of newspaper articles produced for 10 years from 2009 to 2018 with a total of 3,536,491 articles. The corpus consists of a total of 363 files, with a total size of 15.6 GB. The original file is composed of JSON (UTF-8 encoding). Raw data contains article content in the document tag. One article consists of a metadata tag indicating the metadata of the article (title, article name, newspaper company, publication date, and subject) and a paragraph tag indicating the article body. In the paragraph, the article body is divided into paragraphs and composed of form tags.

#### *4.2. Data Preprocessing*

We constructed representative spatio-temporal information learning data for 7400 articles out of 3,536,491 articles. Eight workers read the content of the news article and judge whether the article has representative spatio-temporal information. In order to improve the performance of artificial intelligence systems, the quality of training data is important. In order to maintain the consistency of data quality among workers, we cross-checked each other's work results three times.

#### *4.3. Deep Learning Model*

Determining whether or not a news article is a representative spatiotemporal document is a binary classification problem. We used a deep learning neural network model, a character-level convolutional neural network (CNN) called ConvNet [15]. In general, ConvNet divides sentences/paragraphs/documents into word unit tokens when text classification is performed. However, Zhang et al. [15] argue that by using the character (alphabetic) unit instead of the word unit token, a good enough performance for the Natural Language Processing (NLP) task can be achieved without using the word unit. An attempt to use tokening as a character-level unit was first presented in this paper. We also used an embedding matrix created by tokenizing the text in character units as shown in Figure 1.

**Figure 1.** Character-level embedding. ('나는 학교에 간다' in Korean means 'I go to school' in English).

ConvNet treats each document as a series of characters and is passed to 6 convolutional and max-pooling layers and 3 fully connected layers to determine the probability that a document belongs to a positive class. Because this model does not require pre-trained embedded words, it learns quickly and with reasonable performance compared to word-level models.

We developed a character-level CNN-based document classifier to classify representative spatio-temporal documents, RepSTDoc\_ConvNet using the entire document as input. We used the layers of the CNN model, DocClass\_ConvNet, in [22] as our baseline. Figure 2 shows a comparison of the two models.

ConvNet has both 9 layers deep with 6 convolutional layers and 3 fully-connected layers. DocClass\_ConvNet has both 6 layers deep with 4 convolutional layers and 2 fullyconnected layers. RepSTDoc\_ConvNet has both 12 layers deep with 9 convolutional layers and 3 fully-connected layers.

In order to train a ConvNet model, we need to keep documents of various lengths constant. Considering the hardware memory constraint and the length distribution of the training data, the number of characters in the document was set to 4700 in ConvNet. Long text is truncated and short text is padded.

**Figure 2.** A comparison of DocClass\_ConvNet [22] and RepSTDoc ConvNet.

#### **5. Result and Discussion**

In this section, we present comprehensive experimental results of the deep learning model. The purpose of this paper is to develop a classifier for representative spatiotemporal documents based on deep learning. To evaluate the performance of a proposed deep learning-based classifier, we first evaluated the performance of three traditional machine learning algorithms: Gaussian Naïve Bayes, Linear SVM, and Random Forest. For performance comparison with our CNN model (RepSTDoc\_ConvNet), we also evaluated the performance of DocClass\_ConvNet, an existing CNN-based document binary classifier, and DocClass\_ConvNet\_Mod, which adjusted hyper-parameters in the DocClass\_ConvNet model to fit our dataset.

To confirm that our CNN model works properly, we pre-tested the performance of binary classification using the benchmark spam dataset from the UCI Repository [31]. The spam dataset contained 5572 messages in English. This spam dataset was fed to our proposed CNN model and the experimental results were as follows: accuracy (0.982), precision (0.962), recall (0.916), and F1-score (0.938). This result is not significantly different from that of the recently published CNN model [32].

All experiments were carried out on conducted on a GeForce RTX 2080 Ti 11GB GPU and an Intel(R) Xeon CPU with 64 GB memory.

#### *5.1. Performance Evaluation*

For the experiment, we divided the collected data into training (60%), validation (20%), and test data (20%) as shown in Table 2. Target data were distributed to each data about 25.23%. The training data was used to train the model, the validation data was used to select the best performing model in the training process, and the test set was used to evaluate the performance of the finally selected model.


**Table 2.** Statistics of training, validation, and test data.

#### *5.2. Hyper-Parameter Tuning*

CNN consists of several hyper-parameters such as kernel size, batch size, dropout rate, learning rate, pooling window size, pooling type, activation function, number of neurons in a density layer, and optimization function, etc. We found the most suitable parameter values for the proposed model by manually adjusting the values of each parameter. We found the optimal parameter values by using the learning curves for accuracy and loss of training data and validation data for every experiment. We set up the experimental environment with various parameters, the parameters used in the experiment are summarized in Table 3, and the parameter values with the highest performance are shown in bold. During the training process of the CNN model, we trained our CNN model with up to 1000 epochs and early stopping patience = 220.

**Table 3.** Hyper-parameters for the experiments.


Overfitting deep learning models makes it difficult to trust their predictive performance on new data. Therefore, training should be stopped when the loss in the validation data is no longer reduced during the training phase. Early stopping is one of the regularization techniques that makes neural networks avoid overfitting [33]. We can use the EarlyStopping callback to terminate the model early when the performance index of the model does not improve during the set epoch. Through a combination of EarlyStopping and ModelCheckpoint callbacks, it is possible to trigger an early shutdown for non-improving training and resume training by reloading the best model from ModelCheckpoint. Both training loss and validation loss decrease until overfitting occur, but when overfitting occurs, training loss decreases while validation loss increases. Thus, we set the monitor option of EarlyStopping callback to stop training when the validation loss increases.

#### *5.3. Experimental Results*

We compared the RepSTDoc\_ConvNet with three baseline machine learning classifiers (Gaussian naïve Bayes, linear SVM, and random forest) and three deep learning models (ConvNet, DocClass\_ConvNet, and DocClass\_ConvNet\_Mod). DocClass\_ConvNet is a model in which the CNN layer and hyper-parameters presented in the study are identical. DocClass\_ConvNet\_Mod is a model that optimizes the hyper-parameter values according to the experimental data while maintaining the same CNN layer of DocClass\_ConvNet. Deep learning includes the process of randomly setting weight values during model training. Therefore, to compensate for such randomness, the average performance was measured after performing each experiment 10 times. The experimental results are presented in Table 4.

The accuracy of machine learning algorithms to classify representative spatio-temporal documents was derived from a minimum of 0.74 to a maximum of 0.79. This accuracy is far below the performance of machine learning that deals with general document classification problems. The CNN layer used in this paper derives relatively high performance in the spam classification problem. From these results, it can be seen that classifying representative spatio-temporal documents is a difficult problem.


**Table 4.** Comparison of evaluation based on the precision, recall, F1 score, and accuracy.

Random Forest showed the highest precision with 0.729 and DocClass\_ConvNet\_Mod showed the highest accuracy with 0.794. RepSTDoc\_ConvNet showed the highest recall and F1-score with 0.673 and 0.612, respectively. In terms of accuracy, DocClass\_ConvNet\_Mod seems to have the highest performance with 0.794. However, considering the confusion matrix, it does not seem appropriate to evaluate the performance of machine learning only with accuracy in the problem of classifying representative spatio-temporal documents. Figure 3 shows three confusion matrixes of Linear SVM, Random Forest, and RepSTDoc\_ConvNet.

In the validation data used to evaluate the proposed CNN model, the proportion of representative spatio-temporal documents (RepSTDoc) is only 25.20%. Therefore, even when the model is not trained at all, the accuracy is 74.80%. In this case, high accuracy is maintained even if the number of documents predicted by the model with RepSTDoc is small. In Figure 3a, Linear SVM classified 123 documents (46 false positives, 77 true positive) as RepSTDoc. Even if the model training is not done properly, the high true negative value (471) results in high accuracy. A random forest with the second-highest accuracy is also similar to Linear SVM. In the random forest, the accuracy is 0.770 even though there are few documents classified by RepSTDoc (48) because the model is hardly trained. The fact that the number of documents predicted as RepSTDoc is small because the model is not trained can be confirmed by the small recall value (0.191). In Figure 3c, RepSTDoc\_ConvNet classified 257 documents (123 false positives, 134 true positive) as RepSTDoc. In RepSTDoc\_ConvNet, as the value of true positive increased, the value of falsepositive also increased. The fact that the model classified many documents as RepSTDoc can be seen from the high value of recall (0.609). This phenomenon occurs because the number of positive and false documents in the data is imbalanced. Therefore, in order to accurately evaluate the performance of the model, the F1-score, which considers both precision and recall, should be used as a measure. In terms of the F1-score, RepSTDoc\_ConvNet yields the highest performance with 0.609.

We measured the classification accuracy of human workers on 1400 learning data to verify the challenge of the representative spatio-temporal document classification problem. The 1400 learning data consists of 359 representative spatio-temporal documents and 1041 non-representative spatio-temporal documents. Four workers who participated in building learning data classified representative spatio-temporal documents for 1400 learning data. For each learning data, the number of workers who judged actual representative spatio-temporal documents as representative spatio-temporal documents (True Positive: TP) and the number of workers who judged non-representative spatio-temporal documents (False Negative: FN) were calculated.

For one actual representative spatio-temporal document, the ratio was calculated by dividing the number of all four people judged as TP, the number of three or more judged as TP, the number of two or more judged as TP, and the number of one or more judged as TP in Table 5. For each of the 359 representative spatiotemporal documents, the number of documents judged as TP by all 4 people was 189 (52.64%), the number of documents judged as TP by 3 or more people 251 (69.92%), and the number of documents judged as TP by 2 or more people was 310 (89.35%), the number of documents judged as TP by 1 or more people was 332 (92.48%).

**Table 5.** The ratio and count of actual representative spatio-temporal documents to be judged as representative spatio-temporal documents according to workers.


For one actual nonrepresentative spatio-temporal document, the ratio was also calculated by dividing the number of all 4 people judged as FN, the number of 3 or more people judged as FN, the number of 2 or more people judged as FN, and the number of 1 or more people judged as FN in Table 6. For each of the 1041 nonrepresentative spatio-temporal documents, the number of documents judged as FN by all 4 people was 5 (0.48%), the number of documents judged as FN by 3 or more people was 24 (2.31%), and the number of documents judge as FN by 2 or more people (6.34%), and the number of documents judged as FN by more than one person was 135 (12.97%).

**Table 6.** The ratio and count of actual nonrepresentative spatio-temporal documents to be judged as nonrepresentative spatio-temporal documents according to workers.


First of all, we describe the challenge of the representative spatio-temporal document classification problem through the ratio of documents in which at least three people, more than half of the judges, judged the actual representative spatio-temporal document as the representative spatio-temporal document. About 70% of the three or more people judged the actual representative spatio-temporal document as TP, and the ratio of all four people who judged it as TP was only about 53%, confirming that it is difficult for humans to classify representative spatio-temporal documents from large documents.

#### *5.4. Effect of Learning Rate*

The learning rate refers to the amount by which the weights are updated during model training and determines how quickly the model adapts to the problem. Larger learning rates converge more quickly to suboptimal solutions, while lower learning rates can result in early intervening learning. One of the important hyper-parameters that must be appropriately selected in deep learning neural network model training is the learning rate. We experimented with the effect of learning rate [0.1, 0.01, 0.001, 0.0001, 0.00001. 0.000001] on performance.

Figure 4 shows the effect of the learning rate for ConvNet, DocClass\_ConvNet\_Mod, and RepSTDoc\_ConvNet. The learning rate at which no training was performed in each model was not shown on the graph (learning rate: 0.1, 0.01, and 0.000001). In the section where the model is trained, the F1-score tends to increase as the learning rate decreases. There is a large difference in performance according to the learning rate in each model. In the representative spatio-temporal learning data used in this study, the learning rate shows the highest performance at 0.00001.

**Figure 4.** The effect of learning rate.

#### *5.5. Effect of Batch Size*

Most of the training of deep learning models is based on mini-batch stochastic gradient descent (SGD). At this time, the batch size is one of the important hyper-parameters when training the actual model. Various studies are being conducted regarding the effect of the batch size on model training. Although it has not been clearly identified yet, it is experimentally observed in several studies that the use of a small batch size has a positive effect on generalization performance. We experimented with the effect of learning rate [16, 32, 64, 128, and 256] on performance.

Figure 5 shows the effect of batch size for ConvNet, DocClass\_ConvNet\_Mod, and RepSTDoc\_ConvNet. In the representative spatio-temporal learning data used in this study, there was no consistent performance variability across models. RepSTDoc\_ConvNet shows a tendency to improve performance as the batch size increases in the model training section [32, 64, 128, and 256]. However, in DocClass\_ConvNet\_Mod, the variation of performance according to the batch size was not consistent. Although this result cannot be generalized, the batch size may not affect the performance of the model depending on the complexity of the CNN layer and the characteristics of the data.

#### *5.6. Time Efficiency*

The numbers of weights are 1,410,609, 1,446,261, and 5,083,129 in DocClass\_ConvNet\_Mod, ConvNet and RepSTDoc\_ConvNet respectively. The overall algorithm time is affected by the complexity of the neural network. This is because the amount of computation increases as the number of weights in the network increases. Table 7 shows the time efficiencies for the three algorithms.

**Figure 5.** The effect of batch size.

**Table 7.** The comparison of the time.


#### *5.7. Data Distribution Rate*

We also investigated the performance difference according to the change in the distribution ratio of training, validation, and test data. The ratio of training data was set while keeping the ratio of validation data and test data the same. The distribution ratio used in the experiment is as follows: training, validation, and test data are 4:3:3, 6:2:2, and 8:1:1 respectively. Figure 6 shows the highest performance with a 6:2:2 distribution ratio. There is not much difference in the performance of each model according to the distribution ratio.

**Figure 6.** The effect of distribution rate.

*5.8. Receiver Operating Characteristic*

The Receiver Operating Characteristic (ROC) curve shows the performance of the binary classifier for various thresholds. Figure 7 shows the corresponding ROC curves when using ConvNet, DocClass\_ConvNet\_Mod, and RepSTDoc\_ConvNet. ConvNet outperformed the other models in the lower-left corner. However, in the section where the

false positive rate is greater than 0.2, RepSTDoc\_ConvNet was superior to other models. RepSTDoc\_ConvNet was found to have the best performance for classifying representative spatiotemporal documents.

#### **6. Conclusions**

The purpose of this paper is to develop a CNN-based representative spatio-temporal document classification model. Because the representative spatio-temporal document is a novel concept, we defined a representative spatio-temporal document as documents containing spatio-temporal information describing the core topic of a document. We built 7400 learning data to train a CNN-based representative spatio-temporal document classifier and developed a character-level CNN-based document classifier to classify representative spatio-temporal documents. To evaluate the performance of RepSTDoc\_ConvNet, we evaluated the performance of three traditional machine learning algorithms: Gaussian Naïve Bayes, Linear SVM, and Random Forest. For performance comparison with our RepSTDoc\_ConvNet, we also evaluated the performance of ConvNet, DocClass\_ConvNet, and DocClass\_ConvNet\_Mod. The experimental results show that RepSTDoc\_ConvNet outperforms traditional machine learning classifiers and existing CNN-based classifiers.

A limitation of the work is that RepSTDoc\_ConvNet still has lower performance compared to general document classifiers. It is necessary to diversify the features of the input data as it shows that classifying representative spatio-temporal documents is a difficult problem. In order to further improve the performance of the representative spatio-temporal document classifier, it is necessary to find a way to lower the false positive value by finding the characteristic that distinguishes the general spatio-temporal document from the representative spatio-temporal document.

**Author Contributions:** Conceptualization, H.-J.J. and B.K.; methodology, B.K.; software, B.K.; validation, H.-J.J.; investigation, Y.Y.; data curation, Y.Y. and B.K.; writing—original draft preparation, H.-J.J. and B.K.; writing—review and editing, J.S.P.; visualization, J.S.P.; supervision, B.K.; project administration, B.K.; funding acquisition, B.K. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Korean Government (MSIT) (No. 2021R1F1A1049387) and by industry-academic Cooperation R&D program funded by LX Spatial Information Research Institute (LXSIRI, Republic of Korea) [Project Name: A Study on the Establishment of Service Pipe Database for Safety Management of Underground Space/Project Number: 2021-502). This result was

supported by the "Regional Innovation Strategy (RIS)" through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (MOE) (1345341782).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Written informed consent has been obtained from the patient(s) to publish this paper.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **A Smart-Mutual Decentralized System for Long-Term Care**

**Hsien-Ming Chou**

Department of Information Management, Chung Yuan Christian University, Taoyuan City 32023, Taiwan; chou0109@cycu.edu.tw

**Abstract:** Existing caretakers of long-term care are assigned constrainedly and randomly to taking care of older people, which could lead to issues of shortage of manpower and poor human quality, especially the proportion of older people increases year after year to let long-term care become more and more important. In addition, due to different backgrounds, inadequate caregivers may cause older people to suffer from spiritual alienation under the current system. Most of the existing studies present a centralized architecture, but even if technology elements are incorporated, such as cloud center services or expert systems, it is still impossible to solve the above-mentioned challenges. This study moves past the centralized architecture and attempts to use the decentralized architecture with Artificial Intelligence and Blockchain technology to refine the model of providing comprehensive care for older people. Using the proposed mapping mutual clustering algorithm in this study, the positions of caregivers and older people can be changed at any time based on the four main background elements: risk level, physiology, medical record, and demography. In addition, this study uses the proposed long-term care decentralized architecture algorithm to solve the stability of care records with transparency to achieve the effect of continuous tracking. Based on previous records, it can also dynamically change the new matching mode. The main contribution of this research is the proposal of an innovative solution to the problem of mental alienation, insufficient manpower, and the privacy issue. In addition, this study evaluates the proposed method through practical experiments. The corporation features have been offered and evaluated with user perceptions by a one-sample *t*-test; the proposed algorithm to the research model also has been compared with not putting it into the model through ANOVA analysis to get that all hypotheses are supported. The results reveal a high level of accuracy of the proposed mutual algorithm forecasting and positive user perceptions from the post-study questionnaire. As an emerging research topic, this study undoubtedly provides an important research basis for scholars and experts who are interested in continued related research in the future.

**Keywords:** older people; long-term care; artificial intelligence; blockchain technology; decentralized architecture

#### **1. Introduction**

Older people normally manage their daily activities in residential aged care through family members, professional caregivers, or by themselves. However, most care agencies focused on the cost of employees and always have a lack of staff, leading to limitations on the healthcare systems [1,2]. In addition, differences in backgrounds between the generations can cause generation gap issues such as different ideas, education, and even political leanings [3,4]. The current method of human resources distribution, as assigned by care agencies, is insufficient because older people are participants in different social networks. In particular, the COVID-19 pandemic has made these long-term care facilities with staff much riskier than dynamic mutual ways to keep sufficient caretakers [5,6]. Older people interact with others on managing spiritual loneliness and watching out for accidents. Older people are often highly active, unlike those in nursing care with chronic diseases, so it is necessary to consider their willingness to collaborate. In the past, long-term care

**Citation:** Chou, H.-M. A Smart-Mutual Decentralized System for Long-Term Care. *Appl. Sci.* **2022**, *12*, 3664. https://doi.org/10.3390/ app12073664

Academic Editors: Wei Wang and Ka Lok Man

Received: 16 March 2022 Accepted: 3 April 2022 Published: 6 April 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

homes with social connections keep stable caretakers and well relationships among longterm care residents. However, from this traditional center architecture aspect to protect people living in long-term care from COVID-19 infection, some staff must restrict activities and interactions with old people, which could lead to a devastating impact on residents' social connections [5,7]. Older people have different or interests or political thinking, so it is a challenge to match their needs through one particular method. The first research question (RQ1) is that what features could be suitable for a mapping procedure on a mutual algorithm to solve the manpower issues and alone living?

An effective method should be able to be customized with novel technologies to satisfy personal needs and preferences. Older people may have dynamic preferences even under the same features conditions, which may influence the chance of success when building a mutual algorithm. Therefore, the second research question is described as follows. RQ2: What kinds of mapping architectures and technologies can help us to build an effective mapping procedure based on the proposed mutual algorithm to coordinate human variability and privacy protection?

The motivation of the article is to clearly identify and solve the existing issues in long-term care to keep older people living safe and happy, and offer related organizations a solution to the shortage of manpower. The expected contributions include (1) solving existing issues on shortage of manpower on taking care of older people; (2) considering the fitness of the corporation for both sides for long-term care; (3) adjusting dynamically based on the human variety of characters; (4) recording the process and outcome of taking care of older people to be credited for the next arrangement; and (5) helping to measure the degree of the physical situation based on the records. An effective mutual algorithm should not only consider features related to personal characteristics and human variability, but also be able to record and improve collaboration or transaction processes through a highly trusted and rigid platform. The primary target of this study is to identify more accurate personal characteristics that can fit the mapping procedure. In addition, it aims to implement Artificial Intelligence (AI) based on a suitable mapping architecture to make the empirical process of the system both appropriate and reliable.

#### **2. Architecture Theories**

An ideal architecture of long-term care should consider whether or not it can bring older people a good service quality. Service quality of long-term care has been put in evaluating the long-term service from their perceptions [8]. Older people are interested in quality of the long-term care system including health care provided. Most of the studies showed that long-term care providers do not always pay attention on the quality of services provided. Service quality can be used as a strategic tool for building distinctive features. Literature shows that service quality can be divided into dimensions such as technical and process functional dimension [9]. Technical dimension on long-term care is defined as primarily on the basis of architecture design for maintaining good quality on the medical diagnoses and procedures services, as well as conformance to professional specification and standards such as centralized and the decentralized architecture of long-term cares [10]. Functional dimension can be defined to refer to the manner of long-term care service is delivered to quality of older people relationship with the caregivers.

#### *2.1. The Centralized Architecture of Long-Term Care*

The centralized architecture of long-term care is the process by which the activities of long-term care agents who can offer caregivers. The caregivers are like insurance agents who would be trained, have care permits, and would be assigned to a set of older people. However, this is not simply a problem of financial centralization or decentralization. The agent caregivers may seek to take care of nice people and avoid some older people who are at the high-risk levels of uncomfortable people. Centralization of access to the agents of long-term care through specialized data services could have security or privacy issues to lead to the protection failure of the personal data of older people [9,10].

Traditionally, the centralized architecture of long-term care could have the decisionmaking power to be managed directly with the agents of long-term care. Due to saving cost, centralization of long-term care aims at ensuring effective enforcement of controlling activities of caregiver's consistency in operation [9,10]. Therefore, the centralized architecture of long-term care, unlike many security agencies or entities in the human world, could have mutual problems because of unsuitable personality to let older people feel uncomfortable or lonely soul. In addition, privacy issues on personal data protection and insufficient manpower offered by agents of long-term care could be also serious problems through this the centralized architecture of long-term care.

#### *2.2. The Decentralized Architecture of Long-Term Care*

It is very important to consider service quality of caregivers based on the centralized architecture of long-term care. However, measuring service quality in long-term care is very difficult to evaluate. This is due to the fact that evaluation of understanding of real perceptions of older people and their satisfaction is quite complex [8]. Different agents may provide the same types of services but different quality of services. Decentralization of long-term care is the process of shifting decision making a way from centralized control and closer to older people themselves of the services. In many countries the government has opted to decentralize health system as means of improving responsiveness and performance of delivering of long-term care [10]. The decentralized architecture of long-term care has impacts on the performance of the systems based on some studies found [11,12]. In the decentralized architecture of long-term care, it still needs a way or create a model to handle the healthcare services because three main issues, including mental problems from unsuitable matches, privacy data of older people, and insufficient manpower, can be very important to contain the good service quality of caregivers.

Existing studies present a personal information management, which can offer specific features such as interests or contact lists related to the characteristics of older people, to manage communication through a centralized cloud system [13–16]. The basic idea behind the decentralized architecture of long-term care to replace or support human resource agencies or other specific local platforms as controlled centers is that good services can consider all human resources to adjust to or coordinate their needs [17]. Existing studies still try to offer cloud services to achieve sharing medical data with entities with minimal data privacy [18]. Although existing studies suggest using smart contracts to track the behavior of violations of data permissions, the current studies have some serious problems since it does not consider other impacts for older people such as risk levels. In addition, one of the common limitations of those approaches is that many older people long for a social network in order to have regular interaction with one another to manage spiritual loneliness, and they lack a platform for mutual algorithm and adjusted abilities based on a mapping process suitable for older people [19]. To the second research question, one of the major challenges is to face in this field is to explore a process innovation [20] in search of this answer. In order to solve the second research question and the problems mentioned above, this study proposes the use of the decentralized architecture of long-term care to store a mapping or transaction procedure based on AI methods, and the information would be secured and shared across all network candidates.

#### **3. The Mutual Algorithm**

When older people are engaging in their daily activities, they may defer or give up their current activities to take part in a mutual algorithm effort. Accordingly, a good mutual algorithm should consider if older people prefer to continue working on their main tasks and be given the opportunity to defer a mutual algorithm until they have completed their current activities. This also implies that a good mutual algorithm can allow older people to flow in and out freely, helping them to realize any rationalization and optimization from their participation in the structure. Older people are often highly active, unlike those in nursing care with chronic diseases, so it is necessary to consider their willingness to collaborate.

Clustering is one of the unsupervised learning methods in the field of machine learning, and it includes various algorithms that may differ significantly in the cluster analysis and efficiently identify factors across similar features [21,22]. This study uses the mapping mutual clustering (i.e., MMC) algorithm referred by the clustering method in the long-term care field. Based on the mutual features mapping mutual algorithm selection, this study assumes that each mapping round is repeated and there is an outcome (i.e., optimization) for each round. Therefore, the feedback of each collaboration should consider both the feedback and the probability of using the proposed mutual algorithm.

To formalize the mutual algorithm, this study formulates older people as a fourelement vector, *On* (*Ri*, *Pi*, *Mi*, *Di*), where *n* is the total older number of people in the mutual algorithm; *i* = 1, 2, 3; and terms *O*, *R*, *P*, *M*, *D* represent older people, risk level, physiology, medical record, and demography, respectively. Distances are normally used to measure the similarity or dissimilarity between two older people, so *S* (*Os*, *Ot*) refers to the similarity between two older people, *s* = (*s1*, *s2*, ... , *si*), and *t* = (*t1*, *t2*, ... , *ti*). The similarity function of mapping process for each person is defined in Equation (1):

$$S(O\_{s\prime}O\_t) = \sqrt[q]{(O\_{s1} - O\_{t1})^q + (O\_{s2} - O\_{t2})^q + \dots + (O\_{si} - O\_{ti})^q} \tag{1}$$

According to Equation (1), this study continues to compute the minimum similarity distance of older people in the same group into *K* nonempty subsets in Equation (2):

$$K\_{\mathcal{S}\_{\perp}(O\_{\kappa}, O\_{l})} = \min(\sum\_{l=1}^{n} S\_{l}) \tag{2}$$

According to Equation (2), the mutual algorithm continues to compute seed older people as the centroids of the current clusters: *M*1, *M*2, ... , *Mk*, and then the mutual algorithm uses *S* (*Os*, *Ot*) to subtract *Mk* to obtain a new *S* (*Os*, *Ot*) for Equation (3) and new *K* groups in Equation (4):

$$S\_{new}\left(O\_{s\prime}, O\_{l}\right) = S\left(O\_{s\prime}, O\_{l}\right) - M\_{k}\left(K\_{S\left(O\_{l}, O\_{l}\right)}\right) \tag{3}$$

$$K\_{S\_{new}\ (O\_{s'}O\_t)} = \min(\sum\_{l=1}^{n} S\_{ncw\ l})\tag{4}$$

To test MMC, the process of implementation is used to identify initialization of parameters, setting groups, computing similarity, and building final mutual algorithm. The basic idea of the mutual algorithm is that the mapping mutual clustering (MMC) is implemented as an algorithm called the mutual algorithm (Algorithm 1), which can identify some features of older people such as risk level, and group them based on these features. The mutual algorithm is a finite sequence of well-defined, computer-implementable instructions for the mutual algorithm in order to perform a computation upon evaluation. The detailed description of the mutual algorithm is clear as follows, and it is possible to evaluate and verify its feasibility and correctness by building the measurable architecture described in the next subsection to be implemented through the evaluation plan. In the mutual algorithm, there are several steps that need to be implemented, (1) setting initialization of parameters such as risk level, medical record, etc. of older people; (2) giving an original group based on the first glance; (3) resetting their groups based the later features by using clustering to dynamically adjust for human variety characters; (4) building new corporation relationships and making arrangements based on the novel matching way on step three.

#### **1. Require: Initialization of parameters:** getRiskLevel, getPhysiology, getMedicalRecord, getDemography, getOlderPeopleID **2. Set up groups:** groupA (getOlderPeopleID) ← groupA (getRiskLevel, getPhysiology, getMedicalRecord, getDemography) groupB (getOlderPeopleID) ← groupB (getRiskLevel, getPhysiology, getMedicalRecord, getDemography) groupC (getOlderPeopleID) ← groupC (getRiskLevel, getPhysiology, getMedicalRecord, getDemography) **3. Compute similarity:** for groupA () groupA (getOlderPeopleID) ← retrieve (minimum distance) end for groupA () for groupB () groupB (getOlderPeopleID) ← retrieve (minimum distance) end for groupB () for groupC () groupC (getOlderPeopleID) ← retrieve (minimum distance) end for groupC () **4. Build collaboration:** if groupA (getOlderPeopleID) > groupB (getOlderPeopleID) > groupC (getOlderPeopleID) then groupA (getOlderPeopleID) ← assign (groupC (getOlderPeopleID)) else if groupA (getOlderPeopleID) > groupC (getOlderPeopleID) > groupB (getOlderPeopleID) then groupA (getOlderPeopleID) ← assign (groupB (getOlderPeopleID)) else if groupB (getOlderPeopleID) > groupA (getOlderPeopleID) > groupC (getOlderPeopleID) then groupB (getOlderPeopleID) ← assign (groupC (getOlderPeopleID)) else if groupB (getOlderPeopleID) > groupC (getOlderPeopleID) > groupA (getOlderPeopleID) then groupB (getOlderPeopleID) ← assign (groupA (getOlderPeopleID)) else if groupC (getOlderPeopleID) > groupA (getOlderPeopleID) > groupB (getOlderPeopleID) then groupC (getOlderPeopleID) ← assign (groupB (getOlderPeopleID)) else

**Algorithm 1. Implementation of the Mapping Mutual Clustering (MMC) Algorithm**

groupC (getOlderPeopleID) ← assign (groupA (getOlderPeopleID)) end if

> According to the proposed mapping mutual clustering method and its implementation with the mutual algorithm as a description of the implementation of the mapping mutual clustering algorithm above, in order to build a long-term and comprehensive guideline for further system applications, it is necessary to build a framework called the decentralized self-service framework as Figure 1 depicted. The basic idea of the decentralized self-service framework is that the features are identified clearly with four factors including rick level, physiology, medical records, and demography, the steps are illustrated obviously from grouping to suitability, and the service is grouped into a kind of self-service, which can open to social networks. Based on extant approaches and their limitations, the proposed mutual algorithm not only integrates different data from diverse features but also considers utilizing a classification algorithm in Artificial Intelligence to dynamically increase accuracy. Therefore, adding a self-service mechanism is an important step before building the proposed the decentralized architecture of long-term care. The self-service framework is a grouping based on their similar backgrounds, finding relations of individual interaction; and learning by dynamic adjustment, and running suitably and successfully through mutual understanding.

> The proposed architecture (LCDA) applies Blockchain technology to demonstrate trusted and auditable computing and the decentralized networks of all older people accompanied by a public collaborative ledger (Figure 2). LCDA can solve the issues of cost and time consumption related to data acquisition and inappropriate distribution relations between older people. In addition, this architecture can solve the existing issues of care agencies, which now are under third-party authorization. The proposed architecture also can save the costs of transmission and integration through quick and direct data exchanges between older people. Through this LCDA, the proposed mutual algorithm could ensure the non-destructibility of data, which can make the MMC be recorded of older people

in a more secure way. The LCDA algorithm (Algorithm 2) demonstrates initialization of parameters from older people; setting up functions such as hash, encryption, and signature to ensure data of older people more secure; computing proof-of-work to let the system operate effectively; and adding blocks to ledgers to let the systems adjust dynamically.

**Figure 1.** The Decentralized Self-Service Framework. In an existing centralized system, to make transparency more effective, this local server system must play a role in making fair and accurate reports available to the public [23,24]. The proposed decentralized architecture of long-term care represents that older people's information and availability are important features for the mutual algorithm [25]. Through the decentralized architecture of long-term care (called long-term care decentralized architecture (i.e., LCDA)), this not only reduces the cost compared to centralized systems, but also eliminates the chances of information loss due to a single point of failure, since ledger copies are synchronized across all older people.

#### **Algorithm 2. Implementation of the Long-term Care Decentralized Architecture (LCDA) Algorithm**


**Figure 2.** Long-term Care Decentralized Architecture (LCDA).

The mapping or transaction procedure is a type of distributed ledger, which can view transactions anytime to make LCDA immutable and irreversible.

#### **4. Research Model**

This study proposes methods including the mapping mutual clustering (MMC) algorithm and the long-term care decentralized architecture (LCDA) algorithm for creating an innovative manner to solve problems as research questions described. The user perceptions can include user satisfaction, ease of use, usefulness, and user intention, which are all popular for evaluating systems [26,27]. To evaluate the system proposed as RQ1 mentioned, this first hypothesis considers medical records, risk level, physiology, and demography to improve the users' perceived usefulness, ease of use, satisfaction, and intention to use the long-term care system.

The research model (Figure 3) mainly focuses on the support vector machines (SVMs), which is widely-used algorithm [28] for risk minimization [29,30]. Other algorithms such as the random forest is suitable for classification trees to put the input vector down each tree in the forest [30].

**Figure 3.** Research Model.

This research also uses LCDA to solve the issues of human variability. According to the second research question (RQ2), the second hypothesis examines whether or not the accuracy of the LCDA is higher than the mutual algorithm without the proposed method. The detailed description of the MMC and LCDA algorithm is very clear in the previous two sections, and it is possible to evaluate and verify its feasibility and correctness by building the measurable research model and implementing it through the evaluation plan.

#### **5. Experiment Design**

Twelve participants were randomly recruited from various locations (e.g., nursing home, hospital, park) to fill out a questionnaire in order to collect features data (https: //drive.google.com/file/d/1rvx-T9krnsZ-ErRgl-wESieP2qFnBfos/view?usp=sharing, accessed on 1 January 2022) of LCDA. To help participants understand LCDA, the prototype system (Figure 4) has been developed with mobile application software to assist them in completing the questionnaire successfully.

Participants were over 65 years old. All participants were informed that any potentially identifying information learned and collected from this study would remain confidential and disclosed only upon receipt of permission from the participant. There are three levels for each factor. For the risk level factor, if older people consider that they do not require assistance from others most of the time, this is Level 1. If they estimate a nearly fifty-fifty chance that they need care from others, it is Level 2. The remaining risk level is Level 3. Older people can refer to their own Barthel index, as assessed by the government, to complete this part. Other features such as physiology, medical record, and demography are also classified according to three levels as shown in Table 1. The survey questionnaire about whether or not the proposed system can improve users' (a) perceived usefulness, (b) ease of use, (c) satisfaction, and (d) intention to use for long-term care. A five-point Likert scale [31] was used with 1 indicating "strongly disagree,", 3 indicating "neutral," and 5 indicating "strongly agree".

**Figure 4.** The Prototype System of Long-term Care.

The system architecture (Figure 5) uses Android Studio software for designing Java programming and mobile application services (APPs).

**Figure 5.** The System Architecture of the Prototype System.

**Table 1.** Levels of Features of LCDA.


#### **6. Results**

For the first hypothesis, Cronbach's alpha for user perception was 0.88, which revealed the consistency is reliable. A one-sample *t*-test is used for evaluating whether or not the average of user perception is equal to 3 (neutral) based on the middle point of Likert scale [32]. The perceived usefulness mean difference is 1.33 (*p* < 0.01), perceived ease of use is 1.29 (*p* < 0.01), user satisfaction is 0.96 (*p* < 0.01), and user intention mean difference is 1.17 (*p* < 0.01) are all significant. Therefore, the first hypothesis is supported. The data of all participants were collected from the questionnaire based on the proposed features shown in Table 2.

**Table 2.** Dataset of Participant Features.


According to the MMC steps (1–3), including initial partition for older people, choosing seeds as temporary center members, and assigning older people to new groups based on the similarity computations, the dataset is represented in Tables 3–5, respectively.


**Table 3.** Similarity of Initial Partition for Older People.

**Table 4.** Temporary Groups Based on the Similarity.


**Table 5.** Center Means of Temporary Groups.


According to the center means of temporary groups, this study continues to compute the minimum similarity distance of older people for the temporary and final groups as shown in Tables 6 and 7, respectively.


**Table 6.** Similarity of Temporary Groups for Older People.

**Table 7.** Final Groups Based on MMC.


The dataset of features of older people includes RL1, Phy1, MR1, Dem1, RL2, Phy2, MR2, and Dem2 to represent the risk level, physiology, medical record, and demography, respectively (Table 8). The LCDA dataset shows a total of 66 observations calculated in Equation (5):

$$\sum\_{i=1, j=1, j\neq j}^{12} \text{Combination } (i, j) = \mathbb{C}\_2^{12} = \frac{12!}{2! \ast 10!} = 66\tag{5}$$


**Table 8.** LCDA Dataset of Older People.

The accuracies are 86.36%, 96.97%, 98.48%, and 98.48% using SVMs, logistic, and MLP, random forest, respectively. The MLP and random forests achieve the highest level of accuracy (Figure 6).

**Figure 6.** Accuracy of Classification Algorithms.

The collaboration is built randomly 66 times, and the highest accuracy is 56.06% from SVMs. The mean difference is −30.3 (*p* < 0.01), which is significant. Therefore, the second hypothesis is also supported.

#### **7. Discussion**

Based on the literature review, currently, the design of long-term care intends to use centralized architecture, which means agencies assign caretakers to older people based on their manpower policies without taking the appropriate characteristics of older people into consideration. The study proposed methods including MMC and LCDA to overcome some important issues based on the existing architecture that could lead to shortages of manpower and inappropriate cooperation between caretakers and older people. After being systematically evaluated in the experiment, the hypotheses are all supported, those issues can be confirmed to be solved by using the proposed mapping mutual clustering method and long-term care decentralized architecture. Older people sense the suitable caretakers around them, cognitively group caretakers by some characteristics generated from MMC, form long-term corporation relationships with LCDA, and generate records in order to adjust groups dynamically.

With the proposed methods, older people can involve and generate a corporate process when realistic circumstances are acted out in the long-term care systems in order to better understand the outcome of caretakers no matter they are coming from. Because individual caretakers may behave differently attitudes or performance even under the same situation, attempting to define the qualified ability of caretakers and associated system reactions in a static way that is desired by all older people is impossible and can result in risks and difficulty for caretakers. Therefore, a long-term care system should provide a dynamic interaction for all participants so that they can map mutual rules at any time. The decentralized architecture emphasizes the importance of understanding older people in their mapping mutual process and involves characteristics such as risk level while going out for a walk in a dangerous environment.

#### **8. Conclusions**

The study makes many significant contributions to proposing novel methods. First, existing mutual algorithms for long-term care mainly focus on benefits with centralized care agencies to match healthcare workers and older people. However, the benefits of care agencies and cannot solve the issue of manpower. To address the first research question, this study proposes a novel method, called the mapping mutual clustering algorithm, considering all possible features across all older people. Second, the proposed long-term care decentralized architecture algorithm applies Artificial Intelligence and Blockchain to solve the issues of dynamically adjusting, coordinating human variability, and the privacy protection, which can address the second research question. Third, this study applies an empirical process to long-term care.

There are a few limitations and future works to this study. First, this study evaluated the proposed decentralized architecture of long-term care through a features-based questionnaire, which could not immediately reflect current long-term care operational mechanisms in reality. Therefore, it is better to perform an experiment with a larger sample based on the proposed methods in future studies. In the future, enlarging the dataset by building the decentralized system will overcome any limitation on the accuracy of the classification. Second, due to considering privacy issues, the relevant features categorized on the risk levels 1, 2, and 3 do not correspond to the Barthel index, as measured by official investigation reports. To guide practical insights to the long-term care system, the sub-item categories should be clearly identified to solve the issue of disclosing information.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** This article does not contain any studies on human risk performed by any of the authors.

**Informed Consent Statement:** This article can exclude the informed consent statement due to no human risk performed by any of the authors.

**Data Availability Statement:** The data that support the findings of this study are available in https: //drive.google.com/file/d/1rvx-T9krnsZ-ErRgl-wESieP2qFnBfos/view?usp=sharing (accessed on 1 January 2022).

**Acknowledgments:** I would like to thank my research assistant Yu-Tzu Lu for her help in collecting the long-term care data.

**Conflicts of Interest:** Author Hsien-Ming Chou declares that he has no conflict of interest.

#### **References**

