1. Introduction
With the release of Midjourney and the open sourcing of the Stable Diffusion model [
1] in 2022, “AI Art” has become a hot topic on major social media platforms worldwide. AI Art refers to the process of generating and creating images using artificial intelligence algorithms and deep neural network models. Neural network models like the Convolutional Neural Network (CNN), variational autoencoder (VAE), and Generative Adversarial Network (GAN) enable generative models that can comprehend and assimilate abstract features such as painting techniques and styles from existing images, thus facilitating the generation and creation of new images. During its early era, the process of creating images with AI models required a profound comprehension of neural network models for use in network structure design and parameter tuning, making its popularization challenging due to the domain-specific knowledge it demanded. With the advancement of computer vision and natural language processing, models have become capable of acquiring image features at low cost, while neural network structures like the Transformer [
2] have improved the understanding of intricate semantics, leading to the rapid development of algorithms such as text-to-image and image-to-image, which in turn has led to the widespread adoption of AI-generated art in the public domain. Today, users can utilize the AI painting functions offered by major platforms to generate and create images using prompt keywords or text.
Nevertheless, the advancement of AI Art technology has been met with controversy. The technical intricacies of generative models are not readily comprehensible to the general public, and the limited interpretability during the training process of generative models has also raised skepticism among the public. The utilization of training data may also raise concerns related to legal issues such as copyright, leading to extensive debates on social media platforms such as X and Weibo. As an emerging technology, the benign development of AI Art depends on the widespread dissemination and comprehension of technical details among the general populace. It also necessitates clarification regarding the diverse areas of interest associated with the technology, so as to effectively address the doubts and concerns of the public. Analyzing public opinion regarding the topic of AI Art on social media platforms can provide useful insights into the various attitudes towards AI Art expressed by different online communities, and shed light on the specific technological aspects that concern the public, thereby supporting the formulation of pertinent laws and regulations by relevant authorities, which themselves hold significant practical implications.
For public opinion analysis of trending topics on social media platforms, sentiment analysis can be utilized to investigate the sentiments of the general public, aiding in the comprehension of users’ opinions, and thereby providing useful insights for further work. In the early era, lexicon-based methods were applied to classifying text sentiments using rule-based methods. On the topic of dementia, Kong et al. [
3] collected Weibo posts from 2011 to 2021, and utilized the emotion ontology dictionary provided by DLUT to explore users’ discourse and sentiment towards dementia, offering a foundation for decision-making to authorities considering the growing dementia population in China. Shi et al. [
4] studied the Weibo posts created during the outbreak of COVID-19 in early 2020, and combined LDA with lexicon-based sentiment analysis to discuss the evolution process of the public’s concerns, providing useful insights for public health institutions regarding public opinion. To study public opinion related to COVID-19 vaccination, Kwok et al. [
5] utilized LDA and lexicon-based sentiment analysis to provide insights into the public’s attitudes from detailed topics and sentiments separately.
Different from lexicon-based methods, machine learning-based methods involve extracting text features and designing classifiers. To examine the antivaccine topic on X, Taeb et al. [
6] utilized the Naïve Bayes algorithm for sentiment analysis, followed by topic modeling using LDA. Their findings provide feedback on people’s attitudes and opinions on the COVID-19 vaccine for policymakers. Adamu et al. [
7] analyzed tweets discussing the distribution of palliatives by the Nigerian government during COVID-19. They used the SVM classifier to categorize tweets into fine-grained sentiments, and a word cloud was utilized to provide a comprehensive view of people’s attitudes, providing public opinion references to the authorities for the revision and promotion of related policies. To explore the public’s perception of the utilization of ICT in government operations, Risal et al. [
8] utilized Naïve Bayes to conduct sentiment analysis of related tweets, followed by topic modeling using LDA. Their results provide useful insights into the usage and development of e-government.
With the development of deep learning technologies, deep neural networks and pre-training models have been utilized to enhance the performance of sentiment analysis models. He et al. [
9] collected college students’ opinions from multiple Chinese social platforms and constructed a CNN-based sentiment classification network to conduct sentiment analysis on trending topics discussed by students. They summarized trending discussions on different platforms manually, aiming to help university administrators inform decisions related to these trending topics. El Barachi et al. [
10] studied tweets on the topic of climate change, and constructed a framework for analyzing public opinions combining both sentiment analysis and metadata analysis, thereby providing a correlation analysis between the two. Lyu et al. [
11] utilized Weibo posts during the COVID-19 outbreak in early 2020. They conducted sentiment analysis using the ERNIE pre-training model and developed a visualization system, offering public health departments a tool for tracking the evolution of public opinion. Focusing on the topic of ChatGPT, Su and Kabala [
12] conducted sentiment analysis using multiple pre-training models, followed by a topic classifier to provide a comprehensive view of the public’s sentiments towards ChatGPT.
With the emergence of AI Art technology, there is a growing interest in analyzing the impact of the technology, as well as the public’s attitudes towards it. In ref. [
13], the authors analyze the impact of AI Art on artists. They collected comments from various artists on AI Art technology and argued that AI Art needs to be regulated for its development. In ref. [
14], the authors collected literature related to generative AI and discussed the potential ethical implications of AI Art. In ref. [
15], the authors studied tweets related to generative AI models. By utilizing topic modeling and sentiment analysis, a comparison between occupations and the topic analysis was conducted, offering insights for comprehending the varying attitudes of different occupations. In ref. [
16], the authors conducted surveys to interview the general public and performed an empirical analysis to examine the public’s attitude towards AI Art, arguing that it is a double-edged sword.
Through the analysis of related works, sentiment analysis is frequently integrated with topic analysis; however, the results of these analyses are typically reported independently. Aiming to enhance the result of sentiment analysis and provide a more comprehensive view of public opinion, this paper integrates text sentiment analysis with text clustering analysis to develop a framework for analyzing public opinion on the Weibo topic related to AI Art. The contributions of this work are outlined as follows:
A public opinion analysis was conducted on the Weibo platform with the aim of acquiring Weibo users’ opinions about AI Art. As stated above, the current research on public opinions about AI Art primarily focuses on English-based social media platforms like X and Reddit, with a noticeable absence of analysis on Chinese social media platforms. Therefore, this paper constructed a public opinion analysis framework based on text sentiment analysis and text clustering analysis, with the aim of facilitating the detailed examination of Weibo users’ opinions regarding AI Art and its application in various fields.
A sentiment analysis model, LERT_BiSRU++, was developed. By incorporating linguistic features of Part of Speech Tagging, Named Entity Recognition, and Dependency Parsing in the pre-training phase, LERT achieves a better representation of Weibo texts compared to traditional word embedding models such as Word2vec. To capture the long-distance dependencies in textual data, the attention-embedded bidirectional Simple Recurrent Unit, BiSRU++, was employed, thereby improving the classification accuracy and time efficiency of the model.
A text clustering analysis was conducted to enhance the sentiment analysis procedure. To investigate the nuanced opinions revealed in the sentiment analysis results, this study utilizes a text clustering analysis that leverages the LERT model and autoencoders for text representation and dimension reduction, and subsequently applies K-means clustering algorithms to generate clusters. For the detailed opinion extraction, the C-TF-IDF algorithm was employed to extract keywords from each cluster, thus facilitating a fine-grained representation of opinions with both positive and negative sentiments. By integrating sentiment analysis with clustering analysis, the interpretability of results in public opinion analysis has been improved, offering valuable insights for the advancement of AI Art technologies.
The rest of this article is organized as follows.
Section 2 illustrates the overall architecture of the proposed public opinion analysis method, together with the datasets’ description and experimental design.
Section 3 provides the results of experiments, followed by detailed discussions of experimental results and public opinion analysis. Finally, a brief conclusion is drawn in
Section 4.
3. Results and Discussion
3.1. Sentiment Analysis Model Performance Evaluation
Firstly, the text sentiment analysis model proposed in this paper and the baseline models are evaluated from a quantitative analysis standpoint, using the evaluation metrics described in
Section 2.3.3. All the models were trained for 50 epochs, and the best performances were recorded for a benchmark comparison. The results are shown in
Table 6 and
Table 7.
As the results indicate, the Word2vec_BiLSTM model exhibits worse performance across the evaluation metrics. As a static text representation model, Word2vec is constrained by its shallow network architecture and the limited contextual window, which hinders its ability to effectively address challenges like polysemy and non-standard expressions in Weibo posts, thereby impacting the efficiency of extracting profound semantic features from Weibo posts.
Benefiting from the MLM and bidirectional Transformer architecture, the BERT-WWM model demonstrated better efficacy in text representation than Word2vec. Moreover, the LERT pre-trained model, augmented with linguistic features and multi-task learning, achieved better performance than the BERT-WWM. The integration of linguistic features enhances the model’s capacity to comprehend and depict intricate semantics, thereby improving the performance of sentiment analysis when handling complex corpora.
In comparison with the model utilizing the text representation pooled from the model for classification, the integration of LSTM, SRU, and SRU++ demonstrated additional enhancements across the evaluation metrics. When applied to the Weibo corpus, pre-trained models exhibited limited effectiveness in capturing distinctive expressions and features unique to the Weibo platform. The incorporation of RNNs enhances the model’s ability to emphasize sequential characteristics in Weibo posts. By capturing long-distance dependencies, the model can effectively acquire the emotional semantic features within the text, consequently enhancing the classification accuracy of the model.
To investigate the influence of the self-attention module embedded in the SRU++ unit, this paper visualizes the variation process of the
value and loss value during the training process of LERT_BiSRU and LERT_BiSRU++ on both the weibo2018 dataset and the AI Art Weibo topic dataset, as illustrated in
Figure 9 and
Figure 10.
As illustrated in
Figure 9 and
Figure 10, the LERT_BiSRU++ model demonstrated a quicker convergence rate compared to the LERT_BiSRU model during the learning phase, displaying a more consistent and smoother curve in both
value and loss value. By incorporating a self-attention module in the batch multiplication process, SRU++ benefits from the feature selection process provided by the self-attention calculation process, enabling the model to identify crucial features in the word embedding provided by the LERT model. Consequently, the LERT_BiSRU++ model achieves a faster learning process and higher classification accuracy.
To investigate the acceleration impact of the SRU and SRU++ network, the average times taken for training and inference for the three models on the weibo2018 dataset were computed for visualization. The results are illustrated in
Figure 11.
As illustrated in
Figure 10, the training and inference processes of LERT_BiSRU and LERT_BiSRU++ cost less time in comparison to LERT_BiLSTM. By leveraging lightweight recurrent units, the SRU and SRU++ units eliminate the reliance on the hidden state from the preceding time step in computations, opting for cell states instead, thereby enabling the parallelization of the calculation process. The incorporation of the Hadamard product also improves the computational efficiency of the model when dealing with high-dimensional inputs. In the SRU++, the employment of the self-attention module slightly increased the training and inference time, but with the projection techniques, the total time cost remained relatively lower than that of the LSTM unit.
For the ablation experiments on the setting of the self-attention dimension, the result is shown in
Figure 12.
As illustrated in
Figure 12, the setting of the self-attention dimension exhibits a strong correlation with the performance of the model. When the attention dimension ranged from 64 to 256, the model’s performance showed improvement compared to the SRU. The optimal performance was achieved at 384, which is half of the input dimension. When the attention dimension was set to 512 and 768, not only did the model’s performance decrease, but the training time of the model also approached or exceeded that of the model utilizing LSTM. Hence, selecting the suitable attention dimension is crucial for enhancing the performance of the model.
3.2. Public Opinion Analysis Results
To explore the fine-grained opinions expressed by Weibo users under different sentiments, the proposed sentiment analysis model was employed to categorize the sentiment of 21,200 Weibo posts, generating a compilation of positive and negative sentiment posts. The results of the sentiment analysis are depicted in
Figure 13.
As shown in
Figure 13, the number of Weibo posts showing positive attitudes towards AI Art is 13,567, while the other 7633 Weibo posts show negative attitudes. The advancements of AI Art technology have been applied in various fields, including aiding human creativity and facilitating public participation in artistic endeavors, benefiting the public with easy access to art creation. Meanwhile, the abuse of AI Art also raises public concerns and disapproval of the technology.
To derive detailed opinions with different sentiments, the text clustering analysis method described in
Section 2.2 was utilized to generate opinion clusters and extract keyword representations for each cluster. By using the elbow method, the number of clusters
was set to six for both positive and negative posts, providing a comprehensive analysis of the detailed aspects of the different opinions, as depicted in
Table 8 and
Table 9.
As shown in
Table 9, the clustering analysis results for positive Weibo posts indicate that the Weibo users holding a positive attitude towards AI Art technology mainly focus on the benefits and potentials of the technology. Cluster 1 describes the process of creating images with the model, followed by cluster 4 depicting the details of the model. In clusters 3 and 5, the users who experienced creating their own images with the tool expressed their emotions, such as “happy”, “pleasant”, “lol”, “funny” and “unreasonable”, indicating that the popularization of AI Art has enabled the public to create their own images with their boundless imagination. For clusters 2 and 6, we see that the emergence of AI Art has caught the attention of companies like Tencent, followed by the launch of various models by many companies. As a result, the stock market is also paying great attention to this emerging technology for its unlimited potential.
In contrast to
Table 8,
Table 9 provides a comprehensive overview of detailed opinions that convey negative emotions. The fast pace of development of AI Art has aroused many concerns, such as whether human artists and illustrators will be replaced in the future, as described in cluster 1. Based on this, cluster 2 argues that the emotion artists infuse during their creation is essential for a picture or illustration, which does not exist in the image produced by models. In cluster 3, the copyright problem is discussed from the perspective of Weibo users supporting human artists (also known as “teachers”), which reflects the absence of relevant laws, regulations, and oversight about AI Art applications and their products. The fast pace of AI Art also raises concerns among human artists, as the advancement of technology may replace humans in various industries, which might result in their being laid off (depicted in clusters 4 and 5). In cluster 6, the users expressed excessive negative emotions, resulting from the concerns in clusters 1 to 5, which may be harmful to maintaining a harmonious community environment in Weibo.
The clustering results of positive posts indicate that the advancement of AI Art technology allows the public to engage in artistic creation using AI, thus reducing the barriers to entry for participation in art creation. Furthermore, AI Art technology has garnered investments from companies like Tencent, offering assistance in analyzing market trends and drawing increased interest towards AI Art technology. To conclude, the results are beneficial for the advancement of AI Art technology.
On the other hand, the results of clustering analysis on negative posts highlight the limitations of AI Art technology in its current stage. The rapid development of AI Art has sparked apprehensions regarding the future. Furthermore, being an emerging technology, the copyright and intellectual property concerns associated with AI-generated artwork have incited controversy, resulting in the proliferation of highly critical comments on social media platforms. Hence, the advancement of AI Art technology necessitates the dissemination of detailed information about relevant technologies to enhance the public comprehension of it. Additionally, it calls for authorities to enact legislation pertaining to copyright to mitigate the potential misuse of AI Art technology. In light of the polarization of public opinion resulting from the inappropriate use, it is imperative for the authorities not only to penalize those who misuse the technology but also to enhance the oversight of such extreme posts.
4. Conclusions
This study focuses on the analysis of public opinion on the Weibo topic of AI Art. Commencing with text sentiment analysis and text clustering analysis, a deep learning-driven framework was developed to examine users’ nuanced opinions on the topic. For sentiment analysis, the LERT_BiSRU++ sentiment analysis model was developed for the binary classification of Weibo posts related to the topic. The experimental results showed that the proposed model achieved better results in terms of both classification metrics and computational efficiency. To delve deeper into the nuanced opinions expressed by users carrying different sentiments, a text clustering analysis based on an autoencoder network, K-means, and C-TF-IDF keyword extraction was conducted to produce keyword lists representing the detailed opinions expressed by users. The results of the clustering analysis offer valuable data on public opinion that can be used by relevant authorities in the establishment of laws and regulations, along with the healthy development of AI Art.
The work presented in this paper offers an analysis that provides a framework for further research. Firstly, the LERT pre-training model was employed to enhance text representation within the intricate semantic context of Weibo. In topics like AI Art, where timeliness plays a crucial role, novel internet terms and expressions are consistently emerging, and are difficult for pre-trained models to capture. Collecting and integrating this external knowledge into pre-training models has the potential to further improve the performance of sentiment analysis models. Secondly, the K-means clustering algorithm was utilized for clustering analysis, and the results demonstrated the presence of hierarchical relationships among various topics. These hierarchical relationships can be more effectively extracted through clustering algorithms like DBSCAN and spectral clustering. Finally, the emergence of AI Art technology has ignited significant discussions worldwide. Conducting a multi-language analysis can enhance the comprehension of the public’s attitudes towards AI Art, offering a more comprehensive basis for the sustainable development of AI Art technology.