Harnessing AI and NLP Tools for Innovating Brand Name Generation and Evaluation: A Comprehensive Review
Abstract
:1. Introduction
2. Brand Naming
3. AI Models and Techniques
3.1. Brand Value Identification
3.2. Text Classification
3.2.1. Text Sentiment Analysis
3.2.2. Semantic Analysis
3.3. Text Generation
3.4. Text-to-Speech
3.5. Speech Analysis and Classification
Speech Sentiment Analysis
3.6. Critical Analysis
4. Datasets
5. Conclusions and Future Works
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Arora, S.; Kalro, A.D.; Sharma, D. A comprehensive framework of brand name classification. J. Brand Manag. 2015, 22, 79–116. [Google Scholar] [CrossRef]
- Moro Visconti, R. Domain Name Valuation: Internet Traffic Monetization and IT Portfolio Bundling. 2017. Available online: https://ssrn.com/abstract=3028534 (accessed on 14 June 2024).
- Eskiev, M. Naming as one of the most important elements of brand management. SHS Web Conf. 2021, 128, 01028. [Google Scholar]
- Sabeh, K.; Kacimi, M.; Gamper, J. OpenBrand: Open Brand Value Extraction from Product Descriptions. In Proceedings of the Fifth Workshop on e-Commerce and NLP (ECNLP 5), Dublin, Ireland, 26 May 2022; Malmasi, S., Rokhlenko, O., Ueffing, N., Guy, I., Agichtein, E., Kallumadi, S., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2022; pp. 161–170. [Google Scholar] [CrossRef]
- Sabeh, K.; Kacimi, M.; Gamper, J. GAVI: A Category-Aware Generative Approach for Brand Value Identification. In Proceedings of the 6th International Conference on Natural Language and Speech Processing (ICNLSP 2023), Virtual, 16–17 December 2023; pp. 110–119. [Google Scholar]
- Nurmambetov, D.; Dauylov, S.; Bogdanchikov, A. Kazakh Names Generator Using Deep Learning. Her. Kazakh-Br. Tech. Univ. 2021, 17, 171–177. [Google Scholar]
- Muhammad, P.F.; Kusumaningrum, R.; Wibowo, A. Sentiment analysis using Word2vec and long short-term memory (LSTM) for Indonesian hotel reviews. Procedia Comput. Sci. 2021, 179, 728–735. [Google Scholar] [CrossRef]
- Jiang, H.; Hu, C.; Jiang, F. Text Sentiment Analysis of Movie Reviews Based on Word2Vec-LSTM. In Proceedings of the 14th International Conference on Advanced Computational Intelligence (ICACI), Wuhan, China, 22 July 2022; pp. 129–134. [Google Scholar]
- Liu, Y. Transgender Community Sentiment Analysis from Social Media Data: A Natural Language Processing Approach. Gen. Surgery Clin. Med. 2023, 1, 127–131. [Google Scholar]
- Hassan, J.; Shoaib, U. Multi-class review rating classification using deep recurrent neural network. Neural Process. Lett. 2020, 51, 1031–1048. [Google Scholar] [CrossRef]
- Mendes, G.A.; Martins, B. Quantifying valence and arousal in text with multilingual pre-trained transformers. In Proceedings of the European Conference on Information Retrieval, Dublin, Ireland, 2–6 April 2023; Springer: Berlin/Heidelberg, Germany, 2023; pp. 84–100. [Google Scholar]
- Wang, X.; Dong, X.; Chen, S. Text duplicated-checking algorithm implementation based on natural language semantic analysis. In Proceedings of the IEEE 5th Information Technology and Mechatronics Engineering Conference (ITOEC), Chongqing, China, 5 June 2020; pp. 732–735. [Google Scholar]
- Maksutov, A.A.; Zamyatovskiy, V.I.; Vyunnikov, V.N.; Kutuzov, A.V. Knowledge base collecting using natural language processing algorithms. In Proceedings of the IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus), St. Petersburg/Moscow, Russia, 27–30 January 2020; pp. 405–407. [Google Scholar]
- Li, W. Analysis of semantic comprehension algorithms of natural language based on robot’s questions and answers. In Proceedings of the IEEE International Conference on Advances in Electrical Engineering and Computer Applications (AEECA), Dalian, China, 25 August 2020; pp. 1021–1024. [Google Scholar]
- Aka, A.; Bhatia, S.; McCoy, J. Semantic determinants of memorability. Cognition 2023, 239, 105497. [Google Scholar] [CrossRef] [PubMed]
- Tuckute, G.; Mahowald, K.; Isola, P.; Oliva, A.; Gibson, E.; Fedorenko, E. Intrinsically memorable words have unique associations with their meanings. PsyArXiv 2022. [Google Scholar] [CrossRef]
- Sadhuram, M.V.; Soni, A. Natural language processing based new approach to design factoid question answering system. In Proceedings of the 2nd International Conference on Inventive Research in Computing Applications (ICIRCA), Virtual, 15–17 July 2020; pp. 276–281. [Google Scholar]
- Song, Z.; Zheng, X.; Liu, L.; Xu, M.; Huang, X.J. Generating responses with a specific emotion in dialog. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, 28 July–2 August 2019; pp. 3685–3695. [Google Scholar]
- Liu, Y.; Lin, Z.; Liu, F.; Dai, Q.; Wang, W. Generating paraphrase with topic as prior knowledge. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, Beijing, China, 3–7 November 2019; pp. 2381–2384. [Google Scholar]
- He, X. Parallel Refinements for Lexically Constrained Text Generation with BART. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, Online and Punta Cana, Dominican Republic, 7–11 November 2021; Moens, M.F., Huang, X., Specia, L., Yih, S.W.t., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2021; pp. 8653–8666. [Google Scholar] [CrossRef]
- Wang, Y.; Wood, I.; Wan, S.; Dras, M.; Johnson, M. Mention Flags (MF): Constraining Transformer-based Text Generators. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Online, 2–4 August 2021; Zong, C., Xia, F., Li, W., Navigli, R., Eds.; Association for Computational Linguistic: Stroudsburg, PA, USA, 2021; pp. 103–113. [Google Scholar] [CrossRef]
- Latif, S.; Shahid, A.; Qadir, J. Generative emotional AI for speech emotion recognition: The case for synthetic emotional speech augmentation. Appl. Acoust. 2023, 210, 109425. [Google Scholar] [CrossRef]
- Rashid, M.; Priya; Singh, H. Text to speech conversion in Punjabi language using nourish forwarding algorithm. Int. J. Inf. Tecnol. 2022, 14, 559–568. [Google Scholar] [CrossRef]
- Xu, Y. English speech recognition and evaluation of pronunciation quality using deep learning. Mob. Inf. Syst. 2022, 2022, 1–12. [Google Scholar] [CrossRef]
- Mu, D.; Sun, W.; Xu, G.; Li, W. Japanese Pronunciation Evaluation Based on DDNN. IEEE Access 2020, 8, 218644–218657. [Google Scholar] [CrossRef]
- Gong, Y.; Chen, Z.; Chu, I.H.; Chang, P.; Glass, J. Transformer-based multi-aspect multi-granularity non-native English speaker pronunciation assessment. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, 22–27 May 2022; pp. 7262–7266. [Google Scholar]
- Lu, Z.; Cao, L.; Zhang, Y.; Chiu, C.C.; Fan, J. Speech sentiment analysis via pre-trained features from end-to-end asr models. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), virtual-Barcelona, Spain, 4–8 May 2020; pp. 7149–7153. [Google Scholar]
- Novais, R.; Cardoso, P.J.S.; Rodrigues, J.M.F. Emotion classification from speech by an ensemble strategy. In Proceedings of the 10th International Conference on Software Development and Technologies for Enhancing Accessibility and Fighting Info-Exclusion, Lisboa, Portugal, 31 August–2 September 2022; pp. 85–90. [Google Scholar]
- Shon, S.; Brusco, P.; Pan, J.; Han, K.J.; Watanabe, S. Leveraging Pre-trained Language Model for Speech Sentiment Analysis. In Proceedings of the 22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021, Brno, Czech Republic, 30 August–3 September 2021; International Speech Communication Association: Brno, Czech Republic, 2021; pp. 566–570. [Google Scholar]
- Sabeh, K. Open-Brand: The Dataset Contains over 250 K Product Brand-Value Annotations with More Than 50 k Unique Values across Eight Main Categories of Amazon Product Profiles. 2022. Available online: https://github.com/kassemsabeh/open-brand (accessed on 13 March 2024).
- Ni, J.; Li, J.; McAuley, J. Justifying recommendations using distantly-labeled reviews and fine-grained aspects. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, 3–7 November 2019; pp. 188–197. [Google Scholar]
- Raffel, C.; Shazeer, N.; Roberts, A.; Lee, K.; Narang, S.; Matena, M.; Zhou, Y.; Li, W.; Liu, P.J. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 2020, 21, 1–67. [Google Scholar]
- Minaee, S.; Kalchbrenner, N.; Cambria, E.; Nikzad, N.; Chenaghlu, M.; Gao, J. Deep learning–based text classification: A comprehensive review. ACM Comput. Surv. (CSUR) 2021, 54, 1–40. [Google Scholar] [CrossRef]
- Caschera, M.C.; Grifoni, P.; Ferri, F. Emotion Classification from Speech and Text in Videos Using a Multimodal Approach. Multimodal Technol. Interact. 2022, 6, 28. [Google Scholar] [CrossRef]
- Bogdanchikov, A.; Ayazbayev, D.; Varlamis, I. Classification of Scientific Documents in the Kazakh Language Using Deep Neural Networks and a Fusion of Images and Text. Big Data Cogn. Comput. 2022, 6, 123. [Google Scholar] [CrossRef]
- Bird, S.; Klein, E.; Loper, E. Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit; O’Reilly Media, Inc.: Sebastopol, CA, USA, 2009. [Google Scholar]
- Manning, C.D.; Surdeanu, M.; Bauer, J.; Finkel, J.; Bethard, S.J.; McClosky, D. The Stanford CoreNLP Natural Language Processing Toolkit. In Proceedings of the Association for Computational Linguistics (ACL) System Demonstrations, Baltimore, Maryland, USA, 22–27 June 2014; pp. 55–60. [Google Scholar]
- Mikolov, T.; Chen, K.; Corrado, G.; Dean, J. Efficient estimation of word representations in vector space (2013). arXiv 2023, arXiv:1301.3781. [Google Scholar]
- Khyani, D.; Siddhartha, B.; Niveditha, N.; Divya, B. An interpretation of lemmatization and stemming in natural language processing. J. Univ. Shanghai Sci. Technol. 2021, 22, 350–357. [Google Scholar]
- Pramana, R.; Subroto, J.J.; Gunawan, A.A.S. Systematic Literature Review of Stemming and Lemmatization Performance for Sentence Similarity. In Proceedings of the IEEE 7th International Conference on Information Technology and Digital Applications (ICITDA), Yogyakarta, Indonesia, 4–5 November 2022; pp. 1–6. [Google Scholar]
- Maas, A.; Daly, R.E.; Pham, P.T.; Huang, D.; Ng, A.Y.; Potts, C. Learning word vectors for sentiment analysis. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Portland, Oregon, USA, 19–24 June 2011; pp. 142–150. [Google Scholar]
- Datafiniti. Hotel Reviews. 2019. Available online: https://www.kaggle.com/datasets/datafiniti/hotel-reviews (accessed on 13 March 2024).
- Lakshmipathi, N. IMDB Dataset of 50 K Movie Reviews. 2019. Available online: https://www.kaggle.com/lakshmi25npathi/imdb-dataset-of-50k-movie-reviews (accessed on 13 March 2024).
- Sanh, V.; Debut, L.; Chaumond, J.; Wolf, T. DistilBERT, a distilled version of BERT: Smaller, faster, cheaper and lighter. arXiv 2019, arXiv:1910.01108. [Google Scholar]
- Conneau, A.; Khandelwal, K.; Goyal, N.; Chaudhary, V.; Wenzek, G.; Guzmán, F.; Grave, E.; Ott, M.; Zettlemoyer, L.; Stoyanov, V. Unsupervised Cross-lingual Representation Learning at Scale. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, 5–10 July 2020; Jurafsky, D., Chai, J., Schluter, N., Tetreault, J., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2020; pp. 8440–8451. [Google Scholar] [CrossRef]
- Citron, F.M.; Lee, M.; Michaelis, N. Affective and psycholinguistic norms for German conceptual metaphors (COMETA). Behav. Res. Methods 2020, 52, 1056–1072. [Google Scholar] [CrossRef] [PubMed]
- Busso, C.; Bulut, M.; Lee, C.C.; Kazemzadeh, A.; Mower, E.; Kim, S.; Chang, J.N.; Lee, S.; Narayanan, S.S. IEMOCAP: Interactive emotional dyadic motion capture database. Lang. Resour. Eval. 2008, 42, 335–359. [Google Scholar] [CrossRef]
- Zhang, X.; Zhao, J.; LeCun, Y. Character-level Convolutional Networks for Text Classification. In Proceedings of the Advances in Neural Information Processing Systems, Montreal, Canada, 7–12 December 2015; Cortes, C., Lawrence, N., Lee, D., Sugiyama, M., Garnett, R., Eds.; Curran Associates, Inc.: Glasgow, UK, 2015; Volume 28. [Google Scholar]
- Datafiniti. Consumer Reviews of Amazon Products. 2019. Available online: https://www.kaggle.com/datafiniti/consumer-reviews-of-amazon-products (accessed on 13 March 2024).
- Fairfield, B.; Ambrosini, E.; Mammarella, N.; Montefinese, M. Affective norms for Italian words in older adults: Age differences in ratings of valence, arousal and dominance. PLoS ONE 2017, 12, e0169472. [Google Scholar] [CrossRef]
- Soares, A.P.; Comesaña, M.; Pinheiro, A.P.; Simões, A.; Frade, C.S. The adaptation of the Affective Norms for English words (ANEW) for European Portuguese. Behav. Res. Methods 2012, 44, 256–269. [Google Scholar] [CrossRef] [PubMed]
- Schmidtke, D.S.; Schröder, T.; Jacobs, A.M.; Conrad, M. ANGST: Affective norms for German sentiment terms, derived from the affective norms for English words. Behav. Res. Methods 2014, 46, 1108–1118. [Google Scholar] [CrossRef]
- Imbir, K.K. The Affective Norms for Polish Short Texts (ANPST) database properties and impact of participants’ population and sex on affective ratings. Front. Psychol. 2017, 8, 251141. [Google Scholar] [CrossRef] [PubMed]
- Imbir, K.K. Affective norms for 4900 Polish words reload (ANPW_R): Assessments for valence, arousal, dominance, origin, significance, concreteness, imageability and, age of acquisition. Front. Psychol. 2016, 7, 174568. [Google Scholar] [CrossRef]
- Võ, M.L.; Conrad, M.; Kuchinke, L.; Urton, K.; Hofmann, M.J.; Jacobs, A.M. The Berlin affective word list reloaded (BAWL-R). Behav. Res. Methods 2009, 41, 534–538. [Google Scholar] [CrossRef]
- Aka, A.; Phan, T.D.; Kahana, M.J. Predicting recall of words and lists. J. Exp. Psychol. Learn. Mem. Cogn. 2021, 47, 765. [Google Scholar] [CrossRef]
- Brysbaert, M.; Warriner, A.B.; Kuperman, V. Concreteness ratings for 40 thousand generally known English word lemmas. Behav. Res. Methods 2014, 46, 904–911. [Google Scholar] [CrossRef]
- Ćoso, B.; Guasch, M.; Ferré, P.; Hinojosa, J.A. Affective and concreteness norms for 3022 Croatian words. Q. J. Exp. Psychol. 2019, 72, 2302–2312. [Google Scholar] [CrossRef] [PubMed]
- Xie, H.; Lin, W.; Lin, S.; Wang, J.; Yu, L.C. A multi-dimensional relation model for dimensional sentiment analysis. Inf. Sci. 2021, 579, 832–844. [Google Scholar] [CrossRef]
- Yu, L.C.; Lee, L.H.; Hao, S.; Wang, J.; He, Y.; Hu, J.; Lai, K.R.; Zhang, X. Building Chinese affective resources in valence-arousal dimensions. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, CA, USA, 12–17 June 2016; pp. 540–545. [Google Scholar]
- Lee, L.H.; Li, J.H.; Yu, L.C. Chinese EmoBank: Building valence-arousal resources for dimensional sentiment analysis. Trans. Asian Low-Resour. Lang. Inf. Process. 2022, 21, 1–18. [Google Scholar] [CrossRef]
- Lehmann, J.; Isele, R.; Jakob, M.; Jentzsch, A.; Kontokostas, D.; Mendes, P.N.; Hellmann, S.; Morsey, M.; Van Kleef, P.; Auer, S.; et al. Dbpedia—A large-scale, multilingual knowledge base extracted from wikipedia. Semant. Web 2015, 6, 167–195. [Google Scholar] [CrossRef]
- Eilola, T.M.; Havelka, J. Affective norms for 210 British English and Finnish nouns. Behav. Res. Methods 2010, 42, 134–140. [Google Scholar] [CrossRef] [PubMed]
- Buechel, S.; Hahn, U. Emobank: Studying the impact of annotation perspective and representation format on dimensional emotion analysis. arXiv 2022, arXiv:2205.01996. [Google Scholar]
- Buechel, S.; Hahn, U. Readers vs. writers vs. texts: Coping with different perspectives of text understanding in emotion annotation. In Proceedings of the 11th Linguistic Annotation Workshop, Valencia, Spain, 3 April 2017; pp. 1–12. [Google Scholar]
- Francisco, V.; Hervás, R.; Peinado, F.; Gervás, P. EmoTales: Creating a corpus of folk tales with emotional annotations. Lang. Resour. Eval. 2012, 46, 341–381. [Google Scholar] [CrossRef]
- Loza Mencía, E.; Fürnkranz, J. Efficient pairwise multilabel classification for large-scale problems in the legal domain. In Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Antwerp, Belgium, 15–19 September 2008; Springer: Berlin/Heidelberg, Germany, 2008; pp. 50–65. [Google Scholar]
- Preoţiuc-Pietro, D.; Schwartz, H.A.; Park, G.; Eichstaedt, J.; Kern, M.; Ungar, L.; Shulman, E. Modelling valence and arousal in facebook posts. In Proceedings of the 7th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, San Diego, CA, USA, 16 June 2016; pp. 9–15. [Google Scholar]
- Monnier, C.; Syssau, A. Affective norms for French words (FAN). Behav. Res. Methods 2014, 46, 1128–1137. [Google Scholar] [CrossRef] [PubMed]
- Gilet, A.L.; Grühn, D.; Studer, J.; Labouvie-Vief, G. Valence, arousal, and imagery ratings for 835 French attributes by young, middle-aged, and older adults: The French Emotional Evaluation List (FEEL). Eur. Rev. Appl. Psychol. 2012, 62, 173–181. [Google Scholar] [CrossRef]
- Cieri, C.; Miller, D.; Walker, K. The Fisher corpus: A resource for the next generations of speech-to-text. In Proceedings of the LREC, Lisbon, Portugal, 26–28 May 2004; Volume 4, pp. 69–71. [Google Scholar]
- Tang, D.; Qin, B.; Liu, T. Document modeling with gated recurrent neural network for sentiment classification. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, 17–21 September 2015; pp. 1422–1432. [Google Scholar]
- Wallace, B.C.; Kertz, L.; Charniak, E. Humans require context to infer ironic intent (so computers probably do, too). In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Baltimore, MD, USA, 23–25 June 2014; pp. 512–516. [Google Scholar]
- Kapucu, A.; Kılıç, A.; Özkılıç, Y.; Sarıbaz, B. Turkish emotional word norms for arousal, valence, and discrete emotion categories. Psychol. Rep. 2021, 124, 188–209. [Google Scholar] [CrossRef] [PubMed]
- Kanske, P.; Kotz, S.A. Leipzig affective norms for German: A reliability study. Behav. Res. Methods 2010, 42, 987–991. [Google Scholar] [CrossRef]
- Kahn, J.; Riviere, M.; Zheng, W.; Kharitonov, E.; Xu, Q.; Mazaré, P.E.; Karadayi, J.; Liptchinsky, V.; Collobert, R.; Fuegen, C. Libri-light: A benchmark for asr with limited or no supervision. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 4–8 May 2020; pp. 7669–7673. [Google Scholar]
- Wang, C.; Chen, S.; Wu, Y.; Zhang, Z.; Zhou, L.; Liu, S.; Chen, Z.; Liu, Y.; Wang, H.; Li, J.; et al. Neural codec language models are zero-shot text to speech synthesizers. arXiv 2023, arXiv:2301.02111. [Google Scholar]
- Panayotov, V.; Chen, G.; Povey, D.; Khudanpur, S. LibriSpeech: An ARS corpus based on public domain audio books. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brisbane, Australia, 19–24 April 2015; pp. 5206–5210. [Google Scholar]
- Zen, H.; Dang, V.; Clark, R.; Zhang, Y.; Weiss, R.J.; Jia, Y.; Chen, Z.; Wu, Y. LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech. In Proceedings of the Interspeech, Graz, Austria, 15–19 September 2019; pp. 1526–1530. [Google Scholar] [CrossRef]
- Pinheiro, A.P.; Dias, M.; Pedrosa, J.; Soares, A.P. Minho Affective Sentences (MAS): Probing the roles of sex, mood, and empathy in affective ratings of verbal stimuli. Behav. Res. Methods 2017, 49, 698–716. [Google Scholar] [CrossRef]
- Moors, A.; De Houwer, J.; Hermans, D.; Wanmaker, S.; Van Schie, K.; Van Harmelen, A.L.; De Schryver, M.; De Winne, J.; Brysbaert, M. Norms of valence, arousal, dominance, and age of acquisition for 4300 Dutch words. Behav. Res. Methods 2013, 45, 169–177. [Google Scholar] [CrossRef] [PubMed]
- Socher, R.; Perelygin, A.; Wu, J.; Chuang, J.; Manning, C.D.; Ng, A.; Potts, C. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, Seattle, WA, USA, 18–21 October 2013; pp. 1631–1642. [Google Scholar]
- Deng, L.; Wiebe, J. MPQA 3.0: An Entity/Event-Level Sentiment Corpus. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, CO, USA, 31 May–5 June 2015; Mihalcea, R., Chai, J., Sarkar, A., Eds.; Curran Associates: New York, NY, USA, 2015; pp. 1323–1328. [Google Scholar] [CrossRef]
- Nguyen, T.; Rosenberg, M.; Song, X.; Gao, J.; Tiwary, S.; Majumder, R.; Deng, L. Ms marco: A human-generated machine reading comprehension dataset. In Proceedings of the 5th International Conference on Learning Representations (ICLR), Toulon, France, 24–26 April 2017. [Google Scholar]
- Dolan, W.; Quirk, C.; Brockett, C.; Dolan, B. Unsupervised construction of large paraphrase corpora: Exploiting massively parallel news sources. In Proceedings of the 20th International Conference on Computational Linguistics (COLING 2004), Geneva, Switzerland, 23–27 August 2004; pp. 350–356. [Google Scholar]
- Williams, A.; Nangia, N.; Bowman, S.R. A broad-coverage challenge corpus for sentence understanding through inference. arXiv 2017, arXiv:1704.05426. [Google Scholar]
- Riegel, M.; Wierzba, M.; Wypych, M.; Żurawski, Ł.; Jednoróg, K.; Grabowska, A.; Marchewka, A. Nencki affective word list (NAWL): The cultural adaptation of the Berlin affective word list–reloaded (BAWL-R) for Polish. Behav. Res. Methods 2015, 47, 1222–1236. [Google Scholar] [CrossRef] [PubMed]
- Mohammad, S. Obtaining reliable human ratings of valence, arousal, and dominance for 20,000 English words. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Melbourne, Australia, 15–20 July 2018; pp. 174–184. [Google Scholar]
- Ohsumed. XmdvTool Home Page: Downloads. 2005. Available online: http://davis.wpi.edu/xmdv/datasets/ohsumed.html (accessed on 13 March 2024).
- Chelba, C.; Mikolov, T.; Schuster, M.; Ge, Q.; Brants, T.; Koehn, P.; Robinson, T. One Billion Word Benchmark for Measuring Progress in Statistical Language Modeling. 2014. Available online: https://arxiv.org/abs/1312.3005 (accessed on 13 March 2024).
- Citron, F.M.; Cacciari, C.; Kucharski, M.; Beck, L.; Conrad, M.; Jacobs, A.M. When emotions are expressed figuratively: Psycholinguistic and Affective Norms of 619 Idioms for German (PANIG). Behav. Res. Methods 2016, 48, 91–111. [Google Scholar] [CrossRef] [PubMed]
- Lu, Z. PubMed and Beyond: A Survey of Web Tools for Searching Biomedical Literature. 2011. Available online: https://pubmed.ncbi.nlm.nih.gov/21245076/ (accessed on 13 March 2024).
- Iyer, S.; Dandekar, N.; Csernai, K. First Quora Dataset Release: Question Pairs—Data @ Quora—Quora. 2012. Available online: https://quoradata.quora.com/First-Quora-Dataset-Release-Question-Pairs (accessed on 13 March 2024).
- Kahana, M.J.; Aggarwal, E.V.; Phan, T.D. The variability puzzle in human memory. J. Exp. Psychol. Learn. Mem. Cogn. 2018, 44, 1857. [Google Scholar] [CrossRef] [PubMed]
- Healey, M.K.; Crutchley, P.; Kahana, M.J. Individual differences in memory search and their relation to intelligence. J. Exp. Psychol. Gen. 2014, 143, 1553. [Google Scholar] [CrossRef] [PubMed]
- Lohnas, L.J.; Kahana, M.J. Parametric effects of word frequency in memory for mixed frequency lists. J. Exp. Psychol. Learn. Mem. Cogn. 2013, 39, 1943. [Google Scholar] [CrossRef] [PubMed]
- Thoma, M. The Reuters Dataset. 2017. Available online: https://martin-thoma.com/nlp-reuters (accessed on 13 March 2024).
- Zhang, J.; Zhang, Z.; Wang, Y.; Yan, Z.; Song, Q.; Huang, Y.; Li, K.; Povey, D.; Wang, Y. speechocean762: An open-source non-native english speech corpus for pronunciation assessment. arXiv 2021, arXiv:2104.01378. [Google Scholar]
- Stadthagen-Gonzalez, H.; Imbault, C.; Pérez Sánchez, M.A.; Brysbaert, M. Norms of valence and arousal for 14,031 Spanish words. Behav. Res. Methods 2017, 49, 111–123. [Google Scholar] [CrossRef]
- Marelli, M.; Bentivogli, L.; Baroni, M.; Bernardi, R.; Menini, S.; Zamparelli, R. Semeval-2014 task 1: Evaluation of compositional distributional semantic models on full sentences through semantic relatedness and textual entailment. In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), Dublin, Ireland, 23–24 August 2014; pp. 1–8. [Google Scholar]
- Bowman, S.R.; Angeli, G.; Potts, C.; Manning, C.D. A large annotated corpus for learning natural language inference. arXiv 2015, arXiv:1508.05326. [Google Scholar]
- Sun, C.; Qiu, X.; Xu, Y.; Huang, X. How to fine-tune bert for text classification? In Proceedings of the Chinese Computational Linguistics: 18th China National Conference, CCL 2019, Kunming, China, 18–20 October 2019; Springer: Berlin/Heidelberg, Germany, 2019; pp. 194–206. [Google Scholar]
- Pang, B.; Lee, L.; Vaithyanathan, S. Thumbs up? Sentiment Classification using Machine Learning Techniques. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2002), Philadelphia, PA, USA, 6 July 2002; Association for Computational Linguistics: Stroudsburg, PA, USA, 2002; pp. 79–86. [Google Scholar] [CrossRef]
- Brysbaert, M.; New, B. Moving beyond Kučera and Francis: A critical evaluation of current word frequency norms and the introduction of a new and improved word frequency measure for American English. Behav. Res. Methods 2009, 41, 977–990. [Google Scholar] [CrossRef] [PubMed]
- Chen, E.; Lu, Z.; Xu, H.; Cao, L.; Zhang, Y.; Fan, J. A large scale speech sentiment corpus. In Proceedings of the Twelfth Language Resources and Evaluation Conference, Marseille, France, 11–16 May 2020; pp. 6549–6555. [Google Scholar]
- Scott, G.G.; Keitel, A.; Becirspahic, M.; Yao, B.; Sereno, S.C. The Glasgow Norms: Ratings of 5500 words on nine scales. Behav. Res. Methods 2019, 51, 1258–1270. [Google Scholar] [CrossRef] [PubMed]
- Report, N. The 500 Largest Firms in the World Rated by Net Effect. 2021. Available online: https://netimpactreport.com/datasets/largest-500 (accessed on 13 March 2024).
- GauravArora1091. Top 100 Global Brands by Brandirectory-2022. 2022. Available online: https://www.kaggle.com/datasets/gauravarora1091/top-100-global-brands-by-brandirectory2022 (accessed on 13 March 2024).
- Li, X.; Roth, D. Learning question classifiers. In Proceedings of the COLING: The 19th International Conference on Computational Linguistics, Taipei, Taiwan, 24 August–1 September 2002. [Google Scholar]
- Casanova, E.; Junior, A.C.; Shulby, C.; Oliveira, F.S.d.; Teixeira, J.P.; Ponti, M.A.; Aluísio, S. TTS-Portuguese Corpus: A corpus for speech synthesis in Brazilian Portuguese. Lang. Resour. Eval. 2022, 56, 1043–1055. [Google Scholar] [CrossRef]
- Nelson, D.L.; McEvoy, C.L.; Schreiber, T.A. The University of South Florida free association, rhyme, and word fragment norms. Behav. Res. Methods Instruments Comput. 2004, 36, 402–407. [Google Scholar] [CrossRef] [PubMed]
- Veaux, C.; Yamagishi, J.; MacDonald, K. Superseded-CSTR Vctk Corpus: English Multi-Speaker Corpus for CSTR Voice Cloning Toolkit The Centre for Speech Technology Research (CSTR), University of Edinburgh. 2016. Available online: https://datashare.ed.ac.uk/handle/10283/3443 (accessed on 14 June 2024).
- Verheyen, S.; De Deyne, S.; Linsen, S.; Storms, G. Lexicosemantic, affective, and distributional norms for 1000 Dutch adjectives. Behav. Res. Methods 2020, 52, 1108–1121. [Google Scholar] [CrossRef]
- Warriner, A.B.; Kuperman, V.; Brysbaert, M. Norms of valence, arousal, and dominance for 13,915 English lemmas. Behav. Res. Methods 2013, 45, 1191–1207. [Google Scholar] [CrossRef]
- Kowsari, K.; Brown, D.E.; Heidarysafa, M.; Meimandi, K.J.; Gerber, M.S.; Barnes, L.E. Hdltex: Hierarchical deep learning for text classification. In Proceedings of the 16th IEEE International Conference on Machine Learning and Applications (ICMLA), Cancun, Mexico, 18–21 December 2017; pp. 364–371. [Google Scholar]
- Xu, X.; Li, J.; Chen, H. Valence and arousal ratings for 11,310 simplified Chinese words. Behav. Res. Methods 2022, 54, 26–41. [Google Scholar] [CrossRef] [PubMed]
- Yee, L.T. Valence, arousal, familiarity, concreteness, and imageability ratings for 292 two-character Chinese nouns in Cantonese speakers in Hong Kong. PLoS ONE 2017, 12, e0174569. [Google Scholar] [CrossRef] [PubMed]
- Yelp, I. Yelp Dataset. 2022. Available online: https://www.kaggle.com/yelp-dataset/yelp-dataset (accessed on 13 March 2024).
- Labs, P.D. 7+ Million Company Dataset. 2019. Available online: https://www.kaggle.com/datasets/peopledatalabssf/free-7-million-company-dataset (accessed on 13 March 2024).
- Lang, K. 20 Newsgroups. 2008. Available online: http://qwone.com/~jason/20Newsgroups/ (accessed on 13 March 2024).
- Lilleberg, J.; Zhu, Y.; Zhang, Y. Support vector machines and word2vec for text classification with semantic features. In Proceedings of the IEEE 14th International Conference on Cognitive Informatics & Cognitive Computing (ICCI*CC), Beijing, China, 6–8 July 2015; pp. 136–140. [Google Scholar]
- Labs, P.D. Company Data to Get Intelligence on 22.9+ Million Companies. 2024. Available online: https://www.peopledatalabs.com/company-dataset (accessed on 13 March 2024).
- Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
- Yang, Z.; Dai, Z.; Yang, Y.; Carbonell, J.; Salakhutdinov, R.R.; Le, Q.V. Xlnet: Generalized autoregressive pretraining for language understanding. Adv. Neural Inf. Process. Syst. 2019, 32, 517. [Google Scholar]
- Cox, G.E.; Hemmer, P.; Aue, W.R.; Criss, A.H. Information and processes underlying semantic and episodic memory across tasks, items, and individuals. J. Exp. Psychol. Gen. 2018, 147, 545. [Google Scholar] [CrossRef] [PubMed]
- Mahowald, K.; Isola, P.; Fedorenko, E.; Gibson, E.; Oliva, A. Memorable Words Are Monogamous: The Role of Synonymy and Homonymy in Word Recognition Memory. Preprint at PsyArxiv. 2018. Available online: https://psyarxiv.com/p6kv9 (accessed on 14 June 2024).
- Maulud, D.H.; Zeebaree, S.R.; Jacksi, K.; Sadeeq, M.A.M.; Sharif, K.H. State of art for semantic analysis of natural language processing. Qubahan Acad. J. 2021, 1, 21–28. [Google Scholar] [CrossRef]
- Doyal, A.S.; Sender, D.; Nanda, M.; Serrano, R.A. ChatGPT and artificial intelligence in medical writing: Concerns and ethical considerations. Cureus 2023, 15, e43292. [Google Scholar] [CrossRef]
- Iqbal, T.; Qureshi, S. The survey: Text generation models in deep learning. J. King Saud Univ.-Comput. Inf. Sci. 2022, 34, 2515–2528. [Google Scholar] [CrossRef]
- Yu, W.; Zhu, C.; Li, Z.; Hu, Z.; Wang, Q.; Ji, H.; Jiang, M. A survey of knowledge-enhanced text generation. ACM Comput. Surv. 2022, 54, 1–38. [Google Scholar] [CrossRef]
- Rajpurkar, P.; Jia, R.; Liang, P. Know what you don’t know: Unanswerable questions for SQuAD. arXiv 2018, arXiv:1806.03822. [Google Scholar]
- Sutskever, I.; Vinyals, O.; Le, Q.V. Sequence to Sequence Learning with Neural Networks. Advances in Neural Information Processing Systems, 2014; Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N., Weinberger, K., Eds.; Curran Associates, Inc.: Glasgow, UK, 2014; Volume 27. [Google Scholar]
- Shen, X.; Su, H.; Li, W.; Klakow, D. NEXUS Network: Connecting the Preceding and the Following in Dialogue Generation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, 31 October–4 November 2018; Riloff, E., Chiang, D., Hockenmaier, J., Tsujii, J., Eds.; Association for Computational Linguistics (ACL): New York, NY, USA, 2018; pp. 4316–4327. [Google Scholar] [CrossRef]
- Wang, Y.; Zhao, X.; Zhao, D. Overview of the NLPCC 2022 shared task: Multi-modal dialogue understanding and generation. In Proceedings of the CCF International Conference on Natural Language Processing and Chinese Computing, Beijing, China, 22–23 September 2022; Springer: Berlin/Heidelberg, Germany, 2022; pp. 328–335. [Google Scholar]
- Lan, W.; Qiu, S.; He, H.; Xu, W. A continuously growing dataset of sentential paraphrases. arXiv 2017, arXiv:1708.00391. [Google Scholar]
- Mou, L.; Yan, R.; Li, G.; Zhang, L.; Jin, Z. Backward and forward language modeling for constrained sentence generation. arXiv 2015, arXiv:1512.06612. [Google Scholar]
- Hokamp, C.; Liu, Q. Lexically Constrained Decoding for Sequence Generation Using Grid Beam Search. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Vancouver, Canada, 30 July–4 August 2017; Association for Computational Linguistics: Stroudsburg, PA, USA, 2017. [Google Scholar]
- Miao, N.; Zhou, H.; Mou, L.; Yan, R.; Li, L. CGMH: Constrained sentence generation by metropolis-hastings sampling. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019; Volume 33, pp. 6834–6842. [Google Scholar]
- Zhang, Y.; Wang, G.; Li, C.; Gan, Z.; Brockett, C.; Dolan, W. POINTER: Constrained Progressive Text Generation via Insertion-based Generative Pre-training. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), Online, 16–20 November 2020; pp. 8649–8670. [Google Scholar]
- He, X.; Li, V. Show me how to revise: Improving lexically constrained sentence generation with XLNet. In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, Canada, 2–9 February 2021; Volume 35, pp. 12989–12997. [Google Scholar]
- Lin, B.Y.; Zhou, W.; Shen, M.; Zhou, P.; Bhagavatula, C.; Choi, Y.; Ren, X. CommonGen: A Constrained Text Generation Challenge for Generative Commonsense Reasoning. In Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, Online, 16–20 November 2020; Cohn, T., He, Y., Liu, Y., Eds.; Association for Computational Linguistics (ACL): Stroudsburg, PA, USA, 2020; pp. 1823–1840. [Google Scholar] [CrossRef]
- Dušek, O.; Novikova, J.; Rieser, V. Evaluating the state-of-the-art of end-to-end natural language generation: The e2e nlg challenge. Comput. Speech Lang. 2020, 59, 123–156. [Google Scholar] [CrossRef]
- Agrawal, H.; Desai, K.; Wang, Y.; Chen, X.; Jain, R.; Johnson, M.; Batra, D.; Parikh, D.; Lee, S.; Anderson, P. Nocaps: Novel object captioning at scale. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 8948–8957. [Google Scholar]
- Lorusso, M.L.; Borasio, F.; Panetto, P.; Curioni, M.; Brotto, G.; Pons, G.; Carsetti, A.; Molteni, M. Validation of a Web App Enabling Children with Dyslexia to Identify Personalized Visual and Auditory Parameters Facilitating Online Text Reading. Multimodal Technol. Interact. 2024, 8, 5. [Google Scholar] [CrossRef]
- Abdulrahman, A.; Richards, D. Is Natural Necessary? Human Voice versus Synthetic Voice for Intelligent Virtual Agents. Multimodal Technol. Interact. 2022, 6, 51. [Google Scholar] [CrossRef]
- Pathak, A.; Velasco, C.; Spence, C. The sound of branding: An analysis of the initial phonemes of popular brand names. J. Brand Manag. 2020, 27, 339–354. [Google Scholar] [CrossRef]
- Vidal-Mestre, M.; Freire-Sánchez, A.; Calderón-Garrido, D.; Faure-Carvallo, A.; Gustems-Carnicer, J. Audio identity in branding and brand communication strategy: A systematic review of the literature on audio branding. Prof. Inf./Inf. Prof. 2022, 31. [Google Scholar] [CrossRef]
- Kalchbrenner, N.; Elsen, E.; Simonyan, K.; Noury, S.; Casagrande, N.; Lockhart, E.; Stimberg, F.; van den Oord, A.; Dieleman, S.; Kavukcuoglu, K. Efficient Neural Audio Synthesis. In Proceedings of the International Conference on Machine Learning 2018, Stockholm, Sweden, 10–15 July 2018. [Google Scholar]
- Roberts, L. Understanding the mel spectrogram. Medium 2024. Available online: https://medium.com/analytics-vidhya/understanding-the-mel-spectrogram-fca2afa2ce53 (accessed on 13 June 2024).
- Adigwe, A.; Tits, N.; Haddad, K.E.; Ostadabbas, S.; Dutoit, T. The Emotional Voices Database: Towards Controlling the Emotion Dimension in Voice Generation Systems. arXiv 2018, arXiv:1806.09514. [Google Scholar]
- Dupuis, K.; Pichora-Fuller, M.K. Toronto Emotional Speech Set (TESS); University of Toronto, Psychology Department: Toronto, ON, Canada, 2010. [Google Scholar]
- Livingstone, S.R.; Russo, F.A. The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English. PLoS ONE 2018, 13, e0196391. [Google Scholar] [CrossRef] [PubMed]
- Cao, H.; Cooper, D.G.; Keutmann, M.K.; Gur, R.C.; Nenkova, A.; Verma, R. Crema-d: Crowd-sourced emotional multimodal actors dataset. IEEE Trans. Affect. Comput. 2014, 5, 377–390. [Google Scholar] [CrossRef] [PubMed]
- Jackson, P.; Haq, S. Surrey Audio-Visual Expressed Emotion (Savee) Database; University of Surrey: Guildford, UK, 2014. [Google Scholar]
- Burkhardt, F.; Paeschke, A.; Rolfes, M.; Sendlmeier, W.F.; Weiss, B. A database of German emotional speech. In Proceedings of the Interspeech, Lisbon, Portugal, 4–8 September 2005; Volume 5, pp. 1517–1520. [Google Scholar]
- Défossez, A.; Copet, J.; Synnaeve, G.; Adi, Y. High Fidelity Neural Audio Compression. Trans. Mach. Learn. Res. 2023, 36. [Google Scholar]
- Casanova, E.; Weber, J.; Shulby, C.D.; Junior, A.C.; Gölge, E.; Ponti, M.A. Yourtts: Towards zero-shot multi-speaker tts and zero-shot voice conversion for everyone. In Proceedings of the International Conference on Machine Learning, PMLR, Baltimore, MD, USA, 17–23 July 2022; pp. 2709–2720. [Google Scholar]
- Chen, S.; Wang, C.; Chen, Z.; Wu, Y.; Liu, S.; Chen, Z.; Li, J.; Kanda, N.; Yoshioka, T.; Xiao, X.; et al. Wavlm: Large-scale self-supervised pre-training for full stack speech processing. IEEE J. Sel. Top. Signal Process. 2022, 16, 1505–1518. [Google Scholar] [CrossRef]
- Hsu, W.N.; Bolte, B.; Tsai, Y.H.H.; Lakhotia, K.; Salakhutdinov, R.; Mohamed, A. Hubert: Self-supervised speech representation learning by masked prediction of hidden units. IEEE/ACM Trans. Audio Speech Lang. Process. 2021, 29, 3451–3460. [Google Scholar] [CrossRef]
- Bang, C.W.; Chun, C. Effective Zero-Shot Multi-Speaker Text-to-Speech Technique Using Information Perturbation and a Speaker Encoder. Sensors 2023, 23, 9591. [Google Scholar] [CrossRef]
- Alonso Martin, F.; Malfaz, M.; Castro-Gonzalez, A.; Castillo, J.C.; Salichs, M.A. Four-Features Evaluation of Text to Speech Systems for Three Social Robots. Electronics 2020, 9, 267. [Google Scholar] [CrossRef]
- Ning, Y.; He, S.; Wu, Z.; Xing, C.; Zhang, L.J. A Review of Deep Learning Based Speech Synthesis. Appl. Sci. 2019, 9, 4050. [Google Scholar] [CrossRef]
- Nazir, O.; Malik, A. Deep Learning End to End Speech Synthesis: A Review. In Proceedings of the 2nd International Conference on Secure Cyber Computing and Communications (ICSCCC), Delhi, India, 21–23 May 2021; pp. 66–71. [Google Scholar] [CrossRef]
- Han, W.; Jiang, T.; Li, Y.; Schuller, B.; Ruan, H. Ordinal learning for emotion recognition in customer service calls. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Virtual, Barcelona, 4–8 May 2020; pp. 6494–6498. [Google Scholar]
- Fox, C.B.; Israelsen-Augenstein, M.; Jones, S.; Gillam, S.L. An evaluation of expedited transcription methods for school-age children’s narrative language: Automatic speech recognition and real-time transcription. J. Speech Lang. Hear. Res. 2021, 64, 3533–3548. [Google Scholar] [CrossRef] [PubMed]
- Ling, S.; Liu, Y.; Salazar, J.; Kirchhoff, K. Deep contextualized acoustic representations for semi-supervised speech recognition. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), virtual, 4–8 May 2020; pp. 6429–6433. [Google Scholar]
- Kim, Y.; Levy, J.; Liu, Y. Speech sentiment and customer satisfaction estimation in socialbot conversations. arXiv 2020, arXiv:2008.12376. [Google Scholar]
- Singh, A.; Anand, R. Speech Recognition Using Supervised and Unsupervised Learning Techniques. In Proceedings of the International Conference on Computational Intelligence and Communication Networks (CICN), Jabalpur, MP, India, 12–14 December 2015; pp. 691–696. [Google Scholar] [CrossRef]
- Khonglah, B.; Madikeri, S.; Dey, S.; Bourlard, H.; Motlicek, P.; Billa, J. Incremental semi-supervised learning for multi-genre speech recognition. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP), Virtual, Barcelona, 4–8 May 2020; pp. 7419–7423. [Google Scholar]
- Baevski, A.; Hsu, W.N.; Conneau, A.; Auli, M. Unsupervised speech recognition. Adv. Neural Inf. Process. Syst. 2021, 34, 27826–27839. [Google Scholar]
- Lin, G.T.; Hsu, C.J.; Liu, D.R.; Lee, H.Y.; Tsao, Y. Analyzing the robustness of unsupervised speech recognition. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, 22–27 May 2022; pp. 8202–8206. [Google Scholar]
- Yue, X.; Li, H. Phonetically Motivated Self-Supervised Speech Representation Learning. In Proceedings of the Interspeech, Brno, Czechia, 30 August–3 September 2021; pp. 746–750. [Google Scholar]
- Hernandez, F.; Nguyen, V.; Ghannay, S.; Tomashenko, N.; Esteve, Y. TED-LIUM 3: Twice as much data and corpus repartition for experiments on speaker adaptation. In Proceedings of the Speech and Computer: 20th International Conference, SPECOM 2018, Proceedings 20, Leipzig, Germany, 18–22 September 2018; Springer: Berlin/Heidelberg, Germany, 2018; pp. 198–208. [Google Scholar]
- Marcus, M.; Santorini, B.; Marcinkiewicz, M.A. Building a large annotated corpus of English: The Penn Treebank. Comput. Linguist. 1993, 19, 313–330. [Google Scholar]
- Tivarekar, R.P.; Khadye, R.M.; Chavande, S.R.; Talkatkar, P.S. Review of Deep Speech Recognizer using Transcriber. In Proceedings of the 6th International Conference on Advances in Science and Technology (ICAST), Mumbai, India, 8–9 December 2023; pp. 460–463. [Google Scholar]
- Alharbi, S.; Alrazgan, M.; Alrashed, A.; Alnomasi, T.; Almojel, R.; Alharbi, R.; Alharbi, S.; Alturki, S.; Alshehri, F.; Almojil, M. Automatic speech recognition: Systematic literature review. IEEE Access 2021, 9, 131858–131876. [Google Scholar] [CrossRef]
- Mohamed, A.; Lee, H.y.; Borgholt, L.; Havtorn, J.D.; Edin, J.; Igel, C.; Kirchhoff, K.; Li, S.W.; Livescu, K.; Maaløe, L.; et al. Self-supervised speech representation learning: A review. IEEE J. Sel. Top. Signal Process. 2022, 16, 1179–1210. [Google Scholar] [CrossRef]
- Kumar, T.; Mahrishi, M.; Nawaz, S. A review of speech sentiment analysis using machine learning. In Proceedings of the Trends in Electronics and Health Informatics: TEHI 2021, Kanpur, India, 16–17 December 2021; pp. 21–28. [Google Scholar]
- Maghilnan, S.; Kumar, M.R. Sentiment analysis on speaker specific speech data. In Proceedings of the International Conference on Intelligent Computing and Control (I2C2), Coimbatore, India, 23–24 June 2017; pp. 1–5. [Google Scholar]
- Cardoso, P.J.S.; Rodrigues, J.M.F.; Novais, R. Multimodal Emotion Classification Supported in the Aggregation of Pre-trained Classification Models. In Proceedings of the International Conference on Computational Science, Prague, Czechia, 3–5 July 2023; Springer: Berlin/Heidelberg, Germany, 2023; pp. 433–447. [Google Scholar]
- Wankhade, M.; Rao, A.C.S.; Kulkarni, C. A survey on sentiment analysis methods, applications, and challenges. Artif. Intell. Rev. 2022, 55, 5731–5780. [Google Scholar] [CrossRef]
- Das, R.; Singh, T.D. Multimodal sentiment analysis: A survey of methods, trends, and challenges. ACM Comput. Surv. 2023, 55, 1–38. [Google Scholar] [CrossRef]
Brand value extraction | [4,5] |
Text sentiment analysis | [6,7,8,9,10,11] |
Semantic analysis | [12,13,14,15,16] |
Text generation | [6,17,18,19,20,21] |
Text-to-speech | [22,23] |
Speech classification | [24,25,26] |
Speech sentiment analysis | [27,28,29] |
Dataset | Description | Applied to |
---|---|---|
AG News [48] | The academic news search engine ComeToMyHead gathered news stories from over 2000 news sources to create the AG News dataset. Samples were brief sentences and used four-class classification. | News classification |
Amazon [49] | Labels for multi-class (five) classification as well as binary classification. The Amazon binary classification dataset has 3,600,000 reviews for training and 400,000 for testing. | Sentiment analysis |
ANEW-It [50] | Affective norms for words rated in terms of valence, arousal, and dominance (in Italian). | Valence and arousal computation (e.g., in [11]) |
ANEW-Pt [51] | Valence, arousal, and dominance ratings for 1034 words (in Portuguese). | Valence and arousal computation (e.g., in [11]) |
ANGST [52] | Total of 1003 words (in German) evaluated based on imageability, potency, dominance, and arousal. | Valence and arousal computation (e.g., in [11]) |
ANPST [53] | Affective ratings for valence, arousal, dominance, origin, subjective significance, and source dimensions for 718 short texts (in Polish). | Valence and arousal computation (e.g., in [11]) |
ANPW_R [54] | Assessments for valence, arousal, dominance, origin, significance, concreteness, imageability, and age of acquisition for 4900 words (in Polish). | Valence and arousal computation (e.g., in [11]) |
BAWL-R [55] | Includes evaluations for emotional valence, imageability, and emotional arousal (in German). | Valence and arousal computation (e.g., in [11]) |
Binary animacy collection [56] | Animacy ratings for all words in a word pool. | Assessment of word memorability (e.g., in [15]) |
COMETA [46] | A database of conceptual metaphors incorporating emotional ratings alongside linguistic properties, with stimuli including natural stories and isolated sentences, assessed for attributes such as valence, arousal, and metaphoricity (in German). | Valence and arousal computation (e.g., in [11]) |
Concreteness ratings [56,57] | Concreteness ratings on a five-point scale. | Assessment of word memorability (e.g., in [15]) |
Ćoso et al. [58] | Ratings of valence, arousal, and concreteness for 3022 words (in Croatian). | Valence and arousal computation (e.g., in [11]) |
CVAI [59] | Extended NTU irony corpus, which includes valence, arousal, and irony intensities on the sentence level and valence and arousal intensities on the context level (in Mandarin). | Valence and arousal computation (e.g., in [11]) |
CVAT [60] | Total of 2009 phrases taken from online sources rated for valence and arousal (in Mandarin). | Valence and arousal computation (e.g., in [11]) |
CVAW [60,61] | A sentimental vocabulary of 1653 terms with scores for valence and arousal. (in Mandarin). | Valence and arousal computation (e.g., in [11]) |
DBpedia [62] | Multilingual knowledge base assembled from Wikipedia’s most popular infoboxes. Each sample in the most widely used version of DBpedia has a 14-class label. | News classification |
Eilola et al. [63] | Valence, emotional charge, offensiveness, concreteness, and familiarity ratings for 210 nouns (in Finnish). | Valence and arousal computation (e.g., in [11]) |
EmonBank [64,65] | Text corpus manually annotated with emotions according to the psychological valence–arousal–dominance scheme (in English). | Valence and arousal computation (e.g., in [11]) |
EmoTales [66] | Annotated corpus oriented to the narrative domain which uses two different approaches to represent emotional states, emotional categories, and emotional dimensions (in English). | Valence and arousal computation (e.g., in [11]) |
EUR-Lex [67] | Several categories of documents are included in the EUR-Lex dataset. This dataset’s most-used version has 3956 categories and 19,314 documents based on various parts of EU law. | Topic classification |
Facebook Posts [68] | Social media posts rated by two psychologically trained annotators on two separate ordinal nine-point scales of valence and arousal. | Valence and arousal computation (e.g., in [11]) |
FAN [69] | Affective norms for 1031 words rated on emotional valence and arousal (in French). | Valence and arousal computation (e.g., in [11]) |
FEEL [70] | Valence, arousal, and imagery ratings for 835 words (in French). | Valence and arousal computation (e.g., in [11]) |
Fisher [71] | Corpus containing conversational telephone speech of more than 16,000 English conversations. | Speech sentiment analysis (e.g., [29]) |
Hotel Reviews [42] | Dataset comprising 14,895 data samples (reviews) divided into five classes. | Multi-class sentiment classification (e.g., in [10]) |
IEMOCAP [47] | Total of 151 videos of recorded dialogues annotated for the presence of nine emotions (angry, excited, fearful, sad, surprised, frustrated, happy, disappointed, neutral) as well as valence, arousal, and dominance. | Speech sentiment analysis (e.g., [27]) |
IMDB [72] | Dataset consisting of 50,000 data samples (movie reviews) divided into ten classes | Multi-class sentiment classification (e.g., in [10]) |
Irony [73] | Composed of the arXiv collection, Twitter dataset for topic classification of tweets, and annotated comments from the social news website Reddit. | Topic classification |
Kapucu [74] | Norms of valence and arousal for 2031 words (in Turkish). | Valence and arousal computation (e.g., in [11]) |
LANG [75] | Total of 1000 nouns rated for emotional valence, arousal, and concreteness (in German). | Valence and arousal computation (e.g., in [11]) |
Librilight [76] | Corpus consisting of 60,000 h of English speech with over 7000 unique speakers. | Text-to-speech (e.g., [77]) |
Librispeech [78] | LibriSpeech is a corpus of approximately 1000 h of read English speech. | Text-to-speech (e.g., [22]) |
LibriTTS [79] | Multi-speaker corpus of approximately 585 h of read English speech at a sampling rate of 24 kHz. | Text-to-speech (e.g., [77]) |
MAS [80] | Dimensional and categorical measures of emotion were used to score 192 sentences (in Portuguese). | Valence and arousal computation (e.g., in [11]) |
Moors et al. [81] | Norms of valence, arousal, dominance, and age of acquisition for 4300 words (in Dutch). | Valence and arousal computation (e.g., in [11]) |
Movie Review [82] | A collection of movie reviews created with the intention of identifying the sentiment attached to each review; 10,662 sentences are included, with an equal number of positive and negative samples. | Sentiment analysis |
MPQA [83] | Opinion corpus classified into two classes; 10,606 sentences were taken from news items pertaining to a broad range of news sources. | Sentiment analysis |
MS MARCO [84] | Set of questions sampled from user searches and passages from actual web documents. Includes generative replies. | Question answering |
MSRP [85] | Total of 1725 samples for testing and 4076 samples for training in MSRP. Every example consists of two statements either labeled as either paraphrases or lacking a binary label. | Natural Language Inference (NLI) |
Multi-NLI [86] | Total of 43,300 sentence pairings, each with appended textual entailment labels. The corpus is an expansion of SNLI spanning a larger variety of spoken and written text genres. | Natural Language Inference (NLI) |
NAWL [87] | A database with 2902 Polish words, including nouns, verbs, and adjectives along with ratings of emotional valence, arousal, and imageability (in Polish). | Valence and arousal computation (e.g., in [11]) |
NRC-VAD [88] | Valence, arousal, and dominance ratings for 20,000 English words (in English). | Valence and arousal computation (e.g., in [11]) |
Ohsumed [89] | Total of 7400 documents, each consisting of a medical abstract labeled by one or more classes chosen from 23 categories related to cardiovascular disorders. | Topic classification |
One-Billion word [90] | Dataset consisting of approximately one billion words extracted from various sources on the internet, such as news articles, blogs, and websites. | Text generation (e.g., [20]) |
Open-Brand [30] | Dataset containing over 250,000 product brand–value annotations, with more than 50,000 unique values across eight main categories of Amazon product profiles. | Brand values extracted from product descriptions (e.g., in [4,5]) |
PANIG [91] | Psycholinguistic and affective standards for 619 colloquial terms. Valence, arousal, familiarity, semantic transparency, figurativeness, and concreteness were evaluated for each phrase (in German). | Valence and arousal computation (e.g., in [11]) |
PubMed [92] | Collection of documents from medical and biological research publications. Every document has been tagged using the MeSH set classes. A sentence’s function in an abstract is indicated by labeling it with one of the following classes: background, objective, method, outcome, or conclusion. | News classification |
Quora [93] | Dataset consisting of over 400,000 lines of potential question duplicate pairs. | Text generation (e.g., [19]) |
Recall Memory [56,94] | A total of 98 participants attended 23 experimental sessions. In each session, participants looked over 24 lists, each of 24 words. Participants received 24 s to respond to basic math problems before having 75 s to memorize as many words as they could from the just-presented list. | Assessment of word memorability and estimation of word recall (e.g., in [15]) |
Recognition Memory [95,96] | A total of 171 subjects participated in up to 20 experimental sessions each. Participants examined 12–16 lists of 16 items. Participants finished a recognition memory task at the conclusion of each session by marking whether each of the words had been shown earlier. | Assessment of word memorability and estimation of word recognition (e.g., in [15]) |
Reuters news [97] | One of the most widely used datasets for text classification, gathered in 1987 from the Reuters financial newswire service. | News classification |
Speechocean762 [98] | Open-source speech corpus designed for pronunciation assessment use, consisting of 5000 English utterances from 250 non-native speakers. Five experts annotated each of the utterances at the sentence level, word level, and phoneme level. | Assessment of pronunciation (e.g., [26]) |
S.-Gonzalez et al. [99] | Norms of valence and arousal for 14,031 words (in Spanish). | Valence and arousal computation (e.g., in [11]) |
SICK [100] | 10,000 pairs of English sentences that have been labeled as neutral, entailment, or contradiction. | Natural Language Inference (NLI) |
SNLI [101] | A dataset of 570 k English sentence pairs labeled for entailment, contradiction, and neutrality, serving as a benchmark for text representation evaluation and NLP model development. | Natural Language Inference (NLI) |
Söderholm [79] | Valence and arousal ratings for 420 Finnish nouns by age and gender (in Finnish). | Valence and arousal computation (e.g., in [11]) |
Sogou News [102] | The SogouCA and SogouCS news corpora are combined to create the Sogou News dataset. The news items’ domain names in the URL define their classification labels. | News classifications |
SST [103] | There are two versions available, called SST-1 and SST-2, respectively, one with binary labels and the other with fine-grained labels (five-class). 11,855 movie reviews make up SST-1. SST-2 is divided into three training, development, and test sets, each with a size of 6920, 872, and 1821, respectively. | Sentiment analysis |
SUBTLEX-US [104] | A dataset that shows the proportion of movies that contain a given word. | Assess word memorability (e.g., in [15]) |
SWBD-sentiment [105] | This corpus contains a total of 49,500 labeled utterances covering 140 h of audio. Each sentiment label in this corpus can be one of three options: positive, negative, and neutral. | Speech sentiment analysis (e.g., [27,29]) |
The Glasgow Norms [106] | A set of normative ratings for 5553 words on 9 psycholinguistic dimensions: arousal, valence, dominance, concreteness, imageability, familiarity, age, semantic size, and gender association. | Valence and arousal computation (e.g., in [11]) |
Top 500 companies by net impact [107] | This dataset includes all companies on the Fortune Global 500 list, 2020 edition, i.e., 500 largest companies in the world by revenue. All companies have been ranked based on their net impact. | |
Top 100 Global Brands [108] | Top 100 companies in 2022, including 2022 and 2021 rating, company name, company founder, year of establishment, industry, location, website, additional key people, and 2022 and 2021 brand value. | |
TREC-QA [109] | Two versions: TREC-6 and TREC-50 with 6 and 50 classes, respectively. Both have 5452 training examples and 500 test examples. | Question answering |
TTS-Portuguese [110] | The dataset has approximately 10 h and 28 min of speech from a single speaker, recorded at 48 Khz, containing a total of 3632 audio files in Wave format (in Portuguese). | Text-to-speech (e.g., [77]) |
USF free association norms [111] | Word pool consisting of 576 words. | Assess word memorability (e.g., in [15]) |
VCTK [112] | Corpus including speech data uttered by 110 English speakers with various accents. Each speaker reads out about 400 sentences, which were selected from a newspaper. | Text-to-speech (e.g., [77]) |
Verheyen et al. [113] | Norms for 1000 Dutch adjectives, covering lexicosemantic variables such as age of acquisition, familiarity, concreteness, and imageability alongside affective variables such as valence, arousal, and dominance as well as distributional variables (in Dutch). | Valence and arousal computation (e.g., in [11]) |
Warriner [114] | Valence, arousal, and dominance ratings on a nine-point scale. | Assessment of word memorability (e.g., in [15]) |
WOS [115] | Data and metadata of published papers. | Topic classification |
XANEW [114] | Norms of valence, arousal, and dominance for 13,915 English lemmas. | Valence and arousal computation (e.g., in [11]) |
Xu [116] | Valence and arousal ratings for 11,310 simplified words (in Mandarin). | Valence and arousal computation (e.g., in [11]) |
Yee [117] | Valence, arousal, familiarity, concreteness, and imageability ratings for 292 two-character Chinese nouns (in Cantonese). | Valence and arousal computation (e.g., in [11]) |
Yelp [118] | Subset of businesses, reviews, and user data. | Text generation (e.g., [20]) |
7+ Million Company [119] | Dataset of over seven million companies, including company name, Linkedin URL, domain and industry, company size from 1–10,000+, company location, number of employees, and year of establishment. | |
20 newsgroups [120] | Approximately 20,000 newsgroup documents partitioned (nearly) evenly across 20 different newsgroups. | Semantic analysis (e.g., in [121]) |
22.9+ Million Company [122] | Dataset of over 22.9+ million companies, including company name, Linkedin URL, domain and industry, company location, number of employees, and year of establishment. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Lemos, M.; Cardoso, P.J.S.; Rodrigues, J.M.F. Harnessing AI and NLP Tools for Innovating Brand Name Generation and Evaluation: A Comprehensive Review. Multimodal Technol. Interact. 2024, 8, 56. https://doi.org/10.3390/mti8070056
Lemos M, Cardoso PJS, Rodrigues JMF. Harnessing AI and NLP Tools for Innovating Brand Name Generation and Evaluation: A Comprehensive Review. Multimodal Technologies and Interaction. 2024; 8(7):56. https://doi.org/10.3390/mti8070056
Chicago/Turabian StyleLemos, Marco, Pedro J. S. Cardoso, and João M. F. Rodrigues. 2024. "Harnessing AI and NLP Tools for Innovating Brand Name Generation and Evaluation: A Comprehensive Review" Multimodal Technologies and Interaction 8, no. 7: 56. https://doi.org/10.3390/mti8070056