AI Chatbots for Mental Health: A Scoping Review of Effectiveness, Feasibility, and Applications

Casu, Mirko; Triscari, Sergio; Battiato, Sebastiano; Guarnera, Luca; Caponnetto, Pasquale

doi:10.3390/app14135889

Open AccessReview

AI Chatbots for Mental Health: A Scoping Review of Effectiveness, Feasibility, and Applications

by

Mirko Casu

^1,2,*

,

Sergio Triscari

^2,*

,

Sebastiano Battiato

^1,3

,

Luca Guarnera

¹

and

Pasquale Caponnetto

^2,3

¹

Department of Mathematics and Computer Science, University of Catania, 95125 Catania, Italy

²

Department of Educational Sciences, Section of Psychology, University of Catania, 95124 Catania, Italy

³

Center of Excellence for the Acceleration of Harm Reduction (CoEHAR), University of Catania, 95123 Catania, Italy

^*

Authors to whom correspondence should be addressed.

Appl. Sci. 2024, 14(13), 5889; https://doi.org/10.3390/app14135889

Submission received: 31 May 2024 / Revised: 3 July 2024 / Accepted: 5 July 2024 / Published: 5 July 2024

(This article belongs to the Special Issue Innovative Digital Health Technologies and Their Applications)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Mental health disorders are a leading cause of disability worldwide, and there is a global shortage of mental health professionals. AI chatbots have emerged as a potential solution, offering accessible and scalable mental health interventions. This study aimed to conduct a scoping review to evaluate the effectiveness and feasibility of AI chatbots in treating mental health conditions. A literature search was conducted across multiple databases, including MEDLINE, Scopus, and PsycNet, as well as using AI-powered tools like Microsoft Copilot and Consensus. Relevant studies on AI chatbot interventions for mental health were selected based on predefined inclusion and exclusion criteria. Data extraction and quality assessment were performed independently by multiple reviewers. The search yielded 15 eligible studies covering various application areas, such as mental health support during COVID-19, interventions for specific conditions (e.g., depression, anxiety, substance use disorders), preventive care, health promotion, and usability assessments. AI chatbots demonstrated potential benefits in improving mental and emotional well-being, addressing specific mental health conditions, and facilitating behavior change. However, challenges related to usability, engagement, and integration with existing healthcare systems were identified. AI chatbots hold promise for mental health interventions, but widespread adoption hinges on improving usability, engagement, and integration with healthcare systems. Enhancing personalization and context-specific adaptation is key. Future research should focus on large-scale trials, optimal human–AI integration, and addressing ethical and social implications.

Keywords:

mental health; mental health interventions; clinical psychology; artificial intelligence; AI chatbots; chatbot; AI; feasibility; health promotion; COVID-19

1. Introduction

Mental health, as defined by the World Health Organization (WHO), is a state of well-being in which individuals can realize their abilities, cope with normal life stresses, work productively, and contribute to their community [1]. Mental disorders, affecting a significant portion of the global population at any given time, are a leading cause of disability worldwide [1]. Recent WHO data reveal a global shortfall in the provision of necessary mental health services [2]. Although the global median number of mental health workers has increased from nine per 100,000 population in 2014 to 13 per 100,000 in 2020 [2], this is still insufficient to meet the growing demand for mental health services [3]. The disparity is stark: developed countries like Italy have 17 psychiatrists per 100,000 people [4], whereas many low-income countries have only one psychiatrist per 1,000,000 people [5]. This shortage makes traditional one-on-one mental health interventions challenging to implement globally.

WHO reports that mental health services fail to reach a significant portion of people in both developed and developing countries [6,7]. For instance, service coverage for depression is alarmingly low: even in high-income countries, only one-third of people with depression receive formal care, with minimally adequate treatment ranging from 23% in high-income countries to just 3% in low- and lower-middle-income countries [7]. This lack of access contributes to higher rates of suicidal behavior and mortality [8,9]. Research indicates that areas with overburdened inpatient psychiatric units experience higher suicide rates, underscoring the need for a systemic approach to mental health care [9].

The rise of Cyber Health Psychology has significantly transformed mental health support. This is an interdisciplinary field that explores the intersection of psychology, health, and digital technology. It focuses on understanding how digital tools and online platforms can influence health behaviors, mental well-being, and healthcare practices. This field examines the psychological impacts of using health-related technologies, such as mobile health apps, telemedicine, and online health communities, and seeks to develop digital interventions to promote healthy behaviors and improve mental health outcomes. Cyber Health Psychology aims to enhance healthcare delivery and patient engagement in the digital age [10,11]. Technological integration in mental health has altered social interactions, communication patterns, and even our identities [12]. Web-based psychotherapeutic interventions have proven effective for common mental health disorders such as depression, anxiety, substance abuse, and eating disorders [13]. Combining technological and psychological aspects, particularly through spatial computing and Artificial Intelligence (AI), has shown promising results [14]. The COVID-19 pandemic has further highlighted the importance of digital mental health tools, which have helped address the dual impact of increased new-onset mental health disorders and the deterioration of existing conditions [15].

One notable technological advancement in this field is the use of chatbots or conversational agents [16,17]. These systems, capable of engaging with users through spoken, written, and visual languages, have the potential to expand access to mental health interventions, especially for those reluctant to seek help due to stigma [18,19]. The global market for mental health apps, including chatbots, was estimated at USD 6.2 billion in 2023, with a projected growth rate of 15.2% annually from 2024 to 2030 [20]. As of 2022, mental health, meditation, and sleep trackers accounted for less than 15% of health apps, indicating substantial potential for growth in this area [21].

AI-powered chatbots have evolved from simple rule-based only systems to advanced models using natural language processing (NLP) [22,23]. They show great potential in medical contexts, offering personalized, on-demand health promotion interventions [24,25]. These chatbots mimic human interaction through written, oral, and visual communication, providing accessible health information and services. Over the past decade, research has assessed their feasibility and efficacy, particularly in improving mental health outcomes [26]. Systematic reviews have evaluated their effectiveness, feasibility in healthcare settings, and technical architectures in chronic conditions [26]. Recent studies focus on using AI chatbots for health behavior changes like physical activity, diet, and weight management [26]. Integrated into devices like robots, smartphones, and computers, they support behavioral outcomes such as smoking cessation and treatment adherence [26]. Additionally, AI chatbots aid in patient communication, diagnosis support, and other medical tasks, with studies discussing their benefits, limitations, and future directions [25]. Their potential uses include mental health self-care and health literacy education [14,25,27,28].

1.1. Technical Background

Natural Language Processing and Detailed Aspects of AI Chatbots

The effectiveness and innovation of AI chatbots in mental health interventions heavily depend on advancements and methodologies within computer science [29,30]. A chatbot is essentially a computer program designed to interact with users, addressing specific tasks or requests through AI technologies. Key to these technologies are the machine learning (ML) and deep learning (DL) techniques, which, when integrated with NLP, form the foundation of AI chatbots [31]. NLP is crucial to their functionality, allowing machines to comprehend and generate human language. This combination enables chatbots to understand and interpret user inputs effectively [32,33] (e.g., text, audio [34]). These modern technologies are used in different industries like healthcare to collect, for example, patient data and to provide health education.

It is possible to distinguish three main categories of AI chatbots [35,36,37]:

Menu/button-based chatbots: Use buttons and menus, are common and simple, follow decision tree logic. Users make selections to receive answers.
Keyword recognition-based chatbots: use AI to identify and respond to specific keywords from user input.
Contextual chatbots: use AI and machine learning to understand user intentions and sentiments through technologies like voice recognition.

Recent advancements in NLP, driven by deep learning models such as transformers, have greatly enhanced chatbots’ capabilities in understanding context, sentiment, and nuances in user inputs. Despite these improvements, challenges persist in achieving seamless conversational flow, accurately interpreting user emotions, and comprehending colloquial or context-specific language [29,30].

The key elements that characterize an AI chatbot are as follows [38,39,40]:

Understanding: chatbots use NLP to comprehend user requests and human language complexities.
Contextual responses: NLP allows chatbots to provide relevant, context-aware answers.
Continuous learning: chatbots learn new language patterns from interactions, staying current with trends.

Machine learning powers intelligent chatbot responses [31].

Training data: chatbots learn from extensive conversational data to understand user questions and provide suitable answers.
Pattern recognition: chatbots identify patterns in user behavior and language to predict and generate accurate responses.
Feedback loop: machine learning continuously refines chatbot algorithms, improving accuracy with each interaction.

Machine learning techniques enable chatbots to personalize interactions based on user data, tailoring responses to individual needs and preferences. Methods such as reinforcement learning and user feedback loops can optimize chatbot responses over time [31]. Developing adaptive learning models that dynamically adjust to changes in user behavior and mental state presents a promising area for exploration. Balancing data privacy with personalization is critical, requiring robust encryption and anonymization protocols. Training an AI chatbot involves several steps: collecting and preprocessing data, selecting a model architecture, training the model, evaluating and optimizing its performance, and deploying it [31]. Some models can continue to learn from real-time interactions through continuous learning [41]. This process involves gathering data during conversations, analyzing it, updating the model, and evaluating its performance. While not all AI chatbots use continuous learning, it can significantly enhance responsiveness and adaptability [40]. The decision to implement continuous learning depends on the specific needs of the project.

Advanced chatbots can leverage multimodal data, including text, voice, and even facial expressions, to provide more holistic support. Integrating voice recognition and analysis can make interactions more natural, especially for users who may have difficulties with text-based communication [29,30]. The challenge lies in creating models that can seamlessly integrate and interpret these diverse data types, ensuring consistent and reliable outputs. Using multimodal data, we can now develop Large Language Models (LLMs) [42] that employ deep learning techniques to handle various types of data for numerous NLP tasks, such as recognition, translation, content generation, and text creation. ChatGPT-4, where GPT stands for Generative Pre-trained Transformer, exemplifies the advanced AI chatbots based on this technology.

Training an LLM for specific tasks requires significant hardware resources and extensive data, making the creation of custom AI chatbot models costly [43]. Transfer learning provides an effective solution to this challenge. By starting with a pre-trained model like GPT, we can fine-tune it for specific tasks using targeted data. This fine-tuning process enhances the LLM’s capabilities, making it more accurate and reliable for particular applications [43,44,45]. Figure 1 illustrates a generic fine-tuning operation in the medical domain. In this context, the resulting model can also be customized to process various inputs, including text and multimedia content such as images, videos, and audio.

For chatbots to be effective in large-scale mental health interventions, they must be both scalable and robust [29,30]. Cloud computing and distributed systems are crucial for managing large volumes of concurrent interactions. Research into efficient data processing and storage solutions, as well as load balancing algorithms, is essential to ensure chatbots can operate effectively at scale. Furthermore, maintaining the reliability and uptime of these systems is critical to sustaining user trust and engagement.

Recent advances in LLMs like GPT-4 and Med-PaLM-2 have shown impressive capabilities across medical domains [46]. However, evaluating their performance specifically in mental health applications has been lacking [47,48]. Several studies have begun exploring LLMs’ potential in mental health tasks through benchmarking and extensive evaluations. Xu et al. [49] found that while zero-shot and few-shot prompting showed limited performance, instruction fine-tuning significantly boosted LLM accuracy, with their Mental-Alpaca and Mental-FLAN-T5 models outperforming much larger models like GPT-3.5 and GPT-4. Jin et al. [50] introduced the first multi-dimensional mental health benchmark, revealing significant room for LLM improvement in this domain. Qi et al. [51] evaluated LLMs on cognitive distortion and suicide risk classification for Chinese social media, highlighting GPT-4’s strong performance.

As LLMs integrate further into mental health care [52], a dedicated, rigorous benchmark focused specifically on mental health is crucial. Such a benchmark could evaluate understanding psychological expressions, providing support resources, avoiding harmful outputs, and other key capabilities [49,50]. Further identifying LLMs’ strengths and weaknesses in this area could drive the development of more robust, reliable, and ethical models tailored for assisting those with mental health needs [48,52].

1.2. Aim of the Study

This study aims to conduct a detailed scoping review to address a critical gap in the existing literature on the effectiveness and feasibility of AI chatbots in the treatment of mental health disorders. Despite the growing prevalence of mental health issues and the global shortage of mental health professionals, the potential of AI-powered chatbots as a scalable and accessible solution remains underexplored. This study seeks to fill this significant void by evaluating the current state of research on the effectiveness of AI chatbots in improving mental and emotional well-being, as well as their ability to address specific mental health conditions. Additionally, it will assess the feasibility of AI chatbots in terms of acceptability, usability, and adoption by both users and mental health professionals. By addressing these critical gaps, this study will contribute to a deeper understanding of the potential of AI chatbots as a viable and scalable solution to the growing mental health crisis, informing the development and implementation of more effective and accessible mental health interventions.

2. Materials and Methods

This scoping review was conducted following the guidelines of the Preferred Reporting Items for Systematic reviews and Meta-Analyses extension for Scoping Reviews (PRISMA-ScR) Checklist [53]. The purpose was to map the existing literature on chatbot interventions in mental health, identify research gaps, and suggest directions for future studies.

2.1. Search Strategy

The literature search aimed to identify relevant studies on the use of AI chatbots in mental health interventions. The databases searched included MEDLINE, Scopus, and PsycNet. Additionally, two AI-powered tools, Microsoft Copilot and Consensus, were used to identify further studies.

2.2. Database Searches

The search terms used were the following: “Chatbot and Mental Health” OR “Chatbot and Anxiety” OR “Chatbot and PTSD” OR “Chatbot and Mood Disorder” OR “Chatbot and Depression” OR “Chatbot and DCA” OR “Chatbot and Addiction” OR “Chatbot and Personality Disorder” OR “Chatbot and Generalized Anxiety Disorder” OR “Chatbot and Social Anxiety Disorder” OR “Chatbot and Panic Disorder” OR “Chatbot and Sexual Disorders”. A custom search was executed using Microsoft Copilot with the same search string. Consensus was also utilized to find scientific articles related to the specified search topics. The search was performed on 30 April 2024, and included studies published up to that date.

2.3. Inclusion and Exclusion Criteria

Studies were selected based on the following inclusion criteria:

Clinical trials;
Randomized controlled trials (RCTs);
Articles written in any language;
Chatbot interventions mediated by modern AI architectures/frameworks;
Chatbots utilizing rule-based systems, natural language processing (NLP), or machine learning;
Pilot studies examining chatbot interventions for mental health conditions.

The exclusion criteria included the following:

Cross-sectional studies;
Reviews;
Commentaries;
Editorials;
Protocols;
Case studies;
Older chatbot systems not based on modern AI architectures/frameworks;
Human-to-human asynchronous communication platforms without AI mediation;
Scripted or pre-set chat systems without AI-driven conversation simulation;
Studies not focused on chatbot interventions or mental health conditions.

2.4. Study Selection

The authors conducted a bibliographic search and independently screened each resulting article for adherence to the eligibility criteria. No automatic screening tools were used. The articles agreed upon by all authors were included in the next screening process. In cases of disagreement, a brief discussion was held, and a joint decision was made on whether to include the articles.

2.5. Data Extraction

Two reviewers independently performed data extraction. Any inconsistencies were resolved through discussion or with the assistance of a third reviewer. Both qualitative and quantitative results from each included study were selected during data extraction.

2.6. Quality Assessment

The methodological quality of the included studies was assessed using the Cochrane risk-of-bias tool for randomized trials, version 2 (RoB 2) [54], and the Cochrane Risk Of Bias In Non-randomized Studies—of Exposure (ROBINS-E) [55].

2.7. Analysis

The extracted data were synthesized descriptively to provide an overview of the current research on chatbot interventions for mental health. Key themes and findings were identified, discussing potential benefits, challenges, and gaps in the literature. This synthesis, guided by PRISMA-ScR, ensured a transparent and systematic approach to understanding the emerging field of chatbots in mental health interventions.

3. Results

3.1. Characteristics of Included Studies

The search yielded a total of 4310 records from various databases, with 4268 records excluded after initial screening. After a detailed review, 15 studies were included in the final analysis. The article selection process, documented in the PRISMA flow diagram (Figure 2), followed stringent guidelines to ensure the inclusion of relevant and high-quality studies.

The included studies were categorized into several key areas: mental health support during the COVID-19 pandemic, interventions for specific health conditions, preventive care and well-being, addressing substance use and addiction, health promotion, panic disorder management, and usability and engagement. Two studies focused on mental health support during the COVID-19 pandemic, utilizing AI chatbots to address issues such as depression and anxiety among college students. Three studies concentrated on interventions for specific health conditions, such as depression, Parkinson’s disease, and migraines. Two studies addressed preventive care and well-being applications, examining the use of AI chatbots in preventing eating disorders and promoting well-being among young cancer survivors. Three studies explored the application of AI chatbots in tackling substance use and addiction, including interventions for problem gambling and cannabis/alcohol use. Two studies investigated the use of AI chatbots in health promotion, targeting problem-solving for older adults and HIV prevention and testing among men who have sex with men. One study focused on the management of panic disorder using a mobile app-based interactive CBT chatbot. Additionally, one study compared the usability of an anthropomorphic digital human with a text-based chatbot for responding to mental health queries. The methodologies employed in these studies included randomized controlled trials, cluster-controlled trials, open-label randomized studies, 8-week usability studies, pragmatic multicenter randomized controlled trials, pilot randomized controlled trials, and beta testing mixed methods studies. The full list of included studies is given in Table 1, along with the data extraction of each.

As summarized in Figure 3, the included studies cover a broad spectrum of AI chatbot applications in mental health, ranging from specific interventions for mental health conditions to preventive care, substance use disorders, health promotion, and usability assessments.

3.2. Mental Health Allies during COVID-19

The COVID-19 pandemic has had a significant impact on mental health, particularly among young people. Two studies have demonstrated the potential effectiveness of AI chatbots as interventions for mental health issues during this time. He et al. [58] conducted a randomized controlled trial to assess the impact of an AI chatbot on depressive symptoms in college students; the chatbot, named XiaoE, was employed as a standalone intervention in this study. Their randomized controlled trial found that using the CBT-based AI chatbot, XiaoE, for one week significantly reduced depressive symptoms in college students compared to control groups that read an e-book about depression or used a general chatbot, with a moderate effect size post-intervention and a small effect size at 1-month follow-up. XiaoE had high engagement, acceptability, and working alliance ratings. Qualitative analysis showed participants valued XiaoE’s ability to provide an emotional relationship, promote emotional expression, give personalized responses, and offer practical advice, although some criticized its inflexible content and technical glitches. The authors conclude that while XiaoE is a feasible and effective digital intervention, further research is needed on its long-term efficacy compared to other active treatments, as mental health chatbots may be best used as an adjunct to human therapists rather than a full replacement. Supporting these findings, Peuters et al. [59] evaluated the #LIFEGOALS mobile health intervention, which also included an AI chatbot component. The latter was not a standalone program, but rather offered as one component of a multi-component mHealth intervention. Their cluster-controlled trial with process evaluation interviews showed positive effects on physical activity, sleep quality, and positive moods. Interestingly, the pandemic-related restrictions moderated these effects, with in-person schooling enhancing the mental health benefits. Engagement was a challenge in this study as well, but users highlighted the importance of gamification, self-regulation techniques, and personalized information from the chatbot in facilitating behavior change. These insights provide valuable directions for developing effective AI-based mental health tools for adolescents. Together, these studies suggest that AI chatbots can be a valuable component of mental health interventions, particularly during times of crisis like the COVID-19 pandemic.

3.3. Supporting Emotional Well-Being in Specific Health Conditions

The potential of AI chatbots to support individuals with specific health conditions, such as depression, Parkinson’s disease, and migraines, has been explored in several studies. Yasukawa et al. [60] focused on the role of an AI chatbot in enhancing completion rates for internet-based cognitive–behavioral therapy (iCBT) among workers with subthreshold depression. In this case, the AI chatbot program was not a standalone intervention as it was offered as an adjunct to other therapy, specifically to iCBT. The addition of a chatbot that sent personalized messages encouraged program adherence, resulting in significantly higher iCBT completion rates compared to the control group. However, both groups demonstrated similar improvements in depression and anxiety symptoms, suggesting that while the chatbot improved engagement, the intensive nature of the program may have limited its impact on clinical outcomes.

In the study by Ogawa et al. [61], the impact of an AI chatbot on emotional well-being in Parkinson’s disease patients was investigated. The unique aspect of this research was the exploration of facial expressions and speech patterns as indicators of emotional changes. The intervention group interacted daily with an AI chatbot and had weekly video visits with a neurologist, while the control group received only the video visits. The AI chatbot was used as a component of a broader telemedicine approach, complementing the weekly video conferencing sessions with neurologists. It was not used as a standalone therapy but as an additional tool to enhance patient monitoring and engagement between the weekly neurologist consultations. Although clinical rating scales did not show significant differences, the chatbot group exhibited increased smile parameters and reduced filler words in speech, suggesting improved facial expressivity and fluency, respectively. The correlation between smile features and cognitive and motor ratings, along with the accuracy of machine learning models in predicting these aspects, highlights the potential for remote symptom monitoring in Parkinson’s disease.

The BalanceUP app, developed by Ulrich et al. [62], specifically targets mental health support for individuals suffering from migraines. This app utilizes a chat-based interface with predefined and free-text input options, guiding users through personalized psychoeducational content and behavioral tasks rooted in Cognitive–Behavioral Therapy for Migraine Management (MIMA). The BalanceUP chatbot program appears to have been designed and evaluated as a standalone intervention, rather than as a component or adjunct to other therapy. The app also includes various engagement strategies and tailors content based on user-specific needs. In a randomized controlled trial, the BalanceUP app significantly improved mental well-being, demonstrating its effectiveness as a digital intervention for individuals with migraines.

3.4. Tackling Substance Use and Addiction

The application of AI chatbots in addressing substance use and addiction has been explored in different studies, demonstrating their potential as accessible and effective interventions. Vereschagin et al. [63] conducted a randomized controlled trial with the Minder mobile app, which integrates an AI chatbot delivering cognitive–behavioral therapy, among university students. While an AI chatbot was a key part of the intervention, it was integrated as one component of a multi-faceted mobile app, rather than being used as a standalone chatbot program. The study evaluated the effects of the full Minder app intervention, not just the chatbot in isolation. The findings highlighted the app’s ability to reduce anxiety and depressive symptoms, improve mental well-being, and even decrease the frequency of cannabis use and alcohol consumption. While the effects were small, the self-guided nature and co-development with students make Minder a promising tool for early intervention on university campuses.

Prochaska et al. [64] contributed to this domain with their investigation of the Woebot chatbot designed specifically for substance use disorders (W-SUDs). Their 8-week program delivered cognitive–behavioral therapy for substance use problems and was found to be feasible and highly acceptable by participants. Woebot was offered as a standalone intervention; it is worth noting that while W-SUDs was offered as a standalone intervention in this study, the researchers did analyze whether being in concurrent therapy affected outcomes. The automated therapeutic intervention led to significant improvements in self-reported substance use, cravings, mental health outcomes, and confidence to resist urges. The high engagement and positive outcomes suggest that W-SUDs has the potential to provide scalable treatment for individuals struggling with addiction.

In a similar vein, So et al. [65] compared the effectiveness of guided versus unguided chatbot interventions for problem gambling. Their randomized controlled trial evaluated the addition of minimal therapist guidance to the standalone AI chatbot intervention, GAMBOT2. The research specifically aimed to test the isolated effectiveness of the AI-based intervention by varying only the presence of researcher guidance between groups. Both groups showed significant within-group improvements in gambling outcomes, yet there were no significant between-group differences. This indicates that the guidance provided by therapists did not enhance the outcomes beyond the unguided GAMBOT2 intervention. These findings suggest that the standalone intervention is effective and that costly therapist involvement may not be necessary for positive outcomes. However, further investigation is warranted to identify the specific elements that can best or only be provided by a therapist to determine the optimal human–machine balance and synergy in such interventions.

Furthermore, Olano-Espinosa et al. [66] conducted a pragmatic, multicenter randomized controlled trial comparing an AI chatbot intervention (Dejal@bot) to usual care for smoking cessation in 513 patients across 34 primary care centers in Spain. Dejal@bot was employed as a standalone intervention in this study, replacing usual care. The primary outcome was biochemically-validated 6-month continuous abstinence, which was 26.0% in the chatbot group vs. 18.8% in usual care. The secondary outcomes showed the chatbot group had a greater total interaction time, more contacts, and a trend towards higher quality of life, especially among abstinent patients. Patients using the chatbot intensively (>4 contacts, >30 min total) had a 68.6% abstinence rate versus 40.9% for non-intensive users. While limited by a 54.8% overall dropout rate, the study suggests that the chatbot may increase long-term abstinence rates compared to usual care, with efficacy related to greater interaction intensity.

3.5. Preventive Care: Targeting Eating Disorders and HIV Prevention

The utilization of AI chatbots in preventive health and well-being interventions has been explored in studies targeting eating disorder and HIV prevention. Fitzsimmons-Craft et al. [67] conducted a randomized controlled trial with 700 women at high risk for eating disorders to assess the efficacy of an AI chatbot named “Tessa.” The chatbot delivered a cognitive–behavioral intervention aimed at preventing eating disorders, and was employed as a standalone intervention. The results showed that the intervention group had significantly greater reductions in weight and shape concerns compared to the waitlist control group, with reduced odds of developing an eating disorder. However, challenges with engagement highlight the need for further strategies to maximize the impact of such interventions.

Cheah et al. [68] shifted the focus to HIV prevention and testing, conducting a beta testing study with an AI chatbot prototype among men who have sex with men (MSM) in Malaysia. The AI chatbot was designed to be a complementary tool to existing HIV prevention and testing services, aiming to provide MSM with convenient and confidential access to information and services related to HIV self-testing, venue-based HIV testing, pre-exposure prophylaxis (PrEP), and mental health resources. The chatbot was found to be feasible and acceptable, with participants rating it highly on quality, satisfaction, intention to continue using, and willingness to refer it to others. Participants valued the chatbot’s ability to provide information on HIV testing, and locating testing venues in a stigma-free way that protected privacy. However, participants suggested adding more mental health information and resources, as mental health issues were a major concern for MSM. Improving the conversational flow to seem more natural was also recommended. In general, the study supported the feasibility and acceptability of using an AI chatbot for HIV services among MSM in Malaysia if tailored to the local context and cultures.

3.6. Enhancing Well-Being in Young Cancer Survivors

Greer et al. [69] focused on the use of the Vivibot chatbot to deliver positive psychology skills to young adults who had completed cancer treatment. Vivibot was offered as a standalone intervention in this study; the control group only had access to daily emotion ratings through Facebook Messenger and did not receive the full chatbot content until the end of the study. The authors’ 4-week pilot randomized controlled trial evaluated the feasibility, usability, and initial efficacy of the intervention. Participants found the chatbot engaging and helpful, spending a considerable amount of time interacting with it. The Vivibot group exhibited a trend toward greater reduction in anxiety symptoms compared to the control group, along with an increase in daily positive emotions. While larger trials are warranted, the study underscores the potential of AI chatbots in delivering therapeutic interventions to promote well-being in young cancer survivors.

3.7. Panic Disorder Management

Oh et al. [70] assessed a mobile app-based interactive CBT chatbot for panic disorder. The study was a randomized controlled trial comparing a newly developed mobile app chatbot for cognitive–behavioral therapy (CBT) to a paperback book with information about panic disorder. The chatbot was offered as a standalone intervention for the patients, while the control group received a paperback book with comprehensive information about panic disorder, including sections on symptoms, different types of treatments, and coping skills for emergencies. In total, 45 patients with panic disorder were randomized to either the chatbot group (n = 21) or book group (n = 20). After 4 weeks, the chatbot group showed significantly greater reductions in panic disorder severity measured by the Panic Disorder Severity Scale compared to the book group. The chatbot group also had improvements in social phobia symptoms and perceived control over feelings of helplessness. The chatbot received lower usability ratings and faced technical challenges; the chatbot group reported a lower mean System Usability Scale score (64.5) compared to the book group (69.5). However, qualitative feedback highlighted several advantages of the chatbot, including the availability of coping tools, interactive learning, and self-management features. The mobile CBT chatbot shows promise as an accessible intervention to help manage panic symptoms.

3.8. Problem-Solving in Older Adults

The application of AI chatbots in health promotion interventions has been also explored in studies targeting problem-solving for older adults. Bennion et al. [71] explored the use of web-based conversational agents to facilitate problem-solving among older adults. The study compared the usability, helpfulness, and effectiveness of two conversational AI chatbots (MYLO based on the method of levels therapy and ELIZA based on Rogerian counseling) for problem-solving and reducing distress in a sample of 112 older adults without mental health disorders. The AI chatbots were offered as a standalone intervention in this study. Participants were randomly assigned to interact with either the MYLO or ELIZA chatbot and did not receive any other form of therapy or intervention as part of the study. Both chatbots enabled significant reductions in problem distress and depression/anxiety/stress from baseline to 2-week follow-up, with MYLO showing greater reductions in problem distress at follow-up compared to ELIZA. Participants rated MYLO as significantly more helpful and were more willing to use it again compared to ELIZA. MYLO had higher correlations between system usability ratings and perceived helpfulness, willingness to use again, and problem resolution. The overall system usability scores were below the acceptable threshold for both chatbots, highlighting the importance of optimizing usability for this application in older adults. The promising but mixed results suggest further research is needed on integrating chatbot support systems into clinical care pathways.

3.9. Usability and Engagement

One notable study by Thunström et al. [72] conducted a randomized controlled trial to compare the usability of an anthropomorphic digital human with a text-based chatbot for responding to mental health queries among healthy participants. Specifically, the authors conducted a study on the development and evaluation of a mental health chatbot named BETSY (Behavior, Emotion, Therapy System, and You) using a participatory design approach. This approach involved a multidisciplinary team and extensive public engagement through surveys and workshops. Two versions of BETSY were developed: a digital human interface enabling voice interaction and a text-only interface. The chatbot was offered as a standalone intervention in this study, which specifically evaluated the effectiveness of the chatbot in providing mental health support and did not combine it with any other therapeutic approaches. The study recruited 45 participants, excluding those with high anxiety scores, and divided them randomly into groups interacting with either the digital human or text-only BETSY. The pre-chat procedures included biometric measurements and questionnaires, while during the chat sessions, EEG data were recorded. Post-chat, participants completed additional questionnaires to assess usability and emotional responses. The results indicated higher usability scores for the text-only chatbot compared to the digital human interface. The emotional responses varied, with the digital human group reporting higher nervousness. EEG data revealed higher alpha wave activity in the text-only group, correlating with higher usability scores.

Table 2 provides an overview of the various application areas of the AI chatbots analyzed in this review, summarizing the corresponding studies and their strengths and flaws. The application areas covered include mental health support during COVID-19, interventions for specific health conditions, addressing substance use and addiction, preventive care and well-being, panic disorder management, health promotion, and usability and engagement. For each area, relevant studies are cited, followed by a description of the main strengths and the noted flaws.

Table 3 describes a comprehensive technical overview of the various AI chatbots analyzed in this review, with detailed information on the technologies used, protocols followed, and levels of usability and engagement.

3.10. Risk of Bias

The included studies were assessed for risk of bias using the Cochrane risk-of-bias tool for randomized trials, version 2 (RoB 2), and the Cochrane Risk Of Bias In Non-randomized Studies—of Exposure (ROBINS-E), depending on the nature of the study. The following figures summarize the implementation of these tools and the overall risk-of-bias evaluations for each included study.

3.10.1. Risk of Bias in Randomized Trials (RoB 2)

The assessment using the RoB 2 tool [54] focused on seven domains:

D1: Bias arising from the randomization process.
D2: Bias due to deviations from intended interventions.
D3: Bias due to missing outcome data.
D4: Bias in measurement of the outcome.
D5: Bias in selection of the reported result.

Figure 4 presents the detailed evaluation for each study across these domains. Most studies demonstrated a low risk of bias across all domains, indicating a robust methodological quality. Specifically, Thunström et al. [72], Vereschagin et al. [63], Yasukawa et al. [60], Ulrich et al. [62], So et al. [65], Peuters et al. [59], Ogawa et al. [61], Olano-Espinosa et al. [66], Fitzsimmon-Craft et al. [67], He et al. [58], Bennion et al. [71], Greer et al. [69], and Oh et al. [70] exhibited a low risk of bias in all domains. Conversely, some concerns were noted in studies such as Prochaska et al. [64], which showed a high risk of bias in the randomization process and some concerns in the measurement of the outcome.

3.10.2. Risk of Bias in Non-Randomized Studies (ROBINS-E)

The ROBINS-E tool [55] assessed seven domains for non-randomized studies:

D1: Risk of bias due to confounding.
D2: Risk of bias in selection of participants into the study.
D3: Risk of bias in classification of interventions.
D4: Risk of bias due to deviations from intended interventions.
D5: Risk of bias due to missing data.
D6: Risk of bias in measurement of outcomes.
D7: Risk of bias in selection of the reported result.

Figure 5 provides a summary of these evaluations. Cheah et al. [68] was assessed with a low risk of bias across all domains, indicating high methodological rigor and reliable results.

In general, the risk-of-bias assessments using the RoB 2 and ROBINS-E tools indicate that most of the included studies exhibit a low risk of bias, suggesting that the findings are robust and reliable. However, attention should be given to studies with identified risks to interpret their results cautiously.

4. Discussion

This scoping review provides an overview of the current state of research on the use of AI chatbots for mental health interventions. The included studies span a wide range of application areas, including specific mental health conditions, substance use disorders, preventive care, health promotion, and usability assessments. Several key themes emerge from the findings.

The studies collectively demonstrate the potential benefits of AI chatbots in improving mental and emotional well-being, addressing specific mental health conditions, and facilitating behavior change. Chatbots have shown promise in reducing symptoms of depression, anxiety, substance use, and disordered eating behaviors. They have also been explored as preventive interventions for conditions like eating disorders and as supportive tools for individuals with chronic illnesses like Parkinson’s disease and migraines. AI chatbots offer several advantages over traditional mental health services, including increased accessibility, anonymity, and cost-effectiveness. They can provide round-the-clock support, overcome geographical barriers, and reduce the stigma associated with seeking professional help. Chatbots can also complement existing treatment modalities, serving as adjuncts to human therapists or self-help tools.

While the studies indicate the feasibility and acceptability of AI chatbots for mental health interventions, challenges related to usability and engagement persist (see Table 3). Some studies, like those by He et al. [58] and Prochaska et al. [64], utilized specific standardized tools such as the Working Alliance Questionnaire (WAQ) and the System Usability Scale (SUS) to assess usability and human interaction. Other studies, such as Peuters et al. [59] and Yasukawa et al. [60], relied on qualitative feedback from interviews without using standardized instruments. For instance, Ulrich et al. [62] employed a mix of quantitative metrics and standardized scales like the Mobile Application Rating Scale (MARS) alongside qualitative feedback. The SUS was a commonly used tool in several studies (e.g., Bennion et al. [71], Oh et al. [70], Cheah et al. [68], and Thunström et al. [72]), indicating a preference for its straightforward and well-established usability assessment. However, each study often tailored its evaluation approach to its specific context and objectives, leading to variability in the comparability of findings across different chatbots. This diversity in the evaluation methods underscores the importance of a standardized approach for assessing chatbot usability and user interaction to facilitate more consistent comparisons across studies. Furthermore, several studies reported issues with technical glitches, limited conversational flow, and inflexible content, highlighting the need for continuous improvement in natural language processing and user experience design.

Several studies suggest potential benefits and positive chatbot engagement comparable to or better than traditional methods, highlighting the need for ongoing refinement to address engagement variability and improve user experience (see Table 3). Notably, He et al. [58] found that the XiaoE chatbot demonstrated lower attrition and high initial engagement, although engagement fluctuated and appeared more suited for short-term use. Peuters et al. [59] reported initial enthusiasm but overall low engagement, with the chatbot’s ability to provide meaningful replies being a significant factor. Yasukawa et al. [60] observed improved completion rates with the chatbot-enhanced iCBT program, indicating positive engagement. Similarly, Ulrich et al. [62] noted good engagement rates with a substantial portion of participants completing the program, and Vereschagin et al. [63] found higher engagement with the chatbot compared to other app components. Oh et al. [70] and Bennion et al. [71] reported satisfactory engagement and completion rates, with the latter study indicating comparable engagement to human-delivered CBT.

The reviewed studies also emphasize tailoring chatbot interventions to specific populations and contexts. The success of mental health chatbots during the COVID-19 pandemic underscores the need for adaptive solutions in crises. Similarly, cultural sensitivity and localization are critical for chatbots addressing substance use disorders and HIV prevention in vulnerable groups. Personalization and context-specific adaptation are key to their effectiveness and acceptability. These sensitivities are also critical in human-delivered interventions, where cultural biases and misunderstandings can hinder progress. Conversational agents offer advantages in accessibility, confidentiality, and tailored information [73]. They provide consistent care, infinite patience, and can bridge literacy gaps with anthropomorphic features. Chatbots can also address healthcare provider diversity disparities by matching racial features with patients, fostering trust. Additionally, as outlined by Han et al. [74], chatbots are perceived as less judgmental, creating a safe space for users with sensitive information, which is vital for managing conditions like PTSD. These advantages make chatbots a compelling complement to human health care, especially in behavior change and diverse population support.

While AI chatbots offer scalable solutions for mental health support, their integration with existing healthcare systems remains a challenge. The results indicate that AI chatbots can effectively function as standalone interventions and adjuncts to therapy for improving mental health (see Table 3). For instance, He et al. [58] demonstrated that XiaoE, used independently, reduced depressive symptoms in college students. Conversely, Yasukawa et al. [60] found that an AI chatbot paired with iCBT improved program adherence. Similarly, Ogawa et al. [61] utilized chatbots within a broader telemedicine approach for Parkinson’s patients, enhancing patient engagement. The effectiveness of standalone AI chatbots is further supported by studies from Vereschagin et al. [63], Fitzsimmon-Craft et al. [67], Oh et al. [70], and Bennion et al. [71]. However, Peuters et al. [59] and Cheah et al. [68] showcased the benefits of AI chatbots as components of multi-faceted interventions for mental health and HIV prevention, respectively. These findings suggest that AI chatbots are versatile tools in mental health interventions, and ongoing research will help optimize their use and determine their long-term efficacy relative to human therapists. Nonetheless, studies exploring the usability and acceptability of chatbots among healthcare professionals and policymakers are needed to facilitate their adoption and integration into clinical pathways. Establishing trust and addressing ethical concerns related to data privacy, safety, and accountability will be crucial for the widespread implementation of AI chatbots in mental healthcare settings.

Limitations and Future Research Directions

This scoping review has several limitations. Future systematic reviews and meta-analyses targeting specific application areas or population groups could offer more rigorous evaluations of the evidence. Furthermore, the rapid evolution of AI technology suggests that the findings of this review may quickly become outdated as new advancements in natural language processing and conversational AI emerge.

Future research should prioritize large-scale, well-designed randomized controlled trials to evaluate the long-term efficacy and cost-effectiveness of AI chatbot interventions compared to standard care or other active treatments. Studies investigating the optimal combination of human and AI-based support, as well as the integration of chatbots into existing healthcare systems, are also warranted. Also, the development of effective AI chatbots for mental health is inherently interdisciplinary, requiring collaboration between computer scientists, psychologists, healthcare professionals, and ethicists. Creating platforms for interdisciplinary research and dialogue can facilitate the integration of diverse perspectives and expertise, leading to more holistic and effective chatbot solutions. Furthermore, the deployment of AI chatbots in mental health settings raises significant ethical and security concerns.

Ensuring data privacy and protecting user information against breaches are paramount. However, the ethical concerns surrounding relational agents like chatbots go beyond data privacy. Hudlicka [75] identifies key issues: affective privacy, emotion induction, and virtual relationships. Affective privacy relates to keeping thoughts and emotions private, raising questions about the extent of chatbot probing. Emotion induction refers to the chatbots’ potential to manipulate users’ emotions, bringing up consent and impact concerns. Virtual relationships, where users bond with chatbots, also blur the lines between human and artificial connections, leading to dependency worries. Richards [76] reinforces these concerns with survey data showing user discomfort with AI’s handling of emotions and personal data. The respondents stressed the importance of transparency and user control. Developing ethical guidelines and frameworks, as well as implementing advanced security measures such as end-to-end encryption, is necessary to address these concerns. Furthermore, as these AI systems become more sophisticated in mimicking human conversation, patients using online text-based mental health services may experience heightened doubt about the authenticity of their interlocutor. This uncertainty can manifest as a persistent suspicion that they might be interacting with an AI rather than a human therapist, even when engaging with a real person. Such doubt could potentially undermine the therapeutic relationship, affecting the patient’s trust, openness, and overall treatment efficacy. Moreover, if patients become aware that some services use AI chatbots, they might extend this skepticism to all online mental health interactions, creating a broader trust issue in digital mental health services. This could be interpreted as an extension of the recently described Impostor Bias to mental health contexts [77], which further underscores the need for transparency in AI use and the importance of maintaining clear distinctions between human and AI-driven interactions in sensitive fields like mental health care.

5. Conclusions

The scoping review highlights the potential of AI chatbots in providing accessible and scalable mental health interventions. Chatbots have demonstrated effectiveness in improving mental well-being, addressing specific conditions like depression, anxiety, and substance use disorders, and facilitating preventive care and health promotion. However, challenges persist in terms of usability, engagement, and integration with existing healthcare systems. Tailoring chatbots to specific populations and contexts is crucial for enhancing their acceptability and impact. Personalization and adaptive capabilities enabled by advanced natural language processing and machine learning can further improve the therapeutic potential of AI chatbots. Future research should focus on large-scale randomized controlled trials, exploring the optimal integration of human and AI-based support, and addressing ethical, legal, and social implications. Overcoming these challenges will be essential for the widespread adoption and effective implementation of AI chatbots in mental healthcare settings.

Author Contributions

Conceptualization, M.C. and S.T.; methodology, M.C. and S.T.; validation, M.C., S.T. and L.G.; formal analysis, M.C.; investigation, M.C. and S.T.; data curation, M.C. and S.T.; writing—original draft preparation, M.C., S.T. and L.G.; writing—review and editing, M.C., L.G. and S.T.; visualization, M.C.; supervision, S.B. and P.C.; project administration, M.C., S.T., S.B. and P.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

World Health Organization Mental Health. Available online: https://www.who.int/health-topics/mental-health (accessed on 13 May 2024).
Brunier, A.; WHO Media Team. WHO Report Highlights Global Shortfall in Investment in Mental Health. Available online: https://www.who.int/news/item/08-10-2021-who-report-highlights-global-shortfall-in-investment-in-mental-health (accessed on 13 May 2024).
Rising Demand for Mental Health Workers Globally. IHNA Blog. 2022. Available online: https://www.ihna.edu.au/blog/2022/09/rising-demand-for-mental-health-workers-globally/ (accessed on 13 May 2024).
Eurostat Number of Psychiatrists: How Do Countries Compare? Available online: https://ec.europa.eu/eurostat/web/products-eurostat-news/-/DDN-20200506-1 (accessed on 13 May 2024).
McKenzie, K.; Patel, V.; Araya, R. Learning from Low Income Countries: Mental Health. BMJ 2004, 329, 1138–1140. [Google Scholar] [CrossRef] [PubMed]
Mental Health in Developed vs. Developing Countries. Jacinto Convit World Organization. 2021. Available online: https://www.jacintoconvit.org/social-science-series-5-mental-health-in-developed-vs-developing-countries/ (accessed on 13 May 2024).
WHO Media Team. WHO Highlights Urgent Need to Transform Mental Health and Mental Health Care. Available online: https://www.who.int/news/item/17-06-2022-who-highlights-urgent-need-to-transform-mental-health-and-mental-health-care (accessed on 13 May 2024).
Hester, R.D. Lack of Access to Mental Health Services Contributing to the High Suicide Rates among Veterans. Int. J. Ment. Health Syst. 2017, 11, 47. [Google Scholar] [CrossRef] [PubMed]
Kapur, N.; Gorman, L.S.; Quinlivan, L.; Webb, R.T. Mental Health Services: Quality, Safety and Suicide. BMJ Qual. Saf. 2022, 31, 419–422. [Google Scholar] [CrossRef] [PubMed]
Caponnetto, P.; Milazzo, M. Cyber Health Psychology: The Use of New Technologies at the Service of Psychological Well Being and Health Empowerment. Health Psych. Res. 2019, 7, 8559. [Google Scholar] [CrossRef] [PubMed]
Caponnetto, P.; Casu, M. Update on Cyber Health Psychology: Virtual Reality and Mobile Health Tools in Psychotherapy, Clinical Rehabilitation, and Addiction Treatment. Int. J. Environ. Res. Public Health 2022, 19, 3516. [Google Scholar] [CrossRef]
Ancis, J.R. The Age of Cyberpsychology: An Overview. Technol. Mind Behav. 2020, 1. [Google Scholar] [CrossRef]
Taylor, C.B.; Graham, A.K.; Flatt, R.E.; Waldherr, K.; Fitzsimmons-Craft, E.E. Current State of Scientific Evidence on Internet-Based Interventions for the Treatment of Depression, Anxiety, Eating Disorders and Substance Abuse: An Overview of Systematic Reviews and Meta-Analyses. Eur. J. Public Health 2021, 31, i3–i10. [Google Scholar] [CrossRef] [PubMed]
Spiegel, B.M.R.; Liran, O.; Clark, A.; Samaan, J.S.; Khalil, C.; Chernoff, R.; Reddy, K.; Mehra, M. Feasibility of Combining Spatial Computing and AI for Mental Health Support in Anxiety and Depression. NPJ Digit. Med. 2024, 7, 22. [Google Scholar] [CrossRef] [PubMed]
Li, J. Digital Technologies for Mental Health Improvements in the COVID-19 Pandemic: A Scoping Review. BMC Public Health 2023, 23, 413. [Google Scholar] [CrossRef]
Viduani, A.; Cosenza, V.; Araújo, R.M.; Kieling, C. Chatbots in the Field of Mental Health: Challenges and Opportunities. In Digital Mental Health: A Practitioner’s Guide; Passos, I.C., Rabelo-da-Ponte, F.D., Kapczinski, F., Eds.; Springer International Publishing: Cham, Switzerland, 2023; pp. 133–148. ISBN 978-3-031-10698-9. [Google Scholar]
Song, I.; Pendse, S.R.; Kumar, N.; De Choudhury, M. The Typing Cure: Experiences with Large Language Model Chatbots for Mental Health Support. arXiv 2024, arXiv:2401.14362. [Google Scholar]
Abd-alrazaq, A.A.; Alajlani, M.; Alalwan, A.A.; Bewick, B.M.; Gardner, P.; Househ, M. An Overview of the Features of Chatbots in Mental Health: A Scoping Review. Int. J. Med. Inform. 2019, 132, 103978. [Google Scholar] [CrossRef] [PubMed]
Moore, J.R.; Caudill, R. The Bot Will See You Now: A History and Review of Interactive Computerized Mental Health Programs. Psychiatr. Clin. N. Am. 2019, 42, 627–634. [Google Scholar] [CrossRef] [PubMed]
Grand View Research. Mental Health Apps Market Size and Share Report. 2030. Available online: https://web.archive.org/web/20240705093326/https://www.grandviewresearch.com/industry-analysis/mental-health-apps-market-report (accessed on 13 May 2024).
Ceci, L. Topic: Meditation and Mental Wellness Apps. Available online: https://www.statista.com/topics/11045/meditation-and-mental-wellness-apps/ (accessed on 13 May 2024).
Adam, M.; Wessel, M.; Benlian, A. AI-Based Chatbots in Customer Service and Their Effects on User Compliance. Electron Mark. 2021, 31, 427–445. [Google Scholar] [CrossRef]
Andre, D. What Is a Chatbot? All About AI: Golden Grove, Australia, 2023. [Google Scholar]
Adamopoulou, E.; Moussiades, L. An Overview of Chatbot Technology. In Proceedings of the Artificial Intelligence Applications and Innovations; Maglogiannis, I., Iliadis, L., Pimenidis, E., Eds.; Springer International Publishing: Cham, Switzerland, 2020; pp. 373–383. [Google Scholar]
Moilanen, J.; van Berkel, N.; Visuri, A.; Gadiraju, U.; van der Maden, W.; Hosio, S. Supporting Mental Health Self-Care Discovery through a Chatbot. Front. Digit. Health 2023, 5, 1034724. [Google Scholar] [CrossRef] [PubMed]
Chakraborty, C.; Pal, S.; Bhattacharya, M.; Dash, S.; Lee, S.-S. Overview of Chatbots with Special Emphasis on Artificial Intelligence-Enabled ChatGPT in Medical Science. Front. Artif. Intell. 2023, 6, 1237704. [Google Scholar] [CrossRef]
Zhong, W.; Luo, J.; Zhang, H. The Therapeutic Effectiveness of Artificial Intelligence-Based Chatbots in Alleviation of Depressive and Anxiety Symptoms in Short-Course Treatments: A Systematic Review and Meta-Analysis. J. Affect. Disord. 2024, 356, 459–469. [Google Scholar] [CrossRef]
Zafar, F.; Fakhare Alam, L.; Vivas, R.R.; Wang, J.; Whei, S.J.; Mehmood, S.; Sadeghzadegan, A.; Lakkimsetti, M.; Nazir, Z. The Role of Artificial Intelligence in Identifying Depression and Anxiety: A Comprehensive Literature Review. Cureus 2024, 16, e56472. [Google Scholar] [CrossRef] [PubMed]
Boucher, E.M.; Harake, N.R.; Ward, H.E.; Stoeckl, S.E.; Vargas, J.; Minkel, J.; Parks, A.C.; Zilca, R. Artificially Intelligent Chatbots in Digital Mental Health Interventions: A Review. Expert Rev. Med. Devices 2021, 18, 37–49. [Google Scholar] [CrossRef] [PubMed]
Balcombe, L. AI Chatbots in Digital Mental Health. Informatics 2023, 10, 82. [Google Scholar] [CrossRef]
Ali, B.; Ravi, V.; Bhushan, C.; Santhosh, M.G.; Shiva Shankar, O. Chatbot via Machine Learning and Deep Learning Hybrid. In Modern Approaches in Machine Learning and Cognitive Science: A Walkthrough; Latest Trends in AI; Gunjan, V.K., Zurada, J.M., Eds.; Springer International Publishing: Cham, Switzerland, 2021; Volume 2, pp. 255–265. ISBN 978-3-030-68291-0. [Google Scholar]
Azwary, F.; Indriani, F.; Nugrahadi, D.T. Question Answering System Berbasis Artificial Intelligence Markup Language Sebagai Media Informasi. KLIK—Kumpul. J. Ilmu Komput. 2016, 3, 48–60. [Google Scholar] [CrossRef]
Abdul-Kader, S.A.; Woods, J. Survey on Chatbot Design Techniques in Speech Conversation Systems. Int. J. Adv. Comput. Sci. Appl. 2015, 6, 72–80. [Google Scholar] [CrossRef]
Shevat, A. Designing Bots: Creating Conversational Experiences; O’Reilly Media, Inc.: Sebastopol, CA, USA, 2017; ISBN 978-1-4919-7484-1. [Google Scholar]
Gupta, A.; Hathwar, D.; Vijayakumar, A. Introduction to AI Chatbots. Int. J. Eng. Res. Technol. 2020, 9, 255–258. [Google Scholar]
Trofymenko, O.; Prokop, Y.; Zadereyko, O.; Loginova, N. Classification of Chatbots. Syst. Technol. 2022, 2, 147–159. [Google Scholar] [CrossRef]
Hussain, S.; Ameri Sianaki, O.; Ababneh, N. A Survey on Conversational Agents/Chatbots Classification and Design Techniques. In Proceedings of the Web, Artificial Intelligence and Network Applications; Barolli, L., Takizawa, M., Xhafa, F., Enokido, T., Eds.; Springer International Publishing: Cham, Switzerland, 2019; pp. 946–956. [Google Scholar]
Kuhail, M.A.; Alturki, N.; Alramlawi, S.; Alhejori, K. Interacting with Educational Chatbots: A Systematic Review. Educ. Inf. Technol. 2023, 28, 973–1018. [Google Scholar] [CrossRef]
Labadze, L.; Grigolia, M.; Machaidze, L. Role of AI Chatbots in Education: Systematic Literature Review. Int. J. Educ. Technol. High. Educ. 2023, 20, 56. [Google Scholar] [CrossRef]
Chen, Z.; Liu, B. Continuous Knowledge Learning in Chatbots. In Lifelong Machine Learning; Chen, Z., Liu, B., Eds.; Springer International Publishing: Cham, Switzerland, 2018; pp. 131–138. ISBN 978-3-031-01581-6. [Google Scholar]
Biesialska, M.; Biesialska, K.; Costa-jussà, M.R. Continual Lifelong Learning in Natural Language Processing: A Survey. In Proceedings of the 28th International Conference on Computational Linguistics, Barcelona, Spain, 8–13 December 2020; pp. 6523–6541. [Google Scholar]
Zhao, W.X.; Zhou, K.; Li, J.; Tang, T.; Wang, X.; Hou, Y.; Min, Y.; Zhang, B.; Zhang, J.; Dong, Z.; et al. A Survey of Large Language Models. arXiv 2023, arXiv:2303.18223. [Google Scholar]
Hoffmann, J.; Borgeaud, S.; Mensch, A.; Buchatskaya, E.; Cai, T.; Rutherford, E.; Casas, D.d.L.; Hendricks, L.A.; Welbl, J.; Clark, A.; et al. Training Compute-Optimal Large Language Models. arXiv 2022, arXiv:2203.15556. [Google Scholar]
Auma, D. Using Transfer Learning for Chatbots in Python. Medium. 2023. Available online: https://medium.com/@dianaauma2/using-transfer-learning-for-chatbots-in-python-93e937936ad8 (accessed on 13 May 2024).
Kulkarni, A.; Shivananda, A.; Kulkarni, A. Building a Chatbot Using Transfer Learning. In Natural Language Processing Projects: Build Next-Generation NLP Applications Using AI Techniques; Kulkarni, A., Shivananda, A., Kulkarni, A., Eds.; Apress: Berkeley, CA, USA, 2022; pp. 239–255. ISBN 978-1-4842-7386-9. [Google Scholar]
Pal, A.; Minervini, P.; Motzfeldt, A.G.; Gema, A.P.; Alex, B. Openlifescienceai/Open_medical_llm_leaderboard. 2024. Available online: https://huggingface.co/blog/leaderboard-medicalllm (accessed on 13 May 2024).
Jahan, I.; Laskar, M.T.R.; Peng, C.; Huang, J.X. A Comprehensive Evaluation of Large Language Models on Benchmark Biomedical Text Processing Tasks. Comput. Biol. Med. 2024, 171, 108189. [Google Scholar] [CrossRef]
Wang, B.; Chen, W.; Pei, H.; Xie, C.; Kang, M.; Zhang, C.; Xu, C.; Xiong, Z.; Dutta, R.; Schaeffer, R.; et al. DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models. Adv. Neural Inf. Process. Syst. 2023, 36, 31232–31339. [Google Scholar]
Xu, X.; Yao, B.; Dong, Y.; Gabriel, S.; Yu, H.; Hendler, J.; Ghassemi, M.; Dey, A.K.; Wang, D. Mental-LLM: Leveraging Large Language Models for Mental Health Prediction via Online Text Data. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 2024, 8, 1–32. [Google Scholar] [CrossRef]
Jin, H.; Chen, S.; Wu, M.; Zhu, K.Q. PsyEval: A Comprehensive Large Language Model Evaluation Benchmark for Mental Health. arXiv 2023, arXiv:2311.09189. [Google Scholar]
Qi, H.; Zhao, Q.; Song, C.; Zhai, W.; Luo, D.; Liu, S.; Yu, Y.J.; Wang, F.; Zou, H.; Yang, B.X.; et al. Supervised Learning and Large Language Model Benchmarks on Mental Health Datasets: Cognitive Distortions and Suicidal Risks in Chinese Social Media. Res. Sq. 2023. [Google Scholar] [CrossRef]
Obradovich, N.; Khalsa, S.S.; Khan, W.U.; Suh, J.; Perlis, R.H.; Ajilore, O.; Paulus, M.P. Opportunities and Risks of Large Language Models in Psychiatry. NPP—Digit. Psychiatry Neurosci. 2024, 2, 8. [Google Scholar] [CrossRef]
Tricco, A.C.; Lillie, E.; Zarin, W.; O’Brien, K.K.; Colquhoun, H.; Levac, D.; Moher, D.; Peters, M.D.J.; Horsley, T.; Weeks, L.; et al. PRISMA Extension for Scoping Reviews (PRISMA-ScR): Checklist and Explanation. Ann. Intern. Med. 2018, 169, 467–473. [Google Scholar] [CrossRef]
Sterne, J.A.C.; Savović, J.; Page, M.J.; Elbers, R.G.; Blencowe, N.S.; Boutron, I.; Cates, C.J.; Cheng, H.-Y.; Corbett, M.S.; Eldridge, S.M.; et al. RoB 2: A Revised Tool for Assessing Risk of Bias in Randomised Trials. BMJ 2019, 366, l4898. [Google Scholar] [CrossRef]
Higgins, J.P.T.; Morgan, R.L.; Rooney, A.A.; Taylor, K.W.; Thayer, K.A.; Silva, R.A.; Lemeris, C.; Akl, E.A.; Bateson, T.F.; Berkman, N.D.; et al. A Tool to Assess Risk of Bias in Non-Randomized Follow-up Studies of Exposure Effects (ROBINS-E). Environ. Int. 2024, 186, 108602. [Google Scholar] [CrossRef]
Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E.; et al. The PRISMA 2020 Statement: An Updated Guideline for Reporting Systematic Reviews. BMJ 2021, 372, n71. [Google Scholar] [CrossRef]
Haddaway, N.R.; Page, M.J.; Pritchard, C.C.; McGuinness, L.A. PRISMA2020: An R Package and Shiny App for Producing PRISMA 2020-compliant Flow Diagrams, with Interactivity for Optimised Digital Transparency and Open Synthesis. Campbell Syst. Rev. 2022, 18, e1230. [Google Scholar] [CrossRef]
He, Y.; Yang, L.; Zhu, X.; Wu, B.; Zhang, S.; Qian, C.; Tian, T. Mental Health Chatbot for Young Adults with Depressive Symptoms during the COVID-19 Pandemic: Single-Blind, Three-Arm Randomized Controlled Trial. J. Med. Internet Res. 2022, 24, e40719. [Google Scholar] [CrossRef]
Peuters, C.; Maenhout, L.; Cardon, G.; De Paepe, A.; DeSmet, A.; Lauwerier, E.; Leta, K.; Crombez, G. A Mobile Healthy Lifestyle Intervention to Promote Mental Health in Adolescence: A Mixed-Methods Evaluation. BMC Public Health 2024, 24, 44. [Google Scholar] [CrossRef]
Yasukawa, S.; Tanaka, T.; Yamane, K.; Kano, R.; Sakata, M.; Noma, H.; Furukawa, T.A.; Kishimoto, T. A Chatbot to Improve Adherence to Internet-Based Cognitive–Behavioural Therapy among Workers with Subthreshold Depression: A Randomised Controlled Trial. BMJ Ment. Health 2024, 27, e300881. [Google Scholar] [CrossRef]
Ogawa, M.; Oyama, G.; Morito, K.; Kobayashi, M.; Yamada, Y.; Shinkawa, K.; Kamo, H.; Hatano, T.; Hattori, N. Can AI Make People Happy? The Effect of AI-Based Chatbot on Smile and Speech in Parkinson’s Disease. Park. Relat. Disord. 2022, 99, 43–46. [Google Scholar] [CrossRef]
Ulrich, S.; Gantenbein, A.R.; Zuber, V.; Von Wyl, A.; Kowatsch, T.; Künzli, H. Development and Evaluation of a Smartphone-Based Chatbot Coach to Facilitate a Balanced Lifestyle in Individuals with Headaches (BalanceUP App): Randomized Controlled Trial. J. Med. Internet Res. 2024, 26, e50132. [Google Scholar] [CrossRef]
Vereschagin, M.; Wang, A.Y.; Richardson, C.G.; Xie, H.; Munthali, R.J.; Hudec, K.L.; Leung, C.; Wojcik, K.D.; Munro, L.; Halli, P.; et al. Effectiveness of the Minder Mobile Mental Health and Substance Use Intervention for University Students: Randomized Controlled Trial. J. Med. Internet Res. 2024, 26, e54287. [Google Scholar] [CrossRef]
Prochaska, J.J.; Vogel, E.A.; Chieng, A.; Kendra, M.; Baiocchi, M.; Pajarito, S.; Robinson, A. A Therapeutic Relational Agent for Reducing Problematic Substance Use (Woebot): Development and Usability Study. J. Med. Internet Res. 2021, 23, e24850. [Google Scholar] [CrossRef]
So, R.; Emura, N.; Okazaki, K.; Takeda, S.; Sunami, T.; Kitagawa, K.; Takebayashi, Y.; Furukawa, T.A. Guided versus Unguided Chatbot-Delivered Cognitive Behavioral Intervention for Individuals with Moderate-Risk and Problem Gambling: A Randomized Controlled Trial (GAMBOT2 Study). Addict. Behav. 2024, 149, 107889. [Google Scholar] [CrossRef]
Olano-Espinosa, E.; Avila-Tomas, J.F.; Minue-Lorenzo, C.; Matilla-Pardo, B.; Serrano Serrano, M.E.; Martinez-Suberviola, F.J.; Gil-Conesa, M.; Del Cura-González, I. Dejal@ Group Effectiveness of a Conversational Chatbot (Dejal@bot) for the Adult Population to Quit Smoking: Pragmatic, Multicenter, Controlled, Randomized Clinical Trial in Primary Care. JMIR Mhealth Uhealth 2022, 10, e34273. [Google Scholar] [CrossRef]
Fitzsimmons-Craft, E.E.; Chan, W.W.; Smith, A.C.; Firebaugh, M.; Fowler, L.A.; Topooco, N.; DePietro, B.; Wilfley, D.E.; Taylor, C.B.; Jacobson, N.C. Effectiveness of a Chatbot for Eating Disorders Prevention: A Randomized Clinical Trial. Int. J. Eat. Disord. 2022, 55, 343–353. [Google Scholar] [CrossRef]
Cheah, M.H.; Gan, Y.N.; Altice, F.L.; Wickersham, J.A.; Shrestha, R.; Salleh, N.A.M.; Ng, K.S.; Azwa, I.; Balakrishnan, V.; Kamarulzaman, A.; et al. Testing the Feasibility and Acceptability of Using an Artificial Intelligence Chatbot to Promote HIV Testing and Pre-Exposure Prophylaxis in Malaysia: Mixed Methods Study. JMIR Hum. Factors 2024, 11, e52055. [Google Scholar] [CrossRef]
Greer, S.; Ramo, D.; Chang, Y.-J.; Fu, M.; Moskowitz, J.; Haritatos, J. Use of the Chatbot “Vivibot” to Deliver Positive Psychology Skills and Promote Well-Being among Young People after Cancer Treatment: Randomized Controlled Feasibility Trial. JMIR Mhealth Uhealth 2019, 7, e15018. [Google Scholar] [CrossRef]
Oh, J.; Jang, S.; Kim, H.; Kim, J.-J. Efficacy of Mobile App-Based Interactive Cognitive Behavioral Therapy Using a Chatbot for Panic Disorder. Int. J. Med. Inform. 2020, 140, 104171. [Google Scholar] [CrossRef] [PubMed]
Bennion, M.R.; Hardy, G.E.; Moore, R.K.; Kellett, S.; Millings, A. Usability, Acceptability, and Effectiveness of Web-Based Conversational Agents to Facilitate Problem Solving in Older Adults: Controlled Study. J. Med. Internet Res. 2020, 22, e16794. [Google Scholar] [CrossRef] [PubMed]
Thunström, A.O.; Carlsen, H.K.; Ali, L.; Larson, T.; Hellström, A.; Steingrimsson, S. Usability Comparison among Healthy Participants of an Anthropomorphic Digital Human and a Text-Based Chatbot as a Responder to Questions on Mental Health: Randomized Controlled Trial. JMIR Hum. Factors 2024, 11, e54581. [Google Scholar] [CrossRef]
Lisetti, C. 10 Advantages of Using Avatars in Patient-Centered Computer-Based Interventions for Behavior Change. SIGHIT Rec. 2012, 2, 28. [Google Scholar] [CrossRef]
Han, H.J.; Mendu, S.; Jaworski, B.K.; Owen, J.E.; Abdullah, S. Preliminary Evaluation of a Conversational Agent to Support Self-Management of Individuals Living with Posttraumatic Stress Disorder: Interview Study with Clinical Experts. JMIR Form. Res. 2023, 7, e45894. [Google Scholar] [CrossRef]
Hudlicka, E. Chapter 4—Virtual Affective Agents and Therapeutic Games. In Artificial Intelligence in Behavioral and Mental Health Care; Luxton, D.D., Ed.; Academic Press: San Diego, CA, USA, 2016; pp. 81–115. ISBN 978-0-12-420248-1. [Google Scholar]
Richards, D.; Vythilingam, R.; Formosa, P. A Principlist-Based Study of the Ethical Design and Acceptability of Artificial Social Agents. Int. J. Hum.-Comput. Stud. 2023, 172, 102980. [Google Scholar] [CrossRef]
Casu, M.; Guarnera, L.; Caponnetto, P.; Battiato, S. GenAI Mirage: The Impostor Bias and the Deepfake Detection Challenge in the Era of Artificial Illusions. Forensic Sci. Int. Digit. Investig. 2024, 50, 301795. [Google Scholar] [CrossRef]

Figure 1. A schematic representation of the fine-tuning process for a GPT model in the medical domain.

Figure 2. PRISMA 2020 flow diagram [56] generated using Haddaway and colleagues’ online generator [57].

Figure 3. A flowchart summarizing the different areas of application for AI chatbots in mental health that were covered in the studies included in this review.

Figure 4. Cochrane risk-of-bias tool for randomized trials, version 2 (RoB 2) [58,59,60,61,62,63,64,65,66,67,69,70,71,72].

Figure 5. Risk of Bias Assessment Using ROBINS-E Tool for Cheah et al. [68].

Table 1. Data extraction table summarizing key details from the studies included in this scoping review. Details include the name of the paper, year of publication, authors, study design, sample, and main outcome(s) reported.

Name of the Paper	Year	Authors	Study Design	Sample	Main Outcome
Mental Health Chatbot for Young Adults With Depressive Symptoms During the COVID-19 Pandemic: Single-Blind, Three-Arm Randomized Controlled Trial	2022	He et al. [58]	Randomized controlled trial	148	The AI chatbot XiaoE significantly reduced depressive symptoms in college students compared to control groups, with a moderate effect size post-intervention and a small effect size at 1-month follow-up.
A mobile healthy lifestyle intervention to promote mental health in adolescence: a mixed-methods evaluation	2024	Peuters et al. [59]	Cluster-controlled trial with process evaluation interviews	279	Positive effects on physical activity, sleep quality, and positive moods. Engagement was a challenge, but users highlighted the importance of gamification, self-regulation techniques, and personalized information from the chatbot.
A chatbot to improve adherence to internet-based cognitive-behavioural therapy among workers with subthreshold depression: a randomized controlled trial	2024	Yasukawa et al. [60]	Randomized controlled trial	142	The addition of a chatbot sending personalized messages resulted in significantly higher iCBT completion rates compared to the control group, but both groups showed similar improvements in depression and anxiety symptoms.
Can AI make people happy? The effect of AI-based chatbot on smile and speech in Parkinson’s disease	2022	Ogawa et al. [61]	Open-label randomized study	20	The chatbot group exhibited increased smile parameters and reduced filler words in speech, suggesting improved facial expressivity and fluency, although clinical rating scales did not show significant differences.
Development and Evaluation of a Smartphone-Based Chatbot Coach to Facilitate a Balanced Lifestyle in Individuals With Headaches (BalanceUP App): Randomized Controlled Trial	2024	Ulrich et al. [62]	Randomized controlled trial	198	The BalanceUP app, utilizing a chat-based interface with predefined and free-text input options, significantly improved mental well-being for individuals with migraines.
Effectiveness of the Minder Mobile Mental Health and Substance Use Intervention for University Students: Randomized Controlled Trial	2024	Vereschagin et al. [63]	Randomized controlled trial	1210	The Minder app, integrating an AI chatbot delivering cognitive–behavioral therapy, reduced anxiety and depressive symptoms, improved mental well-being, and decreased the frequency of cannabis use and alcohol consumption among university students.
A Therapeutic Relational Agent for Reducing Problematic Substance Use (Woebot): Development and Usability Study	2021	Prochaska et al. [64]	8-week usability study	51	The automated therapeutic intervention W-SUDs led to significant improvements in self-reported substance use, cravings, mental health outcomes, and confidence to resist urges.
Guided versus unguided chatbot-delivered cognitive behavioral intervention for individuals with moderate-risk and problem gambling: A randomized controlled trial (GAMBOT2 study)	2024	So et al. [65]	Randomized controlled trial	97	Both groups (guided and unguided GAMBOT2 intervention) showed significant within-group improvements in gambling outcomes, with no significant between-group differences.
Effectiveness of a Conversational Chatbot (Dejal@bot) for the Adult Population to Quit Smoking: Pragmatic, Multicenter, Controlled, Randomized Clinical Trial in Primary Care	2022	Olano-Espinosa et al. [66]	Pragmatic, multicenter randomized controlled trial	460	The Dejal@bot chatbot intervention had a significantly higher 6-month continuous smoking abstinence rate (26.0%) compared to usual care (18.8%).
Effectiveness of a chatbot for eating disorders prevention: A randomized clinical trial	2022	Fitzsimmons-Craft et al. [67]	Randomized controlled trial	700	The Tessa chatbot intervention resulted in significantly greater reductions in weight and shape concerns compared to the waitlist control group, with reduced odds of developing an eating disorder.
Testing the Feasibility and Acceptability of Using an Artificial Intelligence Chatbot to Promote HIV Testing and Pre-Exposure Prophylaxis in Malaysia: Mixed Methods Study	2024	Cheah et al. [68]	Beta testing mixed methods study	14	The chatbot was found to be feasible and acceptable, with participants rating it highly on quality, satisfaction, intention to continue using, and willingness to refer it to others.
Use of the Chatbot “Vivibot” to Deliver Positive Psychology Skills and Promote Well-Being Among Young People After Cancer Treatment: Randomized Controlled Feasibility Trial	2019	Greer et al. [69]	Pilot randomized controlled trial	45	The Vivibot chatbot group exhibited a trend toward greater reduction in anxiety symptoms compared to the control group, along with an increase in daily positive emotions among young cancer survivors.
Efficacy of mobile app-based interactive cognitive behavioral therapy using a chatbot for panic disorder	2020	Oh et al. [70]	Randomized controlled trial	41	The mobile app chatbot group showed significantly greater reductions in panic disorder severity measured by the Panic Disorder Severity Scale compared to the book group.
Usability, Acceptability, and Effectiveness of Web-Based Conversational Agents to Facilitate Problem Solving in Older Adults: Controlled Study	2020	Bennion et al. [71]	Study comparing two AI chatbots (MYLO and ELIZA)	112	Both chatbots enabled significant reductions in problem distress and depression/anxiety/stress, with MYLO showing greater reductions in problem distress at follow-up compared to ELIZA.
Usability Comparison Among Healthy Participants of an Anthropomorphic Digital Human and a Text-Based Chatbot as a Responder to Questions on Mental Health: Randomized Controlled Trial	2024	Thunström et al. [72]	Randomized controlled trial	45	The text-only chatbot interface had higher usability scores compared to the digital human interface. Emotional responses varied, with the digital human group reporting higher nervousness.

Table 2. Application areas of AI chatbots for mental health interventions, corresponding studies evaluated, and summary of their strengths and flaws.

Application Area	Studies	Strengths	Flaws
Mental Health Support During COVID-19	He et al. [58]; Peuters et al. [59]	Accessible, scalable interventions during the pandemic	Challenges with engagement, technical glitches
Interventions for Specific Health Conditions	Yasukawa et al. [60]; Ogawa et al. [61]; Ulrich et al. [62]	Personalized support, remote monitoring potential	Limited impact on clinical outcomes, usability issues
Addressing Substance Use and Addiction	Vereschagin et al. [63]; Prochaska et al. [64]; So et al. [65]; Olano-Espinosa et al. [66]	Accessible, scalable interventions, positive outcomes	Small effect sizes, need for more intensive therapist involvement
Preventive Care and Well-being	Fitzsimmons-Craft et al. [67]; Cheah et al. [68]; Greer et al. [69]	Promising for prevention, well-being promotion	Engagement challenges, need for larger trials
Panic Disorder Management	Oh et al. [70]	Potential for promoting healthy behaviors, privacy protection	Usability issues, need for cultural tailoring
Health Promotion	Bennion et al. [71];	Accessible intervention, improvements in panic symptoms	Technical challenges, lower usability ratings
Usability and Engagement	Thunström et al. [72]	Evaluation of different interface designs	Limited to healthy participants, varied emotional responses

The numbers in brackets refer to the respective studies in the reference list.

Table 3. Data extraction table summarizing the name of the chatbot, type of AI technology used, chatbot protocol, usability, and engagement.

Chatbot Name	Authors	AI Chatbot Technology	Chatbot Protocol	Usability	Engagement
XiaoE	He et al. [58]	NLP; DL; ML; Rule-Based System.	Standalone	No Statistically Significant Usability Results	High
#LIFEGOALS	Peuters et al. [59]	NLP; ML.	Multi-Component mHealth Intervention	N/A	Mid
EPO	Yasukawa et al. [60]	NLP; ML; Rule-Based System; Personalization Techniques.	Chatbot and iCBT	Mid Usability	High
N/A	Ogawa et al. [61]	NLP; ML (Implied from Functionality description).	Chatbot and neurologist consultations	N/A	N/A
BalanceUP	Ulrich et al. [62]	NLP; Rule-Based System; ML; Free-Text Input.	Standalone	N/A	High
Minder	Vereschagin et al. [63]	Rule-Based System.	Multi-Component mHealth Intervention	N/A	Low
WoeBot	Prochaska et al. [64]	NLP; ML (Recurrent Neural Networks or Transformer Models); Sentiment Analysis; Emotion Detection; User Feedback Loops.	Standalone	High Usability	High
GAMBOT2	So et al. [65]	NLP; ML; Reinforcement Learning.	Chatbot and Therapist Guidance	N/A	N/A
Dejal@bot	Olano-Espinosa et al. [66]	Intelligent Dictionaries; Expert System; Bayesian System (Probabilistic Approach).	Standalone	N/A	N/A
Tessa	Fitzsimmons-Craft et al. [67]	Rule-Based System (Algorithm-Based).	Standalone	N/A	Mid
N/A	Cheah et al. [68]	Rule-Based System (ELIZA); NLP (MYLO); ML (MYLO).	Chatbot and HIV Prevention and Testing Services	High Usability	N/A
Vivibot	Greer et al. [69]	Rule-Based System (Decision Tree Structure).	Standalone	High Usability	High
N/A	Oh et al. [70]	NLP; ML (Supervised or Reinforcement Learning).	Standalone	Low Usability	N/A
MYLO vs. ELIZA	Bennion et al. [71]	NLP; ML; Rule-Based System.	Standalone	MYLO showed better usability rates than ELIZA.	MYLO showed better engagement rates than ELIZA.
BETSY	Thunström et al. [72]	NLP (Dialogflow); ML (Supervised Learning Inferred); Avatar and Voice Interaction; EEG Data Analysis.	Standalone	Higher usability scores for the text-only chatbot compared to the digital human interface.	Human features elicit more social engagement.

The numbers in brackets refer to the respective studies in the reference list. NLP = Natural Language Processing; ML = Machine Learning; N/A = Not Available.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Casu, M.; Triscari, S.; Battiato, S.; Guarnera, L.; Caponnetto, P. AI Chatbots for Mental Health: A Scoping Review of Effectiveness, Feasibility, and Applications. Appl. Sci. 2024, 14, 5889. https://doi.org/10.3390/app14135889

AMA Style

Casu M, Triscari S, Battiato S, Guarnera L, Caponnetto P. AI Chatbots for Mental Health: A Scoping Review of Effectiveness, Feasibility, and Applications. Applied Sciences. 2024; 14(13):5889. https://doi.org/10.3390/app14135889

Chicago/Turabian Style

Casu, Mirko, Sergio Triscari, Sebastiano Battiato, Luca Guarnera, and Pasquale Caponnetto. 2024. "AI Chatbots for Mental Health: A Scoping Review of Effectiveness, Feasibility, and Applications" Applied Sciences 14, no. 13: 5889. https://doi.org/10.3390/app14135889

APA Style

Casu, M., Triscari, S., Battiato, S., Guarnera, L., & Caponnetto, P. (2024). AI Chatbots for Mental Health: A Scoping Review of Effectiveness, Feasibility, and Applications. Applied Sciences, 14(13), 5889. https://doi.org/10.3390/app14135889

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

AI Chatbots for Mental Health: A Scoping Review of Effectiveness, Feasibility, and Applications

Abstract

1. Introduction

1.1. Technical Background

Natural Language Processing and Detailed Aspects of AI Chatbots

1.2. Aim of the Study

2. Materials and Methods

2.1. Search Strategy

2.2. Database Searches

2.3. Inclusion and Exclusion Criteria

2.4. Study Selection

2.5. Data Extraction

2.6. Quality Assessment

2.7. Analysis

3. Results

3.1. Characteristics of Included Studies

3.2. Mental Health Allies during COVID-19

3.3. Supporting Emotional Well-Being in Specific Health Conditions

3.4. Tackling Substance Use and Addiction

3.5. Preventive Care: Targeting Eating Disorders and HIV Prevention

3.6. Enhancing Well-Being in Young Cancer Survivors

3.7. Panic Disorder Management

3.8. Problem-Solving in Older Adults

3.9. Usability and Engagement

3.10. Risk of Bias

3.10.1. Risk of Bias in Randomized Trials (RoB 2)

3.10.2. Risk of Bias in Non-Randomized Studies (ROBINS-E)

4. Discussion

Limitations and Future Research Directions

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI