Next Article in Journal
Innovative Sentiment Analysis and Prediction of Stock Price Using FinBERT, GPT-4 and Logistic Regression: A Data-Driven Approach
Previous Article in Journal
Explainable Aspect-Based Sentiment Analysis Using Transformer Models
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Defending Against AI Threats with a User-Centric Trustworthiness Assessment Framework

1
The Visual Computing Lab, Information Technologies Institute/Centre for Research and Technology Hellas, 6thkm Charilaou-Thermi Road, GR-57001 Thessaloniki, Greece
2
DistriNet, KU Leuven, Celestijnenlaan 200A, B-3001 Heverlee, Belgium
*
Author to whom correspondence should be addressed.
Big Data Cogn. Comput. 2024, 8(11), 142; https://doi.org/10.3390/bdcc8110142
Submission received: 23 August 2024 / Revised: 11 October 2024 / Accepted: 21 October 2024 / Published: 24 October 2024

Abstract

:
This study critically examines the trustworthiness of widely used AI applications, focusing on their integration into daily life, often without users fully understanding the risks or how these threats might affect them. As AI apps become more accessible, users tend to trust them due to their convenience and usability, frequently overlooking critical issues such as security, privacy, and ethics. To address this gap, we introduce a user-centric framework that enables individuals to assess the trustworthiness of AI applications based on their own experiences and perceptions. The framework evaluates several dimensions—transparency, security, privacy, ethics, and compliance—while also aiming to raise awareness and bring the topic of AI trustworthiness into public dialogue. By analyzing AI threats, real-world incidents, and strategies for mitigating the risks posed by AI apps, this study contributes to the ongoing discussions on AI safety and trust.

1. Introduction

In today’s rapidly evolving landscape, Artificial Intelligence (AI) applications are pervasive, profoundly impacting our daily lives [1]. From the personalized interactions of chatbots [2] to the captivating yet potentially deceptive world of deepfakes [3], AI has become a leading force. The democratization of AI, a process that makes cutting-edge technologies widely accessible, is transforming the way we work, communicate, and create [4].
However, this rapid integration of AI also brings significant concerns about the trustworthiness of these systems, particularly regarding security, privacy, and ethics, manifested in various dimensions [5]. Security concerns involve the susceptibility of AI applications to exploitation by malicious actors, potentially resulting in unauthorized access and data breaches. Privacy-related issues are centered on challenges associated with the collection, storage, and utilization of personal data by AI systems, often lacking transparent user consent, risking unauthorized surveillance or the exposure of sensitive information [6]. Ethical concerns include challenges arising from biased algorithms, discriminatory practices, or unintended consequences in AI decision-making, with the potential to adversely impact individuals and society [7].
The widespread popularity of AI applications is often driven by their positive user experience. Although these applications express approval of their ease of use and engaging interfaces, this positive experience may obscure critical risks related to trustworthiness. Users might trust these apps based entirely on their user-friendly design [8], potentially being unaware of significant underlying issues.
Additionally, current methods to control and assess the trustworthiness of AI largely rely on technological countermeasures [9], regulatory frameworks [10], and trustworthiness assessment tools [11]. Technological measures include advanced algorithms for data security and privacy [12]. Regulatory frameworks, such as the General Data Protection Regulation (GDPR) and other initiatives [13], set legal standards for AI operations. Moreover, trustworthiness assessment tools, such as the Assessment List for Trustworthy Artificial Intelligence (ALTAI) [14], offer guidelines to businesses enabling them to evaluate their AI systems across multiple dimensions.
While such approaches to evaluating and safeguarding AI systems offer significant benefits, they are primarily designed for and addressed to experts [14,15,16,17]. However, as AI applications become more popular and integral to daily life, users are increasingly exposed to potential risks and ethical concerns associated with these technologies. Unfortunately, many users remain unaware of these risks and lack the knowledge to protect themselves effectively [18]. Moreover, while there have been efforts to measure AI trustworthiness through measuring user experience [8], there are still limited practical tools available that allow users to directly assess AI trustworthiness based on their own perceptions. This gap highlights a need for a more intuitive and user-centric approach, one that raises awareness about AI trustworthiness, while equipping users with straightforward methods to evaluate and navigate AI technologies confidently.
Towards narrowing this gap, this paper introduces the AI User Self-Assessment Framework, designed to bridge the divide between complex, expert-driven evaluations and the practical, everyday experiences of users with AI technologies. The proposed framework empowers users by offering a straightforward and practical tool to assess the trustworthiness of the AI applications they encounter regularly. By distilling complex trustworthiness dimensions into accessible criteria, the framework enhances user awareness of potential risks and encourages proactive, informed engagement.
As AI technologies become increasingly embedded in daily life, we envision this framework being integrated into rating systems and community-driven platforms. Such integration would allow users to contribute to and benefit from collective evaluations of AI apps, ensuring that trustworthiness assessments are aligned with broader societal values and user expectations.
The remainder of this article is organized as follows: Section 2 presents a literature review on the threats posed by popular AI applications, showcasing real-world instances where AI trustworthiness has been compromised. Additionally, this section discusses the relationship between AI trustworthiness and user experience. While users generally report positive experiences with popular AI applications, limited direct measurements exist regarding their trustworthiness. Section 3 explores strategies for mitigating risks and ensuring AI trustworthiness, noting that existing controls often target expert users and leave a gap for average users. Next, Section 4 introduces our proposed AI User Self-Assessment Framework, a practical tool designed to help average users evaluate the trustworthiness of AI applications based on their personal experiences. This section details the methodology we followed to develop and implement the framework. Finally, Section 5, draws conclusions and discusses potential directions for future improvements.

2. Threats and Trustworthiness of AI Applications

This section explores the complex landscape of threats and trustworthiness in AI applications. We start by examining the various risks associated with these technologies through theoretical discussions and real-world examples. While our focus includes generative AI apps, known for their data-intensive nature and widespread attention, we also address broader issues relevant to AI applications more generally. Given that many AI apps become popular due to their positive user experience and usability, we subsequently analyze how this user satisfaction may obscure underlying concerns related to different dimensions of AI trustworthiness.

2.1. Generative AI Threats

Exploring AI democratization, we encounter numerous threats linked to the popularity of AI apps. The rise in the use of generative AI chatbots, exemplified by applications such as ChatGPT, has attracted malicious actors who exploit this trend to spread malware [19]. A study conducted by Meta identified 10 different types of malware utilizing ChatGPT to target and compromise users’ devices since March 2023 [20]. Additionally, AI-driven phishing attacks have evolved into powerful tools that can manipulate users into revealing sensitive information or taking harmful actions. AI language models, such as ChatGPT or Bing’s GPT-4-powered Chat, rely heavily on the prompts they receive. While some prompts are crafted to enhance the model’s performance across a range of tasks, others can hide intentions that pose real, tangible threats [21]. For instance, specific prompts can transform AI chatbots into tools for creating convincing phishing websites that imitate famous brands, evading advanced anti-phishing systems [22]. Another example includes the utilization of AI chatbots to generate deceptive emails or messages that convincingly mimic trusted sources, thus increasing the risk of users falling victim to phishing attempts. The ability of chatbots to remember context in conversations inserts an additional layer of concern, enabling the creation of more convincing phishing emails from a simple prompt [23].
Moreover, the biases inherent in AI systems pose a considerable threat to users and consequently to society as a whole. Researches highlight that while LLMs can outperform humans in tasks with explicit mathematical or probabilistic elements, their responses reveal perpetuation of numerous biases when confronted with intricate, ambiguous, and implicit problems [24]. Beyond language models, the world of Text-to-Image (T2I) systems also raise concerns about their impact on society. Luccioni et al. [25] have extensively studied this issue, revealing a notable overrepresentation of whiteness and masculinity in images created by popular T2I systems (DALL·E 2, Stable Diffusion v 1.4 and v 2). Drawing attention to the broader implications, it becomes alarming when AI applications are adopted at young ages, as bias is ingrained and conveyed, shaping an audience oblivious to the existing threats posed by such technologies. This lack of awareness can result in biases being accepted without scrutiny, thereby intensifying the societal impact of AI systems.
Generative AI models can additionally be exploited for the generation of harmful content, posing a significant risk to individuals’ reputations and societal trust. LLMs are particularly susceptible to data poisoning, which allows for the injection of hateful speech targeting specific individuals or entities. Since such models operate based on statistical patterns rather than possessing a true understanding of the world, they can be extremely susceptible to generating false information, as well as perpetuating biased and hateful statements [26]. This flaw can blur the distinction between genuine and false content, further highlighting how AI-generated text smoothly merges into our online environment. A recent study by Qu et al. [27] conducts a safety assessment of T2I models, focusing on the generation of unsafe images and hateful memes. The findings reveal that T2I models can produce substantial rates of unsafe images, even with seemingly harmless prompts. This discovery emphasizes the concerning capacity of these models to generate hateful content that closely resemble real-world instances, thus elevating the risk of the dissemination of hate campaigns.
In the same context, the phenomenon of “hallucination” within LLMs carries significant implications, involving the generation of text that is inaccurate, nonsensical, or entirely fictional [28]. Unlike databases or search engines, LLMs do not cite sources for their responses; instead, they derive text through extrapolation from the provided prompt. This extrapolation, although not necessarily backed by training data, represents the most correlated output based on the given prompt. The severity of this issue lies in the potential for LLMs to produce misleading or entirely fabricated information. For example, an LLM prompted with a historical events question, generates a response inaccurately describing crucial details. Users, unaware of the model’s limitations, may accept the information as accurate, perpetuating the dissemination of false or misleading content.
Similar concerns apply considering the fact that AI apps are being employed to propagate misinformation and manipulate public opinion in the context of critical matters, such as health or climate change. Malicious actors exploit the capabilities of AI models, especially generative neural networks, to generate persuasive and convincing content that disseminates false information or manipulates people’s beliefs. Henley et al. reveal the growing use of LLMs in generating articles on news websites. Synthetic articles produced by these models are found to be increasingly common, while social media users are increasingly engaging with these synthetic articles [29].
In educational settings, AI’s potential benefits, particularly in creative areas, coexist with limitations, raising questions about bias, power dynamics, and ethical concerns [30]. Consider, for example, AI-driven recommendation systems in educational software. Although these systems aim to enhance learning experiences by providing personalized content suggestions, there is a risk of reinforcing existing biases or restricting students’ exposure to diverse perspectives. Additionally, the misuse of AI, as highlighted in computer security-oriented specializations, poses a threat to academic integrity, necessitating a balance between leveraging AI as an educational support mechanism and preserving critical thinking in learning environments [31].

2.2. Real-World Examples

Moving beyond the conceptual landscape of AI threats, our study continues with real-world examples, where these conceptual concerns materialize into tangible challenges. By examining notable failures and incidents, our objective is to derive practical insights that underscore the real impact of the discussed threats.
For example, IBM’s Watson Health, despite the ambitious claims about revolutionizing healthcare, resulted in numerous misdiagnoses and treatment options that were not feasible [32]. Its reliance on vague and erroneous data could have resulted in widespread misdiagnoses and ineffective treatment options if it had been widely adopted. Similarly, Amazon’s biased AI resume screening tool exhibited gender bias, favoring male candidates based on keywords found in predominantly male resumes [33]. Had it not been documented and highlighted as failure, the tool could perpetuate gender discrimination and limit opportunities for qualified women if used extensively across various industries. Microsoft’s chatbot Tay, which became racist and sexist due to biased training data and interactions with users [34], could have spread offensive and harmful content if accessible to a larger audience, further amplifying prejudices.
Apart from these widely discussed and criticized examples, cutting-edge generative AI models such as ChatGPT, Stable Diffusion, and DALL·E, currently more accessible to a wider user base, have demonstrated the potential to fail by producing harmful content when deployed without sufficient controls, as evidenced in Figure 1 and Figure 2. These generative AI models are being exploited in jailbreak attacks, where attackers leverage vulnerabilities to surpass the models’ safeguards. In instances where these defenses are breached, harmful behaviors are triggered, posing substantial risks, including the potential compromise of personal data, financial information, and access to sensitive systems. Such AI models can often generate biased content, mirroring societal biases and potentially perpetuating stereotypes that undermine efforts to cultivate inclusivity and diversity.
Reports have highlighted instances where users can manipulate generative AI models, such as chatbots, to generate inappropriate or harmful content, such as crafting a negative review for a business. This manipulation, especially if repeated on purpose, poses a significant threat as it can seriously damage a business’s reputation, underscoring the necessity for robust content filtering mechanisms. Furthermore, the use of AI-generated text in phishing attempts has showcased its capability to craft convincing phishing emails that, when deployed, can readily deceive users. If this technique becomes widely adopted, it has the potential to heighten the efficacy of phishing attacks, significantly elevating the risk of cybersecurity threats for both individuals and organizations.
The potential harm and societal consequences of uncontrolled AI, as discussed by these instances and exemplified in Figure 1 and Figure 2, necessitate robust technological controls driven by adherence to ethical guidelines. Responsible development and use of AI requires a commitment to principles prioritizing security and privacy, ethics, and trustworthiness, aiming for user safety and societal well-being.

2.3. Trustworthiness and User Experience of AI Applications

The trustworthiness of AI applications has collected significant attention as AI systems are increasingly integrated in various sectors. At its core, Trustworthy AI is a framework aimed at ensuring that AI systems are worthy of being trusted based on verifiable evidence concerning their stated requirements. This framework can ensure that users’ and stakeholders’ expectations are met in a transparent, reliable, and ethical manner [35]. In addition to the framework-based view, trust in AI systems can be understood as the belief from the stakeholder’s side in the reliability and credibility of AI agents’ recommendations and responses [36].
According to Shin et al., trust in AI is shaped by users’ perceptions of an AI system’s explainability and causability, which help users understand how decisions are made. When AI systems provide transparent, understandable explanations and consistent, accurate results, users’ trust is strengthened. However, when outputs are biased or lack transparency, trust can be significantly undermined [36].
Moreover, Cesare et al. highlight the strong connection between trust and usability, emphasizing that the system’s interface plays a crucial role in shaping how trustworthy users perceive an application or website to be. In the context of e-commerce, the quality of the interface becomes particularly significant for building trust and trustworthiness is often measured through usability. A clear, easy-to-navigate, and intuitive interface fosters user confidence and security, making users more likely to trust the processes behind the system. This strong link between user experience and trust is essential for promoting long-term engagement and repeat usage of e-commerce platforms [8].
However, these findings suggest that a highly usable interface might encourage users to return, even if hidden vulnerabilities exist beneath the surface, potentially masking risks that go unnoticed. As highlighted in the previous sections, the growing presence of AI in real-world applications has revealed numerous threats and incidents where trustworthiness is being compromised in several of its dimension including security, robustness and privacy [37]. Despite these trustworthiness challenges, AI systems often deliver high usability and positive user experiences, where the perceived trustworthiness of an AI app may be decided based on usability. Many AI applications prioritize user satisfaction through intuitive interfaces and responsive design, yet this usability can mask underlying issues related to trustworthiness. As AI systems become more embedded in daily life, users may unknowingly place greater trust in these applications, equating ease of use with safety and reliability [8]. This disconnect between user experience and the actual trustworthiness of AI highlights the need for more visible, user-friendly mechanisms that allow non-expert users to evaluate the trustworthiness of the AI systems they interact with.

3. Controlling the Risks of AI Apps

In this section, we critically examine the mechanisms designed to control the risks associated with AI applications, focusing on technological controls, regulatory frameworks, and frameworks for trustworthiness assessment. By exploring these areas, as summarized in Table 1, we aim to analyze current strategies for mitigating AI-related threats and ensuring the safe, ethical, and responsible use of AI technologies while also highlighting potential limitations and directions for improvement.

3.1. Technological Controls

In response to the evolving threats in the AI landscape, ongoing efforts are dedicated to implementing innovative defensive mechanisms within or with the use of AI models. For instance, recent research suggests a creative approach wherein companies leverage ChatGPT to simulate phishing campaigns, using the convincing results to train their personnel in recognizing real-world phishing attempts [38]. Methods such as Self-Reminder can reduce the success rate of jailbreak attacks by reminding the LLM to respond appropriately [39]. Tools such as BiasAsker have been developed to identify and measure social bias in conversational AI systems [40], while AI-generated content detectors are being introduced to effectively identify fake content and distinguish it from genuine [41,42].
Platforms such as RobustBench offer leaderboards that rank models based on their performance against specific types of threats. Such benchmarking services subject the AI models to specific attacks aiming to provide a standardized way to extensively evaluate the robustness of various AI models, helping researchers and developers assess the effectiveness of their defensive strategies [15].
Moreover, the aspect of data privacy within control mechanisms is explored with innovative methods forcing the AI models to forget, as outlined in [43]. Implementing machine unlearning involves intentionally eliminating or modifying previously acquired patterns or data from AI models. By enabling the system to forget sensitive or malicious information, this strategy reduces the risk of security and privacy concerns.
Additionally, there is an ongoing exploration into enhancing transparency in AI decision-making, combining Explainable AI (XAI) [44,45] alongside Reinforcement Learning with Human Feedback (RLHF) [46]. XAI can provide users with clear insights into the AI decision-making process, while RLHF involves users providing feedback to the AI system, reinforcing positive behavior and correcting undesirable actions.
The concept of Controllable AI, as discussed in recent research, further explores how to handle the explainability and transparency challenges associated with modern AI systems [47]. It proposes practical methods for achieving control over AI systems without necessarily achieving full transparency. This approach emphasizes detecting and managing control loss in AI systems, acknowledging that while complete transparency may be impractical, effective control mechanisms are essential for ensuring the system’s reliability and security.

3.2. Regulatory Controls

A growing emphasis on regulatory frameworks highlights the pressing need for ethical guidelines to govern AI applications, ensuring responsible usage [48,49,50]. As nations formulate guidelines and enact laws to supervise the implementation of AI technologies, these measures emerge as potent tools in shaping the strategies of big tech companies. Regulatory frameworks compel these corporations to adhere to rigorous standards, especially in domains such as data privacy, algorithmic transparency, and ethical considerations. Notably, the European Union’s AI Act incorporates provisions tailored specifically to the use of AI, with a focus on high-risk and prohibited AI applications [51,52]. This regulatory pressure pushes for change, encouraging big tech companies to adopt more respectful, accountable, and ethical practices in their AI endeavors.
In addition to external regulations, AI auditing frameworks can ensure compliance with these standards, as they provide structured methodologies for organizations to assess and manage AI systems’ adherence to regulatory and ethical requirements [53]. Such frameworks typically focus on areas such as privacy protection, data integrity, algorithmic transparency, and ethical considerations and contribute to regulatory controls by enabling organizations to systematically review and address risks associated with AI applications. This type of control can enable organizations to comply with regulatory requirements but also enhance their ability to manage and govern AI technologies responsibly. Thus, while regulatory frameworks set the external standards for AI usage, auditing frameworks complement these efforts by ensuring that internal practices are effectively aligned with these standards.
Following this line of research, Meszaros et al. [54] investigate the legal challenges of generative AI models, and examine how well ChatGPT complies with GDPR in processing personal data. They highlight the importance of the GDPR’s legal bases to ensure compliance. While OpenAI has made significant strides toward meeting GDPR standards, ongoing efforts are essential to fully meet regulatory expectations and safeguard data subjects’ rights. As AI and data protection evolve, a proactive approach to balancing innovation with privacy and compliance is key to developing trustworthy generative AI systems.

3.3. Trustworthiness Assessment

Ensuring the trustworthiness of AI applications is a critical concern, especially as these systems become more ingrained in everyday activities. Several frameworks have been developed to guide experts in evaluating and enhancing the trustworthiness and reliability of AI systems. These frameworks aim to address key ethical principles such as transparency, accountability, and fairness, as well as safety and robustness.
One notable framework is the ALTAI list for trustworthiness assessment [14,55], developed by the European Commission. This framework provides a structured checklist to evaluate AI systems across several crucial dimensions. It emphasizes governance and accountability, requiring clear processes and responsibility definitions. Transparency, as discussed in ALTAI, ensures that AI systems operate in a manner that is clear and understandable, with decisions that are easily traceable by stakeholders. The framework also addresses ethical considerations such as fairness, urging the integration of non-discrimination principles into AI design. Additionally, it highlights the importance of safety, robustness, and privacy protection, advocating for rigorous testing and responsible data handling. Finally, it considers the societal and environmental impacts of AI, encouraging assessments of long-term effects and the adoption of practices to mitigate negative outcomes.
According to a recent review [56], ALTAI represents a notable advancement in translating AI governance principles into actionable assessments, bridging the gap between high-level ethical guidelines and practical implementation. The review highlights ALTAI’s strengths, including its detailed focus on operational governance and its ability to translate ethical principles into measurable standards. However, it has also been criticized for not accommodating organizational maturity differences and lacking integrations with risk assessments. Recommendations for improvement include incorporating peer comparisons and risk-based evaluations to better accommodate varying organizational maturity levels and improve its practical application across different sectors.
Another methodology for trustworthiness assessment, the Z-Inspection process [57], offers a holistic and dynamic approach to evaluating the trustworthiness of AI systems at various stages of their lifecycle. It emphasizes participatory and interdisciplinary assessment methods, incorporating socio-technical scenarios to address ethical issues and ensure alignment with trustworthiness principles. However, the Z-Inspection process is not without its limitations. For instance, its flexibility can lead to inconsistencies in assessments, and the resource-intensive nature may not be feasible for all organizations. Additionally, the process’s reliance on specialized expertise and its dynamic, adaptable approach can pose challenges in terms of scalability and integration with existing regulatory frameworks [58]. These limitations underscore the need for ongoing refinement and adaptation of assessment methodologies to address diverse contexts and emerging challenges in AI trustworthiness.
While frameworks such ALTAI offer valuable guidelines for experts, there are additional dimensions of trustworthiness that need to be addressed to fully align AI systems with user expectations. According to recent research [35], human involvement throughout the AI lifecycle can assist in setting performance limits, correcting errors, and enhancing system reliability. In addition, effective evaluation mechanisms can be crucial in matching user expectations with the system’s actual capabilities, ensuring that users have a clear and accurate understanding of the system’s performance and limitations. Bridging this gap is important to building trust and facilitating the broader acceptance of AI technologies.
Furthermore, a recent study highlights the need for a shift from a performance-driven to a trust-driven model of AI development [37]. This approach aims to address the limitations and trade-offs between different aspects of trustworthiness, such as explainability, robustness, and fairness. It also emphasizes the importance of ongoing research and interdisciplinary collaboration to better understand and address these challenges, ensuring that AI systems can be trusted by all stakeholders involved.

3.4. Limitations and the Need for User Involvement

Although the discussed approaches for controlling AI applications can offer a certain level of protection against AI-related threats, they present several practical limitations.
  • echnological defenses often become brittle over time. Malicious actors consistently demonstrate a remarkable ability to stay ahead of defensive mechanisms, continually developing new threats and undermining AI models. This adaptability presents a persistent challenge, continuously testing the effectiveness of existing safeguards and necessitating ongoing improvements to defensive measures.
  • Regulations struggle to keep pace with the rapid advancement of technology. This regulatory lag can create gaps in protection, allowing emerging threats to exploit these delays. Moreover, the reactive nature of regulatory updates may not adequately address the speed and sophistication of new AI threats, further intensifying the risk of insufficient safeguards and oversight.
  • The technical nature of trustworthiness assessment frameworks limits their accessibility to the general public. Consequently, average users often remain unaware of such frameworks and lack specialized tools to independently evaluate the AI applications they use as of their trustworthiness. This disconnect extends to the specialized terminology used within such frameworks, making it even harder for users to start familiarizing themselves with important aspects such as transparency and privacy that they should expect from AI applications.
With respect to the evolving AI landscape, relying exclusively on technological, regulatory, and formalized trustworthiness assessment controls cannot provide a complete defense against the threats posed by AI. These measures, although useful and crucial, present limitations and can be complex, potentially giving users a false sense of security. To effectively address the dynamic challenges of the AI landscape, a more integrated approach is essential—one that combines existing measures with active user involvement through increased awareness.
Users who interact with AI apps daily are in a unique position to contribute to trustworthiness assessments. Their direct experiences and observations can provide valuable insights that established assessments might miss. Involving users in evaluating and reflecting on the trustworthiness of AI apps they use can help ensure that safety measures are better aligned with real-world interactions, thereby enhancing the effectiveness of AI governance.

4. AI User Self-Assessment Framework: Research Design and Methodology

Given the limitations of existing methods for managing and controlling AI technologies, we introduce in this section the AI User Self-Assessment Framework, a tool designed to place user empowerment at the core of AI trustworthiness evaluation. The proposed framework adopts a user-centric approach, offering a structured set of questions that enable individuals to assess AI applications based on their personal experiences. By focusing on essential dimensions of AI trustworthiness—such as user control, transparency, security, privacy, ethics, and compliance—the framework supports users in evaluating these aspects while promoting deeper understanding of how AI systems operate and impact them.
The following sections detail the specific methodology used in the development of the framework, including the theoretical foundations, design considerations, and the steps that guided its development. Our approach, rooted in a user-centric perspective, aims to provide individuals with a tool to evaluate the trustworhiness of AI applications based on their own experiences and perceptions.

4.1. Conceptual Framework

The development of the AI User Self-Assessment Framework was guided by an extensive literature review that examined the threats associated with popular AI applications, AI trustworthiness, and existing controls to mitigate these risks. From this review, we focused particularly on controls related to AI trustworthiness assessment. This analysis, which included a detailed examination of frameworks such as the ALTAI framework, provided the theoretical foundations by identifying and defining the key dimensions of trustworthiness that were crucial for shaping the framework.
In detail, inspired by the ALTAI framework discussed in Section 3, our proposed framework loosely builds upon ALTAI’s principles to offer a user-friendly tool. We were particularly drawn to ALTAI’s structured approach and its use of questions, which facilitates a clearer and more intuitive evaluation process. By adopting this format, we preserve the robustness of ALTAI’s criteria while transitioning to a more accessible list of questions that minimizes technical details. ALTAI’s thorough coverage of trustworthiness dimensions, such as transparency, technical robustness, and safety, underscores critical areas that we believe users should familiarize themselves with to effectively engage with AI applications and identify potential threats. By adapting and simplifying core principles from ALTAI, our framework aspires to raise user awareness but also integrate these dimensions into everyday discussions about AI, making the navigation and assessment of AI technologies more intuitive. The detailed mapping between the dimensions of the two frameworks is presented in Table 2.
As outlined in Table 3 and further detailed in Appendix A, the proposed framework provides a comprehensive set of criteria and questions designed to guide users in their assessments, covering the following dimensions:
  • User Control. This dimension focuses on users’ ability to understand and manage the AI app’s actions and decisions. It assesses whether users can control the app’s operations and intervene or override decisions if needed. The aim is to ensure users maintain their autonomy and that the AI app supports their decision-making without undermining it. Key areas include awareness of AI-driven decisions, control over the app, and measures to avoid over-reliance on the technology. A detailed set of questions enabling users to assess the dimension of user control is available in Appendix A.1.
  • Transparency. The set of questions related to the dimension of transparency (as detailed in Appendix A.2), aim to enable users to assesses how openly the AI app operates. It focuses on the clarity and accessibility of information regarding the app’s data sources, decision-making processes, and purpose. It also evaluates how well the app communicates its benefits, limitations, and potential risks, ensuring users have a clear understanding of the AI’s operations.
  • Security. This dimension aims to enable users assess the robustness of AI apps against potential threats, as well as mechanisms for handling security issues. The questions relevant to security, as detailed in Appendix A.3, evaluate how confident users feel about the app’s safety measures and ability to function without compromising data or causing harm. Important considerations include the app’s reliability, the use of high-quality information, and user awareness of security updates and risks.
  • Privacy. This dimension examines how the AI app handles personal data, emphasizing data protection and user awareness of data use. It considers users’ ability to control their personal information and the app’s commitment to collecting only necessary data. Topics include respect for user privacy, transparency in data collection, and options for data deletion and consent withdrawal. The specific questions aiming to enable users evaluate the app with respect to their privacy are available in Appendix A.4.
  • Ethics. A set of questions relevant to the dimension of ethics, as detailed in Appendix A.5, aims to address the ethical considerations of the AI app, such as avoiding unfair bias and ensuring inclusivity. It evaluates whether the app is designed to be accessible to all users and whether it includes mechanisms for reporting and correcting discrimination. Ethical design principles and fairness in data usage are central themes.
  • Compliance. This dimension of the framework looks at the processes in place for addressing user concerns and ensuring the AI app adheres to relevant legal and ethical standards. Through targeted questions, as detailed in Appendix A.6, it evaluates the availability of reporting mechanisms, regular reviews of ethical guidelines, and ways for users to seek redress if negatively affected by the app. Ensuring compliance with legal frameworks and promoting accountability are key aspects.
This structured approach empowers users to make well-informed decisions about the AI applications they engage with, promoting a more proactive and educated interaction with AI.

4.2. Framework Development Process

We developed the AI User Self-Assessment Framework through a structured process:
  • Literature Review. We began with a review of the existing literature on the threats posed by popular AI applications, using real-world incidents as case studies. This review highlighted how these threats undermine AI trustworthiness and how positive user experiences can sometimes obscure underlying trustworthiness issues. We also examined current countermeasures to these threats. Exploring existing solutions for assessing AI trustworthiness, we identified key dimensions and principles that became the foundation for the framework’s core elements.
  • Framework Design. Based on the identified dimensions of AI Trustworthiness, we designed the framework by adapting principles from the ALTAI framework and incorporating user-centric elements. We simplified questions from the ALTAI framework to make them more relevant to users without technical expertise. This design process involved defining specific criteria and crafting user-friendly questions for each dimension. The aim was to create questions that users, regardless of their technical background, can understand and engage with the framework, improving the accessibility and effectiveness of the self-assessment tool.
  • Framework Implementation. Following the design of the framework, we developed a practical self-assessment tool that incorporates all identified questions. The tool features a user-friendly interface where each dimension is represented by a set of relevant questions. Users can interact with the tool by answering these questions based on their experiences with the AI app they are assessing. Additionally, the tool includes a scoring system that quantifies user responses. Each question is associated with specific criteria and scoring options, allowing users to assign scores based on how well the AI app meets the standards for each dimension according to their perceptions. The scores are aggregated to generate an ruling trustworthiness rating for the app in question. The tool was developed to facilitate individual assessments and allow for comparative analysis across different AI applications. The self-assessment tool, the different dimensions, as well as a subset of the questions, are demonstrated in Figure 3.
  • Framework Application. Users can access our interface, where they are presented with a list of popular AI applications, such as ChatGPT, for trustworthiness assessment. Upon selecting an app, users are guided through a series of questions covering the identified dimensions of AI trustworthiness. As they respond to the questions, their answers are scored based on predefined criteria for each dimension. Once the user completes the assessment, the tool aggregates their individual scores. These scores are then combined with assessments from other users evaluating the same AI app, providing a trustworthiness rating. This aggregated rating helps create a collective evaluation of the app’s trustworthiness.

4.3. User Scoring and Evaluation Process

The AI User Self-Assessment Framework incorporates a scoring system designed to evaluate AI applications based on various dimensions of trustworthiness. This system allows users to assess applications through a set of standardized questions, providing scores that reflect their experiences and perceptions.
In the AI User Self-Assessment Framework, scores are assigned on a scale from 1 to 5 for the majority of questions, where 1 represents a low level of trustworthiness and 5 indicates a high level of trustworthiness. For instance, if a user is evaluating the Transparency dimension and responds to the question, “Can you easily find out what data the app used to make a specific decision?” a score of 1 might be given if the app offers no information about its data sources, indicating a lack of transparency. A score of 3 would suggest that the app provides some basic information but lacks clarity or accessibility, while a score of 5 would signify that the app explicitly lists data sources and clearly explains its decision-making process. Additionally, the framework includes questions that allow users to respond with “yes” or “no” to evaluate specific features, such as whether they feel they are overly reliant on the app. This format provides users with a straightforward way to express their perceptions and experiences regarding the app.
The scoring system aggregates user inputs to provide a detailed assessment of an AI application’s trustworthiness. By evaluating various dimensions of trustworthiness, the system offers a holistic view of how an AI application performs in real-world scenarios. The analysis of these aggregated scores enables us to compare the perceived trustworthiness of an AI application with its perceived usability or user experience. This comparative analysis can help users understand how trustworthy an app is but also how it aligns with their expectations for usability and satisfaction.
The aggregated scores enable both individual assessments and a broader understanding of AI technologies’ perceptions across various contexts and user demographics. This approach promotes a feedback loop, where user evaluations inform improvements in AI development while also spotlighting areas that require attention.

5. Conclusions

In this work, we have introduced the AI User Self-Assessment Framework, a tool designed to empower users in evaluating the trustworthiness of popular AI applications based on their personal experiences. The framework’s integrated scoring system allows users to assess various dimensions of AI trustworthiness, providing insights into how these applications are perceived.
Looking ahead, we envision the integration of this framework into collaborative rating systems and community-driven platforms. Future developments may include a public leaderboard that ranks AI applications according to their trustworthiness. This leaderboard would aggregate user scores, creating a competitive incentive for developers to adhere to higher standards and address user concerns more effectively. Additionally, future research will explore the combination of the trustworthiness score with the System Usability Scale (SUS) [59] to investigate how usability scores might shift when weighted by trustworthiness factors, offering a more holistic view of AI app performance.
To fully realize the potential of the framework, addressing key challenges such as optimizing the relevance of questions, maintaining consistency in scoring, and incorporating continuous updates will be essential. Integrating insights from diverse fields—including computer science, ethics, law, and social sciences—will be crucial in advancing AI accountability and ensuring that these technologies meet high standards of trustworthiness.

Author Contributions

Conceptualization and methodology, E.K., D.P. and T.S.; original draft preparation-writing, E.K.; software development, D.P.; design and implementation of data collection, E.K., T.S. and D.P.; supervision, T.S. and P.D.; review and editing, D.P., T.S. and P.D.; funding acquisition, D.P., T.S. and P.D. All authors have read and agreed to the published version of the manuscript.

Funding

This research is partially funded by the Research Fund KU Leuven and by the Cybersecurity Research Program Flanders. This work was partially supported by the EU funded project KINAITICS (Grant Agreement Number 101070176).

Informed Consent Statement

Not applicable.

Data Availability Statement

ChatGPT has been used to generate Figure 1b,d.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Questionnaire for the AI User Self-Assessment Framework

Appendix A.1. User Control

This set of questions focuses on whether the AI app enables you to understand and manage its actions and decisions. It addresses whether you have control over the app’s operations, including whether you can intervene or override its decisions if needed. The aim is to ensure that you maintain your autonomy and that the AI app supports, rather than undermines, your ability to make your own decisions.
  • Do you know if the AI app is making a decision, giving advice, or producing content on its own?
  • Are you clearly informed when something is decided or assumed by the AI app?
  • Do you feel like you rely too much on the AI app?
  • Are there measures in place to help you avoid over-relying on the AI app?
  • Do you think the AI app affects your ability to make your own decisions in any unintended way?
  • Do you feel like you can step in and change or override the AI app’s decisions?
  • Do you know if there is an emergency way for you to halt or control the AI app if something goes wrong?

Appendix A.2. Transparency

This set of questions aims to assess the extent to which the AI app operates in a clear and open manner. It focuses on whether the app provides understandable information about its data sources, decision-making processes, and overall purpose. It also considers how well the app communicates its benefits, limitations, and potential risks.
  • Can you find out what data the app used to make a specific decision or recommendation?
  • Does the app explain the rationale behind reaching its decisions or predictions?
  • Does the app explain the purpose and criteria behind its decisions or recommendations?
  • Are the benefits of using the app clearly communicated to you?
  • Are the limitations and potential risks of using the app, such as its accuracy and error rates, clearly communicated to you?
  • Does the app provide instructions or training on how to use it properly?

Appendix A.3. Security

This category focuses on the security and reliability of the AI app. It considers whether the app is robust against potential threats and how it handles security issues. The goal is to assess how confident you feel about the app’s safety measures and its ability to maintain functionality without compromising your data or causing harm.
  • Do you think that the app could cause serious safety issues for users or society if it breaks or gets hacked?
  • Are you informed about the app’s security updates and how long they will be provided?
  • Does the app clearly explain any risks involved in using it and how it keeps you safe?
  • Do you find the app reliable and free of frequent errors or malfunctions?
  • Do you think the app uses up-to-date, accurate, and high-quality information for its operations?
  • Does the app provide information about how accurate it is supposed to be and how it ensures this accuracy?
  • Could the app’s unreliability cause significant issues for you or your safety?

Appendix A.4. Privacy

This set of questions addresses how the AI app handles your personal data. It emphasizes the importance of data protection and privacy, ensuring you are aware of how your data is used and managed. It also considers your ability to control your personal information and the app’s commitment to collecting only necessary data.
  • Do you feel that the app respects your privacy?
  • Are you aware of any ways to report privacy issues with the app?
  • Do you know if and to what extent the app uses your personal data?
  • Are you informed about what data is collected by the app and why?
  • Does the app explain how your personal data is protected?
  • Does the app allow you to delete your data or withdraw your consent?
  • Are settings about deleting your data or withdrawing your consent visible and easily accessible?
  • Do you feel that the app only collects data that are necessary for its functions?

Appendix A.5. Ethics

This set of questions focuses on the ethical considerations of the AI app. It looks at whether the app avoids unfair bias, includes diverse data, and provides ways to report discrimination. It also examines if the app is designed to be inclusive and accessible to all users, ensuring it meets high ethical standards.
  • Do you think that the app creators have a plan to avoid creating or reinforcing unfair bias in how the app works?
  • Do you think that the app creators make sure the data used by the app includes a diverse range of people?
  • Does the app allow you to report issues related to bias or discrimination easily?
  • Do you think the app is designed to be usable by people of all ages, genders, and abilities?
  • Do you consider the app easy to use for people with disabilities or those who might need extra help?
  • Does the app follow principles that make it accessible and usable for as many people as possible?

Appendix A.6. Compliance

This set of questions aims to assess whether there are processes in place to address any concerns or issues you might have with the AI app. It considers whether there are clear reporting mechanisms, regular reviews of ethical guidelines, and ways for you to seek redress if negatively affected by the app.
  • Are there clear ways for you to report problems or concerns about the app?
  • Are you informed about the app’s compliance with relevant legal frameworks?
  • Are there processes indicating regular reviews and updates regarding the app’s adherence to ethical guidelines?
  • Are there mechanisms for you to seek redress if you are negatively affected by the app?

References

  1. Emmert-Streib, F. From the digital data revolution toward a digital society: Pervasiveness of artificial intelligence. Mach. Learn. Knowl. Extr. 2021, 3, 284–298. [Google Scholar] [CrossRef]
  2. Ait Baha, T.; El Hajji, M.; Es-Saady, Y.; Fadili, H. The power of personalization: A systematic review of personality-adaptive chatbots. SN Comput. Sci. 2023, 4, 661. [Google Scholar] [CrossRef]
  3. Allen, C.; Payne, B.R.; Abegaz, T.; Robertson, C. What You See Is Not What You Know: Studying Deception in Deepfake Video Manipulation. J. Cybersecur. Educ. Res. Pract. 2024, 2024, 9. [Google Scholar] [CrossRef]
  4. Bansal, G.; Chamola, V.; Hussain, A.; Guizani, M.; Niyato, D. Transforming conversations with AI—A comprehensive study of ChatGPT. Cogn. Comput. 2024, 16, 2487–2510. [Google Scholar] [CrossRef]
  5. Liu, H.; Wang, Y.; Fan, W.; Liu, X.; Li, Y.; Jain, S.; Liu, Y.; Jain, A.; Tang, J. Trustworthy AI: A computational perspective. ACM Trans. Intell. Syst. Technol. 2022, 14, 1–59. [Google Scholar] [CrossRef]
  6. Saeed, M.M.; Alsharidah, M. Security, privacy, and robustness for trustworthy AI systems: A review. Comput. Electr. Eng. 2024, 119, 109643. [Google Scholar] [CrossRef]
  7. Reinhardt, K. Trust and trustworthiness in AI ethics. AI Ethics 2023, 3, 735–744. [Google Scholar] [CrossRef]
  8. Casare, A.; Basso, T.; Moraes, R. User Experience and Trustworthiness Measurement: Challenges in the Context of e-Commerce Applications. In Proceedings of the Future Technologies Conference (FTC) 2021, Vancouver, BC, Canada, 28–29 November 2022; Springer: Berlin/Heidelberg, Germany, 2022; Volume 1, pp. 173–192. [Google Scholar]
  9. Chander, B.; John, C.; Warrier, L.; Gopalakrishnan, K. Toward trustworthy artificial intelligence (TAI) in the context of explainability and robustness. ACM Comput. Surv. 2024. [Google Scholar] [CrossRef]
  10. Díaz-Rodríguez, N.; Del Ser, J.; Coeckelbergh, M.; de Prado, M.L.; Herrera-Viedma, E.; Herrera, F. Connecting the dots in trustworthy Artificial Intelligence: From AI principles, ethics, and key requirements to responsible AI systems and regulation. Inf. Fusion 2023, 99, 101896. [Google Scholar] [CrossRef]
  11. Brunner, S.; Frischknecht-Gruber, C.; Reif, M.U.; Weng, J. A comprehensive framework for ensuring the trustworthiness of AI systems. In Proceedings of the 33rd European Safety and Reliability Conference (ESREL), Southampton, UK, 3–7 September 2023; Research Publishing: Singapore, 2023; pp. 2772–2779. [Google Scholar]
  12. Weng, Y.; Wu, J. Leveraging Artificial Intelligence to Enhance Data Security and Combat Cyber Attacks. J. Artif. Intell. Gen. Sci. (JAIGS) 2024, 5, 392–399. [Google Scholar] [CrossRef]
  13. Durovic, M.; Corno, T. The Privacy of Emotions: From the GDPR to the AI Act, an Overview of Emotional AI Regulation and the Protection of Privacy and Personal Data. Privacy Data Prot.-Data-Driven Technol. 2025, 368–404. [Google Scholar]
  14. Ala-Pietilä, P.; Bonnet, Y.; Bergmann, U.; Bielikova, M.; Bonefeld-Dahl, C.; Bauer, W.; Bouarfa, L.; Chatila, R.; Coeckelbergh, M.; Dignum, V.; et al. The Assessment List for Trustworthy Artificial Intelligence (ALTAI); European Commission: Brussels, Belgium, 2020. [Google Scholar]
  15. Croce, F.; Andriushchenko, M.; Sehwag, V.; Debenedetti, E.; Flammarion, N.; Chiang, M.; Mittal, P.; Hein, M. RobustBench: A standardized adversarial robustness benchmark. In Proceedings of the Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2), Virtual Conference, 7–10 December 2021. [Google Scholar]
  16. Baz, A.; Ahmed, R.; Khan, S.A.; Kumar, S. Security risk assessment framework for the healthcare industry 5.0. Sustainability 2023, 15, 16519. [Google Scholar] [CrossRef]
  17. Schwemer, S.F.; Tomada, L.; Pasini, T. Legal ai systems in the eu’s proposed artificial intelligence act. In Proceedings of the Second International Workshop on AI and Intelligent Assistance for Legal Professionals in the Digital Workplace (LegalAIIA 2021), Held in Conjunction with ICAIL, Sao Paulo, Brazil, 21 June 2021. [Google Scholar]
  18. Dasi, U.; Singla, N.; Balasubramanian, R.; Benadikar, S.; Shanbhag, R.R. Ethical implications of AI-driven personalization in digital media. J. Inform. Educ. Res. 2024, 4, 588–593. [Google Scholar]
  19. Pa Pa, Y.M.; Tanizaki, S.; Kou, T.; Van Eeten, M.; Yoshioka, K.; Matsumoto, T. An attacker’s dream? exploring the capabilities of chatgpt for developing malware. In Proceedings of the 16th Cyber Security Experimentation and Test Workshop, Marina Del Rey, CA, USA, 7–8 August 2023; pp. 10–18. [Google Scholar]
  20. Gill, S.S.; Kaur, R. ChatGPT: Vision and challenges. Internet Things-Cyber-Phys. Syst. 2023, 3, 262–271. [Google Scholar] [CrossRef]
  21. Li, J.; Yang, Y.; Wu, Z.; Vydiswaran, V.V.; Xiao, C. ChatGPT as an Attack Tool: Stealthy Textual Backdoor Attack via Blackbox Generative Model Trigger. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Mexico City, Mexico, 16–21 June 2024; Volume 1, pp. 2985–3004. [Google Scholar]
  22. Roy, S.S.; Thota, P.; Naragam, K.V.; Nilizadeh, S. From Chatbots to Phishbots?: Phishing Scam Generation in Commercial Large Language Models. In Proceedings of the 2024 IEEE Symposium on Security and Privacy (SP). IEEE Computer Society, Francisco, CA, USA, 20–23 May 2024; p. 221. [Google Scholar]
  23. Kshetri, N. ChatGPT in developing economies. IT Prof. 2023, 25, 16–19. [Google Scholar] [CrossRef]
  24. Chen, Y.; Kirhsner, S.; Ovchinnikov, A.; Andiappan, M.; Jenkin, T. A Manager and an AI Walk into a Bar: Does ChatGPT Make Biased Decisions Like We Do? 2023. Available online: https://ssrn.com/abstract=4380365 (accessed on 20 October 2024).
  25. Luccioni, S.; Akiki, C.; Mitchell, M.; Jernite, Y. Stable bias: Evaluating societal representations in diffusion models. In Proceedings of the 37th International Conference on Neural Information Processing Systems, New Orleans, LA, USA, 10–16 December 2023; pp. 56338–56351. [Google Scholar]
  26. Wang, Y.; Pan, Y.; Yan, M.; Su, Z.; Luan, T.H. A survey on ChatGPT: AI-generated contents, challenges, and solutions. IEEE Open J. Comput. Soc. 2023, 4, 280–302. [Google Scholar] [CrossRef]
  27. Qu, Y.; Shen, X.; He, X.; Backes, M.; Zannettou, S.; Zhang, Y. Unsafe diffusion: On the generation of unsafe images and hateful memes from text-to-image models. In Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security, Copenhagen, Denmark, 26–30 November 2023; pp. 3403–3417. [Google Scholar]
  28. Li, J.; Cheng, X.; Zhao, W.X.; Nie, J.Y.; Wen, J.R. Halueval: A large-scale hallucination evaluation benchmark for large language models. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Singapore, 6–10 December 2023; pp. 6449–6464. [Google Scholar]
  29. Hanley, H.W.; Durumeric, Z. Machine-made media: Monitoring the mobilization of machine-generated articles on misinformation and mainstream news websites. In Proceedings of the International AAAI Conference on Web and Social Media, Buffalo, NY, USA, 3–6 June 2024; Volume 18, pp. 542–556. [Google Scholar]
  30. Vartiainen, H.; Tedre, M. Using artificial intelligence in craft education: Crafting with text-to-image generative models. Digit. Creat. 2023, 34, 1–21. [Google Scholar] [CrossRef]
  31. Malinka, K.; Peresíni, M.; Firc, A.; Hujnák, O.; Janus, F. On the educational impact of chatgpt: Is artificial intelligence ready to obtain a university degree? In Proceedings of the 2023 Conference on Innovation and Technology in Computer Science Education V. 1, Turku, Finland, 10–12 July 2023; pp. 47–53. [Google Scholar]
  32. Strickland, E. IBM Watson, heal thyself: How IBM overpromised and underdelivered on AI health care. IEEE Spectr. 2019, 56, 24–31. [Google Scholar] [CrossRef]
  33. Hunkenschroer, A.L.; Kriebitz, A. Is AI recruiting (un) ethical? A human rights perspective on the use of AI for hiring. AI Ethics 2023, 3, 199–213. [Google Scholar] [CrossRef]
  34. Rudolph, J.; Tan, S.; Tan, S. War of the chatbots: Bard, Bing Chat, ChatGPT, Ernie and beyond. The new AI gold rush and its impact on higher education. J. Appl. Learn. Teach. 2023, 6, 364–389. [Google Scholar]
  35. Kaur, D.; Uslu, S.; Rittichier, K.J.; Durresi, A. Trustworthy artificial intelligence: A review. ACM Comput. Surv. (CSUR) 2022, 55, 1–38. [Google Scholar] [CrossRef]
  36. Shin, D. The effects of explainability and causability on perception, trust, and acceptance: Implications for explainable AI. Int. J. Hum.-Comput. Stud. 2021, 146, 102551. [Google Scholar] [CrossRef]
  37. Li, B.; Qi, P.; Liu, B.; Di, S.; Liu, J.; Pei, J.; Yi, J.; Zhou, B. Trustworthy AI: From principles to practices. ACM Comput. Surv. 2023, 55, 1–46. [Google Scholar] [CrossRef]
  38. Langford, T.; Payne, B. Phishing Faster: Implementing ChatGPT into Phishing Campaigns. In Proceedings of the Future Technologies Conference, Vancouver, BC, Canada, 19–20 October 2023; Springer: Berlin/Heidelberg, Germany, 2023; pp. 174–187. [Google Scholar]
  39. Xie, Y.; Yi, J.; Shao, J.; Curl, J.; Lyu, L.; Chen, Q.; Xie, X.; Wu, F. Defending ChatGPT against jailbreak attack via self-reminders. Nat. Mach. Intell. 2023, 5, 1–11. [Google Scholar] [CrossRef]
  40. Wan, Y.; Wang, W.; He, P.; Gu, J.; Bai, H.; Lyu, M.R. Biasasker: Measuring the bias in conversational ai system. In Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, San Francisco, CA, USA, 3–9 December 2023; pp. 515–527. [Google Scholar]
  41. Epstein, D.C.; Jain, I.; Wang, O.; Zhang, R. Online detection of ai-generated images. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 2–6 October 2023; pp. 382–392. [Google Scholar]
  42. Chaka, C. Detecting AI content in responses generated by ChatGPT, YouChat, and Chatsonic: The case of five AI content detection tools. J. Appl. Learn. Teach. 2023, 6, 94–104. [Google Scholar]
  43. Xu, H.; Zhu, T.; Zhang, L.; Zhou, W.; Yu, P.S. Machine unlearning: A survey. ACM Comput. Surv. 2023, 56, 1–36. [Google Scholar] [CrossRef]
  44. Dwivedi, R.; Dave, D.; Naik, H.; Singhal, S.; Omer, R.; Patel, P.; Qian, B.; Wen, Z.; Shah, T.; Morgan, G.; et al. Explainable AI (XAI): Core ideas, techniques, and solutions. ACM Comput. Surv. 2023, 55, 1–33. [Google Scholar] [CrossRef]
  45. Ali, S.; Abuhmed, T.; El-Sappagh, S.; Muhammad, K.; Alonso-Moral, J.M.; Confalonieri, R.; Guidotti, R.; Del Ser, J.; Díaz-Rodríguez, N.; Herrera, F. Explainable Artificial Intelligence (XAI): What we know and what is left to attain Trustworthy Artificial Intelligence. Inf. Fusion 2023, 99, 101805. [Google Scholar] [CrossRef]
  46. Casper, S.; Davies, X.; Shi, C.; Gilbert, T.K.; Scheurer, J.; Rando, J.; Freedman, R.; Korbak, T.; Lindner, D.; Freire, P.; et al. Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback. arXiv 2023, arXiv:2307.15217. [Google Scholar]
  47. Kieseberg, P.; Weippl, E.; Tjoa, A.M.; Cabitza, F.; Campagner, A.; Holzinger, A. Controllable AI-An Alternative to Trustworthiness in Complex AI Systems? In Proceedings of the International Cross-Domain Conference for Machine Learning and Knowledge Extraction, Benevento, Italy, 29 August–1 September 2023; Springer: Berlin/Heidelberg, Germany, 2023; pp. 1–12. [Google Scholar]
  48. Hacker, P.; Engel, A.; Mauer, M. Regulating ChatGPT and other large generative AI models. In Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency, Chicago, IL, USA, 12–15 June 2023; pp. 1112–1123. [Google Scholar]
  49. Chamberlain, J. The risk-based approach of the European Union’s proposed artificial intelligence regulation: Some comments from a tort law perspective. Eur. J. Risk Regul. 2023, 14, 1–13. [Google Scholar] [CrossRef]
  50. Bengio, Y.; Hinton, G.; Yao, A.; Song, D.; Abbeel, P.; Darrell, T.; Harari, Y.N.; Zhang, Y.Q.; Xue, L.; Shalev-Shwartz, S.; et al. Managing extreme AI risks amid rapid progress. Science 2024, 384, 842–845. [Google Scholar] [CrossRef] [PubMed]
  51. Laux, J.; Wachter, S.; Mittelstadt, B. Trustworthy artificial intelligence and the European Union AI act: On the conflation of trustworthiness and acceptability of risk. Regul. Gov. 2023, 18, 3–32. [Google Scholar] [CrossRef] [PubMed]
  52. Hupont, I.; Micheli, M.; Delipetrev, B.; Gómez, E.; Garrido, J.S. Documenting high-risk AI: A European regulatory perspective. Computer 2023, 56, 18–27. [Google Scholar] [CrossRef]
  53. Lucaj, L.; Van Der Smagt, P.; Benbouzid, D. Ai regulation is (not) all you need. In Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency, Chicago, IL, USA, 12–15 June 2023; pp. 1267–1279. [Google Scholar]
  54. Meszaros, J.; Preuveneers, D.; Biasin, E.; Marquet, E.; Erdogan, I.; Rosal, I.M.; Vranckaert, K.; Belkadi, L.; Menende, N. European Union—ChatGPT, Are You Lawfully Processing My Personal Data? GDPR Compliance and Legal Basis for Processing Personal Data by OpenAI. J. AI Law Regul. 2024, 1, 233–239. [Google Scholar] [CrossRef]
  55. Fedele, A.; Punzi, C.; Tramacere, S. The ALTAI checklist as a tool to assess ethical and legal implications for a trustworthy AI development in education. Comput. Law Secur. Rev. 2024, 53, 105986. [Google Scholar] [CrossRef]
  56. Radclyffe, C.; Ribeiro, M.; Wortham, R.H. The assessment list for trustworthy artificial intelligence: A review and recommendations. Front. Artif. Intell. 2023, 6, 1020592. [Google Scholar] [CrossRef]
  57. Zicari, R.V.; Brodersen, J.; Brusseau, J.; Düdder, B.; Eichhorn, T.; Ivanov, T.; Kararigas, G.; Kringen, P.; McCullough, M.; Möslein, F.; et al. Z-Inspection®: A process to assess trustworthy AI. IEEE Trans. Technol. Soc. 2021, 2, 83–97. [Google Scholar] [CrossRef]
  58. Vetter, D.; Amann, J.; Bruneault, F.; Coffee, M.; Düdder, B.; Gallucci, A.; Gilbert, T.K.; Hagendorff, T.; van Halem, I.; Hickman, E.; et al. Lessons learned from assessing trustworthy AI in practice. Digit. Soc. 2023, 2, 35. [Google Scholar] [CrossRef]
  59. Vlachogianni, P.; Tselios, N. Perceived usability evaluation of educational technology using the System Usability Scale (SUS): A systematic review. J. Res. Technol. Educ. 2022, 54, 392–409. [Google Scholar] [CrossRef]
Figure 1. Examples of ChatGPT Failures. (a) Jailbreak Attack https://x.com/semenov_roman_/status/1621465137025613825/photo/1 (URL accessed on 20 October 2024). (b) Biased Content Generation. (c) Phishing https://www.nospamproxy.de/wp-content/uploads/ChatGPT-Phishing-Mail-EN.jpg (URL accessed on 20 October 2024). (d) Harmful Content Generation.
Figure 1. Examples of ChatGPT Failures. (a) Jailbreak Attack https://x.com/semenov_roman_/status/1621465137025613825/photo/1 (URL accessed on 20 October 2024). (b) Biased Content Generation. (c) Phishing https://www.nospamproxy.de/wp-content/uploads/ChatGPT-Phishing-Mail-EN.jpg (URL accessed on 20 October 2024). (d) Harmful Content Generation.
Bdcc 08 00142 g001
Figure 2. Examples of Stable Diffusion on Biased Content Generation. Subfigure (a) depicts an instance of Stable Diffusion v 1.4 generating an image with an ‘ambitious CEO’ gender bias, where it is acknowledged that the term ‘ambitious’ may reflect a stereotype bias. Subfigure (b) illustrates another instance generating an image with a ‘supportive CEO’ gender bias. (a) Stable Diffusion v 1.4: Ambitious CEO https://x.com/SashaMTL/status/1587108586865524738/photo/1 (URL accessed on 20 October 2024). (b) Stable Diffusion v 1.4: Supportive CEO https://x.com/SashaMTL/status/1587108586865524738/photo/1 (URL accessed on 20 October 2024).
Figure 2. Examples of Stable Diffusion on Biased Content Generation. Subfigure (a) depicts an instance of Stable Diffusion v 1.4 generating an image with an ‘ambitious CEO’ gender bias, where it is acknowledged that the term ‘ambitious’ may reflect a stereotype bias. Subfigure (b) illustrates another instance generating an image with a ‘supportive CEO’ gender bias. (a) Stable Diffusion v 1.4: Ambitious CEO https://x.com/SashaMTL/status/1587108586865524738/photo/1 (URL accessed on 20 October 2024). (b) Stable Diffusion v 1.4: Supportive CEO https://x.com/SashaMTL/status/1587108586865524738/photo/1 (URL accessed on 20 October 2024).
Bdcc 08 00142 g002
Figure 3. The AI User Self-Assessment Framework.
Figure 3. The AI User Self-Assessment Framework.
Bdcc 08 00142 g003
Table 1. Summary of control mechanisms for AI applications.
Table 1. Summary of control mechanisms for AI applications.
Control LevelElementsReferences
TechnologicalAdvanced methods and tools aimed at enhancing AI security, robustness, privacy, and interpretability, including vulnerability detection, response and recovery, and XAI for clearer decision-making.[15,38,39,40,41,42,43,44,45,46,47]
RegulatoryLaws and guidelines established to govern the ethical use of AI technologies, ensuring compliance with standards for data privacy, transparency, and accountability.[48,49,50,51,52,53,54]
TrustworthinessFrameworks and guidelines designed to assess and ensure the ethical principles, transparency, fairness, and overall reliability of AI systems, guiding their responsible development and deployment.[14,35,37,55,56,57,58]
Table 2. Mapping of AI User Self-Assessment Framework dimensions to ALTAI dimensions.
Table 2. Mapping of AI User Self-Assessment Framework dimensions to ALTAI dimensions.
ALTAIAI User Self-Assessment Framework
Human Agency and OversightUser Control
TransparencyTransparency
Technical Robustness and SafetySecurity
Privacy and Data GovernancePrivacy
Diversity, Non-discrimination, FairnessEthics
AccountabilityCompliance
Societal and Environmental Well-beingNot explicitly mapped
Table 3. AI User Self-Assessment Framework: Example questions, criteria, and objectives for each dimension of trustworthiness assessment within the proposed framework. The framework’s dimensions are marked in bold. Detailed sets of questions for each dimension are elaborated in Appendix A.
Table 3. AI User Self-Assessment Framework: Example questions, criteria, and objectives for each dimension of trustworthiness assessment within the proposed framework. The framework’s dimensions are marked in bold. Detailed sets of questions for each dimension are elaborated in Appendix A.
Example QuestionsOn-App CriteriaObjective
User ControlDo you feel like you can step in and change or override the AI app’s decisions?Options to override AI decisions, emergency halt functions.Ensure users maintain control over the AI app’s operations, promoting autonomy and the ability to intervene.
TransparencyCan you find out what data the app used to make a specific decision or recommendation?Clear explanations of data sources and decision-making processes.Promote understanding and clarity regarding the AI app’s actions and decisions, encouraging users to value transparency.
SecurityDo you think that the app could cause serious safety issues for users or society if it breaks or gets hacked?Robust data encryption, regular security audits, and proactive threat detection.Ensure users feel confident about the app’s safety measures and its ability to protect their data and functionality.
PrivacyDo you feel that the app respects your privacy?Transparent data collection policies, easy-to-access consent and data deletion options.Empower users to make informed decisions about their personal information and understand how their data are handled.
EthicsDo you think that the app creators have a plan to avoid creating or reinforcing unfair bias in how the app works?Bias detection and correction mechanisms, adherence to ethical guidelines.Ensure fairness, inclusivity, and alignment with ethical standards.
ComplianceAre there clear ways for you to report problems or concerns about the app?Accessible reporting mechanisms, regular ethical reviews, and compliance with legal standards.Empower users to hold the app accountable and ensure it adheres to relevant laws and ethical guidelines.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Kafali, E.; Preuveneers, D.; Semertzidis, T.; Daras, P. Defending Against AI Threats with a User-Centric Trustworthiness Assessment Framework. Big Data Cogn. Comput. 2024, 8, 142. https://doi.org/10.3390/bdcc8110142

AMA Style

Kafali E, Preuveneers D, Semertzidis T, Daras P. Defending Against AI Threats with a User-Centric Trustworthiness Assessment Framework. Big Data and Cognitive Computing. 2024; 8(11):142. https://doi.org/10.3390/bdcc8110142

Chicago/Turabian Style

Kafali, Efi, Davy Preuveneers, Theodoros Semertzidis, and Petros Daras. 2024. "Defending Against AI Threats with a User-Centric Trustworthiness Assessment Framework" Big Data and Cognitive Computing 8, no. 11: 142. https://doi.org/10.3390/bdcc8110142

APA Style

Kafali, E., Preuveneers, D., Semertzidis, T., & Daras, P. (2024). Defending Against AI Threats with a User-Centric Trustworthiness Assessment Framework. Big Data and Cognitive Computing, 8(11), 142. https://doi.org/10.3390/bdcc8110142

Article Metrics

Back to TopTop