Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (3)

Search Parameters:
Keywords = Facebook AI Similarity Search (FAISS)

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
18 pages, 1570 KB  
Article
A Study on Broker-Assisted Blockchain Trust Chains for Provenance and Integrity Verification of Generative Media Using Watermarking, Semantic Fingerprinting, and C2PA
by Chaelin Yang and Minchul Kim
Appl. Sci. 2026, 16(7), 3391; https://doi.org/10.3390/app16073391 - 31 Mar 2026
Viewed by 783
Abstract
The widespread availability of generative artificial intelligence has increased the volume of images and videos shared online, while making it difficult to verify origin and integrity after routine post-processing such as re-encoding, resizing, and transcoding. This research proposes a broker-assisted trust chain architecture [...] Read more.
The widespread availability of generative artificial intelligence has increased the volume of images and videos shared online, while making it difficult to verify origin and integrity after routine post-processing such as re-encoding, resizing, and transcoding. This research proposes a broker-assisted trust chain architecture that treats authenticity verification as an evidence registration and validation workflow rather than a single-signal decision. A trust chain broker seals submitted media by embedding a robust hidden watermark, deriving an embedding-based semantic fingerprint, and producing standardized provenance metadata, then stores the sealed media off-chain using content-addressed storage and anchors only compact evidence on an immutable ledger. The anchored evidence binds the content identifier of the sealed artifact with semantic and provenance hashes, timestamps, and the broker signature, while scalable candidate discovery is supported through an off-chain Facebook AI Similarity Search (FAISS)-based nearest-neighbor similarity index. We evaluate the retrieval stage on a COCO 2017 validation subset (N = 200) under representative post-processing transformations (JPEG compression, resizing, and center cropping), and observe near-perfect candidate identification performance with Recall@1 = 0.9988 and Recall@5/10 = 1.000. During verification, the broker retrieves candidates by embedding similarity, validates ledger inclusion and broker signatures, applies consistency checks across evidence fields, and issues an operational verdict with a signed verification report that is independently checkable. We also implement an EVM-based proof-of-concept for on-chain anchoring and report low ledger-side overhead for a representative registration transaction (gasUsed = 25,380) when recording fixed-size compact evidence fields. The proposed architecture does not prevent copying itself, but improves traceability and auditability under realistic transformation and redistribution conditions by combining watermarking, semantic association, provenance binding, and tamper-evident evidence anchoring within a clear service accountability boundary. Full article
(This article belongs to the Special Issue Advanced Blockchain Technologies and Their Applications)
Show Figures

Figure 1

22 pages, 6241 KB  
Article
Using Large Language Models to Detect and Debunk Climate Change Misinformation
by Zeinab Shahbazi and Sara Behnamian
Big Data Cogn. Comput. 2026, 10(1), 34; https://doi.org/10.3390/bdcc10010034 - 17 Jan 2026
Cited by 3 | Viewed by 1888
Abstract
The rapid spread of climate change misinformation across digital platforms undermines scientific literacy, public trust, and evidence-based policy action. Advances in Natural Language Processing (NLP) and Large Language Models (LLMs) create new opportunities for automating the detection and correction of misleading climate-related narratives. [...] Read more.
The rapid spread of climate change misinformation across digital platforms undermines scientific literacy, public trust, and evidence-based policy action. Advances in Natural Language Processing (NLP) and Large Language Models (LLMs) create new opportunities for automating the detection and correction of misleading climate-related narratives. This study presents a multi-stage system that employs state-of-the-art large language models such as Generative Pre-trained Transformer 4 (GPT-4), Large Language Model Meta AI (LLaMA) version 3 (LLaMA-3), and RoBERTa-large (Robustly optimized BERT pretraining approach large) to identify, classify, and generate scientifically grounded corrections for climate misinformation. The system integrates several complementary techniques, including transformer-based text classification, semantic similarity scoring using Sentence-BERT, stance detection, and retrieval-augmented generation (RAG) for evidence-grounded debunking. Misinformation instances are detected through a fine-tuned RoBERTa–Multi-Genre Natural Language Inference (MNLI) classifier (RoBERTa-MNLI), grouped using BERTopic, and verified against curated climate-science knowledge sources using BM25 and dense retrieval via FAISS (Facebook AI Similarity Search). The debunking component employs RAG-enhanced GPT-4 to produce accurate and persuasive counter-messages aligned with authoritative scientific reports such as those from the Intergovernmental Panel on Climate Change (IPCC). A diverse dataset of climate misinformation categories covering denialism, cherry-picking of data, false causation narratives, and misleading comparisons is compiled for evaluation. Benchmarking experiments demonstrate that LLM-based models substantially outperform traditional machine-learning baselines such as Support Vector Machines, Logistic Regression, and Random Forests in precision, contextual understanding, and robustness to linguistic variation. Expert assessment further shows that generated debunking messages exhibit higher clarity, scientific accuracy, and persuasive effectiveness compared to conventional fact-checking text. These results highlight the potential of advanced LLM-driven pipelines to provide scalable, real-time mitigation of climate misinformation while offering guidelines for responsible deployment of AI-assisted debunking systems. Full article
(This article belongs to the Special Issue Natural Language Processing Applications in Big Data)
Show Figures

Figure 1

30 pages, 21387 KB  
Article
An Intelligent Docent System with a Small Large Language Model (sLLM) Based on Retrieval-Augmented Generation (RAG)
by Taemoon Jung and Inwhee Joe
Appl. Sci. 2025, 15(17), 9398; https://doi.org/10.3390/app15179398 - 27 Aug 2025
Cited by 6 | Viewed by 3448
Abstract
This study designed and empirically evaluated a method to enhance information accessibility for museum and art gallery visitors using a small Large Language Model (sLLM) based on the Retrieval-Augmented Generation (RAG) framework. Over 199,000 exhibition descriptions were collected and refined, and a question-answering [...] Read more.
This study designed and empirically evaluated a method to enhance information accessibility for museum and art gallery visitors using a small Large Language Model (sLLM) based on the Retrieval-Augmented Generation (RAG) framework. Over 199,000 exhibition descriptions were collected and refined, and a question-answering dataset consisting of 102,000 pairs reflecting user personas was constructed to develop DocentGemma, a domain-optimized language model. This model was fine-tuned through Low-Rank Adaptation (LoRA) based on Google’s Gemma2-9B and integrated with FAISS and OpenSearch-based document retrieval systems within the LangChain framework. Performance evaluation was conducted using a dedicated Q&A benchmark for the docent domain, comparing the model against five commercial and open-source LLMs (including GPT-3.5 Turbo, LLaMA3.3-70B, and Gemma2-9B). DocentGemma achieved an accuracy of 85.55% and a perplexity of 3.78, demonstrating competitive performance in language generation and response accuracy within the domain-specific context. To enhance retrieval relevance, a Spatio-Contextual Retriever (SC-Retriever) was introduced, which combines semantic similarity and spatial proximity based on the user’s query and location. An ablation study confirmed that integrating both modalities improved retrieval quality, with the SC-Retriever achieving a recall@1 of 53.45% and a Mean Reciprocal Rank (MRR) of 68.12, representing a 17.5 20% gain in search accuracy compared to baseline models such as GTE and SpatialNN. System performance was further validated through field deployment at three major exhibition venues in Seoul (the Seoul History Museum, the Hwan-ki Museum, and the Hanseong Baekje Museum). A user test involving 110 participants indicated high response credibility and an average satisfaction score of 4.24. To ensure accessibility, the system supports various output formats, including multilingual speech and subtitles. This work illustrates a practical application of integrating LLM-based conversational capabilities into traditional docent services and suggests potential for further development toward location-aware interactive systems and AI-driven cultural content services. Full article
Show Figures

Figure 1

Back to TopTop