Text Mining and Data Mining

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: 20 November 2024 | Viewed by 664

Special Issue Editor


E-Mail Website
Guest Editor
Faculty of Computing, Harbin Institute of Technology, Harbin 150001, China
Interests: text mining; entity relationship calculation; text clustering; data mining

Special Issue Information

Dear Colleagues,

In today's world, it is a complex task to gather, analyze, and extract information from huge amount of datasets. So, we use many efficient methods for the practical integration of the data. Text mining is a technique based around applying knowledge discovery techniques to unstructured text and termed knowledge discovery in text, text data mining, or text mining. Data mining technology is giving us the ability to extract meaningful patterns from large quantities of structured data. Information retrieval systems have made large quantities of textual data available, as well as the scope of multimodal data, in particular multimodal information extraction, which focuses knowledge discovery on multimodal data from various modalities such as image, text, and video, as well as image-aided information extraction, which enhances the performance of information extraction with image information, image information extraction, which extracts structured data from images, etc.

Text mining and data mining use diverse techniques such as natural language processing, computer vision, machine learning, information retrieval, and knowledge management for the automated analysis of digital content. By doing so, text mining and data mining can extract information, identify patterns, and discover new trends, insights, and correlations.

This Special Issue seeks original, unpublished articles that address recent advances in data mining and text mining techniques as well as their applications. Topics of interest include, but are not limited to, the following: text mining; knowledge discovery in text; content analysis; text analysis; text classification; audio-to-text mining; video-to-text mining; image-to-text mining; big data; data mining; artificial intelligence; machine learning; information retrieval; applications; multimodal information extraction; image-aided information extraction; image information extraction.

Prof. Dr. Ming Liu
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • text mining
  • knowledge discovery in text
  • content analysis
  • text analysis
  • text classification
  • audio-to-text mining
  • video-to-text mining
  • image-to-text mining
  • big data
  • data mining
  • artificial intelligence
  • machine learning
  • information retrieval
  • applications
  • multimodal information extraction
  • image-aided information extraction
  • image information extraction

Published Papers (1 paper)

Order results
Result details
Select all
Export citation of selected articles as:

Research

25 pages, 811 KiB  
Article
Contextual Hypergraph Networks for Enhanced Extractive Summarization: Introducing Multi-Element Contextual Hypergraph Extractive Summarizer (MCHES)
by Aytuğ Onan and Hesham Alhumyani
Appl. Sci. 2024, 14(11), 4671; https://doi.org/10.3390/app14114671 - 29 May 2024
Viewed by 233
Abstract
Extractive summarization, a pivotal task in natural language processing, aims to distill essential content from lengthy documents efficiently. Traditional methods often struggle with capturing the nuanced interdependencies between different document elements, which is crucial to producing coherent and contextually rich summaries. This paper [...] Read more.
Extractive summarization, a pivotal task in natural language processing, aims to distill essential content from lengthy documents efficiently. Traditional methods often struggle with capturing the nuanced interdependencies between different document elements, which is crucial to producing coherent and contextually rich summaries. This paper introduces Multi-Element Contextual Hypergraph Extractive Summarizer (MCHES), a novel framework designed to address these challenges through an advanced hypergraph-based approach. MCHES constructs a contextual hypergraph where sentences form nodes interconnected by multiple types of hyperedges, including semantic, narrative, and discourse hyperedges. This structure captures complex relationships and maintains narrative flow, enhancing semantic coherence across the summary. The framework incorporates a Contextual Homogenization Module (CHM), which harmonizes features from diverse hyperedges, and a Hypergraph Contextual Attention Module (HCA), which employs a dual-level attention mechanism to focus on the most salient information. The innovative Extractive Read-out Strategy selects the optimal set of sentences to compose the final summary, ensuring that the latter reflects the core themes and logical structure of the original text. Our extensive evaluations demonstrate significant improvements over existing methods. Specifically, MCHES achieves an average ROUGE-1 score of 44.756, a ROUGE-2 score of 24.963, and a ROUGE-L score of 42.477 on the CNN/DailyMail dataset, surpassing the best-performing baseline by 3.662%, 3.395%, and 2.166% respectively. Furthermore, MCHES achieves BERTScore values of 59.995 on CNN/DailyMail, 88.424 on XSum, and 89.285 on PubMed, indicating superior semantic alignment with human-generated summaries. Additionally, MCHES achieves MoverScore values of 87.432 on CNN/DailyMail, 60.549 on XSum, and 59.739 on PubMed, highlighting its effectiveness in maintaining content movement and ordering. These results confirm that the MCHES framework sets a new standard for extractive summarization by leveraging contextual hypergraphs for better narrative and thematic fidelity. Full article
(This article belongs to the Special Issue Text Mining and Data Mining)
Show Figures

Figure 1

Back to TopTop