Submit to MTI Review for MTI Propose a Special Issue

Journal Menu

Journal Browser

► Journal Browser

Text Mining in Complex Domains

Print Special Issue Flyer
Special Issue Editors
Special Issue Information
Keywords
Benefits of Publishing in a Special Issue
Published Papers

A special issue of Multimodal Technologies and Interaction (ISSN 2414-4088).

Deadline for manuscript submissions: closed (30 June 2019) | Viewed by 26133

Share This Special Issue

Special Issue Editors

Dr. Suzan Verberne

E-Mail Website
Guest Editor

Leiden Institute of Advanced Computer Science, Leiden University, Leiden, The Netherlands
Interests: text mining; information retrieval; professional search; user aspects; evaluation

Dr. Iris Hendrickx

E-Mail Website
Guest Editor

Faculty of Arts, Radboud University, Nijmegen, The Netherlands
Interests: text mining; digital humanities; lexical and relational semantics; co-reference resolution; named entity recognition; automatic summarization

Special Issue Information

Dear Colleagues,

There is an abundance of text data available in a variety of domains. These data offer a large potential for knowledge discovery if the texts can be effectively disclosed with data mining techniques. However, text data is challenging for data mining because it is typically unstructured, often noisy, and open-ended – newly added documents bring new vocabulary and thus new features.

In developing Text Mining methods, every domain has its own unique challenges. Examples of complex text types that have gained attention of researchers in the past decade are: scientific publications, historic documents, patents, electronic health records, policy documents, and social media data. Text Mining research has its roots in the Natural Language Processing community as well as the Information Retrieval community, and receives attention from many application domains. We are seeking for more coherence in Text Mining research, by bringing together papers on text mining research from different angles.

This special issue invites submissions on the following topics:

Text mining methods, among which: named entity recognition, relation extraction, text categorization, text summarization, authorship detection, sentiment analysis
Text mining applications for complex domains
Domain adaptation for text mining methods
Evaluation of text mining methods
Pre-processing pipelines for text mining
Natural Language Processing for text mining
Information Retrieval for text mining
User interfacing for text mining
User studies addressing text mining applications
Methods for mining text with images

Dr. Suzan Verberne
Dr. Iris Hendrickx
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Multimodal Technologies and Interaction is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

Text Mining
Natural Language Processing
Information Retrieval
Domain adaptation
Evaluation
Knowledge Discovery

Benefits of Publishing in a Special Issue

Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (4 papers)

Download All Papers

Order results

Result details

Show export options Show export options

Select all

Export citation of selected articles as:

Research

20 pages, 2632 KiB

Open AccessArticle

Reorganize Your Blogs: Supporting Blog Re-visitation with Natural Language Processing and Visualization

by Shuo Niu, D. Scott McCrickard, Timothy L. Stelter, Alan Dix and G. Don Taylor

Multimodal Technol. Interact. 2019, 3(4), 66; https://doi.org/10.3390/mti3040066 - 7 Oct 2019

Cited by 2 | Viewed by 5432

Abstract

Temporally-connected personal blogs contain voluminous textual content, presenting challenges in re-visiting and reflecting on experiences. Other data repositories have benefited from natural language processing (NLP) and interactive visualizations (VIS) to support exploration, but little is known about how these techniques could be used with blogs to present experiences and support multimodal interaction with blogs, particularly for authors. This paper presents the effect of reorganization—reorganizing the large blog set with NLP and presenting abstract topics with VIS—to support novel re-visitation experiences to blogs. The BlogCloud tool, a blog re-visitation tool that reorganizes blog paragraphs around user-searched keywords, implements reorganization and similarity-based content grouping. Through a public use session with bloggers who wrote about extended hikes, we observed the effect of NLP-based reorganization in delivering novel re-visitation experiences. Findings suggest that the re-presented topics provide new reflection materials and re-visitation paths, enabling interaction with symbolic items in memory. Full article

(This article belongs to the Special Issue Text Mining in Complex Domains)

► Show Figures

Figure 1

15 pages, 1533 KiB

Open AccessArticle

Text Mining in Cybersecurity: Exploring Threats and Opportunities

by Maaike H. T. de Boer, Babette J. Bakker, Erik Boertjes, Mike Wilmer, Stephan Raaijmakers and Rick van der Kleij

Multimodal Technol. Interact. 2019, 3(3), 62; https://doi.org/10.3390/mti3030062 - 15 Sep 2019

Cited by 10 | Viewed by 8454

Abstract

The number of cyberattacks on organizations is growing. To increase cyber resilience, organizations need to obtain foresight to anticipate cybersecurity vulnerabilities, developments, and potential threats. This paper describes a tool that combines state of the art text mining and information retrieval techniques to explore the opportunities of using these techniques in the cybersecurity domain. Our tool, the Horizon Scanner, can scrape and store data from websites, blogs and PDF articles, and search a database based on a user query, show textual entities in a graph, and provide and visualize potential trends. The aim of the Horizon Scanner is to help experts explore relevant data sources for potential threats and trends and to speed up the process of foresight. In a requirements session and user evaluation of the tool with cyber experts from the Dutch Defense Cyber Command, we explored whether the Horizon Scanner tool has the potential to fulfill its aim in the cybersecurity domain. Although the overall evaluation of the tool was not as good as expected, some aspects of the tool were found to have added value, providing us with valuable insights into how to design decision support for forecasting analysts. Full article

(This article belongs to the Special Issue Text Mining in Complex Domains)

► Show Figures

Graphical abstract

27 pages, 792 KiB

Open AccessArticle

Data-Driven Lexical Normalization for Medical Social Media

by Anne Dirkson, Suzan Verberne, Abeed Sarker and Wessel Kraaij

Multimodal Technol. Interact. 2019, 3(3), 60; https://doi.org/10.3390/mti3030060 - 20 Aug 2019

Cited by 11 | Viewed by 4907

Abstract

In the medical domain, user-generated social media text is increasingly used as a valuable
complementary knowledge source to scientific medical literature. The extraction of this knowledge is
complicated by colloquial language use and misspellings. However, lexical normalization of such
data has not been addressed effectively. This paper presents a data-driven lexical normalization
pipeline with a novel spelling correction module for medical social media. Our method significantly
outperforms state-of-the-art spelling correction methods and can detect mistakes with an F₁ of 0.63
despite extreme imbalance in the data. We also present the first corpus for spelling mistake detection
and correction in a medical patient forum. Full article

(This article belongs to the Special Issue Text Mining in Complex Domains)

► Show Figures

Figure 1

12 pages, 1099 KiB

Open AccessArticle

Unsupervised Keyphrase Extraction for Web Pages

by Tim Haarman, Bastiaan Zijlema and Marco Wiering

Multimodal Technol. Interact. 2019, 3(3), 58; https://doi.org/10.3390/mti3030058 - 31 Jul 2019

Cited by 5 | Viewed by 6441

Abstract

Keyphrase extraction is an important part of natural language processing (NLP) research, although little research is done in the domain of web pages. The World Wide Web contains billions of pages that are potentially interesting for various NLP tasks, yet it remains largely untouched in scientific research. Current research is often only applied to clean corpora such as abstracts and articles from academic journals or sets of scraped texts from a single domain. However, textual data from web pages differ from normal text documents, as it is structured using HTML elements and often consists of many small fragments. These elements are furthermore used in a highly inconsistent manner and are likely to contain noise. We evaluated the keyphrases extracted by several state-of-the-art extraction methods and found that they did not transfer well to web pages. We therefore propose WebEmbedRank, an adaptation of a recently proposed extraction method that can make use of structural information in web pages in a robust manner. We compared this novel method to other baselines and state-of-the-art methods using a manually annotated dataset and found that WebEmbedRank achieved significant improvements over existing extraction methods on web pages. Full article

(This article belongs to the Special Issue Text Mining in Complex Domains)

► Show Figures

Journal Menu

Journal Browser

Text Mining in Complex Domains

Share This Special Issue

Special Issue Editors

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Published Papers (4 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI