Selected Papers from the Text Mining and Applications Track at EPIA 2017

A special issue of Information (ISSN 2078-2489). This special issue belongs to the section "Information Processes".

Deadline for manuscript submissions: closed (12 January 2018) | Viewed by 12440

Special Issue Editors


E-Mail Website
Guest Editor
Instituto Superior Técnico and INESC-ID, University of Lisbon, Lisbon, Portugal
Interests: geographic text analysis; geographic information sciences; applied data science and machine learning
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Instituto Superior Técnico, University of Lisbon, Lisbon, Portugal
Interests: information retrieval; information extraction; machine learning; digital libraries

E-Mail Website
Guest Editor
NOVA Laboratory for Computer Science and Informatics, Faculdade de Ciências e Tecnologia, Universidade Nova de Lisboa, Caparica, Portugal
Interests: machine translation; natural language processing

E-Mail Website
Guest Editor
Department of Informatics Engineering, University of Coimbra, Portugal
Interests: natural language processing; information extraction; computational creativity; creative systems

E-Mail Website
Guest Editor
NOVA Laboratory for Computer Science and Informatics, Faculdade de Ciências e Tecnologia, Universidade Nova de Lisboa, Caparica, Portugal
Interests: text mining; machine learning; data mining

E-Mail Website
Guest Editor
Instituto de Computação, Universidade Federal do Amazonas, Brazil
Interests: information retrieval; information extraction; text mining; social networks

Special Issue Information

Dear Colleagues,

This Special Issue focuses on extended versions of selected papers from the upcoming 7th edition of the Text Mining and Applications (TeMA 2017) track of the EPIA Conference on Artificial Intelligence.This track will feature contributions from researchers working in Human Language Technologies (HLT), i.e., Natural Language Processing (NLP), Computational Linguistics (CL), Natural Language Engineering (NLE), Text Mining (TM), Information Retrieval (IR), and related areas.

A detailed description for the track, together with the topics of interest that constitute the focus of its call for papers, is available online at https://web.fe.up.pt/~epia2017/thematic-tracks/tema/. The track’s description emphasizes the fact that, especially on the Web, a huge amount of textual information is openly published every day, offering new insights and many opportunities for innovative applications of Human Language Technologies. Both hidden and new knowledge can be discovered by using text mining methods, at multiple levels and in multiple dimensions, and often with high commercial value. By seeking contributions related to this general theme, TeMA is deeply related to particular subject areas listed for the Information journal, such as knowledge management, social media and social networks, and information extraction.

The authors of the best papers presented at TeMA 2017, selected after the revision of the original papers by members of the track’s program committee, and thus corresponding to high quality work describing significant advances in the area, will be invited to submit extended versions of their work to this Special Issue. Members of the TeMA program committee will act as reviewers for the extended versions submitted to this Special Issue (i.e., each submission will be blindly reviewed by three researchers in its specific area).

Assist. Prof. Bruno Martins
Assoc. Prof. Pável Pereira Calado
Assist. Prof. Hugo Gonçalo Oliveira
Dr. José Gabriel Pereira Lopes
Assist. Prof. Joaquim Ferreira Silva
Assoc. Prof. Altigran Soares da Silva
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Information is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Text Mining
  • Natural Language Processing
  • Natural Language Engineering
  • Information Extraction and Information Retrieval
  • Mining Web and Social Media Contents

Published Papers (2 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

19 pages, 494 KiB  
Article
Recognizing Textual Entailment: Challenges in the Portuguese Language
by Gil Rocha and Henrique Lopes Cardoso
Information 2018, 9(4), 76; https://doi.org/10.3390/info9040076 - 29 Mar 2018
Cited by 17 | Viewed by 7393
Abstract
Recognizing textual entailment comprises the task of determining semantic entailment relations between text fragments. A text fragment entails another text fragment if, from the meaning of the former, one can infer the meaning of the latter. If such relation is bidirectional, then we [...] Read more.
Recognizing textual entailment comprises the task of determining semantic entailment relations between text fragments. A text fragment entails another text fragment if, from the meaning of the former, one can infer the meaning of the latter. If such relation is bidirectional, then we are in the presence of a paraphrase. Automatically recognizing textual entailment relations captures major semantic inference needs in several natural language processing (NLP) applications. As in many NLP tasks, textual entailment corpora for English abound, while the same is not true for more resource-scarce languages such as Portuguese. Exploiting what seems to be the only Portuguese corpus for textual entailment and paraphrases (the ASSIN corpus), in this paper, we address the task of automatically recognizing textual entailment (RTE) and paraphrases from text written in the Portuguese language, by employing supervised machine learning techniques. We employ lexical, syntactic and semantic features, and analyze the impact of using semantic-based approaches in the performance of the system. We then try to take advantage of the bi-dialect nature of ASSIN to compensate its limited size. With the same aim, we explore modeling the task of recognizing textual entailment and paraphrases as a binary classification problem by considering the bidirectional nature of paraphrases as entailment relationships. Addressing the task as a multi-class classification problem, we achieve results in line with the winner of the ASSIN Challenge. In addition, we conclude that semantic-based approaches are promising in this task, and that combining data from European and Brazilian Portuguese is less straightforward than it may initially seem. The binary classification modeling of the problem does not seem to bring advantages to the original multi-class model, despite the outstanding results obtained by the binary classifier for recognizing textual entailments. Full article
Show Figures

Figure 1

21 pages, 374 KiB  
Article
Distributional and Knowledge-Based Approaches for Computing Portuguese Word Similarity
by Hugo Gonçalo Oliveira
Information 2018, 9(2), 35; https://doi.org/10.3390/info9020035 - 08 Feb 2018
Cited by 4 | Viewed by 4520
Abstract
Identifying similar and related words is not only key in natural language understanding but also a suitable task for assessing the quality of computational resources that organise words and meanings of a language, compiled by different means. This paper, which aims to be [...] Read more.
Identifying similar and related words is not only key in natural language understanding but also a suitable task for assessing the quality of computational resources that organise words and meanings of a language, compiled by different means. This paper, which aims to be a reference for those interested in computing word similarity in Portuguese, presents several approaches for this task and is motivated by the recent availability of state-of-the-art distributional models of Portuguese words, which add to several lexical knowledge bases (LKBs) for this language, available for a longer time. The previous resources were exploited to answer word similarity tests, which also became recently available for Portuguese. We conclude that there are several valid approaches for this task, but not one that outperforms all the others in every single test. Distributional models seem to capture relatedness better, while LKBs are better suited for computing genuine similarity, but, in general, better results are obtained when knowledge from different sources is combined. Full article
Show Figures

Figure 1

Back to TopTop