Advances and Challenges in Multimodal Machine Learning 2nd Edition

A special issue of Journal of Imaging (ISSN 2313-433X). This special issue belongs to the section "AI in Imaging".

Deadline for manuscript submissions: 30 April 2024 | Viewed by 1904

Special Issue Editor


E-Mail Website1 Website2
Guest Editor
Department of Computer Science, Loughborough University, Loughborough LE11 3TU, UK
Interests: cross-modal information retrieval; continual lifelong learning; explainable and ethical AI; sensitivity analysis in machine vision and text; natural language processing; machine vision
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

The emerging field of multimodal machine learning has experienced much progress in the past few years; however, several core challenges remain. These challenges are mainly around learning how to represent and summarise multimodal data (representation); translating (mapping) data from one modality to another (translation); identifying direct relations between elements from different modalities (alignment); joining or fusing information from two or more modalities to perform a prediction task (fusion); and transferring knowledge between modalities, their representations, and predictive models (co-learning).

Within the field of information retrieval, the large and continually growing volume of data has given rise to the need for retrieval solutions that can deal with the search process of using one modality as a query to retrieve related information in another modality, known as cross-modal retrieval. In recent years, cross-modal retrieval methods have attracted considerable attention due to the learning capabilities of deep learning methods; however, most of these methods assume that data examples in different modalities are fully paired, when in reality, these data are not often paired.

Furthermore, the continually growing volume of data has given rise to the additional challenge of developing lifelong learning models than can continue to efficiently learn new volumes of data. Lifelong learning remains a challenge for machine learning models and most research on the topic focuses on classification tasks. There is a need to focus on lifelong learning for information retrieval and to propose methods for dealing with the continuous growth of information that can lead to catastrophic forgetting or interference. This limitation represents a major drawback for models that typically learn representations from batches of training data, when in reality, information becomes incrementally available over time. The challenge of lifelong learning increases when dealing with cross-modal learning.

We request contributions that present techniques that will contribute to addressing the multimodal machine learning challenges, and we strongly encourage contributions that propose advances in the field of continual lifelong learning for multimodal machine learning applications. 

Dr. Georgina Cosma
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Journal of Imaging is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • neural information retrieval
  • multi-modal and cross-modal information retrieval
  • relevance feedback and query expansion in multimodal retrieval
  • sensitivity analysis of images or multi-modal data
  • visual semantic embedding for information retrieval and other tasks
  • continual lifelong learning for information retrieval
  • temporal modelling of multi-modal data

Published Papers (1 paper)

Order results
Result details
Select all
Export citation of selected articles as:

Research

33 pages, 3096 KiB  
Article
Efficient Retrieval of Images with Irregular Patterns Using Morphological Image Analysis: Applications to Industrial and Healthcare Datasets
by Jiajun Zhang, Georgina Cosma, Sarah Bugby and Jason Watkins
J. Imaging 2023, 9(12), 277; https://doi.org/10.3390/jimaging9120277 - 13 Dec 2023
Viewed by 1453
Abstract
Image retrieval is the process of searching and retrieving images from a datastore based on their visual content and features. Recently, much attention has been directed towards the retrieval of irregular patterns within industrial or healthcare images by extracting features from the images, [...] Read more.
Image retrieval is the process of searching and retrieving images from a datastore based on their visual content and features. Recently, much attention has been directed towards the retrieval of irregular patterns within industrial or healthcare images by extracting features from the images, such as deep features, colour-based features, shape-based features, and local features. This has applications across a spectrum of industries, including fault inspection, disease diagnosis, and maintenance prediction. This paper proposes an image retrieval framework to search for images containing similar irregular patterns by extracting a set of morphological features (DefChars) from images. The datasets employed in this paper contain wind turbine blade images with defects, chest computerised tomography scans with COVID-19 infections, heatsink images with defects, and lake ice images. The proposed framework was evaluated with different feature extraction methods (DefChars, resized raw image, local binary pattern, and scale-invariant feature transforms) and distance metrics to determine the most efficient parameters in terms of retrieval performance across datasets. The retrieval results show that the proposed framework using the DefChars and the Manhattan distance metric achieves a mean average precision of 80% and a low standard deviation of ±0.09 across classes of irregular patterns, outperforming alternative feature–metric combinations across all datasets. Our proposed ImR framework performed better (by 8.71%) than Super Global, a state-of-the-art deep-learning-based image retrieval approach across all datasets. Full article
(This article belongs to the Special Issue Advances and Challenges in Multimodal Machine Learning 2nd Edition)
Show Figures

Figure 1

Back to TopTop