Information Processing in Multimedia Applications

A special issue of Information (ISSN 2078-2489). This special issue belongs to the section "Information Applications".

Deadline for manuscript submissions: 31 May 2025 | Viewed by 7437

Special Issue Editors


E-Mail Website
Guest Editor
Faculty of Electrical Engineering, Institute of Control and Industrial Electronics, Warsaw University of Technology, Warsaw, Poland
Interests: computer vision; machine vision; image processing; machine learning; deep learning
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Institute of Information Technology, Warsaw University of Life Sciences-SGGW, Warsaw, Poland
Interests: machine vision; intelligent robotics; digital signal processing; image processing

Special Issue Information

Dear Colleagues,

Multimedia, computer vision, graphics, and machine learning have become ubiquitous in modern information systems, creating new challenges for detection, recognition, indexing, access, search retrieval, automated understanding, and processing, resulting in many applications based on image and signal processing, machine learning, and various multimedia technologies.

Recent advances in pervasive computers, networks, telecommunications, and information technology, along with the proliferation of multimedia mobile devices, have stimulated the rapid development of intelligent applications. These key technologies, using virtual reality, augmented reality, and computational intelligence, are gradually creating a multimedia revolution that will significantly impact a broad spectrum of consumer, business, healthcare, educational, and governmental domains. Advancements in artificial intelligence has resulted in the rapid growth of both machine learning methods and applications in computer vision, image processing, and analysis. The development of parallel computing capabilities in the first decade of the 21st century, which boosted the development of deep neural networks, became a real game-changer in machine vision. This Special Issue covers a range of AI-based theories, methods, algorithms, technologies, and systems for diversified and heterogeneous digital multimedia, imaging, computer graphics, and machine learning areas.

This Special Issue will provide an opportunity for researchers and professionals to discuss the present and future challenges associated with this topic and foster potential collaboration for future progress in these fields. We welcome submissions of original papers concerning all aspects of multimedia, vision, and graphics, ranging from concepts and theoretical developments to advanced technologies and innovative applications. The acceptance and publication of papers will be based on their relevance to the below topics, their clarity of presentation, their originality, and the accuracy of the reported results and proposed solutions.

Topics

Topics of interest are related to the following areas (though this list is not exhaustive):

  • Multimedia processing:
    • Audio, image, and video joint processing;
    • Cloud computing and multimedia applications;
    • Multimedia file systems and databases: indexing, recognition and retrieval;
    • Multimedia in internet and web-based systems;
    • Human–computer interactions, interfaces, and multimedia;
    • Distributed multimedia systems;
    • Network and operating system support for multimedia;
    • Machine learning for mobile network architectures;
    • Trends in multimedia information processing;
    • Multimedia ontology and perceptions for multimedia users.
  • Machine vision, image processing, and analysis:
    • Image enhancement;
    • Linear and non-linear filtering;
    • Object detection and segmentation;
    • Shape analysis;
    • Scene understanding, analysis, and modeling;
    • Image acquisition;
    • Stereo and multispectral imaging;
    • Embedded vision;
    • Robotic vision;
    • Image modeling and transforming;
    • The modeling of human visual perception;
    • Visual knowledge representation and reasoning.
  • Visualization and computer graphics:
    • Computational geometry;
    • Data-driven image synthesis;
    • Graphical data presentation;
    • Computer-aided graphic arts and animation;
    • Virtual and augmented reality;
    • Entertainment, personalized systems, and games.
  • Machine learning for multimedia, vision, and graphics:
    • Pattern recognition;
    • Deep neural models;
    • Convolutional networks;
    • Recurrent networks;
    • Graph networks;
    • Generative adversarial networks;
    • Neural-style transfers;
    • Deep reinforcement learning;
    • Big data and multimedia systems;
    • Machine learning and computational intelligence for information retrieval in multimedia systems;
    • Data mining, warehousing, and knowledge extraction.
  • Applications:
    • Innovative uses of graphic and vision systems;
    • Image retrieval;
    • Autonomous driving systems;
    • Remote sensing;
    • Digital microscopy;
    • Security and surveyance systems;
    • Document analysis;
    • OCR systems;
    • Medical applications and computational biology;
    • Security in multimedia applications—authentication and watermarking;
    • E-learning, e-commerce, and e-society applications;
    • Intelligent multimedia network applications;
    • Future trends in multimedia systems technologies and applications.

Dr. Marcin Iwanowski
Prof. Dr. Andrzej Śluzek
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Information is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • computer vision
  • multimedia processing
  • machine learning
  • deep learning
  • image processing

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (5 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

17 pages, 6436 KiB  
Article
One-Shot Learning from Prototype Stock Keeping Unit Images
by Aleksandra Kowalczyk and Grzegorz Sarwas
Information 2024, 15(9), 526; https://doi.org/10.3390/info15090526 - 28 Aug 2024
Viewed by 837
Abstract
This paper highlights the importance of one-shot learning from prototype Stock Keeping Unit (SKU) images for efficient product recognition in retail and inventory management. Traditional methods require large supervised datasets to train deep neural networks, which can be costly and impractical. One-shot learning [...] Read more.
This paper highlights the importance of one-shot learning from prototype Stock Keeping Unit (SKU) images for efficient product recognition in retail and inventory management. Traditional methods require large supervised datasets to train deep neural networks, which can be costly and impractical. One-shot learning techniques mitigate this issue by enabling classification from a single prototype image per product class, thus reducing data annotation efforts. We introduce the Variational Prototyping Encoder (VPE), a novel deep neural network for one-shot classification. Utilizing a support set of prototype SKU images, VPE learns to classify query images by capturing image similarity and prototypical concepts. Unlike metric learning-based approaches, VPE pre-learns image translation from real-world object images to prototype images as a meta-task, facilitating efficient one-shot classification with minimal supervision. Our research demonstrates that VPE effectively reduces the need for extensive datasets by utilizing a single image per class while accurately classifying query images into their respective categories, thus providing a practical solution for product classification tasks. Full article
(This article belongs to the Special Issue Information Processing in Multimedia Applications)
Show Figures

Figure 1

26 pages, 10462 KiB  
Article
The Optimal Choice of the Encoder–Decoder Model Components for Image Captioning
by Mateusz Bartosiewicz and Marcin Iwanowski
Information 2024, 15(8), 504; https://doi.org/10.3390/info15080504 - 21 Aug 2024
Viewed by 1074
Abstract
Image captioning aims at generating meaningful verbal descriptions of a digital image. This domain is rapidly growing due to the enormous increase in available computational resources. The most advanced methods are, however, resource-demanding. In our paper, we return to the encoder–decoder deep-learning model [...] Read more.
Image captioning aims at generating meaningful verbal descriptions of a digital image. This domain is rapidly growing due to the enormous increase in available computational resources. The most advanced methods are, however, resource-demanding. In our paper, we return to the encoder–decoder deep-learning model and investigate how replacing its components with newer equivalents improves overall effectiveness. The primary motivation of our study is to obtain the highest possible level of improvement of classic methods, which are applicable in less computational environments where most advanced models are too heavy to be efficiently applied. We investigate image feature extractors, recurrent neural networks, word embedding models, and word generation layers and discuss how each component influences the captioning model’s overall performance. Our experiments are performed on the MS COCO 2014 dataset. As a result of our research, replacing components improves the quality of generating image captions. The results will help design efficient models with optimal combinations of their components. Full article
(This article belongs to the Special Issue Information Processing in Multimedia Applications)
Show Figures

Figure 1

15 pages, 4913 KiB  
Article
Deep Learning-Based Monocular Estimation of Distance and Height for Edge Devices
by Jan Gąsienica-Józkowy, Bogusław Cyganek, Mateusz Knapik, Szymon Głogowski and Łukasz Przebinda
Information 2024, 15(8), 474; https://doi.org/10.3390/info15080474 - 9 Aug 2024
Cited by 1 | Viewed by 1632
Abstract
Accurately estimating the absolute distance and height of objects in open areas is quite challenging, especially when based solely on single images. In this paper, we tackle these issues and propose a new method that blends traditional computer vision techniques with advanced neural [...] Read more.
Accurately estimating the absolute distance and height of objects in open areas is quite challenging, especially when based solely on single images. In this paper, we tackle these issues and propose a new method that blends traditional computer vision techniques with advanced neural network-based solutions. Our approach combines object detection and segmentation, monocular depth estimation, and homography-based mapping to provide precise and efficient measurements of absolute height and distance. This solution is implemented on an edge device, allowing for real-time data processing using both visual and thermal data sources. Experimental tests on a height estimation dataset we created show an accuracy of 98.86%, confirming the effectiveness of our method. Full article
(This article belongs to the Special Issue Information Processing in Multimedia Applications)
Show Figures

Figure 1

17 pages, 1746 KiB  
Article
Examining the Roles, Sentiments, and Discourse of European Interest Groups in the Ukrainian War through X (Twitter)
by Aritz Gorostiza-Cerviño, Álvaro Serna-Ortega, Andrea Moreno-Cabanillas, Ana Almansa-Martínez and Antonio Castillo-Esparcia
Information 2024, 15(7), 422; https://doi.org/10.3390/info15070422 - 22 Jul 2024
Viewed by 1095
Abstract
This research focuses on examining the responses of interest groups listed in the European Transparency Register to the ongoing Russia–Ukraine war. Its aim is to investigate the nuanced reactions of 2579 commercial and business associations and 2957 companies and groups to the recent [...] Read more.
This research focuses on examining the responses of interest groups listed in the European Transparency Register to the ongoing Russia–Ukraine war. Its aim is to investigate the nuanced reactions of 2579 commercial and business associations and 2957 companies and groups to the recent conflict, as expressed through their X (Twitter) activities. Utilizing advanced text mining and NLP and LDA techniques, this study conducts a comprehensive analysis encompassing language dynamics, thematic shifts, sentiment variations, and activity levels exhibited by these entities both before and after the outbreak of the war. The results obtained reflect a gradual decrease in negative emotions regarding the conflict over time. Likewise, multiple forms of outside lobbying are identified in the communication strategies of interest groups. All in all, this empirical inquiry into how interest groups adapt their messaging in response to complex geopolitical events holds the potential to provide invaluable insights into the multifaceted role of lobbying in shapi ng public policies. Full article
(This article belongs to the Special Issue Information Processing in Multimedia Applications)
Show Figures

Figure 1

Review

Jump to: Research

52 pages, 2296 KiB  
Review
Digital Sentinels and Antagonists: The Dual Nature of Chatbots in Cybersecurity
by Hannah Szmurlo and Zahid Akhtar
Information 2024, 15(8), 443; https://doi.org/10.3390/info15080443 - 29 Jul 2024
Viewed by 1954
Abstract
Advancements in artificial intelligence, machine learning, and natural language processing have culminated in sophisticated technologies such as transformer models, generative AI models, and chatbots. Chatbots are sophisticated software applications created to simulate conversation with human users. Chatbots have surged in popularity owing to [...] Read more.
Advancements in artificial intelligence, machine learning, and natural language processing have culminated in sophisticated technologies such as transformer models, generative AI models, and chatbots. Chatbots are sophisticated software applications created to simulate conversation with human users. Chatbots have surged in popularity owing to their versatility and user-friendly nature, which have made them indispensable across a wide range of tasks. This article explores the dual nature of chatbots in the realm of cybersecurity and highlights their roles as both defensive tools and offensive tools. On the one hand, chatbots enhance organizational cyber defenses by providing real-time threat responses and fortifying existing security measures. On the other hand, adversaries exploit chatbots to perform advanced cyberattacks, since chatbots have lowered the technical barrier to generate phishing, malware, and other cyberthreats. Despite the implementation of censorship systems, malicious actors find ways to bypass these safeguards. Thus, this paper first provides an overview of the historical development of chatbots and large language models (LLMs), including their functionality, applications, and societal effects. Next, we explore the dualistic applications of chatbots in cybersecurity by surveying the most representative works on both attacks involving chatbots and chatbots’ defensive uses. We also present experimental analyses to illustrate and evaluate different offensive applications of chatbots. Finally, open issues and challenges regarding the duality of chatbots are highlighted and potential future research directions are discussed to promote responsible usage and enhance both offensive and defensive cybersecurity strategies. Full article
(This article belongs to the Special Issue Information Processing in Multimedia Applications)
Show Figures

Figure 1

Back to TopTop