Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (8)

Search Parameters:
Keywords = VidBlock

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
19 pages, 2647 KB  
Article
FDI-VSR: Video Super-Resolution Through Frequency-Domain Integration and Dynamic Offset Estimation
by Donghun Lim and Janghoon Choi
Sensors 2025, 25(8), 2402; https://doi.org/10.3390/s25082402 - 10 Apr 2025
Cited by 1 | Viewed by 1124
Abstract
The increasing adoption of high-resolution imaging sensors across various fields has led to a growing demand for techniques to enhance video quality. Video super-resolution (VSR) addresses this need by reconstructing high-resolution videos from lower-resolution inputs; however, directly applying single-image super-resolution (SISR) methods to [...] Read more.
The increasing adoption of high-resolution imaging sensors across various fields has led to a growing demand for techniques to enhance video quality. Video super-resolution (VSR) addresses this need by reconstructing high-resolution videos from lower-resolution inputs; however, directly applying single-image super-resolution (SISR) methods to video sequences neglects temporal information, resulting in inconsistent and unnatural outputs. In this paper, we propose FDI-VSR, a novel framework that integrates spatiotemporal dynamics and frequency-domain analysis into conventional SISR models without extensive modifications. We introduce two key modules: the Spatiotemporal Feature Extraction Module (STFEM), which employs dynamic offset estimation, spatial alignment, and multi-stage temporal aggregation using residual channel attention blocks (RCABs); and the Frequency–Spatial Integration Module (FSIM), which transforms deep features into the frequency domain to effectively capture global context beyond the limited receptive field of standard convolutions. Extensive experiments on the Vid4, SPMCs, REDS4, and UDM10 benchmarks, supported by detailed ablation studies, demonstrate that FDI-VSR not only surpasses conventional VSR methods but also achieves competitive results compared to recent state-of-the-art methods, with improvements of up to 0.82 dB in PSNR on the SPMCs benchmark and notable reductions in visual artifacts, all while maintaining lower computational complexity and faster inference. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

32 pages, 9318 KB  
Article
VidBlock: A Web3.0-Enabled Decentralized Blockchain Architecture for Live Video Streaming
by Hyunjoo Yang and Sejin Park
Appl. Sci. 2025, 15(3), 1289; https://doi.org/10.3390/app15031289 - 26 Jan 2025
Cited by 1 | Viewed by 2511
Abstract
In the digital era, the demand for real-time streaming services highlights the scalability, data sovereignty, and privacy limitations of traditional centralized systems. VidBlock introduces a novel decentralized blockchain architecture that leverages the blockchain’s immutable and transparent characteristics along with direct communication capabilities. This [...] Read more.
In the digital era, the demand for real-time streaming services highlights the scalability, data sovereignty, and privacy limitations of traditional centralized systems. VidBlock introduces a novel decentralized blockchain architecture that leverages the blockchain’s immutable and transparent characteristics along with direct communication capabilities. This ecosystem revolutionizes content delivery and storage, ensuring high data integrity and user trust. VidBlock’s architecture emphasizes serverless operation, aligning with the principles of decentralization to enhance efficiency and reduce costs. Our contributions include decentralized data management, user-controlled privacy, cost reduction through a serverless architecture, and improved global accessibility. Experiments show that VidBlock is superior in reducing latency and utilizing bandwidth, demonstrating its potential to redefine live video streaming in the Web3.0 era. Full article
Show Figures

Figure 1

19 pages, 5481 KB  
Article
Real-Time Semantic Segmentation Algorithm for Street Scenes Based on Attention Mechanism and Feature Fusion
by Bao Wu, Xingzhong Xiong and Yong Wang
Electronics 2024, 13(18), 3699; https://doi.org/10.3390/electronics13183699 - 18 Sep 2024
Cited by 3 | Viewed by 2237
Abstract
In computer vision, the task of semantic segmentation is crucial for applications such as autonomous driving and intelligent surveillance. However, achieving a balance between real-time performance and segmentation accuracy remains a significant challenge. Although Fast-SCNN is favored for its efficiency and low computational [...] Read more.
In computer vision, the task of semantic segmentation is crucial for applications such as autonomous driving and intelligent surveillance. However, achieving a balance between real-time performance and segmentation accuracy remains a significant challenge. Although Fast-SCNN is favored for its efficiency and low computational complexity, it still faces difficulties when handling complex street scene images. To address this issue, this paper presents an improved Fast-SCNN, aiming to enhance the accuracy and efficiency of semantic segmentation by incorporating a novel attention mechanism and an enhanced feature extraction module. Firstly, the integrated SimAM (Simple, Parameter-Free Attention Module) increases the network’s sensitivity to critical regions of the image and effectively adjusts the feature space weights across channels. Additionally, the refined pyramid pooling module in the global feature extraction module captures a broader range of contextual information through refined pooling levels. During the feature fusion stage, the introduction of an enhanced DAB (Depthwise Asymmetric Bottleneck) block and SE (Squeeze-and-Excitation) attention optimizes the network’s ability to process multi-scale information. Furthermore, the classifier module is extended by incorporating deeper convolutions and more complex convolutional structures, leading to a further improvement in model performance. These enhancements significantly improve the model’s ability to capture details and overall segmentation performance. Experimental results demonstrate that the proposed method excels in processing complex street scene images, achieving a mean Intersection over Union (mIoU) of 71.7% and 69.4% on the Cityscapes and CamVid datasets, respectively, while maintaining inference speeds of 81.4 fps and 113.6 fps. These results indicate that the proposed model effectively improves segmentation quality in complex street scenes while ensuring real-time processing capabilities. Full article
Show Figures

Figure 1

21 pages, 6785 KB  
Article
Multi-Granularity Aggregation with Spatiotemporal Consistency for Video-Based Person Re-Identification
by Hean Sung Lee, Minjung Kim, Sungjun Jang, Han Byeol Bae and Sangyoun Lee
Sensors 2024, 24(7), 2229; https://doi.org/10.3390/s24072229 - 30 Mar 2024
Cited by 2 | Viewed by 1780
Abstract
Video-based person re-identification (ReID) aims to exploit relevant features from spatial and temporal knowledge. Widely used methods include the part- and attention-based approaches for suppressing irrelevant spatial–temporal features. However, it is still challenging to overcome inconsistencies across video frames due to occlusion and [...] Read more.
Video-based person re-identification (ReID) aims to exploit relevant features from spatial and temporal knowledge. Widely used methods include the part- and attention-based approaches for suppressing irrelevant spatial–temporal features. However, it is still challenging to overcome inconsistencies across video frames due to occlusion and imperfect detection. These mismatches make temporal processing ineffective and create an imbalance of crucial spatial information. To address these problems, we propose the Spatiotemporal Multi-Granularity Aggregation (ST-MGA) method, which is specifically designed to accumulate relevant features with spatiotemporally consistent cues. The proposed framework consists of three main stages: extraction, which extracts spatiotemporally consistent partial information; augmentation, which augments the partial information with different granularity levels; and aggregation, which effectively aggregates the augmented spatiotemporal information. We first introduce the consistent part-attention (CPA) module, which extracts spatiotemporally consistent and well-aligned attentive parts. Sub-parts derived from CPA provide temporally consistent semantic information, solving misalignment problems in videos due to occlusion or inaccurate detection, and maximize the efficiency of aggregation through uniform partial information. To enhance the diversity of spatial and temporal cues, we introduce the Multi-Attention Part Augmentation (MA-PA) block, which incorporates fine parts at various granular levels, and the Long-/Short-term Temporal Augmentation (LS-TA) block, designed to capture both long- and short-term temporal relations. Using densely separated part cues, ST-MGA fully exploits and aggregates the spatiotemporal multi-granular patterns by comparing relations between parts and scales. In the experiments, the proposed ST-MGA renders state-of-the-art performance on several video-based ReID benchmarks (i.e., MARS, DukeMTMC-VideoReID, and LS-VID). Full article
Show Figures

Figure 1

16 pages, 2701 KB  
Article
Muti-Frame Point Cloud Feature Fusion Based on Attention Mechanisms for 3D Object Detection
by Zhenyu Zhai, Qiantong Wang, Zongxu Pan, Zhentong Gao and Wenlong Hu
Sensors 2022, 22(19), 7473; https://doi.org/10.3390/s22197473 - 2 Oct 2022
Cited by 12 | Viewed by 4113
Abstract
Continuous frames of point-cloud-based object detection is a new research direction. Currently, most research studies fuse multi-frame point clouds using concatenation-based methods. The method aligns different frames by using information on GPS, IMU, etc. However, this fusion method can only align static objects [...] Read more.
Continuous frames of point-cloud-based object detection is a new research direction. Currently, most research studies fuse multi-frame point clouds using concatenation-based methods. The method aligns different frames by using information on GPS, IMU, etc. However, this fusion method can only align static objects and not moving objects. In this paper, we proposed a non-local-based multi-scale feature fusion method, which can handle both moving and static objects without GPS- and IMU-based registrations. Considering that non-local methods are resource-consuming, we proposed a novel simplified non-local block based on the sparsity of the point cloud. By filtering out empty units, memory consumption decreased by 99.93%. In addition, triple attention is adopted to enhance the key information on the object and suppresses background noise, further benefiting non-local-based feature fusion methods. Finally, we verify the method based on PointPillars and CenterPoint. Experimental results show that the mAP of the proposed method improved by 3.9% and 4.1% in mAP compared with concatenation-based fusion modules, PointPillars-2 and CenterPoint-2, respectively. In addition, the proposed network outperforms powerful 3D-VID by 1.2% in mAP. Full article
(This article belongs to the Special Issue Artificial Intelligence and Smart Sensors for Autonomous Driving)
Show Figures

Figure 1

18 pages, 2706 KB  
Article
An Intelligent Tracking System for Moving Objects in Dynamic Environments
by Nada Ali Hakami, Hanan Ahmed Hosni Mahmoud and Abeer Abdulaziz AlArfaj
Actuators 2022, 11(10), 274; https://doi.org/10.3390/act11100274 - 25 Sep 2022
Cited by 2 | Viewed by 2189
Abstract
Localization of suspicious moving objects in dynamic environments requires high accuracy mapping. A deep learning model is proposed to track crossing moving objects in the opposite direction. Moving objects locus measurements are computed from the space included in the boundaries of the images [...] Read more.
Localization of suspicious moving objects in dynamic environments requires high accuracy mapping. A deep learning model is proposed to track crossing moving objects in the opposite direction. Moving objects locus measurements are computed from the space included in the boundaries of the images in the intersecting cameras. Object appearance is designated by the color and textural histograms in the intersecting camera views. The incorrect mapping of moving objects in a dynamic environment through synchronized localization can be considerably increased in complex areas. This is done due to the presence of unfit points that are triggered by moving targets. To face this problem, a robust model using the dynamic province rejection technique (DPR) is presented. We are proposing a novel model that incorporates a combination of the deep learning method and a tracking system that rejects dynamic areas which are not within the environment boundary of interest. The technique detects the dynamic points from sequential video images and partitions the current video image into super blocks and tags the border differences. In the last stage, dynamic areas are computed from dynamic points and superblock boundaries. Static regions are utilized to compute the positions to enhance the path computation precision of the model. Simulation results show that the introduced model has better performance than the state-of-the-art similar models in both the VID and MOVSD4 datasets and is higher than the state-of-the-art tracking systems with better speed performance. The experiments prove that the computed path error in the dynamic setting can be decreased by 81%. Full article
(This article belongs to the Special Issue Advanced Technologies and Applications in Robotics)
Show Figures

Figure 1

12 pages, 268 KB  
Article
Productivity and Quality of Garlic Produced Using Below-Zero Temperatures When Treating Seed Cloves
by José Magno Queiroz Luz, Breno Nunes Rodrigues de Azevedo, Sérgio Macedo Silva, Carlos Inácio Garcia de Oliveira, Túlio Garcia de Oliveira, Roberta Camargos de Oliveira and Renata Castoldi
Horticulturae 2022, 8(2), 96; https://doi.org/10.3390/horticulturae8020096 - 21 Jan 2022
Cited by 9 | Viewed by 4007
Abstract
Garlic cultivation has increased in Brazil in recent years primarily due to the adoption of appropriate technologies, such as the use of low temperatures during the maintenance of garlic seeds to overcome dormancy. However, there is no information on the effects of below-zero [...] Read more.
Garlic cultivation has increased in Brazil in recent years primarily due to the adoption of appropriate technologies, such as the use of low temperatures during the maintenance of garlic seeds to overcome dormancy. However, there is no information on the effects of below-zero temperatures when treating seed cloves on garlic development. Therefore, this study’s objective was to evaluate the effects of below-zero temperatures and different visual indices of overcoming dormancy (VIDs) on garlic performance in Cristalina County, Goias State, Brazil. The experiment was conducted in a randomized block design with four replicates in a 2 × 3 factorial scheme: with two VIDs (40% and 60%), and three temperature ranges (−1 to −3 °C, 1 to 3 °C, and 2 to 4 °C). Vegetative characteristics, bulbar ratios, and commercial bulb yields were evaluated. The results showed that below-zero temperatures resulted in better vegetative characteristics. The yield increased after using below-zero temperatures to treat seed cloves with a VID of 60%. The garlic produced had a higher market value. We concluded that there is an enormous potential for using below-zero temperatures to improve the performance of the “Ito” garlic variety, and more studies should be conducted with other varieties of economic importance to enhance Brazilian garlic production. Full article
15 pages, 854 KB  
Article
DSTnet: Deformable Spatio-Temporal Convolutional Residual Network for Video Super-Resolution
by Anusha Khan, Allah Bux Sargano and Zulfiqar Habib
Mathematics 2021, 9(22), 2873; https://doi.org/10.3390/math9222873 - 12 Nov 2021
Cited by 1 | Viewed by 2839
Abstract
Video super-resolution (VSR) aims at generating high-resolution (HR) video frames with plausible and temporally consistent details using their low-resolution (LR) counterparts, and neighboring frames. The key challenge for VSR lies in the effective exploitation of intra-frame spatial relation and temporal dependency between consecutive [...] Read more.
Video super-resolution (VSR) aims at generating high-resolution (HR) video frames with plausible and temporally consistent details using their low-resolution (LR) counterparts, and neighboring frames. The key challenge for VSR lies in the effective exploitation of intra-frame spatial relation and temporal dependency between consecutive frames. Many existing techniques utilize spatial and temporal information separately and compensate motion via alignment. These methods cannot fully exploit the spatio-temporal information that significantly affects the quality of resultant HR videos. In this work, a novel deformable spatio-temporal convolutional residual network (DSTnet) is proposed to overcome the issues of separate motion estimation and compensation methods for VSR. The proposed framework consists of 3D convolutional residual blocks decomposed into spatial and temporal (2+1) D streams. This decomposition can simultaneously utilize input video’s spatial and temporal features without a separate motion estimation and compensation module. Furthermore, the deformable convolution layers have been used in the proposed model that enhances its motion-awareness capability. Our contribution is twofold; firstly, the proposed approach can overcome the challenges in modeling complex motions by efficiently using spatio-temporal information. Secondly, the proposed model has fewer parameters to learn than state-of-the-art methods, making it a computationally lean and efficient framework for VSR. Experiments are conducted on a benchmark Vid4 dataset to evaluate the efficacy of the proposed approach. The results demonstrate that the proposed approach achieves superior quantitative and qualitative performance compared to the state-of-the-art methods. Full article
(This article belongs to the Special Issue Computer Graphics, Image Processing and Artificial Intelligence)
Show Figures

Figure 1

Back to TopTop