Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (5)

Search Parameters:
Keywords = portrait matting

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
17 pages, 6315 KB  
Article
RVM+: An AI-Driven Vision Sensor Framework for High-Precision, Real-Time Video Portrait Segmentation with Enhanced Temporal Consistency and Optimized Model Design
by Na Tang, Yuehui Liao, Yu Chen, Guang Yang, Xiaobo Lai and Jing Chen
Sensors 2025, 25(5), 1278; https://doi.org/10.3390/s25051278 - 20 Feb 2025
Viewed by 2123
Abstract
Video portrait segmentation is essential for intelligent sensing systems, including human-computer interaction, autonomous navigation, and augmented reality. However, dynamic video environments introduce significant challenges, such as temporal variations, occlusions, and computational constraints. This study introduces RVM+, an enhanced video segmentation framework based on [...] Read more.
Video portrait segmentation is essential for intelligent sensing systems, including human-computer interaction, autonomous navigation, and augmented reality. However, dynamic video environments introduce significant challenges, such as temporal variations, occlusions, and computational constraints. This study introduces RVM+, an enhanced video segmentation framework based on the Robust Video Matting (RVM) architecture. By incorporating Convolutional Gated Recurrent Units (ConvGRU), RVM+ improves temporal consistency and captures intricate temporal dynamics across video frames. Additionally, a novel knowledge distillation strategy reduces computational demands while maintaining high segmentation accuracy, making the framework ideal for real-time applications in resource-constrained environments. Comprehensive evaluations on challenging datasets show that RVM+ outperforms state-of-the-art methods in both segmentation accuracy and temporal consistency. Key performance indicators such as MIoU, SAD, and dtSSD effectively verify the robustness and efficiency of the model. The integration of knowledge distillation ensures a streamlined and effective design with negligible accuracy trade-offs, highlighting its suitability for practical deployment. This study makes significant strides in intelligent sensor technology, providing a high-performance, efficient, and scalable solution for video segmentation. RVM+ offers potential for applications in fields such as augmented reality, robotics, and real-time video analysis, while also advancing the development of AI-enabled vision sensors. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

13 pages, 2924 KB  
Article
Matting Algorithm with Improved Portrait Details for Images with Complex Backgrounds
by Rui Li, Dan Zhang, Sheng-Ling Geng and Ming-Quan Zhou
Appl. Sci. 2024, 14(5), 1942; https://doi.org/10.3390/app14051942 - 27 Feb 2024
Cited by 1 | Viewed by 4145
Abstract
With the continuous development of virtual reality, digital image applications, the required complex scene video proliferates. For this reason, portrait matting has become a popular topic. In this paper, a new matting algorithm with improved portrait details for images with complex backgrounds (MORLIPO) [...] Read more.
With the continuous development of virtual reality, digital image applications, the required complex scene video proliferates. For this reason, portrait matting has become a popular topic. In this paper, a new matting algorithm with improved portrait details for images with complex backgrounds (MORLIPO) is proposed. This work combines the background restoration module (BRM) and the fine-grained matting module (FGMatting) to achieve high-detail matting for images with complex backgrounds. We recover the background by inputting a single image or video, which serves as a priori and aids in generating a more accurate alpha matte. The main framework uses the image matting model MODNet, the MobileNetV2 lightweight network, and the background restoration module, which can both preserve the background information of the current image and provide a more accurate prediction of the alpha matte of the current frame for the video image. It also provides the background prior of the previous frame to predict the alpha matte of the current frame more accurately. The fine-grained matting module is designed to extract fine-grained details of the foreground and retain the features, while combining with the semantic module to achieve more accurate matting. Our design allows training on a single NVIDIA 3090 GPU in an end-to-end manner and experiments on publicly available data sets. Experimental validation shows that our method performs well on both visual effects and objective evaluation metrics. Full article
Show Figures

Figure 1

15 pages, 2064 KB  
Article
Portrait Semantic Segmentation Method Based on Dual Modal Information Complementarity
by Guang Feng and Chong Tang
Appl. Sci. 2024, 14(4), 1439; https://doi.org/10.3390/app14041439 - 9 Feb 2024
Viewed by 1585
Abstract
Semantic segmentation of human images is a research hotspot in the field of computer vision. At present, the semantic segmentation models based on U-net generally lack the ability to capture the spatial information of images. At the same time, semantic incompatibility exists because [...] Read more.
Semantic segmentation of human images is a research hotspot in the field of computer vision. At present, the semantic segmentation models based on U-net generally lack the ability to capture the spatial information of images. At the same time, semantic incompatibility exists because the feature maps of encoder and decoder are directly connected in the skip connection stage. In addition, in low light scenes such as at night, it is easy for false segmentation and segmentation accuracy to appear. To solve the above problems, a portrait semantic segmentation method based on dual-modal information complementarity is proposed. The encoder adopts a double branch structure, and uses a SK-ASSP module that can adaptively adjust the convolution weights of different receptor fields to extract features in RGB and gray image modes respectively, and carries out cross-modal information complementarity and feature fusion. A hybrid attention mechanism is used in the jump connection phase to capture both the channel and coordinate context information of the image. Experiments on human matting dataset show that the PA and MIoU coefficients of this algorithm model reach 96.58% and 94.48% respectively, which is better than U-net benchmark model and other mainstream semantic segmentation models. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

12 pages, 3600 KB  
Article
Bust Portraits Matting Based on Improved U-Net
by Honggang Xie, Kaiyuan Hou, Di Jiang and Wanjie Ma
Electronics 2023, 12(6), 1378; https://doi.org/10.3390/electronics12061378 - 14 Mar 2023
Cited by 4 | Viewed by 3142
Abstract
Extracting complete portrait foregrounds from natural images is widely used in image editing and high-definition map generation. When making high-definition maps, it is often necessary to matte passers-by to guarantee their privacy. Current matting methods that do not require additional trimap inputs often [...] Read more.
Extracting complete portrait foregrounds from natural images is widely used in image editing and high-definition map generation. When making high-definition maps, it is often necessary to matte passers-by to guarantee their privacy. Current matting methods that do not require additional trimap inputs often suffer from inaccurate global predictions or blurred local details. Portrait matting, as a soft segmentation method, allows the creation of excess areas during segmentation, which inevitably leads to noise in the resulting alpha image as well as excess foreground information, so it is not necessary to keep all the excess areas. To overcome the above problems, this paper designed a contour sharpness refining network (CSRN) that modifies the weight of the alpha values of uncertain regions in the prediction map. It is combined with an end-to-end matting network for bust matting based on the U-Net target detection network containing Residual U-blocks. An end-to-end matting network for bust matting is designed. The network can effectively reduce the image noise without affecting the complete foreground information obtained by the deeper network, thus obtaining a more detailed foreground image with fine edge details. The network structure has been tested on the PPM-100, the RealWorldPortrait-636, and a self-built dataset, showing excellent performance in both edge refinement and global prediction for half-figure portraits. Full article
(This article belongs to the Special Issue Computer Vision for Modern Vehicles)
Show Figures

Figure 1

18 pages, 21778 KB  
Article
Semi-Supervised Portrait Matting via the Collaboration of Teacher–Student Network and Adaptive Strategies
by Xinyue Zhang, Guodong Wang, Chenglizhao Chen, Hao Dong and Mingju Shao
Electronics 2022, 11(24), 4080; https://doi.org/10.3390/electronics11244080 - 8 Dec 2022
Cited by 1 | Viewed by 1942
Abstract
In the portrait matting domain, existing methods rely entirely on annotated images for learning. However, delicate manual annotations are time-consuming and there are few detailed datasets available. To reduce complete dependency on labeled datasets, we design a semi-supervised network (ASSN) with two kinds [...] Read more.
In the portrait matting domain, existing methods rely entirely on annotated images for learning. However, delicate manual annotations are time-consuming and there are few detailed datasets available. To reduce complete dependency on labeled datasets, we design a semi-supervised network (ASSN) with two kinds of innovative adaptive strategies for portrait matting. Three pivotal sub-modules are embedded in our architecture, including a static teacher network (S-TN), a static student network (S-SN), and an adaptive student network (A-SN). S-TN and S-SN are modules that need to be trained with a small number of high-quality labeled datasets. Moreover, A-SN and S-SN share the same module parameters. When processing unlabeled datasets, A-SN adopts the adaptive strategies designed by us to discard the dependence on labeled datasets. The adaptive strategies include: (i) An auxiliary adaption: The teacher network with complicated design not only provides alpha mattes for the adaptive student network but also transmits rough segmentation results and edge graphs as optimization reference standards. (ii) A self-adjusting adaption: The adaptive network can make self-supervised to the characteristics of different layers. In addition, we have produced a finely annotated dataset for scholars in the field. Compared with existing datasets, our dataset complements the following two types of data neglected in previous datasets: (i) Images taken by multiple people. (ii) Images under low light conditions. Full article
Show Figures

Figure 1

Back to TopTop