Editorial

5 pages, 157 KiB

Open AccessEditorial

by Oscar Reinoso and Luis Payá

Sensors 2020, 20(3), 910; https://doi.org/10.3390/s20030910 - 08 Feb 2020

Cited by 8 | Viewed by 2223

Visual sensors have characteristics that make them interesting as sources of information for any process or system [...] Full article

(This article belongs to the Special Issue Visual Sensors)

Research

Jump to: Editorial

15 pages, 6147 KiB

Open AccessArticle

An Improved Point Cloud Descriptor for Vision Based Robotic Grasping System

by Fei Wang, Chen Liang, Changlei Ru and Hongtai Cheng

Sensors 2019, 19(10), 2225; https://doi.org/10.3390/s19102225 - 14 May 2019

Cited by 17 | Viewed by 3524

Abstract

In this paper, a novel global point cloud descriptor is proposed for reliable object recognition and pose estimation, which can be effectively applied to robot grasping operation. The viewpoint feature histogram (VFH) is widely used in three-dimensional (3D) object recognition and pose estimation [...] Read more.

In this paper, a novel global point cloud descriptor is proposed for reliable object recognition and pose estimation, which can be effectively applied to robot grasping operation. The viewpoint feature histogram (VFH) is widely used in three-dimensional (3D) object recognition and pose estimation in real scene obtained by depth sensor because of its recognition performance and computational efficiency. However, when the object has a mirrored structure, it is often difficult to distinguish the mirrored poses relative to the viewpoint using VFH. In order to solve this difficulty, this study presents an improved feature descriptor named orthogonal viewpoint feature histogram (OVFH), which contains two components: a surface shape component and an improved viewpoint direction component. The improved viewpoint component is calculated by the orthogonal vector of the viewpoint direction, which is obtained based on the reference frame estimated for the entire point cloud. The evaluation of OVFH using a publicly available data set indicates that it enhances the ability to distinguish between mirrored poses while ensuring object recognition performance. The proposed method uses OVFH to recognize and register objects in the database and obtains precise poses by using the iterative closest point (ICP) algorithm. The experimental results show that the proposed approach can be effectively applied to guide the robot to grasp objects with mirrored poses. Full article

(This article belongs to the Special Issue Visual Sensors)

► Show Figures

Figure 1

24 pages, 7366 KiB

Open AccessArticle

An Optimized Tightly-Coupled VIO Design on the Basis of the Fused Point and Line Features for Patrol Robot Navigation

by Linlin Xia, Qingyu Meng, Deru Chi, Bo Meng and Hanrui Yang

Sensors 2019, 19(9), 2004; https://doi.org/10.3390/s19092004 - 29 Apr 2019

Cited by 6 | Viewed by 3820

Abstract

The development and maturation of simultaneous localization and mapping (SLAM) in robotics opens the door to the application of a visual inertial odometry (VIO) to the robot navigation system. For a patrol robot with no available Global Positioning System (GPS) support, the embedded [...] Read more.

The development and maturation of simultaneous localization and mapping (SLAM) in robotics opens the door to the application of a visual inertial odometry (VIO) to the robot navigation system. For a patrol robot with no available Global Positioning System (GPS) support, the embedded VIO components, which are generally composed of an Inertial Measurement Unit (IMU) and a camera, fuse the inertial recursion with SLAM calculation tasks, and enable the robot to estimate its location within a map. The highlights of the optimized VIO design lie in the simplified VIO initialization strategy as well as the fused point and line feature-matching based method for efficient pose estimates in the front-end. With a tightly-coupled VIO anatomy, the system state is explicitly expressed in a vector and further estimated by the state estimator. The consequent problems associated with the data association, state optimization, sliding window and timestamp alignment in the back-end are discussed in detail. The dataset tests and real substation scene tests are conducted, and the experimental results indicate that the proposed VIO can realize the accurate pose estimation with a favorable initializing efficiency and eminent map representations as expected in concerned environments. The proposed VIO design can therefore be recognized as a preferred tool reference for a class of visual and inertial SLAM application domains preceded by no external location reference support hypothesis. Full article

(This article belongs to the Special Issue Visual Sensors)

► Show Figures

Figure 1

23 pages, 6641 KiB

Open AccessArticle

Star Image Prediction and Restoration under Dynamic Conditions

by Di Liu, Xiyuan Chen, Xiao Liu and Chunfeng Shi

Sensors 2019, 19(8), 1890; https://doi.org/10.3390/s19081890 - 20 Apr 2019

Cited by 15 | Viewed by 3050

Abstract

The star sensor is widely used in attitude control systems of spacecraft for attitude measurement. However, under high dynamic conditions, frame loss and smearing of the star image may appear and result in decreased accuracy or even failure of the star centroid extraction [...] Read more.

The star sensor is widely used in attitude control systems of spacecraft for attitude measurement. However, under high dynamic conditions, frame loss and smearing of the star image may appear and result in decreased accuracy or even failure of the star centroid extraction and attitude determination. To improve the performance of the star sensor under dynamic conditions, a gyroscope-assisted star image prediction method and an improved Richardson-Lucy (RL) algorithm based on the ensemble back-propagation neural network (EBPNN) are proposed. First, for the frame loss problem of the star sensor, considering the distortion of the star sensor lens, a prediction model of the star spot position is obtained by the angular rates of the gyroscope. Second, to restore the smearing star image, the point spread function (PSF) is calculated by the angular velocity of the gyroscope. Then, we use the EBPNN to predict the number of iterations required by the RL algorithm to complete the star image deblurring. Finally, simulation experiments are performed to verify the effectiveness and real-time of the proposed algorithm. Full article

(This article belongs to the Special Issue Visual Sensors)

► Show Figures

Figure 1

15 pages, 24172 KiB

Open AccessArticle

Vision for Robust Robot Manipulation

by Ester Martinez-Martin and Angel P. del Pobil

Sensors 2019, 19(7), 1648; https://doi.org/10.3390/s19071648 - 06 Apr 2019

Cited by 9 | Viewed by 4745

Abstract

Advances in Robotics are leading to a new generation of assistant robots working in ordinary, domestic settings. This evolution raises new challenges in the tasks to be accomplished by the robots. This is the case for object manipulation where the detect-approach-grasp loop requires [...] Read more.

Advances in Robotics are leading to a new generation of assistant robots working in ordinary, domestic settings. This evolution raises new challenges in the tasks to be accomplished by the robots. This is the case for object manipulation where the detect-approach-grasp loop requires a robust recovery stage, especially when the held object slides. Several proprioceptive sensors have been developed in the last decades, such as tactile sensors or contact switches, that can be used for that purpose; nevertheless, their implementation may considerably restrict the gripper’s flexibility and functionality, increasing their cost and complexity. Alternatively, vision can be used since it is an undoubtedly rich source of information, and in particular, depth vision sensors. We present an approach based on depth cameras to robustly evaluate the manipulation success, continuously reporting about any object loss and, consequently, allowing it to robustly recover from this situation. For that, a Lab-colour segmentation allows the robot to identify potential robot manipulators in the image. Then, the depth information is used to detect any edge resulting from two-object contact. The combination of those techniques allows the robot to accurately detect the presence or absence of contact points between the robot manipulator and a held object. An experimental evaluation in realistic indoor environments supports our approach. Full article

(This article belongs to the Special Issue Visual Sensors)

► Show Figures

Figure 1

17 pages, 2804 KiB

Open AccessArticle

2D Rotation-Angle Measurement Utilizing Least Iterative Region Segmentation

by Chenguang Cao and Qi Ouyang

Sensors 2019, 19(7), 1634; https://doi.org/10.3390/s19071634 - 05 Apr 2019

Cited by 4 | Viewed by 4051

Abstract

When geometric moments are used to measure the rotation-angle of plane workpieces, the same rotation angle would be obtained with dissimilar poses. Such a case would be shown as an error in an automatic sorting system. Here, we present an improved rotation-angle measurement [...] Read more.

When geometric moments are used to measure the rotation-angle of plane workpieces, the same rotation angle would be obtained with dissimilar poses. Such a case would be shown as an error in an automatic sorting system. Here, we present an improved rotation-angle measurement method based on geometric moments, which is suitable for automatic sorting systems. The method can overcome this limitation to obtain accurate results. The accuracy, speed, and generality of this method are analyzed in detail. In addition, a rotation-angle measurement error model is established to study the effect of camera pose on the rotation-angle measurement accuracy. We find that a rotation-angle measurement error will occur with a non-ideal camera pose. Thus, a correction method is proposed to increase accuracy and reduce the measurement error caused by camera pose. Finally, an automatic sorting system is developed, and experiments are conducted to verify the effectiveness of our methods. The experimental results show that the rotation angles are accurately obtained and workpieces could be correctly placed by this system. Full article

(This article belongs to the Special Issue Visual Sensors)

► Show Figures

Figure 1

20 pages, 5941 KiB

Open AccessArticle

A Convenient Calibration Method for LRF-Camera Combination Systems Based on a Checkerboard

by Zhuang Zhang, Rujin Zhao, Enhai Liu, Kun Yan and Yuebo Ma

Sensors 2019, 19(6), 1315; https://doi.org/10.3390/s19061315 - 15 Mar 2019

Cited by 7 | Viewed by 3244

Abstract

In this paper, a simple and easy high-precision calibration method is proposed for the LRF-camera combined measurement system which is widely used at present. This method can be applied not only to mainstream 2D and 3D LRF-cameras, but also to calibrate newly developed [...] Read more.

In this paper, a simple and easy high-precision calibration method is proposed for the LRF-camera combined measurement system which is widely used at present. This method can be applied not only to mainstream 2D and 3D LRF-cameras, but also to calibrate newly developed 1D LRF-camera combined systems. It only needs a calibration board to record at least three sets of data. First, the camera parameters and distortion coefficients are decoupled by the distortion center. Then, the spatial coordinates of laser spots are solved using line and plane constraints, and the estimation of LRF-camera extrinsic parameters is realized. In addition, we establish a cost function for optimizing the system. Finally, the calibration accuracy and characteristics of the method are analyzed through simulation experiments, and the validity of the method is verified through the calibration of a real system. Full article

(This article belongs to the Special Issue Visual Sensors)

► Show Figures

Figure 1

18 pages, 8450 KiB

Open AccessArticle

A Vision Based Detection Method for Narrow Butt Joints and a Robotic Seam Tracking System

by Boce Xue, Baohua Chang, Guodong Peng, Yanjun Gao, Zhijie Tian, Dong Du and Guoqing Wang

Sensors 2019, 19(5), 1144; https://doi.org/10.3390/s19051144 - 06 Mar 2019

Cited by 49 | Viewed by 4759

Abstract

Automatic joint detection is of vital importance for the teaching of robots before welding and the seam tracking during welding. For narrow butt joints, the traditional structured light method may be ineffective, and many existing detection methods designed for narrow butt joints can [...] Read more.

Automatic joint detection is of vital importance for the teaching of robots before welding and the seam tracking during welding. For narrow butt joints, the traditional structured light method may be ineffective, and many existing detection methods designed for narrow butt joints can only detect their 2D position. However, for butt joints with narrow gaps and 3D trajectories, their 3D position and orientation of the workpiece surface are required. In this paper, a vision based detection method for narrow butt joints is proposed. A crosshair laser is projected onto the workpiece surface and an auxiliary light source is used to illuminate the workpiece surface continuously. Then, images with an appropriate grayscale distribution are grabbed with the auto exposure function of the camera. The 3D position of the joint and the normal vector of the workpiece surface are calculated by the combination of the 2D and 3D information in the images. In addition, the detection method is applied in a robotic seam tracking system for GTAW (gas tungsten arc welding). Different filtering methods are used to smooth the detection results, and compared with the moving average method, the Kalman filter can reduce the dithering of the robot and improve the tracking accuracy significantly. Full article

(This article belongs to the Special Issue Visual Sensors)

► Show Figures

Figure 1

13 pages, 9072 KiB

Open AccessArticle

RGB-D SLAM with Manhattan Frame Estimation Using Orientation Relevance

by Liang Wang and Zhiqiu Wu

Sensors 2019, 19(5), 1050; https://doi.org/10.3390/s19051050 - 01 Mar 2019

Cited by 14 | Viewed by 2600

Abstract

Due to image noise, image blur, and inconsistency between depth data and color image, the accuracy and robustness of the pairwise spatial transformation computed by matching extracted features of detected key points in existing sparse Red Green Blue-Depth (RGB-D) Simultaneously Localization And Mapping [...] Read more.

Due to image noise, image blur, and inconsistency between depth data and color image, the accuracy and robustness of the pairwise spatial transformation computed by matching extracted features of detected key points in existing sparse Red Green Blue-Depth (RGB-D) Simultaneously Localization And Mapping (SLAM) algorithms are poor. Considering that most indoor environments follow the Manhattan World assumption and the Manhattan Frame can be used as a reference to compute the pairwise spatial transformation, a new RGB-D SLAM algorithm is proposed. It first performs the Manhattan Frame Estimation using the introduced concept of orientation relevance. Then the pairwise spatial transformation between two RGB-D frames is computed with the Manhattan Frame Estimation. Finally, the Manhattan Frame Estimation using orientation relevance is incorporated into the RGB-D SLAM to improve its performance. Experimental results show that the proposed RGB-D SLAM algorithm has definite improvements in accuracy, robustness, and runtime. Full article

(This article belongs to the Special Issue Visual Sensors)

► Show Figures

Figure 1

22 pages, 3501 KiB

Open AccessArticle

Boosting Texture-Based Classification by Describing Statistical Information of Gray-Levels Differences

by Óscar García-Olalla, Laura Fernández-Robles, Enrique Alegre, Manuel Castejón-Limas and Eduardo Fidalgo

Sensors 2019, 19(5), 1048; https://doi.org/10.3390/s19051048 - 01 Mar 2019

Cited by 7 | Viewed by 2668

Abstract

This paper presents a new texture descriptor booster, Complete Local Oriented Statistical Information Booster (CLOSIB), based on statistical information of the image. Our proposal uses the statistical information of the texture provided by the image gray-levels differences to increase the discriminative capability of [...] Read more.

This paper presents a new texture descriptor booster, Complete Local Oriented Statistical Information Booster (CLOSIB), based on statistical information of the image. Our proposal uses the statistical information of the texture provided by the image gray-levels differences to increase the discriminative capability of Local Binary Patterns (LBP)-based and other texture descriptors. We demonstrated that Half-CLOSIB and M-CLOSIB versions are more efficient and precise than the general one. H-CLOSIB may eliminate redundant statistical information and the multi-scale version, M-CLOSIB, is more robust. We evaluated our method using four datasets: KTH TIPS (2-a) for material recognition, UIUC and USPTex for general texture recognition and JAFFE for face recognition. The results show that when we combine CLOSIB with well-known LBP-based descriptors, the hit rate increases in all the cases, introducing in this way the idea that CLOSIB can be used to enhance the description of texture in a significant number of situations. Additionally, a comparison with recent algorithms demonstrates that a combination of LBP methods with CLOSIB variants obtains comparable results to those of the state-of-the-art. Full article

(This article belongs to the Special Issue Visual Sensors)

► Show Figures

Figure 1

12 pages, 5003 KiB

Open AccessArticle

A Stereo-Vision System for Measuring the Ram Speed of Steam Hammers in an Environment with a Large Field of View and Strong Vibrations

by Ran Chen, Zhongwei Li, Kai Zhong, Xingjian Liu, Yonghui Wu, Congjun Wang and Yusheng Shi

Sensors 2019, 19(5), 996; https://doi.org/10.3390/s19050996 - 26 Feb 2019

Cited by 15 | Viewed by 3566

Abstract

The ram speed of a steam hammer is an important parameter that directly affects the forming performance of forgers. This parameter must be monitored regularly in practical applications in industry. Because of the complex and dangerous industrial environment of forging equipment, non-contact measurement [...] Read more.

The ram speed of a steam hammer is an important parameter that directly affects the forming performance of forgers. This parameter must be monitored regularly in practical applications in industry. Because of the complex and dangerous industrial environment of forging equipment, non-contact measurement methods, such as stereo vision, might be optimal. However, in actual application, the field of view (FOV) required to measure the steam hammer is extremely large, with a value of 2–3 m, and heavy steam hammer, at high-speed, usually causes a strong vibration. These two factors combine to sacrifice the accuracy of measurements, and can even cause the failure of measurements. To solve these issues, a bundle-adjustment-principle-based system calibration method is proposed to realize high-accuracy calibration for a large FOV, which can obtain accurate calibration results when the calibration target is not precisely manufactured. To decrease the influence of strong vibration, a stationary world coordinate system was built, and the external parameters were recalibrated during the entire measurement process. The accuracy and effectiveness of the proposed technique were verified by an experiment to measure the ram speed of a counterblow steam hammer in a die forging device. Full article

(This article belongs to the Special Issue Visual Sensors)

► Show Figures

Figure 1

15 pages, 22543 KiB

Open AccessArticle

High-Accuracy Globally Consistent Surface Reconstruction Using Fringe Projection Profilometry

by Xu Cheng, Xingjian Liu, Zhongwei Li, Kai Zhong, Liya Han, Wantao He, Wanbing Gan, Guoqing Xi, Congjun Wang and Yusheng Shi

Sensors 2019, 19(3), 668; https://doi.org/10.3390/s19030668 - 06 Feb 2019

Cited by 13 | Viewed by 4110

Abstract

This paper presents a high-accuracy method for globally consistent surface reconstruction using a single fringe projection profilometry (FPP) sensor. To solve the accumulated sensor pose estimation error problem encountered in a long scanning trajectory, we first present a novel 3D registration method which [...] Read more.

This paper presents a high-accuracy method for globally consistent surface reconstruction using a single fringe projection profilometry (FPP) sensor. To solve the accumulated sensor pose estimation error problem encountered in a long scanning trajectory, we first present a novel 3D registration method which fuses both dense geometric and curvature consistency constraints to improve the accuracy of relative sensor pose estimation. Then we perform global sensor pose optimization by modeling the surface consistency information as a pre-computed covariance matrix and formulating the multi-view point cloud registration problem in a pose graph optimization framework. Experiments on reconstructing a 1300 mm × 400 mm workpiece with a FPP sensor is performed, verifying that our method can substantially reduce the accumulated error and achieve industrial-level surface model reconstruction without any external positional assistance but only using a single FPP sensor. Full article

(This article belongs to the Special Issue Visual Sensors)

► Show Figures

Figure 1

22 pages, 2172 KiB

Open AccessArticle

Appearance-Based Salient Regions Detection Using Side-Specific Dictionaries

by Mian Muhammad Sadiq Fareed, Qi Chun, Gulnaz Ahmed, Adil Murtaza, Muhammad Rizwan Asif and Muhammad Zeeshan Fareed

Sensors 2019, 19(2), 421; https://doi.org/10.3390/s19020421 - 21 Jan 2019

Cited by 2 | Viewed by 3536

Abstract

Image saliency detection is a very helpful step in many computer vision-based smart systems to reduce the computational complexity by only focusing on the salient parts of the image. Currently, the image saliency is detected through representation-based generative schemes, as these schemes are [...] Read more.

Image saliency detection is a very helpful step in many computer vision-based smart systems to reduce the computational complexity by only focusing on the salient parts of the image. Currently, the image saliency is detected through representation-based generative schemes, as these schemes are helpful for extracting the concise representations of the stimuli and to capture the high-level semantics in visual information with a small number of active coefficients. In this paper, we propose a novel framework for salient region detection that uses appearance-based and regression-based schemes. The framework segments the image and forms reconstructive dictionaries from four sides of the image. These side-specific dictionaries are further utilized to obtain the saliency maps of the sides. A unified version of these maps is subsequently employed by a representation-based model to obtain a contrast-based salient region map. The map is used to obtain two regression-based maps with LAB and RGB color features that are unified through the optimization-based method to achieve the final saliency map. Furthermore, the side-specific reconstructive dictionaries are extracted from the boundary and the background pixels, which are enriched with geometrical and visual information. The approach has been thoroughly evaluated on five datasets and compared with the seven most recent approaches. The simulation results reveal that our model performs favorably in comparison with the current saliency detection schemes. Full article

(This article belongs to the Special Issue Visual Sensors)

► Show Figures

Figure 1

20 pages, 5411 KiB

Open AccessArticle

Pose Estimation for Straight Wing Aircraft Based on Consistent Line Clustering and Planes Intersection

by Xichao Teng, Qifeng Yu, Jing Luo, Xiaohu Zhang and Gang Wang

Sensors 2019, 19(2), 342; https://doi.org/10.3390/s19020342 - 16 Jan 2019

Cited by 7 | Viewed by 3984

Abstract

Aircraft pose estimation is a necessary technology in aerospace applications, and accurate pose parameters are the foundation for many aerospace tasks. In this paper, we propose a novel pose estimation method for straight wing aircraft without relying on 3D models or other datasets, [...] Read more.

Aircraft pose estimation is a necessary technology in aerospace applications, and accurate pose parameters are the foundation for many aerospace tasks. In this paper, we propose a novel pose estimation method for straight wing aircraft without relying on 3D models or other datasets, and two widely separated cameras are used to acquire the pose information. Because of the large baseline and long-distance imaging, feature point matching is difficult and inaccurate in this configuration. In our method, line features are extracted to describe the structure of straight wing aircraft in images, and pose estimation is performed based on the common geometry constraints of straight wing aircraft. The spatial and length consistency of the line features is used to exclude irrelevant line segments belonging to the background or other parts of the aircraft, and density-based parallel line clustering is utilized to extract the aircraft’s main structure. After identifying the orientation of the fuselage and wings in images, planes intersection is used to estimate the 3D localization and attitude of the aircraft. Experimental results show that our method estimates the aircraft pose accurately and robustly. Full article

(This article belongs to the Special Issue Visual Sensors)

► Show Figures

Figure 1

21 pages, 4328 KiB

Open AccessArticle

Local Parallel Cross Pattern: A Color Texture Descriptor for Image Retrieval

by Qinghe Feng, Qiaohong Hao, Mateu Sbert, Yugen Yi, Ying Wei and Jiangyan Dai

Sensors 2019, 19(2), 315; https://doi.org/10.3390/s19020315 - 14 Jan 2019

Cited by 7 | Viewed by 3253

Abstract

Riding the wave of visual sensor equipment (e.g., personal smartphones, home security cameras, vehicle cameras, and camcorders), image retrieval (IR) technology has received increasing attention due to its potential applications in e-commerce, visual surveillance, and intelligent traffic. However, determining how to design an [...] Read more.

Riding the wave of visual sensor equipment (e.g., personal smartphones, home security cameras, vehicle cameras, and camcorders), image retrieval (IR) technology has received increasing attention due to its potential applications in e-commerce, visual surveillance, and intelligent traffic. However, determining how to design an effective feature descriptor has been proven to be the main bottleneck for retrieving a set of images of interest. In this paper, we first construct a six-layer color quantizer to extract a color map. Then, motivated by the human visual system, we design a local parallel cross pattern (LPCP) in which the local binary pattern (LBP) map is amalgamated with the color map in “parallel” and “cross” manners. Finally, to reduce the computational complexity and improve the robustness to image rotation, the LPCP is extended to the uniform local parallel cross pattern (ULPCP) and the rotation-invariant local parallel cross pattern (RILPCP), respectively. Extensive experiments are performed on eight benchmark datasets. The experimental results validate the effectiveness, efficiency, robustness, and computational complexity of the proposed descriptors against eight state-of-the-art color texture descriptors to produce an in-depth comparison. Additionally, compared with a series of Convolutional Neural Network (CNN)-based models, the proposed descriptors still achieve competitive results. Full article

(This article belongs to the Special Issue Visual Sensors)

► Show Figures

Figure 1

11 pages, 2863 KiB

Open AccessArticle

Camera Calibration Using Gray Code

by Seppe Sels, Bart Ribbens, Steve Vanlanduit and Rudi Penne

Sensors 2019, 19(2), 246; https://doi.org/10.3390/s19020246 - 10 Jan 2019

Cited by 30 | Viewed by 6638

Abstract

In order to determine camera parameters, a calibration procedure involving the camera recordings of a checkerboard is usually performed. In this paper, we propose an alternative approach that uses Gray-code patterns displayed on an LCD screen. Gray-code patterns allow us to decode 3D [...] Read more.

In order to determine camera parameters, a calibration procedure involving the camera recordings of a checkerboard is usually performed. In this paper, we propose an alternative approach that uses Gray-code patterns displayed on an LCD screen. Gray-code patterns allow us to decode 3D location information of points of the LCD screen at every pixel in the camera image. This is in contrast to checkerboard patterns where the number of corresponding locations is limited to the number of checkerboard corners. We show that, for the case of a UEye CMOS camera, the precision of focal-length estimation is 1.5 times more precise than when using a standard calibration with a checkerboard pattern. Full article

(This article belongs to the Special Issue Visual Sensors)

► Show Figures

Figure 1

25 pages, 6256 KiB

Open AccessArticle

Motion-Aware Correlation Filters for Online Visual Tracking

by Yihong Zhang, Yijin Yang, Wuneng Zhou, Lifeng Shi and Demin Li

Sensors 2018, 18(11), 3937; https://doi.org/10.3390/s18113937 - 14 Nov 2018

Cited by 18 | Viewed by 3409

Abstract

The discriminative correlation filters-based methods struggle deal with the problem of fast motion and heavy occlusion, the problem can severely degrade the performance of trackers, ultimately leading to tracking failures. In this paper, a novel Motion-Aware Correlation Filters (MACF) framework is proposed for [...] Read more.

The discriminative correlation filters-based methods struggle deal with the problem of fast motion and heavy occlusion, the problem can severely degrade the performance of trackers, ultimately leading to tracking failures. In this paper, a novel Motion-Aware Correlation Filters (MACF) framework is proposed for online visual object tracking, where a motion-aware strategy based on joint instantaneous motion estimation Kalman filters is integrated into the Discriminative Correlation Filters (DCFs). The proposed motion-aware strategy is used to predict the possible region and scale of the target in the current frame by utilizing the previous estimated 3D motion information. Obviously, this strategy can prevent model drift caused by fast motion. On the base of the predicted region and scale, the MACF detects the position and scale of the target by using the DCFs-based method in the current frame. Furthermore, an adaptive model updating strategy is proposed to address the problem of corrupted models caused by occlusions, where the learning rate is determined by the confidence of the response map. The extensive experiments on popular Object Tracking Benchmark OTB-100, OTB-50 and unmanned aerial vehicles (UAV) video have demonstrated that the proposed MACF tracker performs better than most of the state-of-the-art trackers and achieves a high real-time performance. In addition, the proposed approach can be integrated easily and flexibly into other visual tracking algorithms. Full article

(This article belongs to the Special Issue Visual Sensors)

► Show Figures

Figure 1

21 pages, 3915 KiB

Open AccessArticle

Video-Based Person Re-Identification by an End-To-End Learning Architecture with Hybrid Deep Appearance-Temporal Feature

by Rui Sun, Qiheng Huang, Miaomiao Xia and Jun Zhang

Sensors 2018, 18(11), 3669; https://doi.org/10.3390/s18113669 - 29 Oct 2018

Cited by 6 | Viewed by 3258

Abstract

Video-based person re-identification is an important task with the challenges of lighting variation, low-resolution images, background clutter, occlusion, and human appearance similarity in the multi-camera visual sensor networks. In this paper, we propose a video-based person re-identification method called the end-to-end learning architecture [...] Read more.

Video-based person re-identification is an important task with the challenges of lighting variation, low-resolution images, background clutter, occlusion, and human appearance similarity in the multi-camera visual sensor networks. In this paper, we propose a video-based person re-identification method called the end-to-end learning architecture with hybrid deep appearance-temporal feature. It can learn the appearance features of pivotal frames, the temporal features, and the independent distance metric of different features. This architecture consists of two-stream deep feature structure and two Siamese networks. For the first-stream structure, we propose the Two-branch Appearance Feature (TAF) sub-structure to obtain the appearance information of persons, and used one of the two Siamese networks to learn the similarity of appearance features of a pairwise person. To utilize the temporal information, we designed the second-stream structure that consisting of the Optical flow Temporal Feature (OTF) sub-structure and another Siamese network, to learn the person’s temporal features and the distances of pairwise features. In addition, we select the pivotal frames of video as inputs to the Inception-V3 network on the Two-branch Appearance Feature sub-structure, and employ the salience-learning fusion layer to fuse the learned global and local appearance features. Extensive experimental results on the PRID2011, iLIDS-VID, and Motion Analysis and Re-identification Set (MARS) datasets showed that the respective proposed architectures reached 79%, 59% and 72% at Rank-1 and had advantages over state-of-the-art algorithms. Meanwhile, it also improved the feature representation ability of persons. Full article

(This article belongs to the Special Issue Visual Sensors)

► Show Figures

Figure 1

20 pages, 5004 KiB

Open AccessArticle

Improved Point-Line Feature Based Visual SLAM Method for Indoor Scenes

by Runzhi Wang, Kaichang Di, Wenhui Wan and Yongkang Wang

Sensors 2018, 18(10), 3559; https://doi.org/10.3390/s18103559 - 20 Oct 2018

Cited by 19 | Viewed by 3651

Abstract

In the study of indoor simultaneous localization and mapping (SLAM) problems using a stereo camera, two types of primary features—point and line segments—have been widely used to calculate the pose of the camera. However, many feature-based SLAM systems are not robust when the [...] Read more.

In the study of indoor simultaneous localization and mapping (SLAM) problems using a stereo camera, two types of primary features—point and line segments—have been widely used to calculate the pose of the camera. However, many feature-based SLAM systems are not robust when the camera moves sharply or turns too quickly. In this paper, an improved indoor visual SLAM method to better utilize the advantages of point and line segment features and achieve robust results in difficult environments is proposed. First, point and line segment features are automatically extracted and matched to build two kinds of projection models. Subsequently, for the optimization problem of line segment features, we add minimization of angle observation in addition to the traditional re-projection error of endpoints. Finally, our model of motion estimation, which is adaptive to the motion state of the camera, is applied to build a new combinational Hessian matrix and gradient vector for iterated pose estimation. Furthermore, our proposal has been tested on EuRoC MAV datasets and sequence images captured with our stereo camera. The experimental results demonstrate the effectiveness of our improved point-line feature based visual SLAM method in improving localization accuracy when the camera moves with rapid rotation or violent fluctuation. Full article

(This article belongs to the Special Issue Visual Sensors)

► Show Figures

Figure 1

18 pages, 8577 KiB

Open AccessArticle

Dense RGB-D Semantic Mapping with Pixel-Voxel Neural Network

by Cheng Zhao, Li Sun, Pulak Purkait, Tom Duckett and Rustam Stolkin

Sensors 2018, 18(9), 3099; https://doi.org/10.3390/s18093099 - 14 Sep 2018

Cited by 20 | Viewed by 6185

Abstract

In this paper, a novel Pixel-Voxel network is proposed for dense 3D semantic mapping, which can perform dense 3D mapping while simultaneously recognizing and labelling the semantic category each point in the 3D map. In our approach, we fully leverage the advantages of [...] Read more.

In this paper, a novel Pixel-Voxel network is proposed for dense 3D semantic mapping, which can perform dense 3D mapping while simultaneously recognizing and labelling the semantic category each point in the 3D map. In our approach, we fully leverage the advantages of different modalities. That is, the PixelNet can learn the high-level contextual information from 2D RGB images, and the VoxelNet can learn 3D geometrical shapes from the 3D point cloud. Unlike the existing architecture that fuses score maps from different modalities with equal weights, we propose a softmax weighted fusion stack that adaptively learns the varying contributions of PixelNet and VoxelNet and fuses the score maps according to their respective confidence levels. Our approach achieved competitive results on both the SUN RGB-D and NYU V2 benchmarks, while the runtime of the proposed system is boosted to around 13 Hz, enabling near-real-time performance using an i7 eight-cores PC with a single Titan X GPU. Full article

(This article belongs to the Special Issue Visual Sensors)

► Show Figures

Figure 1

14 pages, 4094 KiB

Open AccessArticle

Pose Estimation of Sweet Pepper through Symmetry Axis Detection

by Hao Li, Qibing Zhu, Min Huang, Ya Guo and Jianwei Qin

Sensors 2018, 18(9), 3083; https://doi.org/10.3390/s18093083 - 13 Sep 2018

Cited by 20 | Viewed by 3909

Abstract

The space pose of fruits is necessary for accurate detachment in automatic harvesting. This study presents a novel pose estimation method for sweet pepper detachment. In this method, the normal to the local plane at each point in the sweet-pepper point cloud was [...] Read more.

The space pose of fruits is necessary for accurate detachment in automatic harvesting. This study presents a novel pose estimation method for sweet pepper detachment. In this method, the normal to the local plane at each point in the sweet-pepper point cloud was first calculated. The point cloud was separated by a number of candidate planes, and the scores of each plane were then separately calculated using the scoring strategy. The plane with the lowest score was selected as the symmetry plane of the point cloud. The symmetry axis could be finally calculated from the selected symmetry plane, and the pose of sweet pepper in the space was obtained using the symmetry axis. The performance of the proposed method was evaluated by simulated and sweet-pepper cloud dataset tests. In the simulated test, the average angle error between the calculated symmetry and real axes was approximately 6.5°. In the sweet-pepper cloud dataset test, the average error was approximately 7.4° when the peduncle was removed. When the peduncle of sweet pepper was complete, the average error was approximately 6.9°. These results suggested that the proposed method was suitable for pose estimation of sweet peppers and could be adjusted for use with other fruits and vegetables. Full article

(This article belongs to the Special Issue Visual Sensors)

► Show Figures

Figure 1

26 pages, 7034 KiB

Open AccessArticle

Automatic Calibration of an Around View Monitor System Exploiting Lane Markings

by Kyoungtaek Choi, Ho Gi Jung and Jae Kyu Suhr

Sensors 2018, 18(9), 2956; https://doi.org/10.3390/s18092956 - 05 Sep 2018

Cited by 23 | Viewed by 7011

Abstract

This paper proposes a method that automatically calibrates four cameras of an around view monitor (AVM) system in a natural driving situation. The proposed method estimates orientation angles of four cameras composing the AVM system, and assumes that their locations and intrinsic parameters [...] Read more.

This paper proposes a method that automatically calibrates four cameras of an around view monitor (AVM) system in a natural driving situation. The proposed method estimates orientation angles of four cameras composing the AVM system, and assumes that their locations and intrinsic parameters are known in advance. This method utilizes lane markings because they exist in almost all on-road situations and appear across images of adjacent cameras. It starts by detecting lane markings from images captured by four cameras of the AVM system in a cost-effective manner. False lane markings are rejected by analyzing the statistical properties of the detected lane markings. Once the correct lane markings are sufficiently gathered, this method first calibrates the front and rear cameras, and then calibrates the left and right cameras with the help of the calibration results of the front and rear cameras. This two-step approach is essential because side cameras cannot be fully calibrated by themselves, due to insufficient lane marking information. After this initial calibration, this method collects corresponding lane markings appearing across images of adjacent cameras and simultaneously refines the initial calibration results of four cameras to obtain seamless AVM images. In the case of a long image sequence, this method conducts the camera calibration multiple times, and then selects the medoid as the final result to reduce computational resources and dependency on a specific place. In the experiment, the proposed method was quantitatively and qualitatively evaluated in various real driving situations and showed promising results. Full article

(This article belongs to the Special Issue Visual Sensors)

► Show Figures

Figure 1

14 pages, 10016 KiB

Open AccessArticle

Lightweight Visual Odometry for Autonomous Mobile Robots

by Mohamed Aladem and Samir A. Rawashdeh

Sensors 2018, 18(9), 2837; https://doi.org/10.3390/s18092837 - 28 Aug 2018

Cited by 30 | Viewed by 11549

Abstract

Vision-based motion estimation is an effective means for mobile robot localization and is often used in conjunction with other sensors for navigation and path planning. This paper presents a low-overhead real-time ego-motion estimation (visual odometry) system based on either a stereo or RGB-D [...] Read more.

Vision-based motion estimation is an effective means for mobile robot localization and is often used in conjunction with other sensors for navigation and path planning. This paper presents a low-overhead real-time ego-motion estimation (visual odometry) system based on either a stereo or RGB-D sensor. The algorithm’s accuracy outperforms typical frame-to-frame approaches by maintaining a limited local map, while requiring significantly less memory and computational power in contrast to using global maps common in full visual SLAM methods. The algorithm is evaluated on common publicly available datasets that span different use-cases and performance is compared to other comparable open-source systems in terms of accuracy, frame rate and memory requirements. This paper accompanies the release of the source code as a modular software package for the robotics community compatible with the Robot Operating System (ROS). Full article

(This article belongs to the Special Issue Visual Sensors)

► Show Figures

Figure 1

17 pages, 11756 KiB

Open AccessArticle

Handshape Recognition Using Skeletal Data

by Tomasz Kapuscinski and Patryk Organisciak

Sensors 2018, 18(8), 2577; https://doi.org/10.3390/s18082577 - 06 Aug 2018

Cited by 6 | Viewed by 3508

Abstract

In this paper, a method of handshapes recognition based on skeletal data is described. A new feature vector is proposed. It encodes the relative differences between vectors associated with the pointing directions of the particular fingers and the palm normal. Different classifiers are [...] Read more.

In this paper, a method of handshapes recognition based on skeletal data is described. A new feature vector is proposed. It encodes the relative differences between vectors associated with the pointing directions of the particular fingers and the palm normal. Different classifiers are tested on the demanding dataset, containing 48 handshapes performed 500 times by five users. Two different sensor configurations and significant variation in the hand rotation are considered. The late fusion at the decision level of individual models, as well as a comparative study carried out on a publicly available dataset, are also included. Full article

(This article belongs to the Special Issue Visual Sensors)

► Show Figures

Figure 1

25 pages, 15184 KiB

Open AccessArticle

Fast Visual Odometry for a Low-Cost Underwater Embedded Stereo System ^†

by Mohamad Motasem Nawaf, Djamal Merad, Jean-Philip Royer, Jean-Marc Boï, Mauro Saccone, Mohamed Ben Ellefi and Pierre Drap

Sensors 2018, 18(7), 2313; https://doi.org/10.3390/s18072313 - 17 Jul 2018

Cited by 16 | Viewed by 3972

Abstract

This paper provides details of hardware and software conception and realization of a stereo embedded system for underwater imaging. The system provides several functions that facilitate underwater surveys and run smoothly in real-time. A first post-image acquisition module provides direct visual feedback on [...] Read more.

This paper provides details of hardware and software conception and realization of a stereo embedded system for underwater imaging. The system provides several functions that facilitate underwater surveys and run smoothly in real-time. A first post-image acquisition module provides direct visual feedback on the quality of the taken images which helps appropriate actions to be taken regarding movement speed and lighting conditions. Our main contribution is a light visual odometry method adapted to the underwater context. The proposed method uses the captured stereo image stream to provide real-time navigation and a site coverage map which is necessary to conduct a complete underwater survey. The visual odometry uses a stochastic pose representation and semi-global optimization approach to handle large sites and provides long-term autonomy, whereas a novel stereo matching approach adapted to underwater imaging and system attached lighting allows fast processing and suitability to low computational resource systems. The system is tested in a real context and shows its robustness and promising future potential. Full article

(This article belongs to the Special Issue Visual Sensors)

► Show Figures

Figure 1

24 pages, 3051 KiB

Open AccessArticle

Visual Information Fusion through Bayesian Inference for Adaptive Probability-Oriented Feature Matching

by David Valiente, Luis Payá, Luis M. Jiménez, Jose M. Sebastián and Óscar Reinoso

Sensors 2018, 18(7), 2041; https://doi.org/10.3390/s18072041 - 26 Jun 2018

Cited by 25 | Viewed by 3535

Abstract

This work presents a visual information fusion approach for robust probability-oriented feature matching. It is sustained by omnidirectional imaging, and it is tested in a visual localization framework, in mobile robotics. General visual localization methods have been extensively studied and optimized in terms [...] Read more.

This work presents a visual information fusion approach for robust probability-oriented feature matching. It is sustained by omnidirectional imaging, and it is tested in a visual localization framework, in mobile robotics. General visual localization methods have been extensively studied and optimized in terms of performance. However, one of the main threats that jeopardizes the final estimation is the presence of outliers. In this paper, we present several contributions to deal with that issue. First, 3D information data, associated with SURF (Speeded-Up Robust Feature) points detected on the images, is inferred under the Bayesian framework established by Gaussian processes (GPs). Such information represents a probability distribution for the feature points’ existence, which is successively fused and updated throughout the robot’s poses. Secondly, this distribution can be properly sampled and projected onto the next 2D image frame in

t + 1

, by means of a filter-motion prediction. This strategy permits obtaining relevant areas in the image reference system, from which probable matches could be detected, in terms of the accumulated probability of feature existence. This approach entails an adaptive probability-oriented matching search, which accounts for significant areas of the image, but it also considers unseen parts of the scene, thanks to an internal modulation of the probability distribution domain, computed in terms of the current uncertainty of the system. The main outcomes confirm a robust feature matching, which permits producing consistent localization estimates, aided by the odometer’s prior to estimate the scale factor. Publicly available datasets have been used to validate the design and operation of the approach. Moreover, the proposal has been compared, firstly with a standard feature matching and secondly with a localization method, based on an inverse depth parametrization. The results confirm the validity of the approach in terms of feature matching, localization accuracy, and time consumption. Full article

(This article belongs to the Special Issue Visual Sensors)

► Show Figures

Figure 1

22 pages, 4027 KiB

Open AccessArticle

Hybrid Histogram Descriptor: A Fusion Feature Representation for Image Retrieval

by Qinghe Feng, Qiaohong Hao, Yuqi Chen, Yugen Yi, Ying Wei and Jiangyan Dai

Sensors 2018, 18(6), 1943; https://doi.org/10.3390/s18061943 - 15 Jun 2018

Cited by 17 | Viewed by 3830

Abstract

Currently, visual sensors are becoming increasingly affordable and fashionable, acceleratingly the increasing number of image data. Image retrieval has attracted increasing interest due to space exploration, industrial, and biomedical applications. Nevertheless, designing effective feature representation is acknowledged as a hard yet fundamental issue. [...] Read more.

Currently, visual sensors are becoming increasingly affordable and fashionable, acceleratingly the increasing number of image data. Image retrieval has attracted increasing interest due to space exploration, industrial, and biomedical applications. Nevertheless, designing effective feature representation is acknowledged as a hard yet fundamental issue. This paper presents a fusion feature representation called a hybrid histogram descriptor (HHD) for image retrieval. The proposed descriptor comprises two histograms jointly: a perceptually uniform histogram which is extracted by exploiting the color and edge orientation information in perceptually uniform regions; and a motif co-occurrence histogram which is acquired by calculating the probability of a pair of motif patterns. To evaluate the performance, we benchmarked the proposed descriptor on RSSCN7, AID, Outex-00013, Outex-00014 and ETHZ-53 datasets. Experimental results suggest that the proposed descriptor is more effective and robust than ten recent fusion-based descriptors under the content-based image retrieval framework. The computational complexity was also analyzed to give an in-depth evaluation. Furthermore, compared with the state-of-the-art convolutional neural network (CNN)-based descriptors, the proposed descriptor also achieves comparable performance, but does not require any training process. Full article

(This article belongs to the Special Issue Visual Sensors)

► Show Figures

Figure 1

20 pages, 5256 KiB

Open AccessArticle

Segment-Tube: Spatio-Temporal Action Localization in Untrimmed Videos with Per-Frame Segmentation

by Le Wang, Xuhuan Duan, Qilin Zhang, Zhenxing Niu, Gang Hua and Nanning Zheng

Sensors 2018, 18(5), 1657; https://doi.org/10.3390/s18051657 - 22 May 2018

Cited by 10 | Viewed by 10958

Abstract

Inspired by the recent spatio-temporal action localization efforts with tubelets (sequences of bounding boxes), we present a new spatio-temporal action localization detector Segment-tube, which consists of sequences of per-frame segmentation masks. The proposed Segment-tube detector can temporally pinpoint the starting/ending frame of each [...] Read more.

Inspired by the recent spatio-temporal action localization efforts with tubelets (sequences of bounding boxes), we present a new spatio-temporal action localization detector Segment-tube, which consists of sequences of per-frame segmentation masks. The proposed Segment-tube detector can temporally pinpoint the starting/ending frame of each action category in the presence of preceding/subsequent interference actions in untrimmed videos. Simultaneously, the Segment-tube detector produces per-frame segmentation masks instead of bounding boxes, offering superior spatial accuracy to tubelets. This is achieved by alternating iterative optimization between temporal action localization and spatial action segmentation. Experimental results on three datasets validated the efficacy of the proposed method, including (1) temporal action localization on the THUMOS 2014 dataset; (2) spatial action segmentation on the Segtrack dataset; and (3) joint spatio-temporal action localization on the newly proposed ActSeg dataset. It is shown that our method compares favorably with existing state-of-the-art methods. Full article

(This article belongs to the Special Issue Visual Sensors)

► Show Figures

Figure 1

19 pages, 4052 KiB

Open AccessArticle

Automated Field-of-View, Illumination, and Recognition Algorithm Design of a Vision System for Pick-and-Place Considering Colour Information in Illumination and Images

by Yibing Chen, Taiki Ogata, Tsuyoshi Ueyama, Toshiyuki Takada and Jun Ota

Sensors 2018, 18(5), 1656; https://doi.org/10.3390/s18051656 - 22 May 2018

Cited by 7 | Viewed by 4070

Abstract

Machine vision is playing an increasingly important role in industrial applications, and the automated design of image recognition systems has been a subject of intense research. This study has proposed a system for automatically designing the field-of-view (FOV) of a camera, the illumination [...] Read more.

Machine vision is playing an increasingly important role in industrial applications, and the automated design of image recognition systems has been a subject of intense research. This study has proposed a system for automatically designing the field-of-view (FOV) of a camera, the illumination strength and the parameters in a recognition algorithm. We formulated the design problem as an optimisation problem and used an experiment based on a hierarchical algorithm to solve it. The evaluation experiments using translucent plastics objects showed that the use of the proposed system resulted in an effective solution with a wide FOV, recognition of all objects and 0.32 mm and 0.4° maximal positional and angular errors when all the RGB (red, green and blue) for illumination and R channel image for recognition were used. Though all the RGB illumination and grey scale images also provided recognition of all the objects, only a narrow FOV was selected. Moreover, full recognition was not achieved by using only G illumination and a grey-scale image. The results showed that the proposed method can automatically design the FOV, illumination and parameters in the recognition algorithm and that tuning all the RGB illumination is desirable even when single-channel or grey-scale images are used for recognition. Full article

(This article belongs to the Special Issue Visual Sensors)

► Show Figures

Figure 1

22 pages, 6618 KiB

Open AccessArticle

Lane Marking Detection and Reconstruction with Line-Scan Imaging Data

by Lin Li, Wenting Luo and Kelvin C.P. Wang

Sensors 2018, 18(5), 1635; https://doi.org/10.3390/s18051635 - 20 May 2018

Cited by 29 | Viewed by 6576

Abstract

Abstract: Lane marking detection and localization are crucial for autonomous driving and lane-based pavement surveys. Numerous studies have been done to detect and locate lane markings with the purpose of advanced driver assistance systems, in which image data are usually captured by [...] Read more.

Abstract: Lane marking detection and localization are crucial for autonomous driving and lane-based pavement surveys. Numerous studies have been done to detect and locate lane markings with the purpose of advanced driver assistance systems, in which image data are usually captured by vision-based cameras. However, a limited number of studies have been done to identify lane markings using high-resolution laser images for road condition evaluation. In this study, the laser images are acquired with a digital highway data vehicle (DHDV). Subsequently, a novel methodology is presented for the automated lane marking identification and reconstruction, and is implemented in four phases: (1) binarization of the laser images with a new threshold method (multi-box segmentation based threshold method); (2) determination of candidate lane markings with closing operations and a marching square algorithm; (3) identification of true lane marking by eliminating false positives (FPs) using a linear support vector machine method; and (4) reconstruction of the damaged and dash lane marking segments to form a continuous lane marking based on the geometry features such as adjacent lane marking location and lane width. Finally, a case study is given to validate effects of the novel methodology. The findings indicate the new strategy is robust in image binarization and lane marking localization. This study would be beneficial in road lane-based pavement condition evaluation such as lane-based rutting measurement and crack classification. Full article

(This article belongs to the Special Issue Visual Sensors)

► Show Figures

Figure 1

30 pages, 6054 KiB

Open AccessArticle

IrisDenseNet: Robust Iris Segmentation Using Densely Connected Fully Convolutional Networks in the Images by Visible Light and Near-Infrared Light Camera Sensors

by Muhammad Arsalan, Rizwan Ali Naqvi, Dong Seop Kim, Phong Ha Nguyen, Muhammad Owais and Kang Ryoung Park

Sensors 2018, 18(5), 1501; https://doi.org/10.3390/s18051501 - 10 May 2018

Cited by 101 | Viewed by 7933

Abstract

The recent advancements in computer vision have opened new horizons for deploying biometric recognition algorithms in mobile and handheld devices. Similarly, iris recognition is now much needed in unconstraint scenarios with accuracy. These environments make the acquired iris image exhibit occlusion, low resolution, [...] Read more.

The recent advancements in computer vision have opened new horizons for deploying biometric recognition algorithms in mobile and handheld devices. Similarly, iris recognition is now much needed in unconstraint scenarios with accuracy. These environments make the acquired iris image exhibit occlusion, low resolution, blur, unusual glint, ghost effect, and off-angles. The prevailing segmentation algorithms cannot cope with these constraints. In addition, owing to the unavailability of near-infrared (NIR) light, iris recognition in visible light environment makes the iris segmentation challenging with the noise of visible light. Deep learning with convolutional neural networks (CNN) has brought a considerable breakthrough in various applications. To address the iris segmentation issues in challenging situations by visible light and near-infrared light camera sensors, this paper proposes a densely connected fully convolutional network (IrisDenseNet), which can determine the true iris boundary even with inferior-quality images by using better information gradient flow between the dense blocks. In the experiments conducted, five datasets of visible light and NIR environments were used. For visible light environment, noisy iris challenge evaluation part-II (NICE-II selected from UBIRIS.v2 database) and mobile iris challenge evaluation (MICHE-I) datasets were used. For NIR environment, the institute of automation, Chinese academy of sciences (CASIA) v4.0 interval, CASIA v4.0 distance, and IIT Delhi v1.0 iris datasets were used. Experimental results showed the optimal segmentation of the proposed IrisDenseNet and its excellent performance over existing algorithms for all five datasets. Full article

(This article belongs to the Special Issue Visual Sensors)

► Show Figures

Figure 1

20 pages, 4274 KiB

Open AccessArticle

Textile Retrieval Based on Image Content from CDC and Webcam Cameras in Indoor Environments

by Oscar García-Olalla, Enrique Alegre, Laura Fernández-Robles, Eduardo Fidalgo and Surajit Saikia

Sensors 2018, 18(5), 1329; https://doi.org/10.3390/s18051329 - 25 Apr 2018

Cited by 13 | Viewed by 5545

Abstract

Textile based image retrieval for indoor environments can be used to retrieve images that contain the same textile, which may indicate that scenes are related. This makes up a useful approach for law enforcement agencies who want to find evidence based on matching [...] Read more.

Textile based image retrieval for indoor environments can be used to retrieve images that contain the same textile, which may indicate that scenes are related. This makes up a useful approach for law enforcement agencies who want to find evidence based on matching between textiles. In this paper, we propose a novel pipeline that allows searching and retrieving textiles that appear in pictures of real scenes. Our approach is based on first obtaining regions containing textiles by using MSER on high pass filtered images of the RGB, HSV and Hue channels of the original photo. To describe the textile regions, we demonstrated that the combination of HOG and HCLOSIB is the best option for our proposal when using the correlation distance to match the query textile patch with the candidate regions. Furthermore, we introduce a new dataset, TextilTube, which comprises a total of 1913 textile regions labelled within 67 classes. We yielded 84.94% of success in the 40 nearest coincidences and 37.44% of precision taking into account just the first coincidence, which outperforms the current deep learning methods evaluated. Experimental results show that this pipeline can be used to set up an effective textile based image retrieval system in indoor environments. Full article

(This article belongs to the Special Issue Visual Sensors)

► Show Figures

Figure 1

30 pages, 4998 KiB

Open AccessArticle

Presentation Attack Detection for Iris Recognition System Using NIR Camera Sensor

by Dat Tien Nguyen, Na Rae Baek, Tuyen Danh Pham and Kang Ryoung Park

Sensors 2018, 18(5), 1315; https://doi.org/10.3390/s18051315 - 24 Apr 2018

Cited by 16 | Viewed by 6564

Abstract

Among biometric recognition systems such as fingerprint, finger-vein, or face, the iris recognition system has proven to be effective for achieving a high recognition accuracy and security level. However, several recent studies have indicated that an iris recognition system can be fooled by [...] Read more.

Among biometric recognition systems such as fingerprint, finger-vein, or face, the iris recognition system has proven to be effective for achieving a high recognition accuracy and security level. However, several recent studies have indicated that an iris recognition system can be fooled by using presentation attack images that are recaptured using high-quality printed images or by contact lenses with printed iris patterns. As a result, this potential threat can reduce the security level of an iris recognition system. In this study, we propose a new presentation attack detection (PAD) method for an iris recognition system (iPAD) using a near infrared light (NIR) camera image. To detect presentation attack images, we first localized the iris region of the input iris image using circular edge detection (CED). Based on the result of iris localization, we extracted the image features using deep learning-based and handcrafted-based methods. The input iris images were then classified into real and presentation attack categories using support vector machines (SVM). Through extensive experiments with two public datasets, we show that our proposed method effectively solves the iris recognition presentation attack detection problem and produces detection accuracy superior to previous studies. Full article

(This article belongs to the Special Issue Visual Sensors)

► Show Figures

Figure 1

15 pages, 56074 KiB

Open AccessArticle

Improved Seam-Line Searching Algorithm for UAV Image Mosaic with Optical Flow

by Weilong Zhang, Bingxuan Guo, Ming Li, Xuan Liao and Wenzhuo Li

Sensors 2018, 18(4), 1214; https://doi.org/10.3390/s18041214 - 16 Apr 2018

Cited by 21 | Viewed by 5481

Abstract

Ghosting and seams are two major challenges in creating unmanned aerial vehicle (UAV) image mosaic. In response to these problems, this paper proposes an improved method for UAV image seam-line searching. First, an image matching algorithm is used to extract and match the [...] Read more.

Ghosting and seams are two major challenges in creating unmanned aerial vehicle (UAV) image mosaic. In response to these problems, this paper proposes an improved method for UAV image seam-line searching. First, an image matching algorithm is used to extract and match the features of adjacent images, so that they can be transformed into the same coordinate system. Then, the gray scale difference, the gradient minimum, and the optical flow value of pixels in adjacent image overlapped area in a neighborhood are calculated, which can be applied to creating an energy function for seam-line searching. Based on that, an improved dynamic programming algorithm is proposed to search the optimal seam-lines to complete the UAV image mosaic. This algorithm adopts a more adaptive energy aggregation and traversal strategy, which can find a more ideal splicing path for adjacent UAV images and avoid the ground objects better. The experimental results show that the proposed method can effectively solve the problems of ghosting and seams in the panoramic UAV images. Full article

(This article belongs to the Special Issue Visual Sensors)

► Show Figures

Figure 1

16 pages, 45969 KiB

Open AccessArticle

Comparative Analysis of Warp Function for Digital Image Correlation-Based Accurate Single-Shot 3D Shape Measurement

by Xiao Yang, Xiaobo Chen and Juntong Xi

Sensors 2018, 18(4), 1208; https://doi.org/10.3390/s18041208 - 16 Apr 2018

Cited by 9 | Viewed by 3993

Abstract

Digital image correlation (DIC)-based stereo 3D shape measurement is a kind of single-shot method, which can achieve high precision and is robust to vibration as well as environment noise. The efficiency of DIC has been greatly improved with the proposal of inverse compositional [...] Read more.

Digital image correlation (DIC)-based stereo 3D shape measurement is a kind of single-shot method, which can achieve high precision and is robust to vibration as well as environment noise. The efficiency of DIC has been greatly improved with the proposal of inverse compositional Gauss-Newton (IC-GN) operators for both first-order and second-order warp functions. Without the algorithm itself, both the registration accuracy and efficiency of DIC-based stereo matching for shapes with different complexities are closely related to the selection of warp function, subset size, and convergence criteria. Understanding the similarity and difference of the impacts of prescribed subset size and convergence criteria on first-order and second-order warp functions, and how to choose a proper warp function and set optimal subset size as well as convergence criteria for different shapes are fundamental problems in realizing efficient and accurate 3D shape measurement. In this work, we present a comparative analysis of first-order and second-order warp functions for DIC-based 3D shape measurement using IC-GN algorithm. The effects of subset size and convergence criteria of first-order and second-order warp functions on the accuracy and efficiency of DIC are comparatively examined with both simulation tests and real experiments. Reference standards for the selection of warp function for different kinds of 3D shape measurement and the setting of proper convergence criteria are recommended. The effects of subset size on the measuring precision using different warp functions are also concluded. Full article

(This article belongs to the Special Issue Visual Sensors)

► Show Figures

Figure 1

17 pages, 1820 KiB

Open AccessArticle

Delving Deep into Multiscale Pedestrian Detection via Single Scale Feature Maps

by Xinchuan Fu, Rui Yu, Weinan Zhang, Jie Wu and Shihai Shao

Sensors 2018, 18(4), 1063; https://doi.org/10.3390/s18041063 - 02 Apr 2018

Cited by 6 | Viewed by 4517

Abstract

The standard pipeline in pedestrian detection is sliding a pedestrian model on an image feature pyramid to detect pedestrians of different scales. In this pipeline, feature pyramid construction is time consuming and becomes the bottleneck for fast detection. Recently, a method called multiresolution [...] Read more.

The standard pipeline in pedestrian detection is sliding a pedestrian model on an image feature pyramid to detect pedestrians of different scales. In this pipeline, feature pyramid construction is time consuming and becomes the bottleneck for fast detection. Recently, a method called multiresolution filtered channels (MRFC) was proposed which only used single scale feature maps to achieve fast detection. However, there are two shortcomings in MRFC which limit its accuracy. One is that the receptive field correspondence in different scales is weak. Another is that the features used are not scale invariance. In this paper, two solutions are proposed to tackle with the two shortcomings respectively. Specifically, scale-aware pooling is proposed to make a better receptive field correspondence, and soft decision tree is proposed to relive scale variance problem. When coupled with efficient sliding window classification strategy, our detector achieves fast detecting speed at the same time with state-of-the-art accuracy. Full article

(This article belongs to the Special Issue Visual Sensors)

► Show Figures

Figure 1

18 pages, 5227 KiB

Open AccessArticle

Dynamic Non-Rigid Objects Reconstruction with a Single RGB-D Sensor

by Sen Wang, Xinxin Zuo, Chao Du, Runxiao Wang, Jiangbin Zheng and Ruigang Yang

Sensors 2018, 18(3), 886; https://doi.org/10.3390/s18030886 - 16 Mar 2018

Cited by 17 | Viewed by 5317

Abstract

This paper deals with the 3D reconstruction problem for dynamic non-rigid objects with a single RGB-D sensor. It is a challenging task as we consider the almost inevitable accumulation error issue in some previous sequential fusion methods and also the possible failure of [...] Read more.

This paper deals with the 3D reconstruction problem for dynamic non-rigid objects with a single RGB-D sensor. It is a challenging task as we consider the almost inevitable accumulation error issue in some previous sequential fusion methods and also the possible failure of surface tracking in a long sequence. Therefore, we propose a global non-rigid registration framework and tackle the drifting problem via an explicit loop closure. Our novel scheme starts with a fusion step to get multiple partial scans from the input sequence, followed by a pairwise non-rigid registration and loop detection step to obtain correspondences between neighboring partial pieces and those pieces that form a loop. Then, we perform a global registration procedure to align all those pieces together into a consistent canonical space as guided by those matches that we have established. Finally, our proposed model-update step helps fixing potential misalignments that still exist after the global registration. Both geometric and appearance constraints are enforced during our alignment; therefore, we are able to get the recovered model with accurate geometry as well as high fidelity color maps for the mesh. Experiments on both synthetic and various real datasets have demonstrated the capability of our approach to reconstruct complete and watertight deformable objects. Full article

(This article belongs to the Special Issue Visual Sensors)

► Show Figures

Figure 1

Journal Menu

Journal Browser

Visual Sensors

Share This Special Issue

Special Issue Editors

Special Issue Information

Keywords

Published Papers (37 papers)

Editorial

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI