sensors-logo

Journal Browser

Journal Browser

Sensors Signal Processing and Visual Computing

A special issue of Sensors (ISSN 1424-8220). This special issue belongs to the section "Intelligent Sensors".

Deadline for manuscript submissions: closed (31 December 2018) | Viewed by 259977

Special Issue Editor

Special Issue Information

Dear Colleagues,

Signal processing and visual computing research plays an important role in industrial and scientific applications. With the rapid advance of sensor technology, a vast and ever-growing amount of data (i.e., Big Data) in various domains and modalities is readily available, for example, videos captured by a camera network. The emergence of Big Data has brought about a paradigm shift to many fields of data analytics such as signal and image processing and computer vision, namely from handcrafted feature extraction to high level feature learning through deep learning techniques. Therefore, the primary goal of this Special Issue of Sensors is to provide the opportunity for researchers and product developers to discuss the state-of-the-art and trends of architectures, techniques and systems for signal processing and visual understanding.

Topics of Interest:

This Special Issue aims to solicit contributions reporting the most recent progress in signal processing and visual computing. The list of possible topics includes, but is not limited to, the following:

  • Speech analysis
  • Radar signal processing
  • Remote sensing image processing
  • Biomedical signal/image analysis
  • High dimensional signal processing
  • Real-time signal/image processing algorithms and architectures (e.g., FPGA, DSP, GPU)
  • Wearable sensor signal processing and its applications
  • Sensor data fusion and integration
  • Image and video processing (e.g., denoising, deblurring, super-resolution, etc.)
  • Image and video understanding (e.g., novel feature extraction, classification, semantic segmentation, object detection and recognition, action recognition, tracking, etc.)
  • Machine learning (e.g., deep learning) in signal processing and visual computing
  • Big data processing infrastructures and systems, such as cloud computing, high performance computing, Web computing

Dr. Chen Chen
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Sensors is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Published Papers (49 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

19 pages, 5298 KiB  
Article
Spatio-Temporal Action Detection in Untrimmed Videos by Using Multimodal Features and Region Proposals
by Yeongtaek Song and Incheol Kim
Sensors 2019, 19(5), 1085; https://doi.org/10.3390/s19051085 - 03 Mar 2019
Cited by 5 | Viewed by 3618
Abstract
This paper proposes a novel deep neural network model for solving the spatio-temporal-action-detection problem, by localizing all multiple-action regions and classifying the corresponding actions in an untrimmed video. The proposed model uses a spatio-temporal region proposal method to effectively detect multiple-action regions. First, [...] Read more.
This paper proposes a novel deep neural network model for solving the spatio-temporal-action-detection problem, by localizing all multiple-action regions and classifying the corresponding actions in an untrimmed video. The proposed model uses a spatio-temporal region proposal method to effectively detect multiple-action regions. First, in the temporal region proposal, anchor boxes were generated by targeting regions expected to potentially contain actions. Unlike the conventional temporal region proposal methods, the proposed method uses a complementary two-stage method to effectively detect the temporal regions of the respective actions occurring asynchronously. In addition, to detect a principal agent performing an action among the people appearing in a video, the spatial region proposal process was used. Further, coarse-level features contain comprehensive information of the whole video and have been frequently used in conventional action-detection studies. However, they cannot provide detailed information of each person performing an action in a video. In order to overcome the limitation of coarse-level features, the proposed model additionally learns fine-level features from the proposed action tubes in the video. Various experiments conducted using the LIRIS-HARL and UCF-10 datasets confirm the high performance and effectiveness of the proposed deep neural network model. Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

14 pages, 2432 KiB  
Article
Real-Time Vehicle Make and Model Recognition with the Residual SqueezeNet Architecture
by Hyo Jong Lee, Ihsan Ullah, Weiguo Wan, Yongbin Gao and Zhijun Fang
Sensors 2019, 19(5), 982; https://doi.org/10.3390/s19050982 - 26 Feb 2019
Cited by 85 | Viewed by 6548
Abstract
Make and model recognition (MMR) of vehicles plays an important role in automatic vision-based systems. This paper proposes a novel deep learning approach for MMR using the SqueezeNet architecture. The frontal views of vehicle images are first extracted and fed into a deep [...] Read more.
Make and model recognition (MMR) of vehicles plays an important role in automatic vision-based systems. This paper proposes a novel deep learning approach for MMR using the SqueezeNet architecture. The frontal views of vehicle images are first extracted and fed into a deep network for training and testing. The SqueezeNet architecture with bypass connections between the Fire modules, a variant of the vanilla SqueezeNet, is employed for this study, which makes our MMR system more efficient. The experimental results on our collected large-scale vehicle datasets indicate that the proposed model achieves 96.3% recognition rate at the rank-1 level with an economical time slice of 108.8 ms. For inference tasks, the deployed deep model requires less than 5 MB of space and thus has a great viability in real-time applications. Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

13 pages, 6274 KiB  
Article
A Novel Hierarchical Coding Progressive Transmission Method for WMSN Wildlife Images
by Wenzhao Feng, Chunhe Hu, Yuan Wang, Junguo Zhang and Hao Yan
Sensors 2019, 19(4), 946; https://doi.org/10.3390/s19040946 - 23 Feb 2019
Cited by 8 | Viewed by 3076
Abstract
In the wild, wireless multimedia sensor network (WMSN) communication has limited bandwidth and the transmission of wildlife monitoring images always suffers signal interference, which is time-consuming, or sometimes even causes failure. Generally, only part of each wildlife image is valuable, therefore, if we [...] Read more.
In the wild, wireless multimedia sensor network (WMSN) communication has limited bandwidth and the transmission of wildlife monitoring images always suffers signal interference, which is time-consuming, or sometimes even causes failure. Generally, only part of each wildlife image is valuable, therefore, if we could transmit the images according to the importance of the content, the above issues can be avoided. Inspired by the progressive transmission strategy, we propose a hierarchical coding progressive transmission method in this paper, which can transmit the saliency object region (i.e. the animal) and its background with different coding strategies and priorities. Specifically, we firstly construct a convolution neural network via the MobileNet model for the detection of the saliency object region and obtaining the mask on wildlife. Then, according to the importance of wavelet coefficients, set partitioned in hierarchical tree (SPIHT) lossless coding is utilized to transmit the saliency image which ensures the transmission accuracy of the wildlife region. After that, the background region left over is transmitted via the Embedded Zerotree Wavelets (EZW) lossy coding strategy, to improve the transmission efficiency. To verify the efficiency of our algorithm, a demonstration of the transmission of field-captured wildlife images is presented. Further, comparison of results with existing EZW and discrete cosine transform (DCT) algorithms shows that the proposed algorithm improves the peak signal to noise ratio (PSNR) and structural similarity index (SSIM) by 21.11%, 14.72% and 9.47%, 6.25%, respectively. Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

13 pages, 2446 KiB  
Article
A Hardware-Friendly Optical Flow-Based Time-to-Collision Estimation Algorithm
by Cong Shi, Zhuoran Dong, Shrinivas Pundlik and Gang Luo
Sensors 2019, 19(4), 807; https://doi.org/10.3390/s19040807 - 16 Feb 2019
Cited by 4 | Viewed by 3653
Abstract
This work proposes a hardware-friendly, dense optical flow-based Time-to-Collision (TTC) estimation algorithm intended to be deployed on smart video sensors for collision avoidance. The algorithm optimized for hardware first extracts biological visual motion features (motion energies), and then utilizes a Random Forests regressor [...] Read more.
This work proposes a hardware-friendly, dense optical flow-based Time-to-Collision (TTC) estimation algorithm intended to be deployed on smart video sensors for collision avoidance. The algorithm optimized for hardware first extracts biological visual motion features (motion energies), and then utilizes a Random Forests regressor to predict robust and dense optical flow. Finally, TTC is reliably estimated from the divergence of the optical flow field. This algorithm involves only feed-forward data flows with simple pixel-level operations, and hence has inherent parallelism for hardware acceleration. The algorithm offers good scalability, allowing for flexible tradeoffs among estimation accuracy, processing speed and hardware resource. Experimental evaluation shows that the accuracy of the optical flow estimation is improved due to the use of Random Forests compared to existing voting-based approaches. Furthermore, results show that estimated TTC values by the algorithm closely follow the ground truth. The specifics of the hardware design to implement the algorithm on a real-time embedded system are laid out. Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

23 pages, 845 KiB  
Article
Detecting Multi-Resolution Pedestrians Using Group Cost-Sensitive Boosting with Channel Features
by Chao Zhu and Xu-Cheng Yin
Sensors 2019, 19(4), 780; https://doi.org/10.3390/s19040780 - 14 Feb 2019
Cited by 3 | Viewed by 2973
Abstract
Significant progress has been achieved in the past few years for the challenging task of pedestrian detection. Nevertheless, a major bottleneck of existing state-of-the-art approaches lies in a great drop in performance with reducing resolutions of the detected targets. For the boosting-based detectors [...] Read more.
Significant progress has been achieved in the past few years for the challenging task of pedestrian detection. Nevertheless, a major bottleneck of existing state-of-the-art approaches lies in a great drop in performance with reducing resolutions of the detected targets. For the boosting-based detectors which are popular in pedestrian detection literature, a possible cause for this drop is that in their boosting training process, low-resolution samples, which are usually more difficult to be detected due to the missing details, are still treated equally importantly as high-resolution samples, resulting in the false negatives since they are more easily rejected in the early stages and can hardly be recovered in the late stages. To address this problem, we propose in this paper a robust multi-resolution detection approach with a novel group cost-sensitive boosting algorithm, which is derived from the standard AdaBoost algorithm to further explore different costs for different resolution groups of the samples in the boosting process, and to place greater emphasis on low-resolution groups in order to better handle the detection of multi-resolution targets. The effectiveness of the proposed approach is evaluated on the Caltech pedestrian benchmark and KAIST (Korea Advanced Institute of Science and Technology) multispectral pedestrian benchmark, and validated by its promising performance on different resolution-specific test sets of both benchmarks. Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

18 pages, 28319 KiB  
Article
FeinPhone: Low-cost Smartphone Camera-based 2D Particulate Matter Sensor
by Matthias Budde, Simon Leiner, Marcel Köpke, Johannes Riesterer, Till Riedel and Michael Beigl
Sensors 2019, 19(3), 749; https://doi.org/10.3390/s19030749 - 12 Feb 2019
Cited by 10 | Viewed by 8928
Abstract
Precise, location-specific fine dust measurement is central for the assessment of urban air quality. Classic measurement approaches require dedicated hardware, of which professional equipment is still prohibitively expensive (>10k$) for dense measurements, and inexpensive sensors do not meet accuracy demands. As a step [...] Read more.
Precise, location-specific fine dust measurement is central for the assessment of urban air quality. Classic measurement approaches require dedicated hardware, of which professional equipment is still prohibitively expensive (>10k$) for dense measurements, and inexpensive sensors do not meet accuracy demands. As a step towards filling this gap, we propose FeinPhone, a phone-based fine dust measurement system that uses camera and flashlight functions that are readily available on today’s off-the-shelf smart phones. We introduce a cost-effective passive hardware add-on together with a novel counting approach based on light-scattering particle sensors. Since our approach features a 2D sensor (the camera) instead of a single photodiode, we can employ it to capture the scatter traces from individual particles rather than just retaining a light intensity sum signal as in simple photometers. This is a more direct way of assessing the particle count, it is robust against side effects, e.g., from camera image compression, and enables gaining information on the size spectrum of the particles. Our proof-of-concept evaluation comparing several FeinPhone sensors with data from a high-quality APS/SMPS (Aerodynamic Particle Sizer/Scanning Mobility Particle Sizer) reference device at the World Calibration Center for Aerosol Physics shows that the collected data shows excellent correlation with the inhalable coarse fraction of fine dust particles (r > 0.9) and can successfully capture its levels under realistic conditions. Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

14 pages, 19797 KiB  
Article
Three-Dimensional Visualization System with Spatial Information for Navigation of Tele-Operated Robots
by Seung-Hun Kim, Chansung Jung and Jaeheung Park
Sensors 2019, 19(3), 746; https://doi.org/10.3390/s19030746 - 12 Feb 2019
Cited by 3 | Viewed by 3922
Abstract
This study describes a three-dimensional visualization system with spatial information for the effective control of a tele-operated robot. The environmental visualization system for operating the robot is very important. The tele-operated robot performs tasks in a disaster area that is not accessible to [...] Read more.
This study describes a three-dimensional visualization system with spatial information for the effective control of a tele-operated robot. The environmental visualization system for operating the robot is very important. The tele-operated robot performs tasks in a disaster area that is not accessible to humans. The visualization system should perform in real-time to cope with rapidly changing situations. The visualization system should also provide accurate and high-level information so that the tele-operator can make the right decisions. The proposed system consists of four fisheye cameras and a 360° laser scanner. When the robot moves to the unknown space, a spatial model is created using the spatial information data of the laser scanner, and a single-stitched image is created using four images from cameras and mapped in real-time. The visualized image contains the surrounding spatial information; hence, the tele-operator can not only grasp the surrounding space easily, but also knows the relative position of the robot in space. In addition, it provides various angles of view without moving the robot or sensor, thereby coping with various situations. The experimental results show that the proposed method has a more natural appearance than the conventional methods. Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

12 pages, 2760 KiB  
Article
A Piezoelectric Sensor Signal Analysis Method for Identifying Persons Groups
by Hitoshi Ueno
Sensors 2019, 19(3), 733; https://doi.org/10.3390/s19030733 - 12 Feb 2019
Cited by 1 | Viewed by 4085
Abstract
The is an increasing number of elderly single-person households causing lonely deaths and it is a social problem. We study a watching system for elderly families by laying the piezoelectric sensors inside the house. There are few privacy issues of this system because [...] Read more.
The is an increasing number of elderly single-person households causing lonely deaths and it is a social problem. We study a watching system for elderly families by laying the piezoelectric sensors inside the house. There are few privacy issues of this system because piezoelectric sensor detects only a person’s vibration signal. Furthermore, it has a benefit of sensing the ability for a bio-signal including the respiration cycle and cardiac cycle. We propose a method of identifying the person who is on the sensor by analyzing the frequency spectrum of the bio-signal. Multiple peaks of harmonics originating from the heartbeat appear in the graph of the frequency spectrum. We propose a method to identify people by using the peak shape as a discrimination criterion. Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

20 pages, 8021 KiB  
Article
Incorporating Negative Sample Training for Ship Detection Based on Deep Learning
by Lianru Gao, Yiqun He, Xu Sun, Xiuping Jia and Bing Zhang
Sensors 2019, 19(3), 684; https://doi.org/10.3390/s19030684 - 07 Feb 2019
Cited by 24 | Viewed by 4877
Abstract
While ship detection using high-resolution optical satellite images plays an important role in various civilian fields—including maritime traffic survey and maritime rescue—it is a difficult task due to influences of the complex background, especially when ships are near to land. In current literatures, [...] Read more.
While ship detection using high-resolution optical satellite images plays an important role in various civilian fields—including maritime traffic survey and maritime rescue—it is a difficult task due to influences of the complex background, especially when ships are near to land. In current literatures, land masking is generally required before ship detection to avoid many false alarms on land. However, sea–land segmentation not only has the risk of segmentation errors, but also requires expertise to adjust parameters. In this study, Faster Region-based Convolutional Neural Network (Faster R-CNN) is applied to detect ships without the need for land masking. We propose an effective training strategy for the Faster R-CNN by incorporating a large number of images containing only terrestrial regions as negative samples without any manual marking, which is different from the selection of negative samples by targeted way in other detection methods. The experiments using Gaofen-1 satellite (GF-1), Gaofen-2 satellite (GF-2), and Jilin-1 satellite (JL-1) images as testing datasets under different ship detection conditions were carried out to evaluate the effectiveness of the proposed strategy in the avoidance of false alarms on land. The results show that the method incorporating negative sample training can largely reduce false alarms in terrestrial areas, and is superior in detection performance, algorithm complexity, and time consumption. Compared with the method based on sea–land segmentation, the proposed method achieves the absolute increment of 70% of the F1-measure, when the image contains large land area such as the GF-1 image, and achieves the absolute increment of 42.5% for images with complex harbors and many coastal ships, such as the JL-1 images. Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

14 pages, 4325 KiB  
Article
Low-Dose Computed Tomography Image Super-Resolution Reconstruction via Random Forests
by Peijian Gu, Changhui Jiang, Min Ji, Qiyang Zhang, Yongshuai Ge, Dong Liang, Xin Liu, Yongfeng Yang, Hairong Zheng and Zhanli Hu
Sensors 2019, 19(1), 207; https://doi.org/10.3390/s19010207 - 08 Jan 2019
Cited by 13 | Viewed by 4309
Abstract
Aiming at reducing computed tomography (CT) scan radiation while ensuring CT image quality, a new low-dose CT super-resolution reconstruction method based on combining a random forest with coupled dictionary learning is proposed. The random forest classifier finds the optimal solution of the mapping [...] Read more.
Aiming at reducing computed tomography (CT) scan radiation while ensuring CT image quality, a new low-dose CT super-resolution reconstruction method based on combining a random forest with coupled dictionary learning is proposed. The random forest classifier finds the optimal solution of the mapping relationship between low-dose CT (LDCT) images and high-dose CT (HDCT) images and then completes CT image reconstruction by coupled dictionary learning. An iterative method is developed to improve robustness, the important coefficients for the tree structure are discussed and the optimal solutions are reported. The proposed method is further compared with a traditional interpolation method. The results show that the proposed algorithm can obtain a higher peak signal-to-noise ratio (PSNR) and structural similarity index measurement (SSIM) and has better ability to reduce noise and artifacts. This method can be applied to many different medical imaging fields in the future and the addition of computer multithreaded computing can reduce time consumption. Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

17 pages, 2027 KiB  
Article
Dynamic Pose Estimation Using Multiple RGB-D Cameras
by Sungjin Hong and Yejin Kim
Sensors 2018, 18(11), 3865; https://doi.org/10.3390/s18113865 - 10 Nov 2018
Cited by 16 | Viewed by 5419
Abstract
Human poses are difficult to estimate due to the complicated body structure and the self-occlusion problem. In this paper, we introduce a marker-less system for human pose estimation by detecting and tracking key body parts, namely the head, hands, and feet. Given color [...] Read more.
Human poses are difficult to estimate due to the complicated body structure and the self-occlusion problem. In this paper, we introduce a marker-less system for human pose estimation by detecting and tracking key body parts, namely the head, hands, and feet. Given color and depth images captured by multiple red, green, blue, and depth (RGB-D) cameras, our system constructs a graph model with segmented regions from each camera and detects the key body parts as a set of extreme points based on accumulative geodesic distances in the graph. During the search process, local detection using a supervised learning model is utilized to match local body features. A final set of extreme points is selected with a voting scheme and tracked with physical constraints from the unified data received from the multiple cameras. During the tracking process, a Kalman filter-based method is introduced to reduce positional noises and to recover from a failure of tracking extremes. Our system shows an average of 87% accuracy against the commercial system, which outperforms the previous multi-Kinects system, and can be applied to recognize a human action or to synthesize a motion sequence from a few key poses using a small set of extremes as input data. Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

16 pages, 5556 KiB  
Article
Three-Dimensional Holographic Electromagnetic Imaging for Accessing Brain Stroke
by Lulu Wang
Sensors 2018, 18(11), 3852; https://doi.org/10.3390/s18113852 - 09 Nov 2018
Cited by 5 | Viewed by 2789
Abstract
The authors recently developed a two-dimensional (2D) holographic electromagnetic induction imaging (HEI) for biomedical imaging applications. However, this method was unable to detect small inclusions accurately. For example, only one of two inclusions can be detected in the reconstructed image if the two [...] Read more.
The authors recently developed a two-dimensional (2D) holographic electromagnetic induction imaging (HEI) for biomedical imaging applications. However, this method was unable to detect small inclusions accurately. For example, only one of two inclusions can be detected in the reconstructed image if the two inclusions were located at the same XY plane but in different Z-directions. This paper provides a theoretical framework of three-dimensional (3D) HEI to accurately and effectively detect inclusions embedded in a biological object. A numerical system, including a realistic head phantom, a 16-element excitation sensor array, a 16-element receiving sensor array, and image processing model has been developed to evaluate the effectiveness of the proposed method for detecting small stroke. The achieved 3D HEI images have been compared with 2D HEI images. Simulation results show that the 3D HEI method can accurately and effectively identify small inclusions even when two inclusions are located at the same XY plane but in different Z-directions. This preliminary study shows that the proposed method has the potential to develop a useful imaging tool for the diagnosis of neurological diseases and injuries in the future. Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

17 pages, 7449 KiB  
Article
Fusing Infrared and Visible Images of Different Resolutions via Total Variation Model
by Qinglei Du, Han Xu, Yong Ma, Jun Huang and Fan Fan
Sensors 2018, 18(11), 3827; https://doi.org/10.3390/s18113827 - 08 Nov 2018
Cited by 52 | Viewed by 3538
Abstract
In infrared and visible image fusion, existing methods typically have a prerequisite that the source images share the same resolution. However, due to limitations of hardware devices and application environments, infrared images constantly suffer from markedly lower resolution compared with the corresponding visible [...] Read more.
In infrared and visible image fusion, existing methods typically have a prerequisite that the source images share the same resolution. However, due to limitations of hardware devices and application environments, infrared images constantly suffer from markedly lower resolution compared with the corresponding visible images. In this case, current fusion methods inevitably cause texture information loss in visible images or blur thermal radiation information in infrared images. Moreover, the principle of existing fusion rules typically focuses on preserving texture details in source images, which may be inappropriate for fusing infrared thermal radiation information because it is characterized by pixel intensities, possibly neglecting the prominence of targets in fused images. Faced with such difficulties and challenges, we propose a novel method to fuse infrared and visible images of different resolutions and generate high-resolution resulting images to obtain clear and accurate fused images. Specifically, the fusion problem is formulated as a total variation (TV) minimization problem. The data fidelity term constrains the pixel intensity similarity of the downsampled fused image with respect to the infrared image, and the regularization term compels the gradient similarity of the fused image with respect to the visible image. The fast iterative shrinkage-thresholding algorithm (FISTA) framework is applied to improve the convergence rate. Our resulting fused images are similar to super-resolved infrared images, which are sharpened by the texture information from visible images. Advantages and innovations of our method are demonstrated by the qualitative and quantitative comparisons with six state-of-the-art methods on publicly available datasets. Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

15 pages, 2611 KiB  
Article
High-Resolution Aerial Imagery Semantic Labeling with Dense Pyramid Network
by Xuran Pan, Lianru Gao, Bing Zhang, Fan Yang and Wenzhi Liao
Sensors 2018, 18(11), 3774; https://doi.org/10.3390/s18113774 - 05 Nov 2018
Cited by 31 | Viewed by 4677
Abstract
Semantic segmentation of high-resolution aerial images is of great importance in certain fields, but the increasing spatial resolution brings large intra-class variance and small inter-class differences that can lead to classification ambiguities. Based on high-level contextual features, the deep convolutional neural network (DCNN) [...] Read more.
Semantic segmentation of high-resolution aerial images is of great importance in certain fields, but the increasing spatial resolution brings large intra-class variance and small inter-class differences that can lead to classification ambiguities. Based on high-level contextual features, the deep convolutional neural network (DCNN) is an effective method to deal with semantic segmentation of high-resolution aerial imagery. In this work, a novel dense pyramid network (DPN) is proposed for semantic segmentation. The network starts with group convolutions to deal with multi-sensor data in channel wise to extract feature maps of each channel separately; by doing so, more information from each channel can be preserved. This process is followed by the channel shuffle operation to enhance the representation ability of the network. Then, four densely connected convolutional blocks are utilized to both extract and take full advantage of features. The pyramid pooling module combined with two convolutional layers are set to fuse multi-resolution and multi-sensor features through an effective global scenery prior manner, producing the probability graph for each class. Moreover, the median frequency balanced focal loss is proposed to replace the standard cross entropy loss in the training phase to deal with the class imbalance problem. We evaluate the dense pyramid network on the International Society for Photogrammetry and Remote Sensing (ISPRS) Vaihingen and Potsdam 2D semantic labeling dataset, and the results demonstrate that the proposed framework exhibits better performances, compared to the state of the art baseline. Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

19 pages, 2702 KiB  
Article
Hot Anchors: A Heuristic Anchors Sampling Method in RCNN-Based Object Detection
by Jinpeng Zhang, Jinming Zhang and Shan Yu
Sensors 2018, 18(10), 3415; https://doi.org/10.3390/s18103415 - 11 Oct 2018
Cited by 8 | Viewed by 2969
Abstract
In the image object detection task, a huge number of candidate boxes are generated to match with a relatively very small amount of ground-truth boxes, and through this method the learning samples can be created. But in fact the vast majority of the [...] Read more.
In the image object detection task, a huge number of candidate boxes are generated to match with a relatively very small amount of ground-truth boxes, and through this method the learning samples can be created. But in fact the vast majority of the candidate boxes do not contain valid object instances and should be recognized and rejected during the training and evaluation of the network. This leads to extra high computation burden and a serious imbalance problem between object and none-object samples, thereby impeding the algorithm’s performance. Here we propose a new heuristic sampling method to generate candidate boxes for two-stage detection algorithms. It is generally applicable to the current two-stage detection algorithms to improve their detection performance. Experiments on COCO dataset showed that, relative to the baseline model, this new method could significantly increase the detection accuracy and efficiency. Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

16 pages, 29486 KiB  
Article
Efficient Patch-Wise Semantic Segmentation for Large-Scale Remote Sensing Images
by Yan Liu, Qirui Ren, Jiahui Geng, Meng Ding and Jiangyun Li
Sensors 2018, 18(10), 3232; https://doi.org/10.3390/s18103232 - 25 Sep 2018
Cited by 57 | Viewed by 6460
Abstract
Efficient and accurate semantic segmentation is the key technique for automatic remote sensing image analysis. While there have been many segmentation methods based on traditional hand-craft feature extractors, it is still challenging to process high-resolution and large-scale remote sensing images. In this work, [...] Read more.
Efficient and accurate semantic segmentation is the key technique for automatic remote sensing image analysis. While there have been many segmentation methods based on traditional hand-craft feature extractors, it is still challenging to process high-resolution and large-scale remote sensing images. In this work, a novel patch-wise semantic segmentation method with a new training strategy based on fully convolutional networks is presented to segment common land resources. First, to handle the high-resolution image, the images are split as local patches and then a patch-wise network is built. Second, training data is preprocessed in several ways to meet the specific characteristics of remote sensing images, i.e., color imbalance, object rotation variations and lens distortion. Third, a multi-scale training strategy is developed to solve the severe scale variation problem. In addition, the impact of conditional random field (CRF) is studied to improve the precision. The proposed method was evaluated on a dataset collected from a capital city in West China with the Gaofen-2 satellite. The dataset contains ten common land resources (Grassland, Road, etc.). The experimental results show that the proposed algorithm achieves 54.96% in terms of mean intersection over union (MIoU) and outperforms other state-of-the-art methods in remote sensing image segmentation. Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

25 pages, 6498 KiB  
Article
A Novel Parameter Estimation Method Based on a Tuneable Sigmoid in Alpha-Stable Distribution Noise Environments
by Li Li, Nicolas H. Younan and Xiaofei Shi
Sensors 2018, 18(9), 3012; https://doi.org/10.3390/s18093012 - 08 Sep 2018
Cited by 4 | Viewed by 3097
Abstract
In this paper, a novel method, that employs a fractional Fourier transform and a tuneable Sigmoid transform, is proposed, in order to estimate the Doppler stretch and time delay of wideband echoes for a linear frequency modulation (LFM) pulse radar in an alpha-stable [...] Read more.
In this paper, a novel method, that employs a fractional Fourier transform and a tuneable Sigmoid transform, is proposed, in order to estimate the Doppler stretch and time delay of wideband echoes for a linear frequency modulation (LFM) pulse radar in an alpha-stable distribution noise environment. Two novel functions, a tuneable Sigmoid fractional correlation function (TS-FC) and a tuneable Sigmoid fractional power spectrum density (TS-FPSD), are presented in this paper. The novel algorithm based on the TS-FPSD is then proposed to estimate the Doppler stretch and the time delay. Then, the derivation of unbiasedness and consistency is presented. Furthermore, the boundness of the TS-FPSD to the symmetric alpha stable ( S α S ) noise, the parameter selection of the TS-FPSD, and the feasibility analysis of the TS-FPSD, are presented to evaluate the performance of the proposed method. In addition, the Cramér–Rao bound for parameter estimation is derived and computed in closed form, which shows that better performance has been achieved. Simulation results and theoretical analysis are presented, to demonstrate the applicability of the forgoing method. It is shown that the proposed method can not only effectively suppress impulsive noise interference, but it also does not need a priori knowledge of the noise with higher estimation accuracy in alpha-stable distribution noise environments. Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

16 pages, 9179 KiB  
Article
High-Fidelity Inhomogeneous Ground Clutter Simulation of Airborne Phased Array PD Radar Aided by Digital Elevation Model and Digital Land Classification Data
by Hai Li, Jie Wang, Yi Fan and Jungong Han
Sensors 2018, 18(9), 2925; https://doi.org/10.3390/s18092925 - 03 Sep 2018
Cited by 8 | Viewed by 4112
Abstract
This paper presents a high-fidelity inhomogeneous ground clutter simulation method for airborne phased array Pulse Doppler (PD) radar aided by a digital elevation model (DEM) and digital land classification data (DLCD). The method starts by extracting the basic geographic information of the Earth’s [...] Read more.
This paper presents a high-fidelity inhomogeneous ground clutter simulation method for airborne phased array Pulse Doppler (PD) radar aided by a digital elevation model (DEM) and digital land classification data (DLCD). The method starts by extracting the basic geographic information of the Earth’s surface scattering points from the DEM data, then reads the Earth’s surface classification codes of Earth’s surface scattering points according to the DLCD. After determining the landform types, different backscattering coefficient models are selected to calculate the backscattering coefficient of each Earth surface scattering point. Finally, the high-fidelity inhomogeneous ground clutter simulation of airborne phased array PD radar is realized based on the Ward model. The simulation results show that the classifications of landform types obtained by the proposed method are more abundant, and the ground clutter simulated by different backscattering coefficient models is more real and effective. Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

16 pages, 4748 KiB  
Article
Unmanned Aerial Vehicle Object Tracking by Correlation Filter with Adaptive Appearance Model
by Xizhe Xue, Ying Li and Qiang Shen
Sensors 2018, 18(9), 2751; https://doi.org/10.3390/s18092751 - 21 Aug 2018
Cited by 18 | Viewed by 3326
Abstract
With the increasing availability of low-cost, commercially available unmanned aerial vehicles (UAVs), visual tracking using UAVs has become more and more important due to its many new applications, including automatic navigation, obstacle avoidance, traffic monitoring, search and rescue, etc. However, real-world aerial tracking [...] Read more.
With the increasing availability of low-cost, commercially available unmanned aerial vehicles (UAVs), visual tracking using UAVs has become more and more important due to its many new applications, including automatic navigation, obstacle avoidance, traffic monitoring, search and rescue, etc. However, real-world aerial tracking poses many challenges due to platform motion and image instability, such as aspect ratio change, viewpoint change, fast motion, scale variation and so on. In this paper, an efficient object tracking method for UAV videos is proposed to tackle these challenges. We construct the fused features to capture the gradient information and color characteristics simultaneously. Furthermore, cellular automata is introduced to update the appearance template of target accurately and sparsely. In particular, a high confidence model updating strategy is developed according to the stability function. Systematic comparative evaluations performed on the popular UAV123 dataset show the efficiency of the proposed approach. Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

20 pages, 17920 KiB  
Article
A High Precision Quality Inspection System for Steel Bars Based on Machine Vision
by Xinman Zhang, Jiayu Zhang, Mei Ma, Zhiqi Chen, Shuangling Yue, Tingting He and Xuebin Xu
Sensors 2018, 18(8), 2732; https://doi.org/10.3390/s18082732 - 20 Aug 2018
Cited by 24 | Viewed by 5252
Abstract
Steel bars play an important role in modern construction projects and their quality enormously affects the safety of buildings. It is urgent to detect whether steel bars meet the specifications or not. However, the existing manual detection methods are costly, slow and offer [...] Read more.
Steel bars play an important role in modern construction projects and their quality enormously affects the safety of buildings. It is urgent to detect whether steel bars meet the specifications or not. However, the existing manual detection methods are costly, slow and offer poor precision. In order to solve these problems, a high precision quality inspection system for steel bars based on machine vision is developed. We propose two algorithms: the sub-pixel boundary location method (SPBLM) and fast stitch method (FSM). A total of five sensors, including a CMOS, a level sensor, a proximity switch, a voltage sensor, and a current sensor have been used to detect the device conditions and capture image or video. The device could capture abundant and high-definition images and video taken by a uniform and stable smartphone at the construction site. Then data could be processed in real-time on a smartphone. Furthermore, the detection results, including steel bar diameter, spacing, and quantity would be given by a practical APP. The system has a rather high accuracy (as low as 0.04 mm (absolute error) and 0.002% (relative error) of calculating diameter and spacing; zero error in counting numbers of steel bars) when doing inspection tasks, and three parameters can be detected at the same time. None of these features are available in existing systems and the device and method can be widely used to steel bar quality inspection at the construction site. Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

22 pages, 28562 KiB  
Article
Automated Calibration Method for Eye-Tracked Autostereoscopic Display
by Hyoseok Hwang
Sensors 2018, 18(8), 2614; https://doi.org/10.3390/s18082614 - 09 Aug 2018
Cited by 1 | Viewed by 3060
Abstract
In this paper, we propose an automated calibration system for an eye-tracked autostereoscopic display (ETAD). Instead of calibrating each device sequentially and individually, our method calibrates all parameters of the devices at the same time in a fixed environment. To achieve this, we [...] Read more.
In this paper, we propose an automated calibration system for an eye-tracked autostereoscopic display (ETAD). Instead of calibrating each device sequentially and individually, our method calibrates all parameters of the devices at the same time in a fixed environment. To achieve this, we first identify and classify all parameters by establishing a physical model of the ETAD and describe a rendering method based on a viewer’s eye position. Then, we propose a calibration method that estimates all parameters at the same time using two images. To automate the proposed method, we use a calibration module of our own design. Consequently, the calibration process is performed by analyzing two images captured by onboard camera of the ETAD and the external camera of the calibration module. For validation, we conducted two types of experiments, one with simulation for quantitative evaluation, and the other with a real prototype ETAD device for qualitative assessment. Experimental results demonstrate the crosstalk of the ETAD was improved to 8.32%. The visual quality was also improved to 30.44% in the peak-signal-to-noise ratio (PSNR) and 40.14% in the structural similarity (SSIM) indexes when the proposed calibration method is applied. The whole calibration process was carried out within 1.5 s without any external manipulation. Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

18 pages, 1669 KiB  
Article
Modeling and Control of a Micro AUV: Objects Follower Approach
by Jesus Arturo Monroy-Anieva, Cyril Rouviere, Eduardo Campos-Mercado, Tomas Salgado-Jimenez and Luis Govinda Garcia-Valdovinos
Sensors 2018, 18(8), 2574; https://doi.org/10.3390/s18082574 - 06 Aug 2018
Cited by 11 | Viewed by 3719
Abstract
This work describes the modeling, control and development of a low cost Micro Autonomous Underwater Vehicle (μ-AUV), named AR2D2. The main objective of this work is to make the vehicle to detect and follow an object with defined color by means [...] Read more.
This work describes the modeling, control and development of a low cost Micro Autonomous Underwater Vehicle (μ-AUV), named AR2D2. The main objective of this work is to make the vehicle to detect and follow an object with defined color by means of the readings of a depth sensor and the information provided by an artificial vision system. A nonlinear PD (Proportional-Derivative) controller is implemented on the vehicle in order to stabilize the heave and surge movements. A formal stability proof of the closed-loop system using Lyapunov’s theory is given. Furthermore, the performance of the μ-AUV is validated through numerical simulations in MatLab and real-time experiments. Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

24 pages, 49671 KiB  
Article
High-Precision Detection of Defects of Tire Texture Through X-ray Imaging Based on Local Inverse Difference Moment Features
by Guo Zhao and Shiyin Qin
Sensors 2018, 18(8), 2524; https://doi.org/10.3390/s18082524 - 02 Aug 2018
Cited by 25 | Viewed by 5460
Abstract
Automatic defect detection is an important and challenging issue in the tire industrial quality control. As is well known, the production quality of tire is directly related to the vehicle running safety and passenger security. However, it is difficult to inspect the inner [...] Read more.
Automatic defect detection is an important and challenging issue in the tire industrial quality control. As is well known, the production quality of tire is directly related to the vehicle running safety and passenger security. However, it is difficult to inspect the inner structure of tire on the surface. This paper proposes a high-precision detection of defects of tire texture image obtained by X-ray image sensor for tire non-destructive inspection. In this paper, the feature distribution generated by local inverse difference moment (LIDM) features is proposed to be an effective representation of tire X-ray texture image. Further, the defect feature map (DFM) may be constructed by computing the Hausdorff distance between the LIDM feature distributions of original tire image and each sliding image patch. Moreover, DFM may be enhanced to improve the robustness of defect detection algorithm by a background suppression. Finally, an effective defect detection algorithm is proposed to achieve the pixel-level detection of defects with high precision over the enhanced DFM. In addition, the defect detection algorithm is not only robust to the noise in the background, but also has a more powerful capability of handling different shapes of defects. To validate the performance of our proposed method, two kinds of experiments about the defect feature map and defect detection are conducted to demonstrate its good performance. Moreover, a series of comparative analyses demonstrate that the proposed algorithm can accurately detect the defects and outperforms other algorithms in terms of various quantitative metrics. Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

18 pages, 5915 KiB  
Article
In-Process Monitoring of Lack of Fusion in Ultra-Thin Sheets Edge Welding Using Machine Vision
by Yuxiang Hong, Baohua Chang, Guodong Peng, Zhang Yuan, Xiangchun Hou, Boce Xue and Dong Du
Sensors 2018, 18(8), 2411; https://doi.org/10.3390/s18082411 - 25 Jul 2018
Cited by 21 | Viewed by 5454
Abstract
Lack of fusion can often occur during ultra-thin sheets edge welding process, severely destroying joint quality and leading to seal failure. This paper presents a vision-based weld pool monitoring method for detecting a lack of fusion during micro plasma arc welding (MPAW) of [...] Read more.
Lack of fusion can often occur during ultra-thin sheets edge welding process, severely destroying joint quality and leading to seal failure. This paper presents a vision-based weld pool monitoring method for detecting a lack of fusion during micro plasma arc welding (MPAW) of ultra-thin sheets edge welds. A passive micro-vision sensor is developed to acquire clear images of the mesoscale weld pool under MPAW conditions, continuously and stably. Then, an image processing algorithm has been proposed to extract the characteristics of weld pool geometry from the acquired images in real time. The relations between the presence of a lack of fusion in edge weld and dynamic changes in weld pool characteristic parameters are investigated. The experimental results indicate that the abrupt changes of extracted weld pool centroid position along the weld length are highly correlated with the occurrences of lack of fusion. By using such weld pool characteristic information, the lack of fusion in MPAW of ultra-thin sheets edge welds can be detected in real time. The proposed in-process monitoring method makes the early warning possible. It also can provide feedback for real-time control and can serve as a basis for intelligent defect identification. Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

20 pages, 3878 KiB  
Article
A 3D Relative-Motion Context Constraint-Based MAP Solution for Multiple-Object Tracking Problems
by Zhongli Wang, Litong Fan and Baigen Cai
Sensors 2018, 18(7), 2363; https://doi.org/10.3390/s18072363 - 20 Jul 2018
Cited by 1 | Viewed by 4009
Abstract
Multi-object tracking (MOT), especially by using a moving monocular camera, is a very challenging task in the field of visual object tracking. To tackle this problem, the traditional tracking-by-detection-based method is heavily dependent on detection results. Occlusion and mis-detections will often lead to [...] Read more.
Multi-object tracking (MOT), especially by using a moving monocular camera, is a very challenging task in the field of visual object tracking. To tackle this problem, the traditional tracking-by-detection-based method is heavily dependent on detection results. Occlusion and mis-detections will often lead to tracklets or drifting. In this paper, the tasks of MOT and camera motion estimation are formulated as finding a maximum a posteriori (MAP) solution of joint probability and synchronously solved in a unified framework. To improve performance, we incorporate the three-dimensional (3D) relative-motion model into a sequential Bayesian framework to track multiple objects and the camera’s ego-motion estimation. A 3D relative-motion model that describes spatial relations among objects is exploited for predicting object states robustly and recovering objects when occlusion and mis-detections occur. Reversible jump Markov chain Monte Carlo (RJMCMC) particle filtering is applied to solve the posteriori estimation problem. Both quantitative and qualitative experiments with benchmark datasets and video collected on campus were conducted, which confirms that the proposed method is outperformed in many evaluation metrics. Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

20 pages, 32642 KiB  
Article
Accelerating the K-Nearest Neighbors Filtering Algorithm to Optimize the Real-Time Classification of Human Brain Tumor in Hyperspectral Images
by Giordana Florimbi, Himar Fabelo, Emanuele Torti, Raquel Lazcano, Daniel Madroñal, Samuel Ortega, Ruben Salvador, Francesco Leporati, Giovanni Danese, Abelardo Báez-Quevedo, Gustavo M. Callicó, Eduardo Juárez, César Sanz and Roberto Sarmiento
Sensors 2018, 18(7), 2314; https://doi.org/10.3390/s18072314 - 17 Jul 2018
Cited by 29 | Viewed by 6462
Abstract
The use of hyperspectral imaging (HSI) in the medical field is an emerging approach to assist physicians in diagnostic or surgical guidance tasks. However, HSI data processing involves very high computational requirements due to the huge amount of information captured by the sensors. [...] Read more.
The use of hyperspectral imaging (HSI) in the medical field is an emerging approach to assist physicians in diagnostic or surgical guidance tasks. However, HSI data processing involves very high computational requirements due to the huge amount of information captured by the sensors. One of the stages with higher computational load is the K-Nearest Neighbors (KNN) filtering algorithm. The main goal of this study is to optimize and parallelize the KNN algorithm by exploiting the GPU technology to obtain real-time processing during brain cancer surgical procedures. This parallel version of the KNN performs the neighbor filtering of a classification map (obtained from a supervised classifier), evaluating the different classes simultaneously. The undertaken optimizations and the computational capabilities of the GPU device throw a speedup up to 66.18× when compared to a sequential implementation. Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

18 pages, 4784 KiB  
Article
Amplitude-Based Filtering for Video Magnification in Presence of Large Motion
by Xiu Wu, Xuezhi Yang, Jing Jin and Zhao Yang
Sensors 2018, 18(7), 2312; https://doi.org/10.3390/s18072312 - 17 Jul 2018
Cited by 19 | Viewed by 4215
Abstract
Video magnification reveals important and informative subtle variations in the world. These signals are often combined with large motions which result in significant blurring artifacts and haloes when conventional video magnification approaches are used. To counter these issues, this paper presents an amplitude-based [...] Read more.
Video magnification reveals important and informative subtle variations in the world. These signals are often combined with large motions which result in significant blurring artifacts and haloes when conventional video magnification approaches are used. To counter these issues, this paper presents an amplitude-based filtering algorithm that can magnify small changes in video in presence of large motions. We seek to understand the amplitude characteristic of small changes and large motions with the goal of extracting accurate signals for visualization. Based on spectrum amplitude filtering, the large motions can be removed while small changes can still be magnified by Eulerian approach. An advantage of this algorithm is that it can handle large motions, whether they are linear or nonlinear. Our experimental results show that the proposed method can amplify subtle variations in the presence of large motion, as well as significantly reduce artifacts. We demonstrate the presented algorithm by comparing to the state-of-the-art and provide subjective and objective evidence for the proposed method. Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

12 pages, 3217 KiB  
Article
Dense RGB-D SLAM with Multiple Cameras
by Xinrui Meng, Wei Gao and Zhanyi Hu
Sensors 2018, 18(7), 2118; https://doi.org/10.3390/s18072118 - 02 Jul 2018
Cited by 15 | Viewed by 3913
Abstract
A multi-camera dense RGB-D SLAM (simultaneous localization and mapping) system has the potential both to speed up scene reconstruction and to improve localization accuracy, thanks to multiple mounted sensors and an enlarged effective field of view. To effectively tap the potential of the [...] Read more.
A multi-camera dense RGB-D SLAM (simultaneous localization and mapping) system has the potential both to speed up scene reconstruction and to improve localization accuracy, thanks to multiple mounted sensors and an enlarged effective field of view. To effectively tap the potential of the system, two issues must be understood: first, how to calibrate the system where sensors usually shares small or no common field of view to maximally increase the effective field of view; second, how to fuse the location information from different sensors. In this work, a three-Kinect system is reported. For system calibration, two kinds of calibration methods are proposed, one is suitable for system with inertial measurement unit (IMU) using an improved hand–eye calibration method, the other for pure visual SLAM without any other auxiliary sensors. In the RGB-D SLAM stage, we extend and improve a state-of-art single RGB-D SLAM method to multi-camera system. We track the multiple cameras’ poses independently and select the one with the pose minimal-error as the reference pose at each moment to correct other cameras’ poses. To optimize the initial estimated pose, we improve the deformation graph by adding an attribute of device number to distinguish surfels built by different cameras and do deformations according to the device number. We verify the accuracy of our extrinsic calibration methods in the experiment section and show the satisfactory reconstructed models by our multi-camera dense RGB-D SLAM. The RMSE (root-mean-square error) of the lengths measured in our reconstructed mode is 1.55 cm (similar to the state-of-art single camera RGB-D SLAM systems). Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

17 pages, 18598 KiB  
Article
Multi-Object Tracking with Correlation Filter for Autonomous Vehicle
by Dawei Zhao, Hao Fu, Liang Xiao, Tao Wu and Bin Dai
Sensors 2018, 18(7), 2004; https://doi.org/10.3390/s18072004 - 22 Jun 2018
Cited by 46 | Viewed by 5889
Abstract
Multi-object tracking is a crucial problem for autonomous vehicle. Most state-of-the-art approaches adopt the tracking-by-detection strategy, which is a two-step procedure consisting of the detection module and the tracking module. In this paper, we improve both steps. We improve the detection module by [...] Read more.
Multi-object tracking is a crucial problem for autonomous vehicle. Most state-of-the-art approaches adopt the tracking-by-detection strategy, which is a two-step procedure consisting of the detection module and the tracking module. In this paper, we improve both steps. We improve the detection module by incorporating the temporal information, which is beneficial for detecting small objects. For the tracking module, we propose a novel compressed deep Convolutional Neural Network (CNN) feature based Correlation Filter tracker. By carefully integrating these two modules, the proposed multi-object tracking approach has the ability of re-identification (ReID) once the tracked object gets lost. Extensive experiments were performed on the KITTI and MOT2015 tracking benchmarks. Results indicate that our approach outperforms most state-of-the-art tracking approaches. Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

15 pages, 1788 KiB  
Article
Spectral-Spatial Feature Extraction of Hyperspectral Images Based on Propagation Filter
by Zhikun Chen, Junjun Jiang, Xinwei Jiang, Xiaoping Fang and Zhihua Cai
Sensors 2018, 18(6), 1978; https://doi.org/10.3390/s18061978 - 20 Jun 2018
Cited by 22 | Viewed by 4322
Abstract
Recently, image-filtering based hyperspectral image (HSI) feature extraction has been widely studied. However, due to limited spatial resolution and feature distribution complexity, the problems of cross-region mixing after filtering and spectral discriminative reduction still remain. To address these issues, this paper proposes a [...] Read more.
Recently, image-filtering based hyperspectral image (HSI) feature extraction has been widely studied. However, due to limited spatial resolution and feature distribution complexity, the problems of cross-region mixing after filtering and spectral discriminative reduction still remain. To address these issues, this paper proposes a spectral-spatial propagation filter (PF) based HSI feature extraction method that can effectively address the above problems. The dimensionality/band of an HSI is typically high; therefore, principal component analysis (PCA) is first used to reduce the HSI dimensionality. Then, the principal components of the HSI are filtered with the PF. When cross-region mixture occurs in the image, the filter template reduces the weight assignments of the cross-region mixed pixels to handle the issue of cross-region mixed pixels simply and effectively. To validate the effectiveness of the proposed method, experiments are carried out on three common HSIs using support vector machine (SVM) classifiers with features learned by the PF. The experimental results demonstrate that the proposed method effectively extracts the spectral-spatial features of HSIs and significantly improves the accuracy of HSI classification. Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

24 pages, 8331 KiB  
Article
An Improved Randomized Local Binary Features for Keypoints Recognition
by Jinming Zhang, Zuren Feng, Jinpeng Zhang and Gang Li
Sensors 2018, 18(6), 1937; https://doi.org/10.3390/s18061937 - 14 Jun 2018
Cited by 5 | Viewed by 3036
Abstract
In this paper, we carry out researches on randomized local binary features. Randomized local binary features have been used in many methods like RandomForests, RandomFerns, BRIEF, ORB and AKAZE to matching keypoints. However, in those existing methods, the randomness of feature operators only [...] Read more.
In this paper, we carry out researches on randomized local binary features. Randomized local binary features have been used in many methods like RandomForests, RandomFerns, BRIEF, ORB and AKAZE to matching keypoints. However, in those existing methods, the randomness of feature operators only reflects in sampling position. In this paper, we find the quality of the binary feature space can be greatly improved by increasing the randomness of the basic sampling operator. The key idea of our method is to use a Randomized Intensity Difference operator (we call it RID operator) as a basic sampling operator to observe image patches. The randomness of RID operators are reflected in five aspects: grids, position, aperture, weights and channels. Comparing with the traditional incompletely randomized binary features (we call them RIT features), a completely randomized sampling manner can generate higher quality binary feature space. The RID operator can be used on both gray and color images. We embed different kinds of RID operators into RandomFerns and RandomForests classifiers to test their recognition rate on both image and video datasets. The experiment results show the excellent quality of our feature method. We also propose the evaluation criteria for robustness and distinctiveness to observe the effects of randomization on binary feature space. Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

22 pages, 3097 KiB  
Article
A Bayesian Scene-Prior-Based Deep Network Model for Face Verification
by Huafeng Wang, Wenfeng Song, Wanquan Liu, Ning Song, Yuehai Wang and Haixia Pan
Sensors 2018, 18(6), 1906; https://doi.org/10.3390/s18061906 - 11 Jun 2018
Cited by 6 | Viewed by 3606
Abstract
Face recognition/verification has received great attention in both theory and application for the past two decades. Deep learning has been considered as a very powerful tool for improving the performance of face recognition/verification recently. With large labeled training datasets, the features obtained from [...] Read more.
Face recognition/verification has received great attention in both theory and application for the past two decades. Deep learning has been considered as a very powerful tool for improving the performance of face recognition/verification recently. With large labeled training datasets, the features obtained from deep learning networks can achieve higher accuracy in comparison with shallow networks. However, many reported face recognition/verification approaches rely heavily on the large size and complete representative of the training set, and most of them tend to suffer serious performance drop or even fail to work if fewer training samples per person are available. Hence, the small number of training samples may cause the deep features to vary greatly. We aim to solve this critical problem in this paper. Inspired by recent research in scene domain transfer, for a given face image, a new series of possible scenarios about this face can be deduced from the scene semantics extracted from other face individuals in a face dataset. We believe that the “scene” or background in an image, that is, samples with more different scenes for a given person, may determine the intrinsic features among the faces of the same individual. In order to validate this belief, we propose a Bayesian scene-prior-based deep learning model in this paper with the aim to extract important features from background scenes. By learning a scene model on the basis of a labeled face dataset via the Bayesian idea, the proposed method transforms a face image into new face images by referring to the given face with the learnt scene dictionary. Because the new derived faces may have similar scenes to the input face, the face-verification performance can be improved without having background variance, while the number of training samples is significantly reduced. Experiments conducted on the Labeled Faces in the Wild (LFW) dataset view #2 subset illustrated that this model can increase the verification accuracy to 99.2% by means of scenes’ transfer learning (99.12% in literature with an unsupervised protocol). Meanwhile, our model can achieve 94.3% accuracy for the YouTube Faces database (DB) (93.2% in literature with an unsupervised protocol). Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

18 pages, 2502 KiB  
Article
Crack Damage Detection Method via Multiple Visual Features and Efficient Multi-Task Learning Model
by Baoxian Wang, Weigang Zhao, Po Gao, Yufeng Zhang and Zhe Wang
Sensors 2018, 18(6), 1796; https://doi.org/10.3390/s18061796 - 02 Jun 2018
Cited by 27 | Viewed by 5141
Abstract
This paper proposes an effective and efficient model for concrete crack detection. The presented work consists of two modules: multi-view image feature extraction and multi-task crack region detection. Specifically, multiple visual features (such as texture, edge, etc.) of image regions are calculated, which [...] Read more.
This paper proposes an effective and efficient model for concrete crack detection. The presented work consists of two modules: multi-view image feature extraction and multi-task crack region detection. Specifically, multiple visual features (such as texture, edge, etc.) of image regions are calculated, which can suppress various background noises (such as illumination, pockmark, stripe, blurring, etc.). With the computed multiple visual features, a novel crack region detector is advocated using a multi-task learning framework, which involves restraining the variability for different crack region features and emphasizing the separability between crack region features and complex background ones. Furthermore, the extreme learning machine is utilized to construct this multi-task learning model, thereby leading to high computing efficiency and good generalization. Experimental results of the practical concrete images demonstrate that the developed algorithm can achieve favorable crack detection performance compared with traditional crack detectors. Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Graphical abstract

21 pages, 10769 KiB  
Article
Combining Motion Compensation with Spatiotemporal Constraint for Video Deblurring
by Jing Li, Weiguo Gong and Weihong Li
Sensors 2018, 18(6), 1774; https://doi.org/10.3390/s18061774 - 01 Jun 2018
Cited by 2 | Viewed by 3005
Abstract
We propose a video deblurring method by combining motion compensation with spatiotemporal constraint for restoring blurry video caused by camera shake. The proposed method makes effective full use of the spatiotemporal information not only in the blur kernel estimation, but also in the [...] Read more.
We propose a video deblurring method by combining motion compensation with spatiotemporal constraint for restoring blurry video caused by camera shake. The proposed method makes effective full use of the spatiotemporal information not only in the blur kernel estimation, but also in the latent sharp frame restoration. Firstly, we estimate a motion vector between the current and the previous blurred frames, and introduce the estimated motion vector for deriving the motion-compensated frame with the previous restored frame. Secondly, we proposed a blur kernel estimation strategy by applying the derived motion-compensated frame to an improved regularization model for improving the quality of the estimated blur kernel and reducing the processing time. Thirdly, we propose a spatiotemporal constraint algorithm that can not only enhance temporal consistency, but also suppress noise and ringing artifacts of the deblurred video through introducing a temporal regularization term. Finally, we extend Fast Total Variation de-convolution (FTVd) for solving the minimization problem of the proposed spatiotemporal constraint energy function. Extensive experiments demonstrate that the proposed method achieve the state-of-the-art results either in subjective vision or objective evaluation. Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Graphical abstract

19 pages, 40946 KiB  
Article
Real-Time Large-Scale Dense Mapping with Surfels
by Xingyin Fu, Feng Zhu, Qingxiao Wu, Yunlei Sun, Rongrong Lu and Ruigang Yang
Sensors 2018, 18(5), 1493; https://doi.org/10.3390/s18051493 - 09 May 2018
Cited by 10 | Viewed by 5284
Abstract
Real-time dense mapping systems have been developed since the birth of consumer RGB-D cameras. Currently, there are two commonly used models in dense mapping systems: truncated signed distance function (TSDF) and surfel. The state-of-the-art dense mapping systems usually work fine with small-sized regions. [...] Read more.
Real-time dense mapping systems have been developed since the birth of consumer RGB-D cameras. Currently, there are two commonly used models in dense mapping systems: truncated signed distance function (TSDF) and surfel. The state-of-the-art dense mapping systems usually work fine with small-sized regions. The generated dense surface may be unsatisfactory around the loop closures when the system tracking drift grows large. In addition, the efficiency of the system with surfel model slows down when the number of the model points in the map becomes large. In this paper, we propose to use two maps in the dense mapping system. The RGB-D images are integrated into a local surfel map. The old surfels that reconstructed in former times and far away from the camera frustum are moved from the local map to the global map. The updated surfels in the local map when every frame arrives are kept bounded. Therefore, in our system, the scene that can be reconstructed is very large, and the frame rate of our system remains high. We detect loop closures and optimize the pose graph to distribute system tracking drift. The positions and normals of the surfels in the map are also corrected using an embedded deformation graph so that they are consistent with the updated poses. In order to deal with large surface deformations, we propose a new method for constructing constraints with system trajectories and loop closure keyframes. The proposed new method stabilizes large-scale surface deformation. Experimental results show that our novel system behaves better than the prior state-of-the-art dense mapping systems. Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

17 pages, 7945 KiB  
Article
Automated Inspection of Defects in Optical Fiber Connector End Face Using Novel Morphology Approaches
by Shuang Mei, Yudan Wang, Guojun Wen and Yang Hu
Sensors 2018, 18(5), 1408; https://doi.org/10.3390/s18051408 - 03 May 2018
Cited by 5 | Viewed by 6148
Abstract
Increasing deployment of optical fiber networks and the need for reliable high bandwidth make the task of inspecting optical fiber connector end faces a crucial process that must not be neglected. Traditional end face inspections are usually performed by manual visual methods, which [...] Read more.
Increasing deployment of optical fiber networks and the need for reliable high bandwidth make the task of inspecting optical fiber connector end faces a crucial process that must not be neglected. Traditional end face inspections are usually performed by manual visual methods, which are low in efficiency and poor in precision for long-term industrial applications. More seriously, the inspection results cannot be quantified for subsequent analysis. Aiming at the characteristics of typical defects in the inspection process for optical fiber end faces, we propose a novel method, “difference of min-max ranking filtering” (DO2MR), for detection of region-based defects, e.g., dirt, oil, contamination, pits, and chips, and a special model, a “linear enhancement inspector” (LEI), for the detection of scratches. The DO2MR is a morphology method that intends to determine whether a pixel belongs to a defective region by comparing the difference of gray values of pixels in the neighborhood around the pixel. The LEI is also a morphology method that is designed to search for scratches at different orientations with a special linear detector. These two approaches can be easily integrated into optical inspection equipment for automatic quality verification. As far as we know, this is the first time that complete defect detection methods for optical fiber end faces are available in the literature. Experimental results demonstrate that the proposed DO2MR and LEI models yield good comprehensive performance with high precision and accepted recall rates, and the image-level detection accuracies reach 96.0 and 89.3%, respectively. Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

10 pages, 3456 KiB  
Article
Super-Resolution for “Jilin-1” Satellite Video Imagery via a Convolutional Network
by Aoran Xiao, Zhongyuan Wang, Lei Wang and Yexian Ren
Sensors 2018, 18(4), 1194; https://doi.org/10.3390/s18041194 - 13 Apr 2018
Cited by 54 | Viewed by 7089
Abstract
Super-resolution for satellite video attaches much significance to earth observation accuracy, and the special imaging and transmission conditions on the video satellite pose great challenges to this task. The existing deep convolutional neural-network-based methods require pre-processing or post-processing to be adapted to a [...] Read more.
Super-resolution for satellite video attaches much significance to earth observation accuracy, and the special imaging and transmission conditions on the video satellite pose great challenges to this task. The existing deep convolutional neural-network-based methods require pre-processing or post-processing to be adapted to a high-resolution size or pixel format, leading to reduced performance and extra complexity. To this end, this paper proposes a five-layer end-to-end network structure without any pre-processing and post-processing, but imposes a reshape or deconvolution layer at the end of the network to retain the distribution of ground objects within the image. Meanwhile, we formulate a joint loss function by combining the output and high-dimensional features of a non-linear mapping network to precisely learn the desirable mapping relationship between low-resolution images and their high-resolution counterparts. Also, we use satellite video data itself as a training set, which favors consistency between training and testing images and promotes the method’s practicality. Experimental results on “Jilin-1” satellite video imagery show that this method demonstrates a superior performance in terms of both visual effects and measure metrics over competing methods. Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

15 pages, 3101 KiB  
Article
Defocus Blur Detection and Estimation from Imaging Sensors
by Jinyang Li, Zhijing Liu and Yong Yao
Sensors 2018, 18(4), 1135; https://doi.org/10.3390/s18041135 - 08 Apr 2018
Cited by 5 | Viewed by 6297
Abstract
Sparse representation has been proven to be a very effective technique for various image restoration applications. In this paper, an improved sparse representation based method is proposed to detect and estimate defocus blur of imaging sensors. Considering the fact that the patterns usually [...] Read more.
Sparse representation has been proven to be a very effective technique for various image restoration applications. In this paper, an improved sparse representation based method is proposed to detect and estimate defocus blur of imaging sensors. Considering the fact that the patterns usually vary remarkably across different images or different patches in a single image, it is unstable and time-consuming for sparse representation over an over-complete dictionary. We propose an adaptive domain selection scheme to prelearn a set of compact dictionaries and adaptively select the optimal dictionary to each image patch. Then, with nonlocal structure similarity, the proposed method learns nonzero-mean coefficients’ distributions that are much more closer to the real ones. More accurate sparse coefficients can be obtained and further improve the performance of results. Experimental results validate that the proposed method outperforms existing defocus blur estimation approaches, both qualitatively and quantitatively. Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

18 pages, 3745 KiB  
Article
Automatic Fabric Defect Detection with a Multi-Scale Convolutional Denoising Autoencoder Network Model
by Shuang Mei, Yudan Wang and Guojun Wen
Sensors 2018, 18(4), 1064; https://doi.org/10.3390/s18041064 - 02 Apr 2018
Cited by 222 | Viewed by 18791
Abstract
Fabric defect detection is a necessary and essential step of quality control in the textile manufacturing industry. Traditional fabric inspections are usually performed by manual visual methods, which are low in efficiency and poor in precision for long-term industrial applications. In this paper, [...] Read more.
Fabric defect detection is a necessary and essential step of quality control in the textile manufacturing industry. Traditional fabric inspections are usually performed by manual visual methods, which are low in efficiency and poor in precision for long-term industrial applications. In this paper, we propose an unsupervised learning-based automated approach to detect and localize fabric defects without any manual intervention. This approach is used to reconstruct image patches with a convolutional denoising autoencoder network at multiple Gaussian pyramid levels and to synthesize detection results from the corresponding resolution channels. The reconstruction residual of each image patch is used as the indicator for direct pixel-wise prediction. By segmenting and synthesizing the reconstruction residual map at each resolution level, the final inspection result can be generated. This newly developed method has several prominent advantages for fabric defect detection. First, it can be trained with only a small amount of defect-free samples. This is especially important for situations in which collecting large amounts of defective samples is difficult and impracticable. Second, owing to the multi-modal integration strategy, it is relatively more robust and accurate compared to general inspection methods (the results at each resolution level can be viewed as a modality). Third, according to our results, it can address multiple types of textile fabrics, from simple to more complex. Experimental results demonstrate that the proposed model is robust and yields good overall performance with high precision and acceptable recall rates. Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

17 pages, 62953 KiB  
Article
Assessment of Spatiotemporal Fusion Algorithms for Planet and Worldview Images
by Chiman Kwan, Xiaolin Zhu, Feng Gao, Bryan Chou, Daniel Perez, Jiang Li, Yuzhong Shen, Krzysztof Koperski and Giovanni Marchisio
Sensors 2018, 18(4), 1051; https://doi.org/10.3390/s18041051 - 31 Mar 2018
Cited by 37 | Viewed by 4919
Abstract
Although Worldview-2 (WV) images (non-pansharpened) have 2-m resolution, the re-visit times for the same areas may be seven days or more. In contrast, Planet images are collected using small satellites that can cover the whole Earth almost daily. However, the resolution of Planet [...] Read more.
Although Worldview-2 (WV) images (non-pansharpened) have 2-m resolution, the re-visit times for the same areas may be seven days or more. In contrast, Planet images are collected using small satellites that can cover the whole Earth almost daily. However, the resolution of Planet images is 3.125 m. It would be ideal to fuse these two satellites images to generate high spatial resolution (2 m) and high temporal resolution (1 or 2 days) images for applications such as damage assessment, border monitoring, etc. that require quick decisions. In this paper, we evaluate three approaches to fusing Worldview (WV) and Planet images. These approaches are known as Spatial and Temporal Adaptive Reflectance Fusion Model (STARFM), Flexible Spatiotemporal Data Fusion (FSDAF), and Hybrid Color Mapping (HCM), which have been applied to the fusion of MODIS and Landsat images in recent years. Experimental results using actual Planet and Worldview images demonstrated that the three aforementioned approaches have comparable performance and can all generate high quality prediction images. Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

18 pages, 23024 KiB  
Article
Computationally Efficient Automatic Coast Mode Target Tracking Based on Occlusion Awareness in Infrared Images
by Sohyun Kim, Gwang-Il Jang, Sungho Kim and Junmo Kim
Sensors 2018, 18(4), 996; https://doi.org/10.3390/s18040996 - 27 Mar 2018
Cited by 5 | Viewed by 5291
Abstract
This paper proposes the automatic coast mode tracking of centroid trackers for infrared images to overcome the target occlusion status. The centroid tracking method, using only the brightness information of an image, is still widely used in infrared imaging tracking systems because it [...] Read more.
This paper proposes the automatic coast mode tracking of centroid trackers for infrared images to overcome the target occlusion status. The centroid tracking method, using only the brightness information of an image, is still widely used in infrared imaging tracking systems because it is difficult to extract meaningful features from infrared images. However, centroid trackers are likely to lose the track because they are highly vulnerable to screened status by the clutter or background. Coast mode, one of the tracking modes, maintains the servo slew rate with the tracking rate right before the loss of track. The proposed automatic coast mode tracking method makes decisions regarding entering coast mode by the prediction of target occlusion and tries to re-lock the target and resume the tracking after blind time. This algorithm comprises three steps. The first step is the prediction process of the occlusion by checking both matters which have target-likelihood brightness and which may screen the target despite different brightness. The second step is the process making inertial tracking commands to the servo. The last step is the process of re-locking a target based on the target modeling of histogram ratio. The effectiveness of the proposed algorithm is addressed by presenting experimental results based on computer simulation with various test imagery sequences compared to published tracking algorithms. The proposed algorithm is tested under a real environment with a naval electro-optical tracking system (EOTS) and airborne EO/IR system. Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

19 pages, 13314 KiB  
Article
Convolutional Neural Network-Based Shadow Detection in Images Using Visible Light Camera Sensor
by Dong Seop Kim, Muhammad Arsalan and Kang Ryoung Park
Sensors 2018, 18(4), 960; https://doi.org/10.3390/s18040960 - 23 Mar 2018
Cited by 24 | Viewed by 5208
Abstract
Recent developments in intelligence surveillance camera systems have enabled more research on the detection, tracking, and recognition of humans. Such systems typically use visible light cameras and images, in which shadows make it difficult to detect and recognize the exact human area. Near-infrared [...] Read more.
Recent developments in intelligence surveillance camera systems have enabled more research on the detection, tracking, and recognition of humans. Such systems typically use visible light cameras and images, in which shadows make it difficult to detect and recognize the exact human area. Near-infrared (NIR) light cameras and thermal cameras are used to mitigate this problem. However, such instruments require a separate NIR illuminator, or are prohibitively expensive. Existing research on shadow detection in images captured by visible light cameras have utilized object and shadow color features for detection. Unfortunately, various environmental factors such as illumination change and brightness of background cause detection to be a difficult task. To overcome this problem, we propose a convolutional neural network-based shadow detection method. Experimental results with a database built from various outdoor surveillance camera environments, and from the context-aware vision using image-based active recognition (CAVIAR) open database, show that our method outperforms previous works. Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

15 pages, 4304 KiB  
Article
Automatic Modulation Classification Based on Deep Learning for Unmanned Aerial Vehicles
by Duona Zhang, Wenrui Ding, Baochang Zhang, Chunyu Xie, Hongguang Li, Chunhui Liu and Jungong Han
Sensors 2018, 18(3), 924; https://doi.org/10.3390/s18030924 - 20 Mar 2018
Cited by 91 | Viewed by 10397
Abstract
Deep learning has recently attracted much attention due to its excellent performance in processing audio, image, and video data. However, few studies are devoted to the field of automatic modulation classification (AMC). It is one of the most well-known research topics in communication [...] Read more.
Deep learning has recently attracted much attention due to its excellent performance in processing audio, image, and video data. However, few studies are devoted to the field of automatic modulation classification (AMC). It is one of the most well-known research topics in communication signal recognition and remains challenging for traditional methods due to complex disturbance from other sources. This paper proposes a heterogeneous deep model fusion (HDMF) method to solve the problem in a unified framework. The contributions include the following: (1) a convolutional neural network (CNN) and long short-term memory (LSTM) are combined by two different ways without prior knowledge involved; (2) a large database, including eleven types of single-carrier modulation signals with various noises as well as a fading channel, is collected with various signal-to-noise ratios (SNRs) based on a real geographical environment; and (3) experimental results demonstrate that HDMF is very capable of coping with the AMC problem, and achieves much better performance when compared with the independent network. Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

16 pages, 7598 KiB  
Article
Urban Area Detection in Very High Resolution Remote Sensing Images Using Deep Convolutional Neural Networks
by Tian Tian, Chang Li, Jinkang Xu and Jiayi Ma
Sensors 2018, 18(3), 904; https://doi.org/10.3390/s18030904 - 18 Mar 2018
Cited by 32 | Viewed by 4922
Abstract
Detecting urban areas from very high resolution (VHR) remote sensing images plays an important role in the field of Earth observation. The recently-developed deep convolutional neural networks (DCNNs), which can extract rich features from training data automatically, have achieved outstanding performance on many [...] Read more.
Detecting urban areas from very high resolution (VHR) remote sensing images plays an important role in the field of Earth observation. The recently-developed deep convolutional neural networks (DCNNs), which can extract rich features from training data automatically, have achieved outstanding performance on many image classification databases. Motivated by this fact, we propose a new urban area detection method based on DCNNs in this paper. The proposed method mainly includes three steps: (i) a visual dictionary is obtained based on the deep features extracted by pre-trained DCNNs; (ii) urban words are learned from labeled images; (iii) the urban regions are detected in a new image based on the nearest dictionary word criterion. The qualitative and quantitative experiments on different datasets demonstrate that the proposed method can obtain a remarkable overall accuracy (OA) and kappa coefficient. Moreover, it can also strike a good balance between the true positive rate (TPR) and false positive rate (FPR). Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

19 pages, 10224 KiB  
Article
A New 3D Object Pose Detection Method Using LIDAR Shape Set
by Jung-Un Kim and Hang-Bong Kang
Sensors 2018, 18(3), 882; https://doi.org/10.3390/s18030882 - 16 Mar 2018
Cited by 11 | Viewed by 6037
Abstract
In object detection systems for autonomous driving, LIDAR sensors provide very useful information. However, problems occur because the object representation is greatly distorted by changes in distance. To solve this problem, we propose a LIDAR shape set that reconstructs the shape surrounding the [...] Read more.
In object detection systems for autonomous driving, LIDAR sensors provide very useful information. However, problems occur because the object representation is greatly distorted by changes in distance. To solve this problem, we propose a LIDAR shape set that reconstructs the shape surrounding the object more clearly by using the LIDAR point information projected on the object. The LIDAR shape set restores object shape edges from a bird’s eye view by filtering LIDAR points projected on a 2D pixel-based front view. In this study, we use this shape set for two purposes. The first is to supplement the shape set with a LIDAR Feature map, and the second is to divide the entire shape set according to the gradient of the depth and density to create a 2D and 3D bounding box proposal for each object. We present a multimodal fusion framework that classifies objects and restores the 3D pose of each object using enhanced feature maps and shape-based proposals. The network structure consists of a VGG -based object classifier that receives multiple inputs and a LIDAR-based Region Proposal Networks (RPN) that identifies object poses. It works in a very intuitive and efficient manner and can be extended to other classes other than vehicles. Our research has outperformed object classification accuracy (Average Precision, AP) and 3D pose restoration accuracy (3D bounding box recall rate) based on the latest studies conducted with KITTI data sets. Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

21 pages, 12178 KiB  
Article
Deep Spatial-Temporal Joint Feature Representation for Video Object Detection
by Baojun Zhao, Boya Zhao, Linbo Tang, Yuqi Han and Wenzheng Wang
Sensors 2018, 18(3), 774; https://doi.org/10.3390/s18030774 - 04 Mar 2018
Cited by 19 | Viewed by 5408
Abstract
With the development of deep neural networks, many object detection frameworks have shown great success in the fields of smart surveillance, self-driving cars, and facial recognition. However, the data sources are usually videos, and the object detection frameworks are mostly established on still [...] Read more.
With the development of deep neural networks, many object detection frameworks have shown great success in the fields of smart surveillance, self-driving cars, and facial recognition. However, the data sources are usually videos, and the object detection frameworks are mostly established on still images and only use the spatial information, which means that the feature consistency cannot be ensured because the training procedure loses temporal information. To address these problems, we propose a single, fully-convolutional neural network-based object detection framework that involves temporal information by using Siamese networks. In the training procedure, first, the prediction network combines the multiscale feature map to handle objects of various sizes. Second, we introduce a correlation loss by using the Siamese network, which provides neighboring frame features. This correlation loss represents object co-occurrences across time to aid the consistent feature generation. Since the correlation loss should use the information of the track ID and detection label, our video object detection network has been evaluated on the large-scale ImageNet VID dataset where it achieves a 69.5% mean average precision (mAP). Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

15 pages, 4843 KiB  
Article
New Keypoint Matching Method Using Local Convolutional Features for Power Transmission Line Icing Monitoring
by Qiangliang Guo, Jin Xiao and Xiaoguang Hu
Sensors 2018, 18(3), 698; https://doi.org/10.3390/s18030698 - 26 Feb 2018
Cited by 19 | Viewed by 4023
Abstract
Power transmission line icing (PTLI) problems, which cause tremendous damage to the power grids, has drawn much attention. Existing three-dimensional measurement methods based on binocular stereo vision was recently introduced to measure the ice thickness in PTLI, but failed to meet requirements of [...] Read more.
Power transmission line icing (PTLI) problems, which cause tremendous damage to the power grids, has drawn much attention. Existing three-dimensional measurement methods based on binocular stereo vision was recently introduced to measure the ice thickness in PTLI, but failed to meet requirements of practical applications due to inefficient keypoint matching in the complex PTLI scene. In this paper, a new keypoint matching method is proposed based on the local multi-layer convolutional neural network (CNN) features, termed Local Convolutional Features (LCFs). LCFs are deployed to extract more discriminative features than the conventional CNNs. Particularly in LCFs, a multi-layer features fusion scheme is exploited to boost the matching performance. Together with a location constraint method, the correspondence of neighboring keypoints is further refined. Our approach achieves 1.5%, 5.3%, 13.1%, 27.3% improvement in the average matching precision compared with SIFT, SURF, ORB and MatchNet on the public Middlebury dataset, and the measurement accuracy of ice thickness can reach 90.9% compared with manual measurement on the collected PTLI dataset. Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

13 pages, 4534 KiB  
Article
Application of Ground-Penetrating Radar for Detecting Internal Anomalies in Tree Trunks with Irregular Contours
by Weilin Li, Jian Wen, Zhongliang Xiao and Shengxia Xu
Sensors 2018, 18(2), 649; https://doi.org/10.3390/s18020649 - 22 Feb 2018
Cited by 23 | Viewed by 6112
Abstract
To assess the health conditions of tree trunks, it is necessary to estimate the layers and anomalies of their internal structure. The main objective of this paper is to investigate the internal part of tree trunks considering their irregular contour. In this respect, [...] Read more.
To assess the health conditions of tree trunks, it is necessary to estimate the layers and anomalies of their internal structure. The main objective of this paper is to investigate the internal part of tree trunks considering their irregular contour. In this respect, we used ground penetrating radar (GPR) for non-invasive detection of defects and deteriorations in living trees trunks. The Hilbert transform algorithm and the reflection amplitudes were used to estimate the relative dielectric constant. The point cloud data technique was applied as well to extract the irregular contours of trunks. The feasibility and accuracy of the methods were examined through numerical simulations, laboratory and field measurements. The results demonstrated that the applied methodology allowed for accurate characterizations of the internal inhomogeneity. Furthermore, the point cloud technique resolved the trunk well by providing high-precision coordinate information. This study also demonstrated that cross-section tomography provided images with high resolution and accuracy. These integrated techniques thus proved to be promising for observing tree trunks and other cylindrical objects. The applied approaches offer a great promise for future 3D reconstruction of tomographic images with radar wave. Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

34 pages, 7966 KiB  
Article
Deep Learning-Based Gaze Detection System for Automobile Drivers Using a NIR Camera Sensor
by Rizwan Ali Naqvi, Muhammad Arsalan, Ganbayar Batchuluun, Hyo Sik Yoon and Kang Ryoung Park
Sensors 2018, 18(2), 456; https://doi.org/10.3390/s18020456 - 03 Feb 2018
Cited by 120 | Viewed by 13092
Abstract
A paradigm shift is required to prevent the increasing automobile accident deaths that are mostly due to the inattentive behavior of drivers. Knowledge of gaze region can provide valuable information regarding a driver’s point of attention. Accurate and inexpensive gaze classification systems in [...] Read more.
A paradigm shift is required to prevent the increasing automobile accident deaths that are mostly due to the inattentive behavior of drivers. Knowledge of gaze region can provide valuable information regarding a driver’s point of attention. Accurate and inexpensive gaze classification systems in cars can improve safe driving. However, monitoring real-time driving behaviors and conditions presents some challenges: dizziness due to long drives, extreme lighting variations, glasses reflections, and occlusions. Past studies on gaze detection in cars have been chiefly based on head movements. The margin of error in gaze detection increases when drivers gaze at objects by moving their eyes without moving their heads. To solve this problem, a pupil center corneal reflection (PCCR)-based method has been considered. However, the error of accurately detecting the pupil center and corneal reflection center is increased in a car environment due to various environment light changes, reflections on glasses surface, and motion and optical blurring of captured eye image. In addition, existing PCCR-based methods require initial user calibration, which is difficult to perform in a car environment. To address this issue, we propose a deep learning-based gaze detection method using a near-infrared (NIR) camera sensor considering driver head and eye movement that does not require any initial user calibration. The proposed system is evaluated on our self-constructed database as well as on open Columbia gaze dataset (CAVE-DB). The proposed method demonstrated greater accuracy than the previous gaze classification methods. Full article
(This article belongs to the Special Issue Sensors Signal Processing and Visual Computing)
Show Figures

Figure 1

Back to TopTop