Next Article in Journal
Research on Distributed 5G Signal Coverage Detection Algorithm Based on PSO-BP-Kriging
Next Article in Special Issue
Connection of the SUMO Microscopic Traffic Simulator and the Unity 3D Game Engine to Evaluate V2X Communication-Based Systems
Previous Article in Journal
Reducing Message Collisions in Sensing-Based Semi-Persistent Scheduling (SPS) by Using Reselection Lookaheads in Cellular V2X
Previous Article in Special Issue
Driver’s Facial Expression Recognition in Real-Time for Safe Driving
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Lane Endpoint Detection and Position Accuracy Evaluation for Sensor Fusion-Based Vehicle Localization on Highways

1
Department of Electronic Engineering, Korea National University of Transportation, 50 Daehak-ro, Chungju-si, Chungbuk 27469, Korea
2
School of Intelligent Mechatronics Engineering, Sejong University, 209 Neungdong-ro, Gwangjin-gu, Seoul 05006, Korea
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Sensors 2018, 18(12), 4389; https://doi.org/10.3390/s18124389
Submission received: 14 November 2018 / Revised: 8 December 2018 / Accepted: 10 December 2018 / Published: 11 December 2018
(This article belongs to the Special Issue Sensors Applications in Intelligent Vehicle)

Abstract

:
Landmark-based vehicle localization is a key component of both autonomous driving and advanced driver assistance systems (ADAS). Previously used landmarks in highways such as lane markings lack information on longitudinal positions. To address this problem, lane endpoints can be used as landmarks. This paper proposes two essential components when using lane endpoints as landmarks: lane endpoint detection and its accuracy evaluation. First, it proposes a method to efficiently detect lane endpoints using a monocular forward-looking camera, which is the most widely installed perception sensor. Lane endpoints are detected with a small amount of computation based on the following steps: lane detection, lane endpoint candidate generation, and lane endpoint candidate verification. Second, it proposes a method to reliably measure the position accuracy of the lane endpoints detected from images taken while the camera is moving at high speed. A camera is installed with a mobile mapping system (MMS) in a vehicle, and the position accuracy of the lane endpoints detected by the camera is measured by comparing their positions with ground truths obtained by the MMS. In the experiment, the proposed methods were evaluated and compared with previous methods based on a dataset acquired while driving on 80 km of highway in both daytime and nighttime.

1. Introduction

Vehicle localization is one of the key components of both autonomous driving and advanced driver assistance systems (ADAS). Accurate localization results can improve performances of other key components such as perception, planning, and control. The most widely used vehicle localization method is global navigation satellite systems (GNSS). This method provides global positions and its error is not accumulated, but produces inaccurate results when the GNSS signal is reflected or blocked. To alleviate this problem, GNSS has often been fused with dead reckoning (DR). This method is robust against the status of the GNSS signal and provides accurate results in a short period of time, but its error can accumulate over time. Recently, to overcome these drawbacks, localization methods that utilize a perception sensor and digital map have been widely researched [1]. These methods localize the ego-vehicle by matching the landmarks detected by the perception sensor and the landmarks stored in the digital map.
Various types of landmarks have been used, and road markings are one of the most widely used landmarks for vehicle localization. In terms of the perception sensor, road markings are relatively easy to detect because their shapes are predetermined by regulations and their color and reflectivity are significantly different from those of the road surface. In terms of the digital map, road markings can be stored at low capacity because their shapes are relatively simple compared with other road structures. Among road markings, lane markings are most widely used for vehicle localization. Lane markings provide abundant information on lateral position, but lack information on longitudinal position. To alleviate the drawback of lane markings, methods that utilize stop lines, crosswalks, arrows, and letters have been suggested. Those road markings are useful in urban situations because they frequently appear on urban roads. However, in highway situations, stop lines and crosswalks rarely exist, and arrows and letters seldom appear. Figure 1 shows the results of analyzing sequential images taken from approximately 40 km of highway from the viewpoint of road markings. First of all, this highway does not include any stop lines and crosswalks. In general, there are no stop lines and crosswalks on Korean highways. Red points in Figure 1 indicate the locations where arrows and letters were captured by the front camera. The arrows and letters were observed only in approximately 3% of the entire images. Furthermore, the arrows and letters are concentrated in specific areas such as intersections. The roads where the arrows and letters are not continuously observed reach up to 7.8 km. This analysis clearly shows that stop lines, crosswalks, arrows, and letters have limitations of improving longitudinal positioning accuracy in highway situations.
To overcome the limitations of the abovementioned road markings, lane endpoints can be used as landmarks to improve longitudinal positioning accuracy. The lane endpoint indicates the location where a lane marking starts or ends, so this paper classifies it into a lane starting point and a lane ending point. Figure 2a shows lane endpoints taken in a highway. Red and blue crosses indicate the starting and ending points, respectively. As a landmark for vehicle localization in highway situations, the lane endpoint has the following advantages: (1) Since it represents a specific point, it provides information on both lateral and longitudinal positions; (2) Since it has a simple and distinct shape, it can effectively be detected using a conventional automotive front camera; (3) Since it can be captured at a close distance, its position accuracy is guaranteed; (4) It frequently appears in highway situations. In Figure 1, blue points indicate the locations where the lane endpoints were captured by the front camera while driving on approximately 40 km of highway. The lane endpoints were observed in approximately 88% of the entire images, and the roads where the lane endpoints are not continuously observed reach at most to 1.7 km in a tunnel. According to Korean regulations, the length of a single dashed lane marking is 8.0 m, longitudinal distance between two dashed lane markings is 12.0 m, and the width of the driving lane is at least 3.5 m as shown in Figure 2b. This means that the lane endpoints of the same type (staring or ending point) appear every 20.0 m in the longitudinal direction and every 3.5 m in the lateral direction. Because there is enough space between adjacent lane endpoints, there is little confusion between adjacent lane endpoints when matching them with those stored in the digital map. The effect of using a specific location of a road marking for vehicle localization has been verified in [2] where arrow endpoints are used instead of lane endpoints. Note that this paper suggests that the lane endpoints are used as one of several landmarks, rather than suggesting the use of lane endpoints alone.
This paper proposes two methods concerning the use of the lane endpoints as landmarks for vehicle localization. First, it proposes a method to efficiently detect the lane endpoints using a vehicle-mounted monocular forward-looking (front) camera, which is the most widely installed perception sensor. Second, it proposes a method to reliably measure the position accuracy of the lane endpoints detected from images taken while the ego-vehicle is moving at high speed.
Since the front camera module is responsible for detecting various objects such as lane markings, vehicles, pedestrians, traffic signs, etc., an algorithm added to this module should require less computational costs. Thus, this paper proposes a method that can efficiently detect lane endpoints by combining simple algorithms. The proposed method first extracts lane pixels using a top-hat filter, and detects left and right lanes based on RANSAC. Once the lanes are detected, profiles of the top-hat filter response are generated along the detected lanes. Lane endpoint candidates are generated by finding local minima and maxima from differentiation results of the profiles. The lane endpoint candidates are verified by a two-class classifier after converting them into bird’s-eye view images. Finally, the positions of the lane endpoints from the camera are calculated based on the camera’s intrinsic and extrinsic parameters.
In general, the landmark-based localization system estimates the position of the ego-vehicle by comparing the positions of the detected landmarks from the ego-vehicle and those stored in the digital map. This means the position accuracy of the detected landmark is directly related to the performance of the vehicle localization system. In addition, it is necessary to have a distribution of the position error of the detected landmark in order to effectively apply extended Kalman or particle filtering methods. However, it is quite difficult to precisely measure the position accuracy of the landmarks, which are detected from images taken while the ego-vehicle is moving on a highway at high speed. According to our literature review, there has been no previous work attempting it. To reliably measure the position accuracy of the landmark, this paper proposes a method that uses a mobile mapping system (MMS) after examining various approaches. Since the MMS consists of high-precision positioning equipment and high performance LIDARs, it produces highly accurate positions and dense 3D points with reflectivities. If the front camera is attached to the MMS, 3D locations of the detected lane endpoints from the front camera can be accurately obtained while the ego-vehicle is moving at high speed by using the precise positions and dense 3D points produced by the MMS. Thus, this paper reliably measures the position accuracy of the lane endpoints by comparing its position obtained from the camera and the corresponding position obtained from the MMS.
In the experiment, the proposed method was evaluated based on a dataset acquired while driving on approximately 80 km of highway. Half of the dataset was acquired during daytime and the other half at night. The proposed lane endpoint detection method shows 96.1% recall and 99.7% precision in daytime, and 94.7% recall and 100% precision at night. As a result of the MMS-based position accuracy evaluation, the detected lane endpoints show 7.8 cm and 21.6 cm average position errors in lateral and longitudinal directions during daytime, and 8.2 cm and 48.2 cm average position errors in lateral and longitudinal directions at night. In terms of execution time, this method requires only 4.35 ms per image so that it can handle 230 frames per second.
The rest of this paper is organized as follows: Section 2 explains related research. Section 3 and Section 4 describe the proposed lane endpoint detection method and the MMS-based position accuracy evaluation method, respectively. Section 5 presents experimental results and analyses. Finally, this paper is concluded with future works in Section 6.

2. Related Research

The perception sensor and digital map-based vehicle location methods can be categorized based on the information acquired by the perception sensor and stored in the digital map. According to this criterion, previous methods can be categorized into range data-based, feature point-based, and road marking-based approaches. Since the proposed system is categorized into the road marking-based approach, the other two approaches are briefly introduced in this section. Figure 3 shows the taxonomy of the vehicle localization methods.
The range data-based approach acquires range data using perception sensors and matches them with range information stored in the digital map. Since this approach mostly utilizes active sensors such as LIDARs and Radars, it is robust against illumination conditions and textures of surrounding objects. However, its performance can be degraded in cases where a few fixed objects and many moving obstacles are presented on a broad road. The methods in this approach recognize specific objects (e.g., curbs [3], building facades [4], pole-like objects [5], or guardrails [6]), or directly utilize 3D point clouds [7] to match them with the digital map.
The feature point-based approach extracts feature points mostly from images and matches them with feature points stored in the digital map by comparing their descriptors. Because this approach utilizes a large number of feature points on various surrounding objects, it can achieve high localization accuracy. However, its performance can be affected by appearance and position changes of obstacles. In addition, it requires a large storage volume to store numerous feature points along with their high-dimensional descriptors in the digital map. The methods in this approach utilize various types of interest point detectors and feature descriptors: maximally stable extremal regions (MSER) with moment invariants [8], difference of Gaussians (DoG) with scale-invariant feature transform (SIFT), fast Hessian detector with speeded up robust features (SURF) [9], and DIRD (Dird is an illumination robust descriptor) [10].
The road marking-based approach extracts markings on road surfaces and matches them with road markings stored in the digital map. Since road markings are visually distinctive and under government regulations, they can be reliably detected compared to other objects. However, localization performance can be degraded when the road markings are worn or covered with snow. The road marking-based approach can further be categorized into signal-level, feature-level, and symbol-level approaches according to the level of information being used.
The signal-level approach uses raw data of road surfaces obtained from the perception sensor. Mattern et al. [11] matches intensities of image pixels with a virtual road surface image generated from the digital map. Levinson and Thrun [12] compares infrared reflectivities of a LIDAR with their means and standard deviations stored in the digital map. The feature-level approach utilizes features of the road marking extracted from raw data of the perception sensor. Hata and Wolf [13], Kim et al. [14], and Suganuma and Uozumi [15] find road marking features by comparing infrared reflectivities acquired by LIDARs with neighboring reflectivities or threshold values. Schreiber et al. [16] extracts road marking features by applying the oriented matched filter to free space obtained by a stereo camera. Deusch et al. [17] uses the maximally stable extremal regions to find road marking features from images of a front camera. Jo et al. [18] utilizes the top-hat filter to extract road marking features from bird’s-eye view images. The signal-level and feature-level approaches require less effort to handle the outputs of the perception sensors. However, since the raw data and road marking features have a large amount of information, these approaches require a high computational cost for the matching procedure and a large storage volume for the digital map.
The symbol-level approach recognizes a variety of types of road markings and matches them with those stored in the digital map. Lane markings are the most popularly used symbols for vehicle localization. Nedevschi et al. [19], Jo et al. [20], Lu et al. [21], Gruyer et al. [22], Tao et al. [23], Shunsuke et al. [24], and Suhr et al. [2] utilize a variety of types of cameras to detect lane markings for vehicle localization purposes. In particular, Nedevschi et al. [19] not only recognize positions of lanes but also their types (e.g., double, single, interrupted, and merge) as additional information. Even though the lane markings provide abundant information on the lateral position, they lack information on the longitudinal position. To complement this, stop lines, crosswalks, arrows, and letters have been utilized as well. Nedevschi et al. [19] recognizes stop-lines using a stereo camera and Jo et al. [20] recognize crosswalks using a front camera in order to enhance longitudinal localization accuracies. Nedevschi et al. [19] detect five types of arrows using a stereo camera to recognize the driving lane of the ego-vehicle. Wu and Ranganathan [25] recognize ten types of arrows and letters using a stereo camera and extract corner features from them. Suhr et al. [2] recognize nine types of arrows and diamond along with their starting points using a single front camera. The symbol-level approach requires a low computational cost for the matching procedure and a small storage volume for the digital map because it simply matches and stores locations and types of symbols. However, it requires additional computing resources to reliably recognize specific symbols. Thus, the symbol recognition procedure should be designed to be cost-effective.
The lane endpoints are expected to be good landmarks for vehicle localization when applying the symbol-level approach to highway situations. This paper proposes two essential components for its application: a cost-effective method to detect the lane endpoints using a conventional monocular front camera and a reliable method to measure the position accuracy of the lane endpoints based on the MMS. According to our literature review, there has been no previous work that explicitly detects the lane endpoint for vehicle localization purposes and measures its position accuracy.

3. Lane Endpoint Detection

3.1. Lane Detection

For lane departure warning (LDW) and lane keeping assist (LKA) systems, distant lanes and their curvatures should be obtained. However, in this paper, lane detection results are used only to restrict the search range of lane endpoints. For this purpose, it is enough to detect the lanes close to the ego-vehicle. Since the highway has a large turning radius, the lanes close to the ego-vehicle can be approximated as straight lines. Thus, the lanes within 20 m of the front camera are detected by finding a pair of straight lines. To detect left and right lanes, this paper first designates regions of interest (ROIs) for both lanes by considering the camera’s intrinsic and extrinsic parameters and the lane width of the highway in the regulation. Two pairs of blue lines in Figure 4a indicate the ROIs for left and right lanes. Lane candidate pixels are extracted by applying a horizontal top-hat filter to the ROIs shown in Figure 4a [26]. To reduce the computational cost, the top-hat filter is not applied to bird’s-eye view images but to the original images. The horizontal top-hat filter, h can be described as
h = [ 1 w / 2 ,   1 w ,   1 w / 2 ]
where 1n is an n-dimensional row vector whose elements are all ones. The top-hat filter can successfully detect lane candidate pixels if w fits the lane marking width. The lane marking width changes according to the vertical coordinate of the image because of the perspective projection. Thus, this paper pre-calculates the lane marking widths at different vertical coordinates based on the camera’s intrinsic and extrinsic parameters and the width of the lane marking in the regulation. In this pre-calculation, it is assumed that the roll and yaw angles of the camera are close to zero and only markings of the ego-lane are considered. The response of the top-hat filter at location (u,v), r(u,v) can be cost-efficiently calculated using only four operations regardless of w when a horizontal integral image, Ih is used as:
r ( u , v ) = 2 [ I h ( u + w / 2 , v ) I h ( u w / 2 , v ) ] [ I h ( u + w , v ) I h ( u w , v ) ]
Figure 4b shows the result of applying the top-hat filter to the ROIs in Figure 4a. Once the top-hat filter response is obtained, local maxima are searched for each row of the image to find the center pixels of the lane markings. Yellow points in Figure 4c indicate the locations of the extracted local maxima whose top-hat filter responses are larger than a predetermined value. Finally, a pair of lines is estimated by separately applying the RANSAC-based line detector to the local maxima found from the left and right ROIs [27]. Two red lines in Figure 4d show the detected left and right lanes. The lane markings are considered to be stably detected if the vanishing point calculated by a pair of two lines is located at a similar position more than N (in this paper, =3) consecutive images. Two consecutive vanishing points are considered to be located in a similar position if the distance between them is less than a predetermined value. Once the lane markings are stably detected, the ROIs for left and right lanes are limited to the narrow areas near the lane markings detected in the previous image. Even though this paper detects lanes using a combination of simple algorithms to reduce computational cost, more sophisticated methods can also be used.

3.2. Lane Endpoint Candidate Generation

The lane endpoint candidates are generated along the detected lanes. To this end, lane profiles are first calculated by following the detected left and right lanes. Those profiles are not generated based on the brightness values of the image, but the response of the top-hat filter already obtained in the lane detection procedure. Direct use of the brightness values is sensitive to lighting conditions because even objects of the same color in the real world can have different brightness values in images depending on local lighting conditions. However, since the response of the top-hat filter is calculated based on the brightness difference between surrounding areas, it is relatively robust against lighting conditions. The lane profile, p can be represented as:
p = [ p ( N ) ,   p ( N + 1 ) ,     ,   p ( M 1 ) ,   p ( M ) ]
where p(y) indicates the value of the lane profile at the vertical image location, y. N and M are the starting and ending locations of the lane profile, and they correspond to the road surface of 5 m and 20 m in front of the front camera, respectively. p(y) is calculated from the top-hat filter response as:
p ( y ) = max { r ( L x ( y ) 1 , y ) ,   r ( L x ( y ) , y ) ,   r ( L x ( y ) + 1 , y ) }
where r(x,y) is the value of the top-hat filter response at the image location (x,y), and Lx(y) is the horizontal image location of the lane marking at y. p(y) is calculated not only using r(Lx(y),y) but also using its horizontally neighboring values, r(Lx(y) − 1,y) and r(Lx(y) + 1,y) in order to reduce the influence of the lane detection error on the generation of the lane profile. Figure 5a shows an example of the lane detection result, and black dotted lines in Figure 5b show the lane profiles extracted from the left and right lane markings detected in Figure 5a. It can be seen that the lane profiles have large values at lane marking locations. The extracted lane profiles are clipped to make their maximum and minimum values become predetermined values, and filtered by the median filter to alleviate the noise. Blue solid lines in Figure 5b show the lane profiles after the clipping and filtering. Once the lane profile is extracted and preprocessed, it is differentiated to find the lane endpoints. The differential value at y, d(y) is calculated by subtracting the averages of the lane profile values before and after 1 m from y as:
d ( y ) = { 1 y m y 1 i = m y 1 y 1 p ( i ) } { 1 m y + 1 y i = y + 1 i = m y + 1 p ( i ) }
where my−1 and my+1 are the vertical image locations before and after 1 m from y, respectively. Figure 5c shows the derivatives of the lane profiles. In the derivative, a peak indicates the location where the lane profile value abruptly increases, which is the starting point of the lane, and a valley is the location where the lane profile value abruptly decreases, which is the ending point of the lane. Thus, local maxima and minima whose absolute values exceed a predetermined value in the derivatives are extracted and designated as candidates of the starting and ending points. To prevent multiple starting or ending points from being generated at similar locations, the non-maxima (or non-minima) suppression is conducted.
Red and blue triangles in Figure 5c show the generated candidates of the starting and ending points, and red and blue crosses in Figure 5d show the corresponding locations in the image. Since the proposed method detects the lane endpoint based on the lane marking detection result, the lane endpoints cannot be detected when both the left and right lane markings are not stably detected.

3.3. Lane Endpoint Candidate Verification

In real situations, the candidates of the lane endpoints (including starting and ending points) may include some false detections due to preceding vehicles, stains, shadows, etc. From the viewpoint of the vehicle localization, the false detections can more severely degrade the localization accuracy than the miss detections. If landmarks are miss-detected, the matching between the map and landmarks is omitted and the vehicle localization is performed with DR, which slightly degrades the localization accuracy. However, if landmarks are falsely detected, the localization accuracy may be significantly degraded due to the matching of the map with the wrong landmarks [2]. Therefore, this paper verifies the generated lane endpoint candidates in order to eliminate the falsely generated candidates.
To this end, the area of 1 m × 2 m around the lane endpoint candidate is first transformed into a bird’s-eye view image. In this transformation, the vertical direction of the bird’s-eye view image is set to coincide with the direction of the detected lane. Crosses in Figure 6a are four lane endpoint candidates generated from the procedure shown in Figure 5, and green rectangles are 1 m × 2 m areas around the candidates. Figure 6b shows four bird’s-eye view images generated from the green rectangles in Figure 6a. Since this transformation removes the perspective distortion, it can increase the verification performance by reducing the inter-class variation of the correctly generated candidates. After the transformation, the feature extraction and classification are conducted based on the bird’s-eye view image to determine whether the candidate is correct. The histogram of oriented gradients (HOG) [28] and support vector machine (SVM) [29] are used for the feature extraction and classification, respectively. The HOG-SVM is a long-proven method and has been widely used for various classification tasks such as vehicle, pedestrian, traffic sign, etc. It has an advantage that a high classification performance can be achieved with a small amount of computation when applied to a problem in which the degree of difficulty is not high. In this paper, the HOG-SVM is selected after comparing it with the convolutional neural network (CNN), which is a class of deep learning, which is known to provide the highest classification performance in image classification [30]. Both methods provided similar classification performance, but the HOG-SVM is far ahead in terms of computational cost. This will be explained in detail in the experimental section.
The resolution of the bird’s-eye view image is 48 × 96 pixels, and the parameters of the HOG are as follows: number of orientation bin: 9; cell size: 8 × 8 pixels; block size: 2 × 2 cells; and block overlapping: 50%. Thus, the HOG feature dimension is 1980. After extracting the HOG feature, the SVM is used to determine whether the candidate is correct. In order to reduce the computational cost, the linear SVM is selected. In general, the classification function of the SVM is defined as:
f ( x ) = i N α i y i Φ ( x i ) T Φ ( x ) + b
where xi is the i-th support vector and N is the number of support vectors. αi is a coefficient of xi, and b is a bias. x is an input feature vector and yi indicates the class of xi, which is either −1 or +1. Φ is a kernel function. In case of the linear SVM, (6) can be simplified as:
f ( x ) = i N α i y i x i T x + b = w T x + b , w = i N α i y i x i
Since αi, yi, and xi can all be determined at the learning stage, w can be calculated in advance. Therefore, the test stage can be simplified by adding the bias, b to the inner product of two vectors, w and x regardless of the number of support vectors. This simplification significantly reduces the computational cost. If a lane endpoint candidate is classified as a correct one based on the HOG-SVM classifier, it is confirmed as a final detection result. Details of the learning and evaluation of the HOG-SVM classifier will be described in the experimental section. Figure 6c shows the lane endpoint verification result. It can be seen that one falsely generated lane endpoint candidate is correctly removed during the lane endpoint verification procedure.

3.4. Calculation of Lane Endpoint Location

To effectively utilize the detected lane endpoints for vehicle localization, their locations with respect to the camera should be calculated. This paper calculates the location of the lane endpoint based on the camera’s intrinsic and extrinsic parameters, which are obtained during the precalibration stage using planar checkerboards. In case of calibrating the extrinsic parameters, the longitudinal direction of the vehicle is set to the Y-axis, the ground to the XY-plane, and the direction perpendicular to the ground to Z-axis in the checkerboard coordinate system. If RB2C and tB2C are rotation matrix and translation vector that represent a rigid transformation between the checkerboard and camera coordinate systems, an image location, (x,y) and a 2D location in the XY-plane of the checkerboard coordinate system, (XB,YB) are related by a homography, H:
H = K [ r 1 r 2 t B 2 C ] , R B 2 C = [ r 1 r 2 r 3 ]
Since the origin of the 2D location of the lane endpoint should be the origin of the camera, H in (8) is manipulated as:
H = K [ r 1 r 2 t B 2 C ] , t B 2 C = r 3 r 3 T t B 2 C
Based on (9), the 2D location of the lane endpoint having the camera location as its origin, (XC, ZC) can be calculated from the image location, (x,y) using the homography, H’ as:
X C = a c ,   Z C = b c , [ a b c ] T = H 1 [ x y 1 ] T
XC and ZC are called the lateral and longitudinal locations of the lane endpoint, respectively.

4. Position Accuracy Evaluation

4.1. Candidates for Evaluation Methods

Unfortunately, after extensive literature review, we have not found any previous methods for evaluating the position accuracy of the road markings, which are detected from images taken during high-speed driving. Thus, this section introduces the methods considered as candidates for position accuracy evaluation, and explains the reason why the MMS-based method is finally selected. The following three methods have been considered as candidates: measuring instrument-based, on-vehicle sensor-based, and MMS-based.
The measuring instrument-based method measures the relative position from the camera to the road marking using various measuring devices. The terrestrial laser scanning (TLS) is one of the most widely used instruments, and is mainly used for obtaining high resolution 3D models of large structures such as dams, mines, and terrain. To evaluate the position accuracy of the road marking using this method, TLS and the monocular front camera are first installed together and the rigid transformation between them should be precalibrated. The ground truth position of the road marking is obtained from 3D points produced by the TLS, and its position is transformed into the camera coordinate system using the rigid transformation between the TLS and camera. Finally, the position accuracy is measured by comparing the transformed ground truth position and the position calculated by the image of the camera. This method has an advantage of providing a dense and precise 3D shape of the road marking. However, it can be used only in cases where the vehicle is stationary because the TLS takes a long time to scan surrounding areas. For the same reason, it is difficult for the TLS to scan long and narrow areas such as highways. Due to these limitations, it has been concluded that this method is inappropriate to use in highway situations.
The on-vehicle sensor-based method measures the relative position from the camera to the road marking using various ranging sensors that have already been mounted on commercialized or autonomous vehicles. The most widely used ranging sensors mounted on those vehicles are a stereo camera and multi-layer LIDAR. This method is similar to the measuring instrument-based method. However, there is a major difference, which is the sensing speed. The sensing speed of the vehicle mounted ranging sensors are designed to be fast so that they can quickly obtain the ranging data of the surrounding area while the vehicle is moving at high speeds. To evaluate the position accuracy of the road marking using this method, the stereo camera or LIDAR along with the monocular front camera are first installed on the same vehicle and the rigid transformation between those sensors should be precalibrated. The ground truth position of the road marking is obtained from a dense disparity map of the stereo camera or 3D points produced by the multi-layer LIDAR, and its position is transformed into the camera coordinate system using the rigid transformation between those sensors. Finally, the position accuracy is measured by comparing the transformed ground truth position and the position calculated by the image of the camera. This method has an advantage that it is applicable to situations where the vehicle is moving at high speeds. However, in the case of the stereo camera, its ranging accuracy severely deteriorates when the distance of the road marking increases. In the case of the multi-layer LIDAR, it provides a high ranging accuracy, but its range data is not dense enough to accurately localize the road marking. Due to these limitations, it has been concluded that this method is inappropriate to use.
The MMS-based method measures the relative position from the camera to the road marking using dense 3D points and accurate positions obtained by the MMS. The MMS consists of high-precision positioning equipment and high performance LIDARs, and those sensors are usually mounted on a vehicle. Since the MMS can measure dense and accurate 3D points while the vehicle is moving at high speeds, it has an advantage of measuring long and wide areas such as highways. To evaluate the position accuracy of the road marking using this method, the monocular front camera is first attached to the MMS and the rigid transformation between the camera and MMS should be precalibrated. The ground truth position of the road marking is obtained from 3D points and its position is transformed into the camera coordinate system using the precalibrated rigid transformation and accurate positions produced by the MMS. Finally, the position accuracy is measured by comparing the transformed ground truth position and the position calculated by the image of the camera. Compared to the measuring instrument-based method, the MMS-based method has an advantage that it can be used while the vehicle is moving at high speeds. Compared to the on-vehicle sensor-based method, its 3D structure is denser and more accurate. Because of these advantages, this paper has selected the MMS-based method for evaluating the position accuracy of the road marking. A detailed explanation of this method will be presented in the following sections.

4.2. Acquisition of 3D Points Using MMS

The MMS used in this paper consists of a high precision LIDAR as a ranging sensor and a combination of real-time kinematic-global positioning system (RTK-GPS), high precision inertial measurement unit (IMU), and distance measurement instrument (DMI) as a positioning sensor. Table 1 summarizes the specifications of these sensors.
The mapping and positioning sensors are mounted on an off-the-shelf vehicle as shown in Figure 7. 3D structures surrounding the vehicle is obtained by registering 3D points acquired by the ranging sensor using 6D positions produced by the positioning sensor while the vehicle is moving. The obtained 3D points include 3D locations and infrared reflectivities. The monocular camera used to detect the lane endpoints is mounted on the vehicle equipped with the MMS, and the rigid transformation between the camera and MMS is precalibrated. Since an image is captured by being triggered by the positioning sensor, the precise camera position at the time of capturing the image are stored. Figure 8a,b show an example 3D structure and image captured by the MMS and camera, respectively. The red point and blue arrow in Figure 8a indicate the camera position at the time when the image in Figure 8b is captured.

4.3. Acquisition of Lane Endpoint Ground Truth

To evaluate the position accuracy of the lane endpoint, the ground truth position of the lane endpoint is manually designated using the 3D points obtained by the MMS. Since the manually designated position is in the 3D point cloud coordinate system, it is converted into the camera coordinate system using the precise camera position at the time of capturing the image, which is provided by the MMS. The position accuracy is evaluated by comparing the ground truth position and the position produced by the proposed method.
To manually designate the lane endpoints in the 3D points, roads and lanes should be visually distinguishable. However, since the roads and lanes are located on the same plane, it is hard to distinguish between them using their 3D information. Thus, this paper utilizes the infrared reflectivities provided by the LIDAR of the MMS to visually distinguish the roads and lanes. The 3D points of the lanes have high infrared reflectivities compared with the roads because they are drawn with a highly reflective paint. The positions of the lane endpoints are manually designated by displaying the 3D points based on their infrared reflectivities. Figure 9a shows the displayed 3D points based on their infrared reflectivities. In this figure, the larger the infrared reflectivity, the darker the 3D point is drawn. It can be noticed that the lanes are darker than the roads. This paper randomly selected 100 camera positions from approximately 40 km of highway, and manually designated four lane endpoints from the closest pair of dashed lanes. Figure 9a shows the 3D points around a randomly chosen camera position, which is depicted by a red point. In this figure, two green dashed lines indicate the selected dashed lane pair, which is closest to the camera position. Figure 9b shows an enlargement of one of the two selected lanes. Since the proposed method detects the center of the lane end as shown in Figure 5, the same location should be designated in the 3D points. To this end, the left and right corners of the lane end are manually designated and their center location is used as the ground truth of the lane endpoint. Figure 9c shows an example of the lane endpoint designation. In this figure, two blue points indicate two manually designated locations and a green point indicates the center of the two blue points, which is the ground truth of the lane endpoint.

4.4. Evaluation of Lane Endpoint Position Accuracy

Since the position of the lane endpoint is calculated with respect to the camera position in Section 3.4, its ground truth position obtained in Section 4.3 should be transformed into the camera coordinate system. To this end, the manually designated ground truth, GP in the point cloud coordinate system is first transformed into the camera coordinate system, GC as:
G C = R P 2 C ( G P t P 2 C )
where RP2C and tP2C are the rotation matrix and translation vector that represent a rigid transformation between the point cloud and camera coordinate systems. RP2C and tP2C are obtained from the high precision positioning sensor of the MMS. GC is manipulated as:
[ X G Y G Z G ] T = ( R B 2 C ) T ( G C t B 2 C ) ,
where RB2C and tB2C are the rotation matrix and the translation vector that represent a rigid transformation between the camera and checkerboard coordinate systems, which has been already explained in (8) and (9). XG and ZG are the ground truth position of the lane endpoint in lateral and longitudinal directions, respectively. The position accuracy of the lane endpoint is calculated by comparing (XG and ZG) in (12) and (XC and ZC) in (10).

5. Experiments

5.1. Experimental Environment

Experiments were conducted using a dataset acquired by the monocular forward-looking (front) camera and the MMS mounted on the roof of the vehicle while driving on highways during the day and night. The experimental setup has been shown in Figure 7. The resolution, horizontal field of view, and acquisition frequency of the front camera are 1280 × 1024 pixels, 60 degrees, and 20 Hz, respectively. The dataset was acquired while driving at approximately 70~80 km/hour on average. It includes a total 80 km of driving. Half of the dataset was taken during daytime (13:00 p.m.) and half at night (21:00 p.m.). Figure 10a shows the 3D points of three sample locations (①, ②, and ③) included in the test dataset. Figure 10b,c show images taken by the front camera at those three locations in daytime and nighttime, respectively.

5.2. Performance Evaluation and Comparison of Lane Endpoint Verification

This section presents the performance evaluation and comparison results of the lane endpoint verification explained in Section 3.3. To train and test the lane endpoint verifier, this paper collected training and test samples from images that are totally different from the test dataset introduced in Section 5.1. The collected sample images were manually divided into positive and negative samples. The lane endpoints are categorized into four types: left staring point (LSP), left ending point (LEP), right starting point (RSP), and right ending point (REP) as shown in Figure 11. This paper trains one classifier for each type of the lane endpoint so that a total of four classifiers are trained. Table 2 shows the number of sample images used for training and testing the lane endpoint verifiers. Figure 12 shows positive and negative samples images.
This paper compares two methods: the HOG-SVM and CNN. The CNN-based classifier has been widely used for a variety of applications. This paper chose the one that has been used for classifying road markings in [30] for performance comparison. The performances of the two methods were evaluated based on the following three criteria: true positive rate (TPR), true negative rate (TNR), and classification accuracy. The better the classification performance, the greater the value of all three criteria:
TPR = No .   of   correctly   classified   positive   samples No .   of   positive   samples TNR = No .   of   correctly   classified   negative   samples No .   of   negative   samples Accuracy = No .   of   correctly   classified   samples No .   of   all   samples
Two classifiers were trained by the same training samples and tested by the same test samples shown in Table 2. Table 3 and Table 4 show the performances of the HOG-SVM and CNN in [30], respectively. The HOG-SVM gives 98.2% TPR, 98.1% TNR, and 98.1% accuracy in Table 3, and the CNN gives 99.5% TPR, 97.4% TNR, and 98.5% accuracy. In terms of the classification performance, two methods are quite similar. However, in terms of the computational cost, the HOG-SVM requires 0.25 ms while the CNN requires 3.10 ms to classify a single lane endpoint. This means that if four lane endpoint candidates are generated, only the verification stage requires 12.40 ms in the case of using the CNN. Considering that the total computation time of the proposed method is 4.35 ms, the computation time of the CNN is too heavy as a part of the proposed method. Based on this performance evaluation, it is found that the HOG-SVM is more cost-effective than the CNN in this application. The computation time of the proposed method and specification of the used computer will be explained in Section 5.5.

5.3. Performance Evaluation of Lane Endpoint Detection

This section presents the performance evaluation and comparison results of the lane endpoint detection, which consists of the lane endpoint candidate generation and verification. To this end, this paper randomly selected 1200 images from the test dataset. Of these, 600 images are from the dataset taken in daytime, and 600 images at night. The daytime and nighttime images include 1504 and 1290 lane endpoints, respectively. This paper manually confirms whether the locations of the detected lane endpoints match the location of the actual lane endpoints. The performance evaluation and comparison were conducted based on the following three criteria: recall, precision, and F-measure. The better the classification performance, the greater the value of all three criteria:
Recall = No .   of   true   positives No .   of   lane   endpoints Precision = No .   of   true   positives No .   of   true   positives   +   No .   of   false   positives F - measure = 2 × precision   ×   recall precision   +   recall
Table 5 and Table 6 show the performances of the proposed method in daytime and nighttime, respectively. In the case of the daytime performance, this method gives 96.1% recall by correctly detecting 1445 lane endpoints out of 1504, and gives 99.7% precision by producing only four false detections. The F-measure that combines recall and precision is 97.9%. In the case of the nighttime, this method gives 94.7% recall by correctly detecting 1222 lane endpoints out of 1290, and gives 100.0% precision by producing no false detection. The F-measure is 97.3%. These results show that the proposed method successfully detects lane endpoints both in daytime and nighttime and produces very few false detections. Figure 13a,b show the lane endpoint detection results of the proposed method in daytime and nighttime, respectively. As aforementioned in Section 3.3, from a vehicle localization perspective, it is a great advantage that the proposed method generates very few false positives because falsely detected landmarks can cause a large localization error by being mismatched with the landmarks stored in the digital map. As shown in Table 5 and Table 6, the recall of the nighttime performance is slightly lower than that of the daytime. This is because in daytime, illumination is relatively uniform depending on the distance, but at night, the area at near and far distance from the vehicle is dark due to lack of headlight lighting as shown in Figure 13b and this slightly hinders detection. The precision of the nighttime is slightly higher than that of the daytime. This is because objects in front of the ego-vehicle have low brightness values in images at night as shown in Figure 13b, which reduces the possibility of producing false detections.
The proposed method was quantitatively compared with the method suggested in [31]. The method in [31] detects the lane endpoint for camera calibration purposes. Table 7 and Table 8 show the performance of the method suggested in [31] in daytime and nighttime, respectively. In the case of the daytime performance, this method gives 97.1% recall and 66.3% precision. In the case of the nighttime performance, it gives 88.3% recall and 60.4% precision. Compared with the proposed method, the method in [31] shows a similar recall, but there is a large difference (30%~40%) in precision. This difference is mainly caused by two reasons. One reason is because of the approach of measuring the image brightness difference when detecting the lane endpoints. The proposed method measures the brightness difference using neighboring pixels while adaptively changing the size of the top-hat filter. However, the method in [31] measures the brightness difference using distant pixels in the areas of fixed size. Due to the perspective distortion in images of the front camera, it is advantageous to adaptively change the size of the area used to calculate the brightness difference. In addition, it is preferred to use the neighboring pixels when calculating the brightness difference because other road marking such as arrows or letters can degrade the detection performance if the distant pixels are used. The other reason for the large difference in precision is the use of the learning-based lane endpoint verifier. The method in [31] detects the lane endpoints using the image brightness difference and the heuristically tuned threshold values. However, the proposed method utilizes the image brightness difference along with the lane endpoint verifier trained based on a machine learning approach. These are two main reasons why the proposed method produces a higher precision than the method in [31].

5.4. Position Accuracy Evaluation of Lane Endpoint

The position accuracy of the lane endpoint was evaluated using the MMS-based method explained in Section 4. To this end, the ground truths of the lane endpoints were first acquired from the 3D points. This paper randomly selected 100 locations and manually designated four endpoints for each location. Thus, the total number of ground truths is 400. The lane endpoints automatically detected from the front camera images based on the proposed method were compared with the corresponding ground truths as explained in Section 4.4. Since the camera takes the same lane endpoint multiple times while the ego-vehicle is moving, each ground truth is detected in multiple images. In this experiment, 400 ground truths correspond with 3829 lane endpoints detected in the daytime images and 4962 lane endpoints detected in the nighttime images. Thus, a total of 8791 lane endpoints detected by the proposed method were used for the position accuracy evaluation. This paper utilizes three criteria for evaluating the position accuracy: longitudinal error (elon), lateral error (elat), and Euclidean error (eeuc) as:
e l o n = | Z G Z C | e l a t = | X G X C | e e u c = ( X G X C ) 2 + ( Z G Z C ) 2
where (XG, ZG) and (XC, ZC) are the locations of the ground truth and detected lane endpoints, respectively. elon indicates the position error in the vehicle traveling direction, elat indicates the position error in the direction perpendicular to the vehicle traveling direction, and eeuc indicates the Euclidean distance between two positions.
Table 9 shows the mean and standard deviation of the position errors for four types of lane endpoints in daytime and nighttime. In daytime, the proposed method gives 21.6 cm, 7.8 cm, and 24.2 cm for elon, elat, and eeuc, respectively. In nighttime, the proposed method gives 48.2 cm, 8.2 cm, and 49.9 cm for elon, elat, and eeuc, respectively. In the case of the highway situation, the lane width is more than 3.5 m and the distance to the preceding vehicle is several tens of meters. Therefore, it can be said that the longitudinal and lateral errors of the lane endpoints detected by the proposed method are sufficient to be used as landmarks of the vehicle localization system for highway autonomous driving. According to Table 9, in both daytime and nighttime, the longitudinal error is much larger than the lateral error. This is because, in the case of using the front camera whose optical axis is almost parallel to the road surface, the number of pixels per meter decreases more sharply in the longitudinal direction compared with the lateral direction. In this table, it can be also found that the longitudinal error of the nighttime (48.2 cm) is much larger than that of the daytime (21.6 cm). This is because the lane endpoints are clearly captured in daytime due to bright illumination but at night, severe motion blurs occur near the lane endpoints due to dark illumination. Figure 14a,b show images of the lane endpoints taken in daytime and nighttime, respectively. It can be clearly noticed that the images taken at nighttime severely contaminated by the motion blur compared with those taken in daytime. Unlike the longitudinal errors, the lateral errors measured in daytime and nighttime are quite similar. This is because the motion blur occurs mainly in the moving direction of the ego-vehicle and hardly occurs in the direction perpendicular to it.
Furthermore, at night, the longitudinal errors of the LEP and REP are larger than those of the LSP and RSP by approximately 20 cm. This is due to the nature of the motion blur that occurs at the lane endpoints. When the front camera is moving at high speed and capturing images, the motion blur makes the brightness of the lane starting point (LSP or RSP) slightly dark. Since the lane marking is much brighter than surrounding areas, a slight darkening does not affect the locations of the detected lane starting points. However, in the same situation, the motion blur makes the brightness of the region near the lane ending point (LEP or REP) fairly bright. This makes the proposed method erroneously detect the location brightened by the motion blur and it leads to a relatively larger longitudinal error.
Figure 15 shows the changes in average position errors according to the distance from the camera to the lane endpoint. In Figure 15a, solid and dashed lines indicate the lateral and longitudinal errors in daytime, respectively. In daytime, the lateral error slightly increases and the longitudinal error dramatically increases while the distance to the lane endpoint increases. This is because, in the case of the front camera whose optical axis is almost parallel to the road surface, the number of pixels per meter slightly decreases in the lateral direction and rapidly decreases in the longitudinal direction.
In Figure 15b, solid and dashed lines indicate the lateral and longitudinal errors at night, respectively. Similar to the daytime case, the lateral error slightly increases while the distance to the lane endpoint increases. However, the longitudinal error shows a tendency different from that of the daytime. The longitudinal error becomes large at the location near the front camera. This is because the motion blur more severely occurs on images of objects that are closer to the camera. Objects that are closer to the camera move faster in a captured image compared with those that are far from the camera.

5.5. Execution Time

Table 10 shows the execution times of the main modules of the proposed method. These times were measured on an Intel Core i7-7700 CPU with 16 GB RAM using only a single core. The proposed method requires a total execution time of 4.35 ms, which means that it can process 230 frames per second in real time. Since the proposed method requires only a tiny amount of computation cost, it is possible to be inserted as an additional function of the existing multi-functional front camera module. Note that the image acquisition time is not included in the execution time.

6. Conclusions and Future Works

To increase the accuracy of landmark-based vehicle localization, this paper proposes using lane endpoints as landmarks. Two methods are proposed regarding the lane endpoints. One is to efficiently detect the lane endpoints using a conventional front camera, and the other is to reliably measure the position accuracy of the detected lane endpoints. In the experiment, it was found that the proposed detection method can accurately find the lane endpoints in both daytime and nighttime with a small amount of computation and the proposed MMS-based method can reliably measure the position accuracy of the lane endpoints detected from images taken while the vehicle is moving at high speed. In the future, we are planning to develop a method that can reduce the influence of the motion blur when calculating the positions of the lane endpoints at night, and extend the proposed position accuracy evaluation method to other landmarks such as arrows, letters, and traffic signs.

Author Contributions

E.S.J. and J.K.S. developed the algorithm, performed the experiments, and wrote the paper together. H.G.J. developed the evaluation methodology, analyzed the experimental results, and reviewed the paper.

Acknowledgments

This research was supported by a grant (code 18TLRP-B101406-04) from the ‘Development of Cooperative Automated Driving Highway Systems’ Program funded by the Ministry of Land, Infrastructure and Transport of the Korean government.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Kuutti, S.; Fallah, S.; Katsaros, K.; Dianati, M.; Mccullough, F.; Mouzakitis, A. A Survey of the State-of-the-Art Localization Techniques and Their Potentials for Autonomous Vehicle Applications. IEEE Int. Things J. 2018, 5, 829–846. [Google Scholar] [CrossRef] [Green Version]
  2. Suhr, J.K.; Jang, J.; Min, D.; Jung, H.G. Sensor fusion-based low-cost vehicle localization system for complex urban environments. IEEE Trans. Intell. Transp. Syst. 2017, 18, 1078–1086. [Google Scholar] [CrossRef]
  3. Hata, A.; Osorio, F.; Wolf, D. Robust curb detection and vehicle localization in urban environments. In Proceedings of the 2014 Intelligent Vehicles Symposium, Dearborn, MI, USA, 8–11 June 2014. [Google Scholar]
  4. Wei, L.; Cappelle, C.; Ruichek, Y. Horizontal/vertical LRFs and GIS maps aided vehicle localization in urban environment. In Proceedings of the IEEE International Conference on Intelligent Transportation Systems, Hague, The Netherlands, 6–9 October 2013. [Google Scholar]
  5. Schlichting, A.; Brenner, C. Localization using automotive laser scanners and local pattern matching. In Proceedings of the 2014 Intelligent Vehicles Symposium, Dearborn, MI, USA, 8–11 June 2014. [Google Scholar]
  6. Lundgren, M.; Stenborg, E.; Svensson, L.; Hammarstrand, L. Vehicle self-localization using off-the-shelf sensors and a detailed map. In Proceedings of the 2014 Intelligent Vehicles Symposium, Dearborn, MI, USA, 8–11 June 2014. [Google Scholar]
  7. Yoneda, K.; Tehrani, H.; Ogawa, T.; Hukuyama, N.; Mita, S. Lidar scan feature for localization with highly precise 3-D map. In Proceedings of the 2014 Intelligent Vehicles Symposium, Dearborn, MI, USA, 8–11 June 2014. [Google Scholar]
  8. Stübler, M.; Wiest, J.; Dietmayer, K. Feature-based mapping and self-localization for road vehicles using a single grayscale camera. In Proceedings of the 2015 Intelligent Vehicles Symposium, Seoul, Korea, 28 June–1 July 2015. [Google Scholar]
  9. Valgren, C.; Lilienthal, A.J. SIFT, SURF & seasons: Appearance-based long-term localization in outdoor environments. Robot. Auton. Syst. 2010, 58, 149–156. [Google Scholar]
  10. Ziegler, J.; Lategahn, H.; Schreiber, M.; Keller, C.G.; Knoppel, C.; Hipp, J.; Haueis, M.; Stiller, C. Video based localization for BERTHA. In Proceedings of the 2014 Intelligent Vehicles Symposium, Dearborn, MI, USA, 8–11 June 2014. [Google Scholar]
  11. Mattern, N.; Schubert, R.; Wanielik, G. High-accurate vehicle localization using digital maps and coherency images. In Proceedings of the 2010 Intelligent Vehicles Symposium, San Diego, CA, USA, 21–24 June 2010. [Google Scholar]
  12. Levinson, J.; Thrun, S. Robust vehicle localization in urban environments using probabilistic maps. In Proceedings of the 2010 IEEE International Conference on Robotics and Automation, Anchorage, AK, USA, 3–8 May 2010. [Google Scholar]
  13. Hata, A.; Wolf, D. Road marking detection using LIDAR reflective intensity data and its application to vehicle localization. In Proceedings of the 2014 IEEE International Conference on Intelligent Transportation Systems, Qingdao, China, 8–11 October 2014. [Google Scholar]
  14. Kim, D.; Chung, T.; Yi, K. Lane map building and localization for automated driving using 2D laser rangefinder. In Proceedings of the 2015 Intelligent Vehicles Symposium, Seoul, Korea, 28 June–1 July 2015. [Google Scholar]
  15. Suganuma, N.; Uozumi, T. Precise position estimation of autonomous vehicle based on map-matching. In Proceedings of the 2011 Intelligent Vehicles Symposium, Baden-Baden, Germany, 5–9 June 2011. [Google Scholar]
  16. Schreiber, M.; Knöppel, C.; Franke, U. Laneloc: Lane marking based localization using highly accurate maps. In Proceedings of the 2013 Intelligent Vehicles Symposium, Gold Coast City, Australia, 23–26 June 2013. [Google Scholar]
  17. Deusch, H.; Wiest, J.; Reuter, S.; Nuss, D.; Fritzsche, M.; Dietmayer, K. Multi-sensor self-localization based on maximally stable extremal regions. In Proceedings of the 2014 Intelligent Vehicles Symposium, Dearborn, MI, USA, 8–11 June 2014. [Google Scholar]
  18. Jo, K.; Jo, Y.; Suhr, J.K.; Jung, H.G.; Sunwoo, M. Precise Localization of an Autonomous Car Based on Probabilistic Noise Models of Road Surface Marker Features Using Multiple Cameras. IEEE Trans. Intell. Transp. Syst. 2015, 16, 3377–3392. [Google Scholar] [CrossRef]
  19. Nedevschi, S.; Popescu, V.; Danescu, R.; Marita, T.; Oniga, F. Accurate ego-vehicle global localization at intersections through alignment of visual data with digital map. IEEE Trans. Intell. Transp. Syst. 2013, 14, 673–687. [Google Scholar] [CrossRef]
  20. Jo, K.; Chu, K.; Sunwoo, M. GPS-bias correction for precise localization of autonomous vehicles. In Proceedings of the 2013 Intelligent Vehicles Symposium, Gold Coast City, Australia, 23–26 June 2013. [Google Scholar]
  21. Lu, W.; Seignez, E.; Rodriguez, F.S.A.; Reynaud, R. Lane marking based vehicle localization using particle filter and multi-kernel estimation. In Proceedings of the International Conference on Control Automation Robotics & Vision, Marina Bay Sands, Singapore, 10–12 December 2014. [Google Scholar]
  22. Gruyer, D.; Belaroussi, R.; Revilloud, M. Accurate lateral positioning from map data and road marking detection. Expert Syst. Appl. 2016, 43, 1–8. [Google Scholar] [CrossRef]
  23. Tao, Z.; Bonnifait, P.; Fremont, V.; Ibanez-Guzman, J. Mapping and localization using GPS, lane markings and proprioceptive sensors. In Proceedings of the International Conference on Intelligent Robots and Systems, Tokyo, Japan, 3–7 November 2013. [Google Scholar]
  24. Shunsuke, K.; Yanlei, G.; Hsu, L.T. GNSS/INS/on-board camera integration for vehicle self-localization in urban canyon. In Proceedings of the IEEE International Conference on Intelligent Transportation Systems, Las Palmas de Gran Canaria, Spain, 15–18 September 2015. [Google Scholar]
  25. Wu, T.; Ranganathan, A. Vehicle localization using road markings. In Proceedings of the 2013 Intelligent Vehicles Symposium, Gold Coast City, Australia, 23–26 June 2013. [Google Scholar]
  26. Veit, T.; Tarel, J.P.; Nicolle, P.; Charbonnier, P. Evaluation of road marking feature extraction. In Proceedings of the IEEE International Conference on 2008 Intelligent Transportation Systems, Beijing, China, 12–15 October 2008. [Google Scholar]
  27. Fischler, M.; Bolles, R. Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 1981, 24, 381–395. [Google Scholar] [CrossRef]
  28. Dalal, N.; Triggs, B. Histograms of oriented gradients for human detection. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Diego, CA, USA, 20–26 June 2005. [Google Scholar]
  29. Burges, C.J. A tutorial on support vector machines for pattern recognition. Data Min. Knowl. Discovery 1998, 2, 121–167. [Google Scholar] [CrossRef]
  30. Ahmad, T.; Ilstrup, D.; Emami, E.; Bebis, G. Symbolic road marking recognition using convolutional neural networks. In Proceedings of the 2017 Intelligent Vehicles Symposium, Redondo Beach, CA, USA, 11–14 June 2017. [Google Scholar]
  31. De Paula, M.B.; Jung, C.R.; da Silveira, L.G., Jr. Automatic on-the-fly extrinsic camera calibration of onboard vehicular cameras. Expert Syst. Appl. 2014, 41, 1997–2007. [Google Scholar] [CrossRef]
Figure 1. Analysis of distributions of road markings in highway situations.
Figure 1. Analysis of distributions of road markings in highway situations.
Sensors 18 04389 g001
Figure 2. (a) Two types of lane endpoints; (b) Installation regulation of dashed lane marking.
Figure 2. (a) Two types of lane endpoints; (b) Installation regulation of dashed lane marking.
Sensors 18 04389 g002
Figure 3. Taxonomy of vehicle localization methods.
Figure 3. Taxonomy of vehicle localization methods.
Sensors 18 04389 g003
Figure 4. Procedures for lane detection. (a) Regions of interest for lane detection; (b) Top-hat filtering result; (c) Local maxima locations; (d) Lane detection result.
Figure 4. Procedures for lane detection. (a) Regions of interest for lane detection; (b) Top-hat filtering result; (c) Local maxima locations; (d) Lane detection result.
Sensors 18 04389 g004
Figure 5. Lane endpoint candidate generation. (a) Lane detection result; (b) Left and right lane profiles; (c) Derivatives of the lane profiles and candidates of the lane starting and ending points; (d) Candidates of the lane starting and ending points in the image.
Figure 5. Lane endpoint candidate generation. (a) Lane detection result; (b) Left and right lane profiles; (c) Derivatives of the lane profiles and candidates of the lane starting and ending points; (d) Candidates of the lane starting and ending points in the image.
Sensors 18 04389 g005aSensors 18 04389 g005b
Figure 6. Lane endpoint candidate verification. (a) Lane endpoint candidates; (b) Bird’s-eye view images generated from lane endpoint candidates; (c) Lane endpoint verification result.
Figure 6. Lane endpoint candidate verification. (a) Lane endpoint candidates; (b) Bird’s-eye view images generated from lane endpoint candidates; (c) Lane endpoint verification result.
Sensors 18 04389 g006
Figure 7. MMS and camera used in the experiment.
Figure 7. MMS and camera used in the experiment.
Sensors 18 04389 g007
Figure 8. Example 3D structure and image captured by MMS and camera. (a) 3D structure obtained by the MMS; (b) Image captured by the camera. A red point and blue arrow in (a) indicate the camera position at the time when the image (b) was captured.
Figure 8. Example 3D structure and image captured by MMS and camera. (a) 3D structure obtained by the MMS; (b) Image captured by the camera. A red point and blue arrow in (a) indicate the camera position at the time when the image (b) was captured.
Sensors 18 04389 g008
Figure 9. Ground truth acquisition procedure. (a) 3D points around the selected camera position; (b) Enlargement of dashed lane segment; (c) Lane endpoint ground truth designation.
Figure 9. Ground truth acquisition procedure. (a) 3D points around the selected camera position; (b) Enlargement of dashed lane segment; (c) Lane endpoint ground truth designation.
Sensors 18 04389 g009
Figure 10. Highway the vehicle drove when acquiring the test dataset. (a) 3D points at three sample locations; (b) Images taken in daytime; (c) Images taken at nighttime.
Figure 10. Highway the vehicle drove when acquiring the test dataset. (a) 3D points at three sample locations; (b) Images taken in daytime; (c) Images taken at nighttime.
Sensors 18 04389 g010
Figure 11. Types of lane endpoints.
Figure 11. Types of lane endpoints.
Sensors 18 04389 g011
Figure 12. Sample images, (a) positive samples; (b) negative samples.
Figure 12. Sample images, (a) positive samples; (b) negative samples.
Sensors 18 04389 g012
Figure 13. Lane endpoint detection results of the proposed method in (a) daytime and (b) nighttime.
Figure 13. Lane endpoint detection results of the proposed method in (a) daytime and (b) nighttime.
Sensors 18 04389 g013
Figure 14. Images of lane endpoints in (a) daytime and (b) nighttime.
Figure 14. Images of lane endpoints in (a) daytime and (b) nighttime.
Sensors 18 04389 g014aSensors 18 04389 g014b
Figure 15. Changes of position errors according to the distance from the camera, (a) Daytime; (b) Nighttime.
Figure 15. Changes of position errors according to the distance from the camera, (a) Daytime; (b) Nighttime.
Sensors 18 04389 g015
Table 1. Sensor specifications of the MMS.
Table 1. Sensor specifications of the MMS.
SensorModelSpecification
Ranging sensorVelodyne HDL32EMeasurement range5 cm~100 m
Field of viewHorizontal 360°
Vertical 40°
Accuracy<2 cm
Angular resolutionVertical 1.25°
Acquisition frequency5~20 Hz
Positioning sensorApplanix POS LV 210X, Y position0.020 m (RMS)
Z position0.050 m (RMS)
Roll & Pitch0.020° (RMS)
Heading0.050° (RMS)
Acquisition frequency100 Hz
Table 2. Number of samples used for training and testing the lane endpoint verifiers.
Table 2. Number of samples used for training and testing the lane endpoint verifiers.
Training SampleTest Sample
PositiveNegativePositiveNegative
LSP5000520250004582
LEP5000520250004582
RSP5000520250004582
REP5000520250004582
Table 3. Performance of the HOG-SVM-based lane endpoint verifier.
Table 3. Performance of the HOG-SVM-based lane endpoint verifier.
No. of Positive SamplesNo. of Negative SamplesNo. of True PositivesNo. of True NegativesTrue Positives RateTrue Negative RateAccuracy
LSP500045824810449096.2%98.0%97.1%
LEP500045824905440898.1%96.2%97.2%
RSP500045824955455999.1%99.5%99.3%
REP500045824960453199.2%98.9%99.1%
Overall20,00018,32819,63017,98898.2%98.1%98.1%
Table 4. Performance of the CNN-based lane endpoint verifier.
Table 4. Performance of the CNN-based lane endpoint verifier.
No. of Positive SamplesNo. of Negative SamplesNo. of True PositivesNo. of True NegativesTrue Positives RateTrue Negative RateAccuracy
LSP500045824961436899.2%95.3%97.4%
LEP500045824978446999.6%97.5%98.6%
RSP500045824974448599.5%97.9%98.7%
REP500045824984452799.7%98.8%99.3%
Overall20,00018,32819,89717,84999.5%97.4%98.5%
Table 5. Performance of the proposed lane endpoint detection method in daytime.
Table 5. Performance of the proposed lane endpoint detection method in daytime.
No. of End-PointsNo. of True PositiveNo. of False PositiveRecallPrecisionF-Measure
LSP364353197.0%99.7%98.3%
LEP360334192.8%99.7%96.1%
RSP393383097.5%100.0%98.7%
REP387375296.9%99.5%98.2%
Overall15041445496.1%99.7%97.9%
Table 6. Performance of the proposed lane endpoint detection method in nighttime.
Table 6. Performance of the proposed lane endpoint detection method in nighttime.
No. of End-PointsNo. of True PositiveNo. of False PositiveRecallPrecisionF-Measure
LSP355335094.4%100.0%97.1%
LEP385362094.0%100.0%96.9%
RSP250235094.0%100.0%96.9%
REP300290096.7%100.0%98.3%
Overall12901222094.7%100.0%97.3%
Table 7. Performance of the method [31] in daytime.
Table 7. Performance of the method [31] in daytime.
No. of End-PointsNo. of True PositiveNo. of False PositiveRecallPrecisionF-Measure
LSP36434520494.8%62.8%75.6%
LEP36034319395.3%64.0%76.6%
RSP39339120899.5%65.3%78.8%
REP38738113898.4%73.4%84.1%
Overall1504146074397.1%66.3%78.8%
Table 8. Performance of the method [31] in nighttime.
Table 8. Performance of the method [31] in nighttime.
No. of End-PointsNo. of True PositiveNo. of False PositiveRecallPrecisionF-Measure
LSP35532216590.7%66.1%76.5%
LEP38532321483.9%60.1%70.1%
RSP25023215792.8%59.6%72.6%
REP30026221087.3%55.5%67.9%
Overall1290113974688.3%60.4%71.7%
Table 9. Position Accuracy Evaluation Result in Daytime and Nighttime. [Mean (Standard Deviation)].
Table 9. Position Accuracy Evaluation Result in Daytime and Nighttime. [Mean (Standard Deviation)].
TypeDaytimeNighttime
elon (cm)elat (cm)eeuc (cm)elon (cm)elat (cm)eeuc (cm)
LSP20.9 (21.1)6.9 (4.1)23.5 (19.8)36.3 (34.9)8.5 (6.2)38.2 (34.4)
LEP22.0 (20.6)7.8 (4.5)24.7 (19.5)55.9 (35.2)8.1 (6.257.4 (34.3)
RSP20.9 (18.0)8.4 (5.5)23.7 (17.2)38.7 (29.5)8.2 (5.9)40.6 (28.7)
REP22.5 (18.3)8.0 (5.1)25.0 (17.5)59.4 (33.8)8.2 (5.9)60.9 (32.6)
Overall21.6 (19.5)7.8 (4.9)24.2 (18.5)48.2 (34.9)8.2 (6.1)49.9 (34.0)
Table 10. Execution time.
Table 10. Execution time.
ModuleTime (ms)
Integral image generation1.51
Top-hat filtering1.46
RANSAC-based lane detection0.35
Lane endpoint candidate generation0.08
Lane endpoint candidate verification0.95
Overall4.35

Share and Cite

MDPI and ACS Style

Jang, E.S.; Suhr, J.K.; Jung, H.G. Lane Endpoint Detection and Position Accuracy Evaluation for Sensor Fusion-Based Vehicle Localization on Highways. Sensors 2018, 18, 4389. https://doi.org/10.3390/s18124389

AMA Style

Jang ES, Suhr JK, Jung HG. Lane Endpoint Detection and Position Accuracy Evaluation for Sensor Fusion-Based Vehicle Localization on Highways. Sensors. 2018; 18(12):4389. https://doi.org/10.3390/s18124389

Chicago/Turabian Style

Jang, Eun Seok, Jae Kyu Suhr, and Ho Gi Jung. 2018. "Lane Endpoint Detection and Position Accuracy Evaluation for Sensor Fusion-Based Vehicle Localization on Highways" Sensors 18, no. 12: 4389. https://doi.org/10.3390/s18124389

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop