close

Вход

Забыли?

вход по аккаунту

?

TITS.2017.2750087

код для вставкиСкачать
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS
1
3-D Surround View for Advanced Driver
Assistance Systems
Yi Gao, Chunyu Lin, Yao Zhao, Senior Member, IEEE, Xin Wang, Shikui Wei, and Qi Huang
Abstract— As the primary means of transportations in modern society, the automobile is developing toward the trend of
intelligence, automation, and comfort. In this paper, we propose
a more immersive 3-D surround view covering the automobiles
around for advanced driver assistance systems. The 3-D surround
view helps drivers to become aware of the driving environment
and eliminates visual blind spots. The system first uses four fisheye lenses mounted around a vehicle to capture images. Then,
according to the pattern of image acquisition, camera calibration,
image stitching, and scene generation, the 3-D surround driving
environment is created. To achieve the real-time and easyto-handle performance, we only use one image to finish the
camera calibration through a special designed checkerboard.
Furthermore, in the process of image stitching, a 3-D ship
model is built to be the supporter, where texture mapping and
image fusion algorithms are utilized to preserve the real texture
information. The algorithms used in this system can reduce the
computational complexity and improve the stitching efficiency.
The fidelity of the surround view is also improved, thereby
optimizing the immersion experience of the system under the
premise of preserving the information of the surroundings.
Fig. 1. Output result of the existing system. (a) 2D surround view. (b) satcked
image.
Index Terms— Fish-eye lens, camera calibration, 3D surround
view, image stitching, driver assistance systems.
Fig. 2.
Screenshot of the system.
I. I NTRODUCTION
P
RESENTLY, autonomous vehicles are a very hot topic
in academia and industry. However, how to make
autonomous vehicles practical is not only a technical problem
but also includes safety, legal and social acceptance aspects,
among others [1]. Conversely, advanced driver assistance
systems (ADAS) that involve human interaction are more practical for applications. In [2], anomaly detection in traffic scenes
is implemented through spatial-aware motion reconstruction
to reduce traffic accidents resulting from drivers’ unawareness
or blind spots. However, the authors in [2] also note that it is
almost impossible to design a system that can faultlessly detect
Manuscript received February 21, 2017; revised June 2, 2017 and
July 8, 2017; accepted July 30, 2017. This work was supported in part
by the National Training Program of Innovation and Entrepreneurship for
Undergraduates, in part by the National Natural Science Foundation of China
under Grant 61402034, Grant 61210006, and Grant 61202240, and in part
by the National Key Research and Development Program of China under
Grant 2016YFB0800404. The Associate Editor for this paper was Q. Wang.
(Corresponding author: Chunyu Lin.)
Y. Gao, C. Lin, Y. Zhao, X. Wang, and S. Wei are with the Beijing
Key Laboratory of Advanced Information Science and Network, Institute
of Information Science, Beijing Jiaotong University, Beijing 100044, China
(e-mail: [email protected]).
Q. Huang is with Beijing Xinyangquan Electronic Technology Co., Ltd.,
Beijing 100038, China.
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TITS.2017.2750087
all types of abnormal events. A surround view camera system
consists of a top/bird’s-eye view that allows the driver to watch
360-degree surroundings of the vehicle [3], [4]. On the one
hand, the existing surround view assistance systems cannot
generate integrated and natural surround images because of
the calibration algorithm. On the other hand, such algorithms,
as claimed in [3], are not designed to achieve real-time
performance or have not been tested in the embedded platform.
Most importantly, a bird’s-eye view system can only provide a
single perspective from above the vehicle, such as that shown
in Fig. 1 (a), or the images are simply stacked, as shown
in Fig. 1 (b). Both of these images will probably mislead the
driver.
3D surround views for ADAS could solve this problem by
providing a considerably better sense of immersion and awareness of the surround view. A screenshot of our 3D surround
view is presented in Fig. 2, from which more information
of the vehicle surroundings can be observed. Although the
large company Fujitsu declared that its chips will support
3D surround views [5], the details of this technology must
still be completed. In this paper, we will introduce a lowcost 3D surround view system that includes our special fisheye calibration algorithm, perspective transformation, 3D ship
model building, texture mapping and linear fusion. This type
of system can help drivers be aware of the driving environment
1524-9050 © 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
2
Fig. 3.
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS
Fig. 5.
Ship model.
Fig. 6.
Sectional view of 3D ship model.
The specific position of the lenses and the output of the screen.
A. 3D Ship Model
Fig. 4.
Flow chart of the key steps.
around the vehicle, eliminate visual blind spots, and prevent
all types of hidden dangers.
The remainder of this paper is organized as follows.
In Sec.II, we describe the architecture of our 3D surround
view for driver assistance systems. In Sec.III, we introduce
the algorithm details. Finally, the performance and the results
are presented in Sec.IV.
II. S YSTEM OVERVIEW
The system consists of four fish-eye lenses mounted around
the vehicle and a display screen inside the control panel.
A miniature vehicle model is shown in Fig. 3. The fish-eye
lenses are distributed at the front bumper, the rear bumper and
under two rear-view mirrors. To reduce costs, HK8067A fisheye lenses that cost no more than 3 dollars are utilized. These
types of lenses provide an 180-degree wide-angle view to
ensure sufficient overlap between each view. Through a series
of image stitching processes using an embedded system with
the Freescale processor, a 3D surround view can be formed on
the vehicle’s central control panel. In the next section, we will
introduce each step in detail.
III. 3D S URROUND V IEW S YSTEM
A flow chart of our 3D surround view system is presented
in Fig. 4, which includes four main steps: camera calibration,
coordinate transformation, texture mapping and image fusion.
Before introducing the details of the algorithm, the proposed
3D model, which is the carrier of the surround image, will be
presented because it is directly related to the effects of the
texture mapping, image fusion and the quality of the surround
image. In this algorithm, we decide to construct a ship model.
Compared with the commonly used cylindrical models, such
as those mentioned in [6], the horizontal bottom and the
arc-shaped wall will be more consistent with the driver’s visual
habits to help drivers obtain a broader view. The model is
shown in Fig. 5.
The construction of the 3D model consists of connecting the
points in the 3D space into a line, then a plane, and finally
a body. After we store the points in a certain order, we can
draw the 3D model as needed.
Considering the effect of texture fusion, our ship model
selects the function Z = R 4 as the ramp function (R is the
distance between the projection point and the edge of the
bottom, and Z is the arc point’s height). The sectional view
of our ship model is presented in Fig. 6. As indicated by
the top view, the model is actually composed of a circle of
ellipses. The points of the model are actually the intersection
of lines extending from the bottom frame and those ellipses.
According to experiments, the slope surface is smoother when
the density of points on the slope is 15. In other words,
15 intersection points on each slope curve are selected to
build the model. However, in the uppermost ellipse, where
the distance between each point is large, we add a number
of straight lines passing through the corners of the bottom
frame and form more intersections to build the model. Then,
we project these points into the bottom of the model and
calculate the distance between the projection point and edge
point of the bottom surface. With the ramp function, the
coordinates of those points on the slope can be calculated.
The resulting slope is smooth, and the images are more natural.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
GAO et al.: 3-D SURROUND VIEW FOR ADASs
3
Fig. 8.
Fig. 7.
Model construction.
It not only helps drivers obtain broader vision but also speeds
up the process of image stitching. The time required from
generating model points to completing the model construction
is 17 ms. In addition, construction of the model can be
completed in 3 ms when the model points are already known.
Two adjacent curves on the slope can constitute a parallel
region. We choose one of these regions to illustrate the process
and we show this region in Fig. 7. The letter K in this figure
means that it is a point, and L means that it is a line. The
label in the bracket is the number of the point or the line.
First, mark the points on the left curve with odd numbers and
those on the right curve with even numbers. Then, suppose
that there is a point with label K(n). If this point is on the left
curve, connect the points K(n-1), K(n-2), and K(n) in order.
Otherwise, connect the points K(n-2), K(n-1) and K(n) in
order. By addressing the points on all lines using this method,
all triangles will be drawn in the same direction. After all
the points in a certain area have been addressed, the model
construction is complete. Having finished the introduction of
our 3D model, we will now introduce the other steps in the
following subsections.
B. Calibration of Fish-Eye Lenses Using Collinear
Constraint and Edge Corner Points
Since the employed fish-eye lens can capture a scene with
a wide angle of 180 degrees, the four fish-eye lenses around
the vehicle can ensure that there are no blind spots if a good
surround view can be stitched. However, the wide angle of
the fish-eye lens sacrifices the quality of the captured image.
There is barrel distortion using a fish-eye lens, particularly
the light through positions located far away from the optical
center [7], [8]. The distortion makes it difficult to convey the
captured image information [9]. Therefore, it is necessary to
calibrate the camera, correct the distortion and then rectify
the image to meet human visual requirements. The traditional camera calibration methods typically use approximately
20 captured images with different positions [10]–[12], which
requires substantial human intervention and time cost [13].
In addition, the traditional calibration methods assume a pinhole model, which differs from the fish-eye lenses.
In fact, the lens supplier provides a list of field curvature data that indicate the relation between real height and
Special calibration board.
reference height. Using this list, we can rectify the distorted
image captured by the fish-eye lens. However, due to the entire
manufacturing process of the lens, the following CCD center
and misalignment between the sensor plane and lens plane,
the provided list may not be accurate. Moreover, the fish-eye
lenses installed in our system are very inexpensive and of low
quality, which may also lead to inaccuracy. Through many
experiments, we found that the optical center is the most
important factor that affects the rectified image. If a good
estimation of the optical center can be obtained, then the list of
field curvature data can still be used. Most importantly, we can
cross rectify the image and estimate the optical center using
a certain rule, thereby obtaining the final rectified image.
Considering that the system is going to be setup in vehicles
during manufacturing or in a 4S shop, less human intervention,
simple and easy operation and less time cost are preferred.
Therefore, we propose a fish-eye lens calibration algorithm
that uses a collinear constraint and edge corner points. The
proposed algorithm takes two important constraints. The first
constraint is that collinear points should be rectified to be
collinear. The second constraint is that the light through the
optical center has less distortion, whereas that through the edge
of the lenses has the largest distortion.
First, we use a special checkerboard that can be printed and
easily set up, as shown in Fig. 8. Using this checkerboard,
we calibrate the camera without moving the checkerboard or
vehicle.
In the distorted image, the corner points close to the lens can
still be accurately searched since these points have relatively
less distortion. In the global coordinates, these points are in the
same row or the same column. As shown in Fig. 8, the corner
points in the rectified image should still be collinear. Suppose
that d(i ) is the distance between point i and the fitting line
and that μ(i ) is the weight of each point. Then, the weighted
summation is
L=
M
μ(i )d(i ).
(1)
i=1
where M is the number of total corner points.
Considering the special structure of the fish-eye lens, the
farther away the points are from the optical center, the larger
the distortion will be. Therefore, the weight of each point
should be different, and these weights are set considering the
physical distance of the points to the lens.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
4
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS
The perspective transformation matrix contains 9 parameters.
⎞
⎛
a11 a12 a13
⎝ a21 a22 a23 ⎠.
a31 a32 a33
Fig. 9.
The process of the perspective transformation.
Given an optical center value, we can always obtain a
rectified image using the list of field curvature data. Then,
lines could be fit through the searched corner points. The
optical center coordinate is traversed in a given range, and the
smallest L of each line is found. The corresponding optical
center is what we require.
Due to the special structure of the fish-eye lens, distortion
near the edge of the image is greater. As shown in Fig. 9, the
distortion of the rectangle on the sides of the checkerboard is
larger than that of the middle of the checkerboard. In addition,
the fish-eye lenses are mounted above the ground, and there
will be an inclination angle between the horizontal ground
and the lens, which makes the rectangle appear as a trapezoid
or other irregular space. The aforementioned facts lead to the
result that the side corner points are not easy to detect. If we
perform the calibration using only the middle corner points
on the checkerboard(smaller checkerboard), then the estimated
optical center will not be accurate, thus affecting the final
rectified image.
To detect the corner points far away from the fish-eye
lens, we design a large square at the edge and perform a
perspective transformation two times. The process of corner
detection can be divided into several steps. First, a binarized
image is obtained through threshold segmentation. In addition,
we employ the adaptive thresholding method considering the
non-uniform brightness of images. The adaptive threshold of
a pixel is determined by the pixel value distribution of its
adjacent points. Second, image expansion is used to separate
the connection of each black cube on the calibration board.
This method requires a structuring element, which can be
a square or a circle with a center point. The pixel value
of this center point is compared with the value of each
pixel of the image. Then, the larger one will be set as
the value of the pixel of the image. After this process, the
white pixel dots will expand. Therefore, it can reduce the
black quadrilateral and cut off the connection between each
square. Moreover, the vertex number and the graph outline
are used to distinguish the square. Ultimately, some restrictive
conditions, such as aspect ratio, perimeter and area, are used
to eliminate interfering figures. After the above steps, the
corner points can be detected. By combining these detected
large square corner points and the small square corner points,
the camera calibration can be accomplished. Fig. 9 shows the
process.
Some corner points are difficult to detect because they are
far away from the optical center and have large distortion.
Hence, we perform perspective projection on the original
image. Since the checkerboard is placed on the ground, if we
look from above the checkerboard, the rectangle will be
its normal shape. Consequently, we can obtain a bird’s-eye
view image in which we can easily detect the corner points.
However, the map plane is parallel to the original plane
and the formula is homogeneous. Therefore, [a33 ] = 1.
The perspective transformation matrix contains 9 parameters,
shown as follows.
⎛
⎞
a11 a12 a13
(x, y, 1) = (u, v, 1) × ⎝ a21 a22 a23 ⎠.
(2)
a31 a32
1
Among them, (u, v) are the pixel coordinates in the distorted
image, and (x, y) are the pixel coordinates in the transformed
image. The transformation matrix obtains the information
about scaling, shearing, rotation and translation. With 4 reliable pairs of detected corner points, which are the 4 points
closest to the lens, we can calculate the transformation matrix
parameters. Thus, we can obtain each point in the transformed
image as follows.
a11 u + a21 v + a31
,
(3)
x =
a13 u + a23v + 1
a12 u + a22 v + a32
y=
.
(4)
a13 u + a23 v + 1
Subsequently, we perform a second perspective transformation on the bird’s-eye view image. Thus, we can calculate
the pixel coordinates of the large rectangular corner points in
the original image. With all the detected corner points, the
line fitting algorithm is used to search for the optical center
and generate the rectified image as (1). Compared with the
traditional calibration algorithm, the proposed algorithm only
captures one image to complete the calibration; thus, it is
more suitable for the driver assistance system. Note that in
the traditional algorithm, only the close pixels can be rectified,
whereas our algorithm can rectify all the points.
C. Coordinate Transformation Using Virtual
Imaging Surface
After obtaining the optical center and the rectified images,
we need to transform the 2D image into our 3D ship model.
However, the direct relationship between the 2D image and
3D model is difficult to obtain. Hence, a virtual image plane
is established between the 2D image and 3D imaging plane
through perspective transformation and affine transformation.
The lens is mounted on the top of the cone. According to
the fish-eye lens’ position, visual angle, and orientations, the
cone matrix can be confirmed.
⎞
⎧⎛ 2N
⎪
0
0
0
⎪
⎪
⎟
⎪⎜
⎪
⎜ r −l
⎟
⎪
2N
⎪
⎜
⎟
⎪
⎪
0
0 ⎟,
⎜ 0
⎪
⎪
⎜
⎟
t
−
b
⎪
⎪
⎟
⎨⎜
0
a
b⎠
⎝ 0
(5)
0
0
−1 0
⎪
⎪
⎪
⎧
⎪
⎪
F+N
⎪
⎪
⎪
⎨a = −
,
⎪
⎪
F−N
⎪
⎪
⎪
⎪
⎩⎪
⎩b = − 2N F .
F−N
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
GAO et al.: 3-D SURROUND VIEW FOR ADASs
Fig. 10.
5
Fig. 11.
Two-step mapping.
Fig. 12.
O mapping.
The process of coordinate transformation.
N is the distance from the eye to the front clipping plane,
and F is the distance from the eye to the back clipping plane.
r , l, t, and b are the right boundary value, the left boundary
value, the upper boundary value and the lower boundary value
of the projection plane, respectively.
The pixel coordinates of the points on the virtual imaging
plane can be obtained by multiplying the above matrix with
the pixel coordinates of the corner points on the 3D model.
⎞
⎛ 2N x ⎞ ⎛
2N
⎛ ⎞
0
0
0
x
⎟
⎜ r −l ⎟ ⎜ r −l
⎟
⎟ ⎜
⎜
⎜y⎟
2N
⎟
⎜ 2N y ⎟ ⎜ 0
⎟. (6)
0
0⎟× ⎜
⎟=⎜
⎜
t −b
⎟ ⎝z⎠
⎜ t −b ⎟ ⎜
⎝ az + b ⎠ ⎝ 0
0
a
b⎠
1
0
0
−1
0
−z
Then, the perspective transformation matrix can be solved
by combining the pixel coordinates of the corresponding
corner point on the 2D plane. After obtaining the matrix of the
two projection transformations, every point on the 2D image
can be converted to the 3D model using the virtual imaging
plane. The schematic diagram of the entire process is presented
in Fig. 10.
D. 3D Texture Mapping
After the above steps, the coordinates of each pixel in the
3D ship model are obtained. If only these points are mapped,
then the resulting surround image is not natural and will
cause a loss of image information. Therefore, texture mapping
is employed here. Compared with the traditional mapping
method, texture mapping is a complex graphics technology,
and it is most commonly used to express geometric details
via texture and light details. The generated surround image
will be more vivid and more natural.
Texture mapping is the process that maps the pixels of a
2D texture plane to a 3D surface. This process is similar to
placing an image onto the surface of a 3D object to enhance
the sense of reality. The core of this method is the introduction
of a middle 3D surface mapping as an intermediate media. The
basic process can be accomplished through the following two
steps [14], [15].
(7)
(u, v) → x , y , z → (x, y, z).
(u, v) are the coordinates of the 2D plane. (x , y , z ) are
the coordinates of the simple 3D object surface. (x, y, z) are
the coordinates of the 3D model.
First, the 2D texture is mapped to a simple 3D object surface, such as a sphere, cube, cylinder and so on.
The following
mapping is then established: T (u, v) → T x , y , z . This
mapping is called the S mapping, which maps the 2D texture
to a sphere with a radius of R. The mapping process is shown
in Fig. 11.
⎧
⎪
⎨x = R × cos α × sin β,
(8)
y = R × sin α × sin β,
⎪
⎩
z = R × cos β.
P is a point on the sphere, as shown in Fig. 11. By mapping
the line OP to the XOY plane, α is then the angle between
the projection line and the X axis. The angle between the line
OP and the Z axis is β, where 0 ≤ α ≤ 2π, 0 ≤ β ≤ 2π.
Subsequently, the texture on the surface of the intermediate
object is mapped to the surface of the final object, as shown
in Fig. 12. The intersection of the intermediate object and the
radial that links point O and (x, y, z) is (x , y , z ). This point
is regarded as the mapping
point.
The process described above
is O mapping. T x , y , z → O (x, y, z).
Through the above two steps, the texture on the 2D plane
can be mapped to the surface of the 3D ship model through
a spherical surface. The image distortion can also be reduced,
and the image information can be saved to the greatest
extent [14], [15].
E. Image Fusion
Through the calculation of the camera parameters and the
calculation of the texture mapping rules, the relative relationship between the adjacent images has been determined.
By projecting the 3D ship model to the first quadrant,
we can observe that the area of the adjacent lens will have
overlapping areas. To reduce the stitching of the surround
image and make the image more natural, it is necessary to
fuse the overlapping regions of the image [16]–[18].
For simplicity, we use alpha fusion here. As shown
in Fig. 13, there are two boundary lines: the front region
segmentation line l and the right visual region segmentation
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
6
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS
Fig. 14.
Comparison of the two algorithms. (a) Traditional algorithm.
(b) Proposed algorithm.
TABLE I
C OMPARISON OF O PTICAL C ENTER VALUES
Fig. 13.
The projection of the model in the first quadrant.
line m. For the front region, the weight of the pixel value
varied from 1 to 0, corresponding to l to m. For the right
visual region, the weight of the pixel value varied from 1 to 0,
corresponding to m to l. For point A in the model, the pixel
value can be calculated using the following formula.
⎧
|y| − w/2
⎪
⎪
,
⎪α = ar c tan
⎪
⎪
|x| − l/2
⎪
⎪
⎨
α − θr
P f ront =
,
(9)
θo
⎪
⎪
⎪
⎪
Pright = 1 − P f ront ,
⎪
⎪
⎪
⎩C = P
×C
+P
×C
.
A
f ront
f ront
right
right
In the above formula, w is the vehicle’s width, and l is the
vehicle’s length. θr and θo are shown in the figure. Depending
on the fish-eye lens that we used, the θo of our system is 15°,
θ f is 41.5°, and θr is 33.5°. C f ront and Cright represent
the pixel values of point A in the front region and the right
visual region, respectively. P f ront and Pright are the weight
values of C f ront and Cright , respectively. 0 ≤ Pright ≤ 1,
0 ≤ P f ront ≤ 1. This algorithm has a good fusion effect.
Moreover, the image output by the system exhibits a naturally
connected and smooth transition. It also eliminates the brightness differences in images, and the visual effects have been
greatly improved. [19].
After the above steps, the 3D surround view is completed.
In the next section, experimental results will be presented to
demonstrate the effectiveness of the proposed algorithm.
IV. E XPERIMENTAL R ESULTS
In this section, we present some results from our experiments with the aforementioned algorithm and the entire
system.
A. Testing the Proposed Calibration Algorithm With
the Traditional Calibration Algorithm
Before the cars leave the manufacturing process, some
required checks and tests must be completed very quickly.
In this process, less human intervention is preferred. However,
the traditional calibration algorithm [10] needs more than
one image to complete calibration, thus requiring the shifting
of cars or the calibration board a few times. Hence, traditional algorithms are not well suited for this application case.
In contrast, the calibration algorithm proposed in this paper
adopted a special calibration board. It can use only one image
to rectify the distortion of images and obtain a more accurate
value of the optical center. This strength suits our system and
other high real-time systems well. The following part will
compare the traditional algorithm mentioned in [10] with the
proposed algorithm, and the results will be presented.
We implemented the camera calibration in an indoor environment where the light is mild. This environment helps to
more easily detect corners. The rectified image processed using
our algorithm is shown in Fig. 14(b), and it is compared
with the traditional algorithm in Fig. 14(a). According to the
images, we can clearly observe that the image processed using
the proposed algorithm rectified the distortion well. It not only
detects the corners of the middle checkerboard, but also detects
the corners of the rectangular calibration board at the edge
of the image. However, the traditional algorithm cannot detect
the corners of the rectangular calibration board. Therefore, the
rectified image in Fig. 14(a) also contains distortion at the edge
of the image. Only the central region is well calibrated.
The second advantage of our calibration algorithm is that
it obtains more accurate values of the optical center, which is
the most important parameter in camera calibration. A more
accurate optical center value will provide a better camera
calibration result. The optical center value and some other
parameters of the fish-eye lens were already presented in
the table of camera parameters. The value in the table can
be used as a criterion for evaluating the camera calibration
performance. The results of the proposed algorithm and the
traditional algorithm are presented in table I and table II.
There is a reference value of optical center in the parameter
table given by the manufacturer. We use four images to test
the performances of the proposed algorithm and the traditional
algorithm. By carefully comparing and analyzing the data
in table. I and table. II, we observe that the results of the two
calibration algorithms are quite close to the reference values.
However, the traditional algorithm [20] always has a relatively
larger deviation compared with the proposed scheme. For the
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
GAO et al.: 3-D SURROUND VIEW FOR ADASs
7
Fig. 15.
Images captured by the fish-eye lenses. (a) Front side of the car. (b) Back side of the car. (c) Left side of the car. (d) Right side of the car.
Fig. 16.
Surround view in an underground garage. (a) Front view. (b) Rear view. (c) Right view. (d) Left view.
Fig. 17.
Surround view from different angles on a highway. (a) Rear view. (b) Left-rear view. (c) Right view. (d) Front view.
TABLE II
E RROR A NALYSIS
average deviation, both the x coordinate and the y coordinate
of the presented algorithm are smaller than those of the traditional algorithm. Thus, the values of the optical center obtained
from the proposed algorithm are more accurate. Employing
this algorithm in the system can make the camera calibration
results more precise. Most importantly, the proposed algorithm
requires less human intervention.
B. Testing the System While the Car is
Static or in Motion
First, a brief introduction to the implementations and contents of the experiments will be given. The embedded platform
of our system is Freescale i.MX6Q, which is a quad-core
processor with 1 MB L2 cache. In addition, the processor
has four shader core 3D graphics acceleration engines and
two 2D graphics acceleration engines. Four fish-eye lenses are
mounted on the front bumper, rear bumper, and on each side
under the mirrors. To improve the imaging result, the output
resolution of the fish-eye lenses is 720×576. If larger sizes are
required, other types of cameras or image resizing algorithms
should be adopted [21]. The checkerboard of the experiment
is 5×7, and the size of each compartment is 20cm×20cm. The
size of the large rectangle is 100cm×100cm, and the distance
between the rectangle and the checkerboard is 40cm. We use
a Honda CRV as the test vehicle in the experiment. Both static
and moving cases are adopted to test the performance of our
system.
We parked the car in an underground garage where the light
is dim. Fig. 15 shows the images that were acquired by the
fish-eye lenses. Using the proposed algorithm to process the
images, the final surround view is presented as follows.
These images captured by the fish-eye lenses all have
distortions. The farther away the optical center is, the larger
is the distortion. Meanwhile, there are overlapping regions
in adjacent visual regions. If we only piece those images
together, then the information in the image will mislead
drivers. If the algorithm in this paper is employed, the results
are considerably different. As shown in Fig. 16, the distortion
has already been corrected, and the texture has been mapped
to our 3D model appropriately. The overlapped area has been
greatly improved in terms of color, brightness and so on after
the fusion process. Overall, the 3D surround view system can
help drivers obtain knowledge of their surroundings naturally.
The output of the 3D surround view is on the display screen.
We tested the system on a spacious highway and on a crowded
city road. The pictures of different visual angles of different
environments are shown in Fig. 17 and Fig. 18. We also
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
8
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS
Fig. 18.
Surround view from different angles on a city road. (a) Rear view. (b) Left-rear view. (c) Right view. (d) Front view.
Fig. 19. Comparison with other systems. (a) Fujitsu. (b) Delphi automotive.
(c) Our system.
provide a test video on the website,1 which better illustrates
the test results. The results are still satisfactory.
C. Testing the System With Other Systems
of the Same Type
There have been some reports of 360 surround view mentioned in the literature or introduced on websites. A 360-degree
wrap-around video imaging technology was proposed by
Fujitsu in [5]. The image is shown in Fig. 19(a). Another
360-degree surround view system mentioned in [4] is shown
in Fig. 19(b). These two figures are utilized here to perform
a comparison with ours.
The technology shown in Fig. 19(a) can generate a
panoramic image around the vehicle, but the image is not
natural, and there is a large difference in brightness and color
between the generated image and reality. The system shown
in Fig. 19(b) generates a more natural image, but the scene that
the driver can see above the ground is very limited compared
with our system, as shown in Fig. 19(c). Our algorithm can
generate a more natural image, and drivers can obtain a
boarder view. It helps drivers obtain more accurate information
about their driving environment and allows them to response
in advance.
We implemented the camera calibration in an indoor environment where the light is well balanced. This environment condition helps to detect corners easily such that we
can perform a highly accurate calibration. However, vehicles
will inevitably encounter shaking while they are in motion.
An important and necessary task is to eliminate the effects of
car turbulence on the fish-eye lenses. For convenience, we provide a mobile application for users. If there are just some
trivial position changes with the fish-eye lenses, users can use
the mobile application connected with the system to adjust
the captured images slightly without a second calibration.
However, when the positions of the lenses encounter considerable changes, the users cannot complete a good 3D displaying process autonomously. Under these circumstances, they
have to recalibrate the lenses through the manufacturer or
1 https://pan.baidu.com/s/1dELPgrv
by themselves. Note that the setup of the lenses will meet the
requirement of vehicle level in production; hence, it is much
more robust to vibration in practice.
Real-time efficiency is one of the most important evaluation indicators of the system; thus, the time costs are also
provided here. The multi-threaded camera calibration takes
approximately 10 seconds to calibrate all the cameras, which
could be finished off-line. Another 8 seconds is required to
read the data and load the model. Including other time costs,
it altogether costs approximately 20 seconds to finish the
stitching. This meets the low time-cost requirements of the
system. Considering all the results, we find that both in
the static environment and the moving vehicle, this algorithm
can restore the scene around the vehicle well; preserve the
light, shade and concave-convex information well; and obtain
the natural surround view with almost no traces of splicing.
V. C ONCLUSION
This paper presents an advanced driver assistance system
based on 3D surround view technology. With four fish-eye
lenses mounted on a vehicle, we implement the entire framework, which includes the special calibration, 3D ship model
construction, texture mapping and image fusion processes. The
proposed algorithm is very efficient, and it can be applied
in embedded systems. The entire system can adapt well to
changes in the environment and switch to any desired view
angle. The experimental results show that the calibration
algorithm presented in this paper obtains more accurate results
than the traditional algorithm. Moreover, calibration based on
a single image enables this system to not only be used in
advanced driver assistance systems but also in video surveillance and applications where real-time demand is required.
R EFERENCES
[1] P. Koopman and M. Wagner, “Autonomous vehicle safety: An interdisciplinary challenge,” IEEE Intell. Transp. Syst. Mag., vol. 9, no. 1,
pp. 90–96, Jan. 2017.
[2] Y. Yuan, D. Wang, and Q. Wang, “Anomaly detection in traffic scenes via
spatial-aware motion reconstruction,” IEEE Trans. Intell. Transp. Syst.,
vol. 18, no. 5, pp. 1198–1209, Mar. 2017.
[3] B. Zhang et al., “A surround view camera solution for embedded systems,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Workshops,
Jun. 2014, pp. 676–681.
[4] M. Yu and G. Ma, “360° surround view system with parking guidance,”
Driver Assist. Syst., vol. 7, no. 1, pp. 19–24, 2014.
[5] “360° wrap-around video imaging technology ready for integration
with fujitsu graphics SoCs,” Fujitsu Microelectron. America, Inc.,
Sunnyvale, CA, USA, Tech. Rep., Feb. 2011. [Online]. Available:
https://www.fujitsu.com/us/Images/360_OmniView_AppNote.pdf
[6] M. Lin, G. Xu, X. Ren, and K. Xu, “Cylindrical panoramic image
stitching method based on multi-cameras,” in Proc. IEEE Int. Conf.
Cyber Technol. Autom., Control, Intell. Syst., Jun. 2015, pp. 1091–1096.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
GAO et al.: 3-D SURROUND VIEW FOR ADASs
9
[7] Z. Hu, Y. Li, and Y. Wu, “Radial distortion invariants and lens evaluation
under a single-optical-axis omnidirectional camera,” Comput. Vis. Image
Understand., vol. 126, no. 2, pp. 11–27, 2014.
[8] M. Schönbein, T. Strauß, and A. Geiger, “Calibrating and centering quasi-central catadioptric cameras,” in Proc. Int. Conf. Robot.
Autom. (ICRA), May 2014, pp. 4443–4450.
[9] C. S. Fraser, “Automatic camera calibration in close range photogrammetry,” Photogramm. Eng. Remote Sens., vol. 79, no. 4, pp. 381–388,
2013.
[10] Z. Zhang, “A flexible new technique for camera calibration,” IEEE Trans.
Pattern Anal. Mach. Intell., vol. 22, no. 11, pp. 1330–1334,
Nov. 2000.
[11] Z. Zhang, “Camera calibration with one-dimensional objects,” IEEE
Trans. Pattern Anal. Mach. Intell., vol. 26, no. 7, pp. 892–899, Jul. 2004.
[12] Q. Wang, C. Zou, Y. Yuan, H. Lu, and P. Yan, “Image registration
by normalized mapping,” Neurocomputing, vol. 101, pp. 181–189,
Feb. 2013.
[13] H.-T. Chen, “Geometry-based camera calibration using five point correspondences from a single image,” IEEE Trans. Circuits Syst. Video
Technol., to be published.
[14] E. A. Bier and K. R. Sloan, “Two-part texture mappings,” IEEE Comput.
Graph. Appl., vol. 6, no. 9, pp. 40–53, Sep. 1986.
[15] P. J. Besl, “Geometric modeling and computer vision,” Proc. IEEE,
vol. 76, no. 8, pp. 936–958, Aug. 1988.
[16] M. A. Ruzon and C. Tomasi, “Alpha estimation in natural images,” in
Proc. CVPR, vol. 1. 2000, pp. 18–25.
[17] A. Levin, D. Lischinski, and Y. Weiss, “A closed-form solution to natural
image matting,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 30, no. 2,
pp. 228–242, Feb. 2008.
[18] M. Salvi and K. Vaidyanathan, “Multi-layer alpha blending,” in Proc.
Meet. ACM SIGGRAPH Symp. Interact. 3D Graph. Games, 2014,
pp. 151–158.
[19] T. Stathaki, Image Fusion: Algorithms and Applications. San Francisco,
CA, USA: Academic, 2008.
[20] Z. Zhang, “Flexible camera calibration by viewing a plane from
unknown orientations,” in Proc. 7th IEEE Int. Conf. Comput. Vis., vol. 1.
Sep. 1999, pp. 666–673.
[21] Q. Wang and Y. Yuan, “High quality image resizing,” Neurocomputing, vol. 131, pp. 348–356, Jan. 2014. [Online]. Available:
http://dx.doi.org/10.1016/j.neucom.2013.09.032
Yao Zhao (M’06–SM’12) received the B.S. degree
from the Radio Engineering Department, Fuzhou
University, Fuzhou, China, in 1989, and the M.E.
degree from the Radio Engineering Department,
Southeast University, Nanjing, China, in 1992, and
the Ph.D. degree from the Institute of Information Science, Beijing Jiaotong University (BJTU),
Beijing, China, in 1996. He became an Associate
Professor with BJTU in 1998, where he became a
Professor in 2001. From 2001 to 2002, he was a
Senior Research Fellow with the Information and
Communication Theory Group, Faculty of Information Technology and
Systems, Delft University of Technology, Delft, The Netherlands. He is
currently the Director of the Institute of Information Science, BJTU. His
current research interests include image/video coding, digital watermarking
and forensics, and video analysis and understanding. He is also leading
several national research projects from the 973 Program, the 863 Program,
and the National Science Foundation of China. He serves on the editorial boards of several international journals, including as an Associate
Editor of the IEEE T RANSACTIONS ON C YBERNETICS , an Associate Editor
of the IEEE S IGNAL P ROCESSING L ETTERS , an Area Editor of the
Signal Processing: Image Communication (Elsevier), and an Associate Editor
of Circuits, System, and Signal Processing (Springer). He was named a
Distinguished Young Scholar by the National Science Foundation of China
in 2010, and was elected as a Chang Jiang Scholar of Ministry of Education
of China in 2013.
Yi Gao was born in Yichang, China, in 1996. She is
currently pursuing the bachelor’s degree in computer
science with Beijing Jiaotong University, China.
She is currently involved in multimedia information
processing with the Institute of Information Science,
Beijing Jiaotong University. Her interests include
image processing and data analysis.
Shikui Wei received the Ph.D. degree in signal
and information processing from Beijing Jiaotong
University (BJTU), Beijing, China, in 2010. From
2010 to 2011, he was a Research Fellow with
the School of Computer Engineering, Nanyang
Technological University, Singapore. He is currently a Professor with the Institute of Information
Science, BJUT. His research interests include computer vision, image/video analysis and retrieval, and
copy detection.
Chunyu Lin was born in Liaoning, China.
He received the Ph.D. degree from Beijing
Jiaotong University, Beijing, China, in 2011.
From 2009 to 2010, he was a Visiting Researcher
with the ICT Group, Delft University of Technology,
Delft, The Netherlands. From 2011 to 2012, he was
a Post-Doctoral Researcher with the Multimedia
Laboratory, Gent University, Gent, Belgium. His
current research interests include image/video compression and robust transmission, 3-D video coding,
panorama, and VR video processing.
Xin Wang was born in Xianghe, China, in 1995.
He is currently pursuing the bachelor’s degree in
computer science with Beijing Jiaotong University,
China. He is currently involved in multimedia
information processing with the Institute of Information Science, Beijing Jiaotong University. His
interests include image processing, deep learning,
and computer vision.
Qi Huang received the master’s degree of communication and information systems from Beijing
Jiaotong University in 2008. He was with China
Mobile (Beijing) Ltd., from 2008 to 2016.
Since 2016, he has been serving as the CEO of
Beijing Xinyangquan Electronic Technology Co.,
Ltd.
Документ
Категория
Без категории
Просмотров
6
Размер файла
4 495 Кб
Теги
2017, 2750087, tits
1/--страниц
Пожаловаться на содержимое документа