Original PDF Flash format AR-Bridge-Builder:-Real-Time-Vectorisation-of-Freehand-Sketched-...  


AR Bridge Builder: Real Time Vectorisation Of Freehand Sketched ...

F. Petzold, J. Pfeil, C. Riechert, R. Green, ‘AR Bridge Builder: Real-Time Vectorisa-
tion of Freehand Sketched Structures for an AR Application’, Proceedings of Image and
Vision Computing New Zealand 2007, pp. 192–197, Hamilton, New Zealand, December
2007.
AR Bridge Builder: Real-time Vectorisation of Freehand
Sketched Structures for an AR Application
F. Petzold1, J. Pfeil1, C. Riechert1 and R. Green1
1Dept. Computer Science, University of Canterbury, Christchurch, New Zealand.
Web: http://arbridgebuilder.grundeis.net
Abstract
In this paper the idea of an augmented reality game involving the overlay of information on freehand bridge
drawings is presented. This is supported by a proposed algorithm which merges vectorisation and corner
detection, followed by a post-processing of the resulting bridge graph. Different implemented approaches
to recognise the sketches are discussed. The Hough transform is found not to be well suited to cope with
the problem. Whereas simple vectorisation gives better results but still has some drawbacks in terms of
precision and reliability. This paper proposes a hybrid algorithm to overcome these drawbacks by using
corner detection in addition to vectorisation.
Keywords: vectorisation, freehand sketches, skeletonisation, hough transform, thinning, augmented reality
overlaid in real time on the video. A selection tool
is used to choose a track that a simulated train then
runs over. This allows the user to playfully discover
more about the statics of the drawn bridge.
As a first step, the elements of the drawn bridge
have to be recognised in the video image. In the
game a bridge only consists of connected straight
lines. Thus, as a final result the bridge sketch
should be represented as a graph containing the
joint positions of the bridge and the trusses be-
tween them.
In this paper several possible algorithms to achieve
this are investigated. The first approach uses the
Figure 1: Recognised bridge
Hough transform to detect the trusses. Due to sev-
1 Introduction
eral draw-backs concerning robustness to noise and
bad recognition of slightly curved lines, the Hough
Sketching is a very straight forward way to ex-
transform approach is found inappropriate. It also
press and test out ideas. Recognised structures
fails to extract the main information about the
and features of sketches can be used to generate
bridge structure, which does not lie in the trusses,
additional information which is overlayed on the
but rather in joint positions and their connectivity.
video image. The idea presented in this paper is
to overlay physics and game information on real
A second approach using thinning and vectorisa-
world drawings. The proposed algorithm forms the
tion was developed, which could directly extract
base of an Augmented Reality (AR) game that can
the joint positions. Thinning and vectorisation
be played on a normal white board or on paper. It
reliably preserve joint connectivity but may alter
could be used for education, giving learners instant
joint positions due to known issues of several thin-
feedback on the validity of their ideas.
ning algorithms, which will be explained. To over-
come the inaccurate extraction of joint positions,
The game is about building bridges that can sup-
we decided to implement a hybrid approach, which
port itself and traffic. A bridge is drawn on a
combines vectorisation with better joint position
whiteboard or on paper. The user points the cam-
recognition using the Harris corner detector.
era at the sketch and starts the simulation – re-
The final algorithm was realised using C++ and
vealing the statics and game information which is
the Open Computer Vision Library (OpenCV) [1].
192

1.1 Related Work
The proposed AR game is heavily inspired by a
PC game called BridgeBuilder [2] which allows the
player to construct a bridge with small straight
trusses which is then physically simulated and tested
by a train running over it.
The illustration of physics information using free-
hand sketches has been demonstrated by Microsoft
Research’s Physics Illustrator [3]. This system needs
a special input device to capture the strokes. It
can either be run on a tablet PC or use a digital
whiteboard combined with a projector.
Algorithms for freehand sketch recognition exist
but usually the interest lies in labelling strokes ac-
Figure 2: Big image: Hough space, Small images:
cording to some model in order to extract meaning
Original image and coarsely recognised bridge
from a sketch (e.g. [4]). For our application the
exact position and connectivity of structures is im-
they can no longer be extracted from the Hough
portant and the underlying model is rather simple.
space. Longer lines outvote shorter ones. Paral-
Also the most robust sketch recognition has been
lel lines which lie close together cannot be distin-
achieved with special input devices capturing the
guished.
process of drawing (e.g. [5, 6]). At least in the
For pre-processing the images, the Canny edge de-
general case we do not have this additional time
tector is used. It enables further processing to be
information which makes reliable detection much
fairly independent of varying lighting conditions
harder.
[11].
Recognition of scanned documents like maps and
The Hough algorithm is very noise prone. The
technical drawings is another area where robust
noise of a fixed camera is enough to make stable
vectorisation is required [7, 8]. But here the reso-
line recognition on live video impossible. Lines
lution is usually high and the lighting is ideal.
disappear due to noise making it necessary to lower
The novelty of the proposed application is the use
the recognition threshold. This results in line bun-
of a simple camera which does not give the same
dles instead of single lines. A clustering algorithm
amount of information as a tablet PC or digital
was implemented which groups lines by their prox-
whiteboard. In order to be able to use it in an AR
imity in the Hough space. From every group the
application the computer vision algorithm has to
line with the highest number of votes is chosen.
be very robust under different lighting conditions,
With this clustering algorithm it was possible to
varying line thickness and discontinuities in the
get a stable real-time recognition of structures con-
drawn lines.
sisting of a small number of lines. These lines were
intersected to obtain a list of candidate junction
2 Hough Transform
points. If a rectangular area of the thresholded
input image between two candidate points contains
The Hough transform [9] can be used for straight
a number of white pixels that is higher than the
line detection. It transforms the image into Hough
minimum number of pixels a line through this area
space where lines are represented as points. The
would have, the two candidate points are consid-
coordinates of every point in the Hough space cor-
ered to be connected by a line in the real image.
respond to the distance and the angle of a nor-
mal representing one line in the original image.
2.1 Problems
The structure that is to be recognised is ideally
As soon as the drawings become more complex or
a straight line. Every white pixel in the original
the variety of line lengths too wide, the algorithm
image votes for every possible line that could go
is no longer able to extract all lines (see fig. 2). The
through it.
problem with the standard Hough transform lies in
There are several variations of the Hough trans-
the fact that pixels vote globally. The more lines
form. The Probabilistic Hough transform, for ex-
are in an image the harder it becomes to extract
ample, uses a selected subset of the input points to
single lines from the Hough space.
speed up the algorithm [10].
A possible solution could be a modified version of
The Hough transform has some known disadvan-
the Hough transform that divides the image into
tages and problems. If there are too many lines,
small regions. Each of these regions is transformed
193

Figure 3: Visualisation of Hough transform applied to
Figure 4: Necking artefact produced by the thinning
image subsections
algorithm
independently. In every of these Hough spaces the
the vectorisation algorithm is a graph representing
global maximum defines a line segment (fig. 3).
the drawing. It obviously contains nodes which
These local line segments could then be joined to-
do not correspond to junctions in the sketch. To
gether. This way, long lines cannot outvote shorter
remove non-critical points a polygonal approxima-
ones.
tion algorithm is applied.
Even if the problems inherent to the Hough trans-
An error function, shown in equation (1), calcu-
form could be overcome, it proved not to be the
lates the cost of removal for every point that has
best possible approach to the problem at hand.
two edges. The cost is the distance between each
Lines drawn are often slightly curved resulting in
node and the line connecting its neighbours. At
imprecise junction point detection. It could also
each iteration the point with the lowest cost is
be desirable to recognise the exact shape of drawn
removed and its neighbours are directly connected.
elements. An alternate approach overcoming the
The algorithm terminates if there are no more points
shortcomings of the Hough transform is discussed
undercutting a certain threshold.
in the next section.
3 Vectorisation
2
(P
C
n − Pn−1) · (Pn+1 − Pn−1)
n = Pn − Pn−1 −
2
Another possible method is to use skeletonisation
Pn+1 − Pn−1
to obtain a thin line which can then be vectorised.
(1)
This also enables the detection of arbitrary curved
Polygonal approximation yields a graph whose nodes
structures. The skeletonisation is done with a thin-
are either junctions or sharp bends in a line (e.g.
ning algorithm proposed by Zhang [12] after apply-
triangle).
ing an adaptive threshold to the original image (see
fig. 6). A great advantage of thinning over other
3.1 Problems
kinds of skeletonisation algorithms is its guaranty
of a connected skeleton.
There are two major drawbacks of this approach,
Since Zhang’s thinning algorithm does not always
concerning the thinning algorithm. Firstly, neck-
result in an 8-connected line, an algorithm to re-
ing artefacts appear at junctions where lines meet
move staircases in diagonal line segments, as pro-
at an acute angle (fig. 4). Secondly, extra line
posed by Holt [13], is applied. This results in a
segments are often created if the underlying shape
thinned 8-connected line which is ideal for vectori-
is not regular. These artefacts result in the intro-
sation. Although thinning is potentially slow, it
duction of additional junction points.
does not pose a problem in our case. The bridge
Furthermore the adaptive threshold used in the
structure does already consist of relatively thin lines, creation of the binary image can result in gaps in
which constrains the maximum number of itera-
the centre of the junction of thick lines (fig. 5).
tions necessary for the algorithm to converge.
This can make correct vectorisation impossible.
The vectorisation is performed by simply adding
Extra line segments introduced by the thinning al-
each line pixel as a node to a graph structure. If
gorithm can be addressed by removing nodes with
two line pixels are adjacent, their corresponding
a single edge, whereas the other artefacts cannot
nodes are connected by an edge. The output of
be as easily dealt with. Necking artefacts and
194

Figure 5: Artefacts caused by adaptive thresholding
Figure 7: Thinned drawing with overlaid feature points
node is created and all edges attached to nodes
Image
within a certain range are transferred to the newly
created node. The now unconnected nodes are
Adaptive Threshold
Feature Point Extraction
removed.
At this state, tightly connected groups of nodes can
Thinning
still exist. They are caused by holes in the lines
between feature points and cannot be removed by
Vectorisation
polygonal approximation, because they have more
than two edges. To break up these clusters a pro-
Polygonal Approximation
cessing step called ”artefact point removal” is ap-
plied. Every node not created by a feature point
is removed if another node is within a certain dis-
Raw Bridge Model
Feature Points
tance. The removed node’s edges are transferred
to the nearest neighbour. After breaking up these
Collect bridge nodes
clusters, the polygonal approximation algorithm is
around feature points
invoked again.
Artefact point removal
4.1 Corner Detection
Harris corner detection is a very common way to
Polygonal Approximation
obtain distinct features in an image. To obtain the
possible corner points the structure tensor T shown
Bridge Model
in equation (2) has to be calculated for each point
in the image. Ix and Iy are the derivatives of pixel
intensity in x and y direction, respectively.
Figure 6: The final algorithm
gaps introduced by adaptive thresholding result
in decreased reliability at junctions. Furthermore
T =
I2x
IxIy
I
(2)
this crude vectorisation algorithm can result in im-
xIy
I2y
precise junction positioning. To overcome these
Corners are detected by searching for local max-
drawbacks, we propose a hybrid solution, which
ima of R which is calculated by equation (3) with
uses Harris Corner detection in addition to vec-
κ ≈ 0.04.
torisation.
4 Final Algorithm
R = det(T ) − κ trace2(T )
(3)
Junctions in the drawing can be recognised very
5 Discussion
reliably using corner detection. The final algorithm
(see fig. 6) uses the Harris corner detector [14] to
The described algorithm was created to solve the
obtain a set of feature points. These features are
special problem of recognising bridge sketches on a
then used to collect bridge nodes in their neigh-
whiteboard. An algorithm suitable for our problem
bourhood (fig. 7). For each feature point a new
195

Figure 8: Reflections
Figure 10: Small structures
effectively prevents the recognition of any struc-
tures smaller than this minimal distance. A marginal
case can be seen in fig. 10.
The mentioned limitations prevent the proposed
algorithm from being a general solution for straight
line sketch recognition. But it works accurately,
reliably and in real time within the constraints of
the project it was built for.
6 Conclusion and Outlook
Figure 9: Thick lines
In this paper we presented the first step in the
does not only have to be fast but also more ro-
design of an augmented reality game involving the
bust than algorithms working on still images from
overlay of statics information on freehand bridge
scanners. Video camera images are usually of low
drawings. This first and most important step was
resolution and contain significant noise. Further-
the development of an algorithm for the recog-
more, fast movements of the camera yield motion
nition of bridge sketches. The Hough transform
blur on the image. These constraints make reliable
proved to be inappropriate to extract the wanted
recognition a challenging problem.
information. Simple vectorisation gives better re-
sults but still has some drawbacks in terms of pre-
To test the algorithm we used a fixed set of still im-
cision and reliability. The presented hybrid algo-
ages exhibiting critical properties like poor lighting
rithm overcomes these drawbacks by using corner
and marginal cases such as short and long lines,
detection in addition to vectorisation.
acute and obtuse angles and thin and thick lines.
In addition we continuously tested with live video.
The proposed solution is able to recognise bridge
sketches for a broad scale of lighting conditions
The algorithm can deal with light reflections on
and distances and works in real time. Further
the whiteboard. But there are also cases like in
improvements can be achieved by adding variable
fig. 8 where reflections are too strong for proper
thresholds and scale spaces. This would allow for
recognition. By decreasing the kernel size of the
a large range of distances, differently sized or even
adaptive threshold the algorithm becomes less af-
mixed-sized structures.
fected by brightness variations. But smaller kernels
can cause holes in thick lines. The kernel size has
The result of the presented algorithm is a model of
to be chosen in correspondence to the thickness of
the sketched bridge which is used by the AR Bridge
the lines and the distance to the drawing plane.
Builder game to simulate the bridge’s physics and
This should be done dynamically, which is not yet
overlay the simulation results on live video.
implemented. As a consequence we can currently
only detect 1-5 pixel wide lines. Fig. 9 shows that
References
thick lines cause the algorithm to fail.
Corner detection fails for small distances to the
[1] Intel, “OpenCV Open Source Computer
drawing plane. This could probably be overcome
Vision
Library,”
http://sourceforge.
by using scale spaces. Furthermore the threshold
net/projects/opencvlibrary/,
visited
on
used to collect nodes around the corner points im-
4/10/2007.
plies a minimal distance between junctions. This
196

[2] Chronic Logic, “Bridge Builder,” http://
[14] C. Harris and M. Stephens, “A combined cor-
www.chroniclogic.com/, visited on 9/9/2007.
ner and edge detector,” in Proceedings of The
Fourth Alvey Vision Conference, Manchester,
[3] Microsoft Research,
“Microsoft Physics
1988, pp. 147–152.
Illustrator,”
http://www.microsoft.
com/downloads/details.aspx?familyid=
56347faf-a639-4f3b-9b87-1487fd4b5a53&displaylang=
en, visited on 9/9/2007.
[4] J. Mahoney and M. Fromherz, “Three
main concerns in sketch recognition and an
approach to addressing them,” in AAAI
Spring Symposium on Sketch Understanding,
2002, pp. 105–112. [Online]. Available:
citeseer.ist.psu.edu/mahoney02three.html
[5] B. Yu, “Recognition of freehand sketches us-
ing mean shift,” in IUI ’03: Proceedings of
the 8th international conference on Intelligent
user interfaces. New York, NY, USA: ACM
Press, 2003, pp. 204–210.
[6] T. M. Sezgin and R. Davis, “Hmm-based
efficient sketch recognition,” in IUI ’05: Pro-
ceedings of the 10th international conference
on Intelligent user interfaces. New York, NY,
USA: ACM Press, 2005, pp. 281–283.
[7] R. Janssen and A. Vossepoel, “Adaptive vec-
torization of line drawing images,” CVIU,
vol. 65, no. 1, pp. 38–56, January 1997.
[8] Jinyang and Yumei, “Automatic extraction
of contour lines from scanned topographic
map,” in Geoscience and Remote Sensing
Symposium, 2004. IGARSS ’04. Proceedings.,
vol. 5. IEEE International, 2004, pp. 2886–
2888.
[9] P. Hough, “Method and means for recognizing
complex patterns,” US Patent No. 3069654,
1962.
[10] J. Matas, C. Galambos, and J. Kittler, “Pro-
gressive probabilistic hough transform.” 1998.
[Online]. Available: http://dblp.uni-trier.de/
db/conf/bmvc/bmvc1998.html#MatasGK98
[11] J. Canny, “A computational approach to edge
detection,” IEEE Trans. Pattern Anal. Mach.
Intell., vol. 8, no. 6, pp. 679–698, 1986.
[12] T. Y. Zhang and C. Y. Suen, “A fast par-
allel algorithm for thinning digital patterns,”
Commun. ACM, vol. 27, no. 3, pp. 236–239,
1984.
[13] C. M. Holt, A. Stewart, M. Clint, and R. H.
Perrott, “An improved parallel thinning algo-
rithm,” Commun. ACM, vol. 30, no. 2, pp.
156–160, 1987.
197