Table of Contents
Preface
1. Introduction 
1.1 Overview 
1.2 Human and computer vision 
1.3 The human vision system 
1.3.1 The eye 
1.3.2 The neural system 
1.3.3 Processing 
1.4 Computer vision systems 
1.4.1 Cameras 
1.4.2 Computer interfaces 
1.5 Processing images 
1.5.1 Processing 
1.5.2 Hello Python, hello images! 
1.5.3 Mathematical tools 
1.5.4 Hello Matlab 
1.6 Associated literature 
1.6.1 Journals, magazines and conferences 
1.6.2 Textbooks 
1.6.3 The web 
1.7 Conclusions 
References 
2. Images, sampling and frequency domain processing 
2.1 Overview 
2.2 Image formation 
2.3 The Fourier Transform 
2.4 The sampling criterion 
2.5 The discrete Fourier Transform 
2.5.1 One-dimensional transform 
2.5.2 Two-dimensional transform 
2.6 Properties of the Fourier Transform 
2.6.1 Shift invariance 
2.6.2 Rotation 
2.6.3 Frequency scaling 
2.6.4 Superposition (linearity) 
2.6.5 The importance of phase 
2.7 Transforms other than Fourier 
2.7.1 Discrete cosine transform 
2.7.2 Discrete Hartley Transform 
2.7.3 Introductory wavelets 
2.7.3.1 Gabor Wavelet 
2.7.3.2 Haar Wavelet 
2.7.4 Other transforms 
2.8 Applications using frequency domain properties 
2.9 Further reading 
References 
3. Image processing 
3.1 Overview 
3.2 Histograms 
3.3 Point operators 
3.3.1 Basic point operations 
3.3.2 Histogram normalisation 
3.3.3 Histogram equalisation 
3.3.4 Thresholding 
3.4 Group operations 
3.4.1 Template convolution 
3.4.2 Averaging operator 
3.4.3 On different template size 
3.4.4 Template convolution via the Fourier transform 
3.4.5 Gaussian averaging operator 
3.4.6 More on averaging 
3.5 Other image processing operators 
3.5.1 Median filter 
3.5.2 Mode filter 
3.5.3 Nonlocal means 
3.5.4 Bilateral filtering 
3.5.5 Anisotropic diffusion 
3.5.6 Comparison of smoothing operators 
3.5.7 Force field transform 
3.5.8 Image ray transform 
3.6 Mathematical morphology 
3.6.1 Morphological operators 
3.6.2 Grey level morphology 
3.6.3 Grey level erosion and dilation 
3.6.4 Minkowski operators 
3.7 Further reading 
References 
4. Low-level feature extraction (including edge detection) 
4.1 Overview 
4.2 Edge detection 
4.2.1 First-order edge detection operators 
4.2.1.1 Basic operators 
4.2.1.2 Analysis of the basic operators 
4.2.1.3 Prewitt edge detection operator 
4.2.1.4 Sobel edge detection operator 
4.2.1.5 The Canny edge detector 
4.2.2 Second-order edge detection operators 
4.2.2.1 Motivation 
4.2.2.2 Basic operators: The Laplacian 
4.2.2.3 The Marr-Hildreth operator 
4.2.3 Other edge detection operators 
4.2.4 Comparison of edge detection operators 
4.2.5 Further reading on edge detection 
4.3 Phase congruency 
4.4 Localised feature extraction 
4.4.1 Detecting image curvature (corner extraction) 
4.4.1.1 Definition of curvature 
4.4.1.2 Computing differences in edge direction 
4.4.1.3 Measuring curvature by changes in intensity (differentiation) 
4.4.1.4 Moravec and Harris detectors 
4.4.1.5 Further reading on curvature 
4.4.2 Feature point detection; region/patch analysis 
4.4.2.1 Scale invariant feature transform 
4.4.2.2 Speeded up robust features 
4.4.2.3 FAST, ORB, FREAK, LOCKY and other keypoint detectors 
4.4.2.4 Other techniques and performance issues 
4.4.3 Saliency 
4.4.3.1 Basic saliency 
4.4.3.2 Context aware saliency 
4.4.3.3 Other saliency operators 
4.5 Describing image motion 
4.5.1 Area-based approach 
4.5.2 Differential approach 
4.5.3 Recent developments: deep flow, epic flow and extensions 
4.5.4 Analysis of optical flow 
4.6 Further reading 
References 
5. High-level feature extraction: fixed shape matching 
5.1 Overview 
5.2 Thresholding and subtraction 
5.3 Template matching 
5.3.1 Definition 
5.3.2 Fourier transform implementation 
5.3.3 Discussion of template matching 
5.4 Feature extraction by low-level features 
5.4.1 Appearance-based approaches 
5.4.1.1 Object detection by templates 
5.4.1.2 Object detection by combinations of parts 
5.4.2 Distribution-based descriptors 
5.4.2.1 Description by interest points (SIFT, SURF, BRIEF) 
5.4.2.2 Characterising object appearance and shape 
5.5 Hough transform
5.5.1 Overview 
5.5.2 Lines 
5.5.3 HT for circles 
5.5.4 HT for ellipses 
5.5.5 Parameter space decomposition 
5.5.5.1 Parameter space reduction for lines 
5.5.5.2 Parameter space reduction for circles 
5.5.5.3 Parameter space reduction for ellipses 
5.5.6 Generalised Hough transform 
5.5.6.1 Formal definition of the GHT 
5.5.6.2 Polar definition 
5.5.6.3 The GHT technique 
5.5.6.4 Invariant GHT 
5.5.7 Other extensions to the HT 
5.6 Further reading 
References 
6. High-level feature extraction: deformable shape analysis 
6.1 Overview 
6.2 Deformable shape analysis 
6.2.1 Deformable templates 
6.2.2 Parts-based shape analysis 
6.3 Active contours (snakes) 
6.3.1 Basics 
6.3.2 The Greedy Algorithm for snakes 
6.3.3 Complete (Kass) Snake implementation 
6.3.4 Other Snake approaches 
6.3.5 Further Snake developments 
6.3.6 Geometric active contours (Level Set-Based Approaches) 
6.4 Shape Skeletonisation 
6.4.1 Distance transforms 
6.4.2 Symmetry 
6.5 Flexible shape models - active shape and active appearance 
6.6 Further reading 
References 
7. Object description 
7.1 Overview and invariance requirements 
7.2 Boundary descriptions 
7.2.1 Boundary and region 
7.2.2 Chain codes 
7.2.3 Fourier descriptors 
7.2.3.1 Basis of Fourier descriptors 
7.2.3.2 Fourier expansion 
7.2.3.3 Shift invariance 
7.2.3.4 Discrete computation 
7.2.3.5 Cumulative angular function 
7.2.3.6 Elliptic Fourier descriptors 
7.2.3.7 Invariance 
7.3 Region descriptors 
7.3.1 Basic region descriptors 
7.3.2 Moments 
7.3.2.1 Definition and properties 
7.3.2.2 Geometric moments 
7.3.2.3 Geometric complex moments and centralised moments 
7.3.2.4 Rotation and scale invariant moments 
7.3.2.5 Zernike moments 
7.3.2.6 Tchebichef moments 
7.3.2.7 Krawtchouk moments 
7.3.2.8 Other moments 
7.4 Further reading 
References 
8. Region-based analysis 
8.1 Overview 
8.2 Region-based analysis 
8.2.1 Watershed transform 
8.2.2 Maximally stable extremal regions 
8.2.3 Superpixels 
8.2.3.1 Basic techniques and normalised cuts 
8.2.3.2 Simple linear iterative clustering 
8.3 Texture description and analysis 
8.3.1 What is texture? 
8.3.2 Performance requirements 
8.3.3 Structural approaches 
8.3.4 Statistical approaches 
8.3.4.1 Co-occurrence matrix 
8.3.4.2 Learning-based approaches 
8.3.5 Combination approaches 
8.3.6 Local binary patterns 
8.3.7 Other approaches 
8.3.8 Segmentation by texture 
8.4 Further reading 
References 
9. Moving object detection and description 
9.1 Overview 
9.2 Moving object detection 
9.2.1 Basic approaches 
9.2.1.1 Detection by subtracting the background 
9.2.1.2 Improving quality by morphology 
9.2.2 Modelling and adapting to the (static) background 
9.2.3 Background segmentation by thresholding 
9.2.4 Problems and advances 
9.3 Tracking moving features 
9.3.1 Tracking moving objects 
9.3.2 Tracking by local search 
9.3.3 Problems in tracking 
9.3.4 Approaches to tracking 
9.3.5 MeanShift and Camshift 
9.3.5.1 Kernel-based density estimation 
9.3.5.2 MeanShift tracking 456
9.3.5.3 Camshift technique 461
9.3.6 Other approaches 465
9.4 Moving feature extraction and description 468
9.4.1 Moving (biological) shape analysis 468
9.4.2 Space-time interest points 470
9.4.3 Detecting moving shapes by shape matching in
image sequences 470
9.4.4 Moving shape description 474
9.5 Further reading 477
References 478
Contents xv
These proofs may contain color figures. Those figures may print black and white in the final printed book if a color print product has not been planned. The color figures will
appear in color in all electronic versions of this book.
To protect the rights of the author(s) and publisher we inform you that this PDF is an uncorrected proof for internal business use only by the author(s), editor(s), reviewer(s),
Elsevier and typesetter TNQ Books and Journals Pvt Ltd. It is not allowed to publish this proof online or in print. This proof copy is the copyright property of the publisher
and is confidential until formal publication.
10. Camera geometry fundamentals 483
10.1 Overview 483
10.2 Projective space 483
10.2.1 Homogeneous co-ordinates and projective
geometry 484
10.2.2 Representation of a line, duality and ideal points 485
10.2.3 Transformations in the projective space 487
10.2.4 Computing a planar homography 490
10.3 The perspective camera 493
10.3.1 Perspective camera model 494
10.3.2 Parameters of the perspective camera model 498
10.3.3 Computing a projection from an image 498
10.4 Affine camera 
10.4.1 Affine camera model 
10.4.2 Affine camera model and the perspective projection 
10.4.3 Parameters of the affine camera model 
10.5 Weak perspective model 
10.6 Discussion 
10.7 Further reading 
References 
11. Colour images 
11.1 Overview 
11.2 Colour image theory 
11.2.1 Colour images 
11.2.2 Tristimulus theory 
11.2.3 The colourimetric equation 
11.2.4 Luminosity function 
11.3 Perception-based colour models: CIE RGB and CIE XYZ 
11.3.1 CIE RGB colour model: Wright-Guild data 
11.3.2 CIE RGB colour matching functions 
11.3.3 CIE RGB chromaticity diagram and chromaticity co-ordinates 
11.3.4 CIE XYZ colour model 
11.3.5 CIE XYZ colour matching functions 
11.3.6 XYZ chromaticity diagram 
11.3.7 Uniform colour spaces: CIE LUV and CIE LAB 
11.4 Additive and subtractive colour models 
11.4.1 RGB and CMY 
11.4.2 Transformation between RGB models 
11.4.3 Transformation between RGB and CMY models 
11.5 Luminance and chrominance colour models 
11.5.1 YUV, YIQ and YCbCr models 
11.5.2 Luminance and gamma correction 
11.5.3 Chrominance 
11.5.4 Transformations between YUV, YIQ and RGB colour models 
11.5.5 Colour model for component video: YPbPr 
11.5.6 Colour model for digital video: YCbCr 
11.6 Additive perceptual colour models 
11.6.1 The HSV and HLS colour models 
11.6.2 The hexagonal model: HSV 
11.6.3 The triangular model: HLS 
11.6.4 Transformation between HLS and RGB 
11.7 More colour models 
References 
12. Distance, classification and learning 
12.1 Overview 
12.2 Basis of classification and learning 
12.3 Distance and classification 
12.3.1 Distance measures 
12.3.1.1 Manhattan and Euclidean Ln norms 
12.3.1.2 Mahalanobis, Bhattacharrya and Matusita 
12.3.1.3 Histogram intersection, Chi2 (c2) and the Earth Mover’s distance 
12.3.2 The k-nearest neighbour for classification 
12.4 Neural networks and Support Vector Machines 
12.5 Deep learning 
12.5.1 Basis of deep learning 
12.5.2 Major deep learning architectures 
12.5.3 Deep learning for feature extraction 
12.5.4 Deep learning performance evaluation 
12.6 Further reading 
References