THE REPUBLIC OF TURKEY BAHCESEHIR UNIVERSITY IR IMAGE EDGE DETECTION USING NEURAL NETWORK AND CLUSTERING Master’s Thesis TALA MOHAMMADZADEH MEYMANDI ISTANBUL, 2018 THE REPUBLIC OF TURKEY BAHCESEHIR UNIVERSITY GRADUATE SCHOOL OF NATURAL AND APPLIED SCIENCES COMPUTER ENGINEERING IR IMAGE EDGE DETECTION USING NEURAL NETWORK AND CLUSTERING Master’s Thesis TALA MOHAMMADZADEH MEYMANDI Supervisor: ASSIST. PROF. TARKAN AYDIN İSTANBUL, 2018 THE REPUBLIC OF TURKEY BAHCESEHIR UNIVERSITY GRADUATE SCHOOL OF NATURAL AND APPLIED SCIENCES COMPUTER ENGINEERING Name of the thesis: IR image edge detection using Neural Network and clustering Name/Last Name of the Student: Tala Mohammadzadeh Meymandi Date of the Defense of Thesis: May.13.2018 The thesis has been approved by the Graduate School of Natural and Applied Sciences. Assist. Prof. Yücel Batu SALMAN Graduate School Director Signature I certify that this thesis meets all the requirements as a thesis for the degree of Master of Science. Assist. Prof. Tarkan AYDIN Program Coordinator Signature This is to certify that we have read this thesis and we find it fully adequate in scope, quality and content, as a thesis for the degree of Master of Science. Examining Committee Members Signature _ Thesis Supervisor Assist. Prof. Tarkan AYDIN ------------------------------------ Thesis Co-supervisor Assist. Prof. Pınar SARISARAY BÖLUK ------------------------------------ Member Assist. Prof. Mürüvvet Aslı AYDIN ----------------------------------- iii ACKNOWLEDGEMENTS I wish to express my sincere gratitude to my Supervisor Asst. Prof. Dr. Tarkan Aydin for his advice, encouragement, guidance and continuous feedback throughout this thesis. I would like to thank my family for their constant support throughout my life. May, 2018 Tala Mohammadzadeh MEYMANDI iv ABSTRACT IR IMAGE EDGE DETECTION USING NEURAL NETWORK AND CLUSTERING Tala Mohammadzadeh Meymandi Computer Engineering Thesis Supervisor: Assist. Prof. Tarkan Aydin May 2018, 87 pages Nowadays image processing and feature extraction methods provide significantly important knowledge about images. The first step for identifying objects in an image is extracting the image properties. Edge detection is one the common features of image processing, because edges include useful information about an image. Although general public may not deal with Infrared images directly, this field is widely benefited in many sciences. Therefore, a proper infrared image edge detection method could result in thorough comprehension. In this study, infrared images are selected for edge detection due to their application in various technologies such as medical, military fields and surveillance purposes. According to the structure of these images, it is not possible to extract their edges using common methods. Therefore, a new method is proposed for edge detection of infrared images. In the proposed method, first the image is segmented by a clustering algorithm. Then, Neural Network algorithm is selected to extract the region of interest among the segmented clusters. In the last step, morphological operators are used to extract the edges from the Region of Interest. For segmentation, two K-means and Mean Shift clustering methods are applied separately, and their cluster features are used as the Neural Network inputs. Pursuant to the advantage of Mean Shift clustering algorithm in cluster number determination this method may be favorable in many cases. The evaluation results of the proposed method and comparison with other available methods indicate the method’s good performance for infrared image edge detection. Keywords: Infrared images, K-means clustering, Mean Shift clustering, Neural Network, Region of Interest v ÖZET KIZILÖTESİ GÖRÜNTÜDE SİNİR AĞI VE KÜMELEME İLE KENAR BELİRLEME Tala Mohammadzadeh Meymandi Bilgisayar Mühendisliği Tez Danışmanı: Dr. Öğr. Üyesi Tarkan AYDIN Mayıs 2018, 87 sayfa Günümüzde görüntü işleme ve özellik çıkarma yöntemleri, görüntüler hakkında önemli bilgiler sağlamaktadır. Bir görüntüdeki nesneleri tanımlamak için ilk adım görüntü özelliklerini ayıklamaktır. Kenar belirleme, görüntü işlemenin ortak özelliklerinden biridir, çünkü kenarlar bir görüntü hakkında faydalı bilgiler içerir. Genel halk doğrudan Kızılötesi görüntülerle ilgilenmese de bu alan birçok bilimlerinde yaygın olarak kullanılmaktadır. Bu nedenle, uygun bir kenar belirleme metodu ile kızılötesi görüntüde, kapsamlı anlayış sonuçlanabilir. Bu çalışmada, kenar belirleme metodu kızılötesi görüntüler için seçilmiştir; çünkü bu görüntüler çeşitli teknolojilerde mesela medikal, askeri alanlar ve gözetim amaçları için uygulanmaktadır. Bu görüntülerin yapısına göre, kenarlarını ortak belirleme metotlarla mümkün değildir. Bu nedenle, kızılötesi görüntülerin kenar bulması için yeni bir yöntem önerilmiştir. Önerilen yöntemde, önce görüntü bir Kümeleme algoritması ile bölümlere ayrılır. Ardından, ayrılmış bölümler arasında ilgi bölgesi çıkarmak için Sinir Ağı algoritması seçilir. Son adımda, ilgi bölgesinde Morfolojik işletmeciler kenarları çıkarmak için kullanılır. K-Ortalamalar Kümeleme ve Ortalama Kaydırma yöntemleri ile görüntü bölünür ve kümelerin özellikleri Sinir Ağının girişleri olarak kullanılır. Küme sayısının belirlenmesi için Ortalama Kaydırma ile Kümeleme algoritmasının avantajına göre, bu yöntem birçok durumda uygun olabilir. Önerilen yöntemin değerlendirme sonuçlara göre ve diğer mevcut yöntemlerin karşılaştırmanın neticelere göre, yöntemin kızılötesi görüntü kenarı belirleme için iyi performansını göstermektedir. Anahtar Kelimeler: Kızılötesi görüntüler, K-Ortalamalar Kümeleme, Ortalama Kaydırma ile Kümeleme, Sinir Ağı, İlgi Bölgesi vi CONTENTS TABLES..................................................................................................................ix FIGURES.................................................................................................................x ABBREVIATIONS...............................................................................................xii SYMBOLS...........................................................................................................xiii 1. INTRODUCTION...............................................................................................1 1.1 INTRODUCTION....................................................................................1 1.2 PROBLEM STATEMENT.....................................................................2 1.3 THESIS OBJECTIVES...........................................................................5 1.4 NEW ASPECTS AND RESEARCH INNOVATION...........................5 1.5 THESIS STRUCTURE............................................................................5 2. LITERATURE REVIEW..................................................................................7 2.1 INTRODUCTION...................................................................................7 2.2 IMAGE PROCESSING..........................................................................7 2.2.1 Image Processing Applications...................................................8 2.2.2 Imaging.......................................................................................14 2.2.3 Pre-processing............................................................................14 2.2.4 Feature Extraction from Images...............................................16 2.3 THERMAL IMAGING.........................................................................19 2.3.1 Thermal Imaging Components.................................................20 2.3.2 Different Generations of Thermal Cameras.............................21 2.3.3 Effective Factors on Image Quality..........................................22 2.3.4 Accuracy and Recognition Factors of Images.........................22 2.3.5 Main Effective Factors in Thermal Imaging...........................23 2.3.6 Selection of Wavelength Region for Thermal Cameras.........24 2.3.7 Atmospheric Effects on Thermal Cameras’ Performance….24 2.4 IMAGE SEGMENTATION…..............................................................26 2.4.1 Clustering…................................................................................26 2.4.1.1 K-means clustering.....................................................27 2.4.1.2 Mean shift clustering....................................................27 2.4.1.2.1 Mean shift applications.................................27 vii 2.4.2 Classification...............................................................................29 2.4.2.1 Artificial neural network ...........................................29 2.5 RESEARCH BACKGROUND....................................................33 2.6 CONCLUSION..............................................................................34 3. DATA AND METHOD.....................................................................................35 3.1 INTRODUCTION..................................................................................35 3.2 APPLIED TOOLS.................................................................................35 3.3 OUTLINE OF THE THESIS................................................................36 3.4 PRE-PROCESSING..............................................................................37 3.5 IMAGE SEGMENTATIONS...............................................................39 3.5.1 K-means Clustering...................................................................39 3.5.1.2 Optimal number of clusters.........................................42 3.5.2 Mean Shift Clustering................................................................43 3.6 ROI EXTRACTION...............................................................................47 3.6.1 Feature Extraction from The Image Regions..........................47 3.6.2 Artificial Neural Network..........................................................49 3.7 EDGE DETECTION.............................................................................51 3.7.1 Post-Processing..............................................................................53 3.7.2 Edge Detection Using Morphological Operators....................55 3.8 CHAPTER SUMMARY........................................................................56 4. FINDINGS.........................................................................................................57 4.1 INTRODUCTION..................................................................................57 4.2 EVALUATION.......................................................................................57 4.2.1 Dataset.........................................................................................57 4.3 EVALUATION RESULT......................................................................57 4.3.1 Evaluation Method.....................................................................58 4.3.1.2 Confusion matrix........................................................69 4.3.1.3 Running time…….......................................................72 4.4 CONCLUSION........................................................................................73 5. DISCUSSION....................................................................................................74 5.1 COMPARING THE RESULTS WITH OTHER METHODS..........74 5.2 COMPARING THE RESULTS WITH OTHER WORKS...............76 viii 5.3 COMPARING BOTH CLUSTERING METHODS...........................79 6. CONCLUSION..................................................................................................86 6.1 INTRODUCTION..................................................................................86 6.2 THESIS RESULTS.................................................................................86 6.3 PROPOSAL FOR FUTURE RESEARCH..........................................87 REFERENCES......................................................................................................88 APPENDICES Appendix A.1 Running time........................................................................94 Appendix A.2 Confusion matrix................................................................95 ix TABLES Table 2.1: Pre-processing categories…………….………..……………....…………….15 Table 2.2: Correspondence between ANN and BNN…………..……………....………33 Table 3.1: Extracted Quantile …..………………………..……………....…………….45 Table 3.2: Extracted features for each region using K-means clustering…….………...48 Table 3.3: Extracted features for each region using Mean Shift clustering… ..………....49 Table 3.4: Region labeling for training the NN (K-means clustering)……..……….….51 Table 3.5: Region labeling for training the NN (Mean Shift clustering)… .….............…51 Table 4.1: Confusion matrix of Figure 4.1………………………..…………..………...70 Table 4.2: Confusion matrix of Figure 4.3……………..…………………………….....70 Table 4.3: Total results of all 16 images…….………………….….……….…………...70 Table 4.4: Performance comparison of both methods.……………………….………...72 x FIGURES Figure 1.1: Prewitt edge detection method........................................................................3 Figure 2.1: Motion tracking...............................................................................................9 Figure 2.2: Medical image Processing applications……………....….………..…….…..9 Figure 2.3: Distinction of various tissues from each other……………...…………..….10 Figure 2.4: Image processing application for ultrasound measurement………...……...10 Figure 2.5: Computer-assisted surgeries using image processing technique…..……....11 Figure 2.6: Intercept targets in military applications using image processing……....…12 Figure 2.7: Image processing in security systems………………………….………..…12 Figure 2.8: Image processing in geographic systems for identifying cover crops…..…13 Figure 2.9: Separation of the defective fruit surface by image processing….……........14 Figure 2.10: Feature extraction steps………….………………….……….…………....16 Figure 2.11: Electromagnetic waves……………………………………..…………….19 Figure 2.12: Object tracking for surveillance………………….…….…………………28 Figure 2.13: Object tracking in a soccer game……………………………………..…..28 Figure 2.14: Structure of the humans’ brain………………………………………..…..30 Figure 2.15: Simple structure of the Neural Network………………..…..…..…..…….32 Figure 3.1: General steps of the proposed method……………………………........…..37 Figure 3.2: Median Filter……………………………………………...………..………38 Figure 3.3: Applying Median filter…………...………….…...….…………………….39 Figure 3.4: Block diagram explaining the k-means clustering steps…………..……….40 Figure 3.5: Clusters extracted using k-means clustering ………....………………..…..41 Figure 3.6: Elbow criterion………….……….………….…….…………….……….....43 Figure 3.7: Flat kernel……………………………..…………….……………….….....44 Figure 3.8: Image density structure…………………….…………….…………….…..45 Figure 3.9: The resulting clusters by Mean Shift clustering…….………...…………...46 Figure 3.10: Clusters extracted using Mean Shift Clustering…………………..............47 Figure 3.11: Expected object for edge detection in the IR image….…..…..…….…….48 Figure 3.12: Structure of Neural Network in this study …..………….…………….….50 Figure 3.13: Pre-processing………………………………………………………….....52 Figure 3.14: Clustering using both methods…………………..………………………..52 Figure 3.15: Binary image rebuilt from the extracted ROI ………………….…...…....53 xi Figure 3.16: Binary image after Erosion……….……………………….…...……...….54 Figure 3.17: Binary image after Dilation……….……….…………….………..……...55 Figure 3.18: Detected edges.…… ……………………...………………….……..……55 Figure 4.1: Detected edges (b,c) of image img_00003 (a) in folder 0002…….…….....58 Figure 4.2: Detected edges (b,c) of image img_00014 (a) in the folder 0002…........….59 Figure 4.3: Detected edges (b,c) of image img_00028 (a) in the folder 0002….……....60 Figure 4.4: Detected edges (b,c) of image img_00001 (a) in the folder 0004.................61 Figure 4.5: Detected edges (b,c) of image img_00009 (a) in the folder 0004.................62 Figure 4.6: Detected edges (b,c) of image img_00018 (a) in the folder 0004……….....63 Figure 4.7: Detected edges (b,c) of image img_00001 (a) in the folder 0006……….…63 Figure 4.8: Detected edges (b,c) of image img_00009 (a) in the folder 0006…..….......64 Figure 4.9: Detected edges (b,c) of image img_00018 (a) in the folder 0006……..…...65 Figure 4.10: Detected edges (b,c) of image img_00001 (a) in the folder 0007…...........65 Figure 4.11: Detected edges (b,c) of image img_00011 (a) in the folder 0007…...........66 Figure 4.12: Detected edges (b,c) of image img_00022 (a) in the folder 0007…...........66 Figure 4.13: Detected edges (b,c) of image img_00001 (a) in the folder 0008…….......67 Figure 4.14: Detected edges (b,c) of image img_00012 (a) in the folder 0008…..….....67 Figure 4.15: Detected edges (b,c) of image img_00024 (a) in the folder 0008…...........68 Figure 4.16: Detected edges (b,c) of image img_00032(a) in the folder 0009……...….68 Figure 5.1: Comparison of proposed method with the common algorithms…….…......75 Figure 5.2: Comparison of proposed method with the common algorithms……...........76 Figure 5.3: Comparison of proposed methods with CNN_DGA method…..…..…...…77 Figure 5.4: Comparison of proposed method with CNN_DGA method……...……......77 Figure 5.5: Comparison of the proposed method with Mix_model………….....….…..78 Figure 5.6: Comparison of the proposed method with Mix_model…………..…....…..79 Figure 5.7: Comparison between k-means and Mean Shift clustering……...……...…..80 Figure 5.8: Comparison between k-means and Mean Shift clustering……...….............81 Figure 5.9: Comparison between k-means and Mean Shift clustering……..………..…82 Figure 5.10: Comparison between k-means and Mean Shift clustering……………......83 Figure 5.11: Comparison between k-means and Mean Shift clustering………...….......84 Figure 5.12: Comparison between k-means and Mean Shift clustering………......…....85 Figure 5.13: Comparison between k-means and Mean Shift clustering……….........….85 xii ABBREVIATIONS ACO : Ant Colony Optimization ANN : Artificial Neural Network BNN : Biological Neural Network CCD : Charge-Coupled Device array CNN : Cellular Neural Networks DGA : Distributed Genetic Algorithms FOV : Field of View GPS : Global Positioning System HgCdTe : Mercury Cadmium Telluride Detectors HSL : Hue Saturation Intensity InSb : Indium Antimonite Semiconductors IR : Infrared Radiation KDE : Kernel Density Estimation MLP : Multilayer Perceptron MRTD : Minimum Resolvable Temperature Difference NETD : Noise Equivalent Temperature Difference NN : Neural Network RGB : Red Green Blue ROI : Region of Interest SSE : Standard Square Error USA : United States of America xiii SYMBOLS Centigrade : C Degree : ° Extracted Region from the image : 𝐼𝑛 micrometer : μm millimeter : mm nanometer : nm 1. INTRODUCTION 
 1.1 INTRODUCTION The knowledge of image and photo is a great science. Nowadays, acquiring and analyzing images have a great impact on many different sciences. Because of this great influence various methods for image acquisition and processing methods are being developed. Each method has its own advantages and disadvantages and they’re being used in their own fields. It is important to consider that due to the variety of techniques available for image processing, in many cases applying a particular method is not enough. One of the basic utilization of image processing is edge detection. All image edges include useful information that could be very helpful for object detection. Infrared image is one of the most popular and vital image types in today's life. Infrared Radiation (IR) images are widely used in medical and military industries and also in surveillance applications. The temperature of human body is a great sign of humans’ well-being. Therefore, IR images could provide considerable information about health and therefore, Infrared imaging has a significant role in diagnosing many illnesses and disorders. Many warfare weapons and smart devices use this technology for capturing and identifying objects. Breast cancer, diabetes, kidney transplantation, dermatology, heart diseases, fever screening and brain imaging are some of the examples that indicate the success of using IR imaging. Accordingly, IR images are applied in many different technologies due to their characteristics and a lot of researchers are focused on this subject. In this study, a new method is proposed to study edge detection of infrared images which is one of the most important tasks about infrared images. For this purpose, image segmentation methods with machine learning tools are applied. In the beginning of this chapter, problem statement and the initial definitions of the research tools are presented, then the research objectives are expressed. Finally, new aspects and research innovation are expressed. 2 1.2 PROBLEM STATEMENT In recent years, (Wang et al. 2014; Lahiri et al. 2012) various methods have been proposed for image processing such as Cellular Neural Networks (CNN), genetic algorithms and wavelet transforms. One of the most important aspects of image processing is edge detection. Edge detection technology is used to extract edge features. Edge feature is one of the most basic features of the image and it could be used to display the image. Edge detection is a sensitive task for target tracking. Therefore, tracking objects in images and movies is one of the most significant tasks. There are various image types for processing and the type of image determines the operation that must be performed on them. In the next section, a brief explanation of the infrared images is given. Infrared energy from all objects with a temperature above zero ° Kelvin (the absolute temperature or -273 ° Centigrade) is emitted. IR is a part of the electromagnetic spectrum with the frequency between the color spectrum and the radio waves. The IR wavelength in the electromagnetic spectrum is between 0.7 μm and 1000 μm (one millimeter). In this band the waves with the wavelengths between 0.7 micrometers and 20 μm are used for measuring the temperature. Cameras’ imaging sensors convert this energy to electrical signals which are displayed on monitor as a thermal monochromatic image. These images also show different values of heat (Duarte et al. 2014). Sir William Herschel (1800) discovered Infrared thermography. Sir But (1940s) invented the first infrared imaging system. Since 1960s Infrared imaging has been used in medical science. Over the past 20 years, there have been significant improvements in the quality of imaging equipment, the standardization of techniques and clinical imaging protocols (Duarte et al. 2014). The following features could be noted as some of the IR features (Zhou et al. 2011) : a. Random changes of the external environment and thermal imaging systems’ flaws could cause different infrared image noises such as thermal noise. The existence of these noises causes reduction of the signal quality. b. The infrared image determines the temperature distribution in the image; this image is black and white (gray); it is not a color or three-dimensional image, so it has low resolution for the human eye. 3 c. Due to the structure of thermal images their imaging systems also have low recognizing ability of objects, hence its spatial precision is lower than the visible light in the Charge-Coupled Device array (CCD), which makes the infrared image resolution lower comparing with other image types. Infrared image has many uses in medical sciences (Duarte et al. 2014) and military technologies (Abdulmunim et al. 2012, Sun 2003). Due to lower contrast and resolution of IR images comparing with color images, common image processing algorithms are not suitable. The most common edge detection algorithms are Prewitt, Canny, Sobel and Roberts edge detection algorithms which could not be applied (Wang et al. 2014). Because most of the common image edge detection methods work by extracting high frequency signals. As common operators are sensitive to noise it is difficult to distinguish between image noise and edges. Figure 1.1 shows the result of IR image edge detection using Prewitt edge detection method as a first order derivative filter. Figure 1.1: Prewitt edge detection method Therefore, other edge detection methods are required. One of these methods are based on machine learning algorithms and clustering methods. Clustering algorithms segments the images according to the images’ features. K-means and Mean Shift clustering are two popular clustering algorithms that are used in this study. In this approach each clustering method is applied separately with the machine learning algorithm due to their characteristics and their results are compared. K-means algorithm is an unsupervised algorithm which clusters the pixels based on some similar features such as gray levels. In this method the number of clusters (k) is constant which should be defined initially by the user. Some random points are selected as centers of the clusters (centroids). This algorithm has a loop with two parts: 4 i. Assignment: Each point is assigned to the cluster with the closest centroid. ii. Computation: computing the centroids of all clusters in each loop. The computation continues until convergence happens (Dhanachandra et al. 2015). For assigning and computing the clusters and their centroids many different distance measures are used. Mean Shift is a hierarchical non-parametric clustering algorithm, unlike k-means clustering method the algorithm itself figures out the number and the location of the clusters. The main concept of this algorithm is to find the densest region by computing the mean within a chosen bandwidth. In each iteration first, the points within the radius of the mean is calculated then the new mean is computed. These two steps are computed in a loop until convergence happens. In many cases this algorithm is preferable as it figures out the optimized number of clusters. Unlike k-means it doesn’t require to define the number of clusters. That is a great advantage, as in k-means with incorrect cluster number the algorithm doesn’t work properly (Cheng 1995). After clustering the image, the Region of Interest (ROI) should be found and extracted using a proper method. for this purpose, proper machine learning algorithm could be implemented. In this study, ROI is extracted using Neural Network (NN) algorithm which is one of the common machine learning algorithms. The NN is inspired by the biological behavior of neural systems of the human’s brain. This network is a combination of a large number of connected processing elements (nerves). The NN is a combination of three layers (Gonzalez et al. 2008): - Input layers -Hidden layers -Output layers Some of the Neural Network’s advantages and disadvantages are listed as following (Gonzalez et al. 2008): Advantages: a. Widespread application in many different fields. b. Very flexible because the user decides about its structure. c. Finds the complex relationship between inputs and outputs. Disadvantages: a. It's hard to interpret, so it's difficult to explain it. b. Awareness is limited about the fundamental connections. 5 c. Need to be designed and preprocessed accurately with predictive variables. In this study, in order to detect the edges both k-means and Mean Shift clustering algorithms are used for IR segmentation and Neural Network algorithm is used for ROI extraction. 1.3 THESIS OBJECTIVES The main aim of this thesis is to provide a new method for the edge detection of IR images. In this thesis, the combinations of NN with two different clustering methods (k- means and Mean Shift clustering) are studied and a new method is provided that properly detects the edges of infrared images. The results of this research could be used in the medical industry, military, and in general wherever the infrared image is used. 1.4 NEW ASPECTS AND RESEARCH INNOVATION Considering the earlier work, although extensive work was done on infrared image edge detection, there was no such proposed method for infrared image edge detection. On the other hand, given that both K-means and Mean Shift clustering methods have high results in image segmentation and Artificial Neural Networks (ANN) also functions as well, so the proposed method is expected to perform well. 1.5 THESIS STRUCTURE The structure of this thesis is defined as following: a. In the first chapter, the generalities of the study, the expected objectives of the research and the aspects of innovation are discussed. b. The second chapter explains a general overview of available methods in this field. It also explains the algorithms used in this thesis. c. In the third chapter, the proposed method is described, and the tools used in this study are described in detail. d. In the fourth chapter, the proposed method is evaluated. e. In the fifth chapter, the proposed methods are compared with other methods, works and also with each other. 6 f. In the sixth chapter, the conclusion and future work proposals are illustrated. 7 2. LITERATURE REVIEW 2.1 INTRODUCTION Digital images have a great role in today's life. Over recent decades, many industries and applications require special imaging techniques. Each of these special imaging techniques needs its own tools for processing while working with the tools requires understanding the concepts of image analysis. Infrared imaging is one of the imaging techniques that is not generally applicable, but its value is due to its application in many sensitive technologies. Unfortunately using the common visible and color imaging tools could not extract useful information by processing the infrared images. There are available methods for processing these images. One useful method is based on machine learning algorithm. Therefore, this study proposes a method to process infrared images which is based on machine learning algorithms. In this chapter, first, concepts of infrared images with definitions of image processing techniques are presented. In the second part of this chapter, the image segmentation methods used for clustering part of this study are explained, then the classification algorithms necessary for extracting the ROI are described. Finally, part of the works done in this area is also introduced as the literature of this study. 2.2 IMAGE PROCESSING There are two types of image processing: analog and digital image processing. Nowadays image processing is rather referred to digital image processing than analog part. In general, there are two types of images: analog images and digital images. The digital image processing is a field of computer science working on digital images acquired by digital cameras or scanners. This field includes two branches: Image Enhancement and Computer Vision. A digital image is an input of the functions performed in digital image processing image and it’s usually a two-dimensional image. The output of the function depending on the purpose could be either an image or the extracted feature of the image. Image enhancement advantages the acquisition tools such as filters for eliminating the noises, better visualization and adjusted contrast within an image. On the other hand, computer vision includes the techniques for analyzing and manipulating an image for better perception of the structure and the content of an image. 8 The extracted characteristics could be benefited in different technologies such as Robotics (Gilbert et al. 2005; Gonzalez 2002). There are three main tasks in the image processing: preprocessing, enhancement and displaying the image or the features of an image. The main operations in digital image processing (Marques, 2011) : a. Geometric Transformations: such as resizing, rotation . b. Arithmetic and Logic Operations: The arithmetic operations are used for different purposes such as extracting the differences between images or finding the mean of two images. c. Color Enhancement: Brightness and contrast enhancement and adjustment of the color space. d. Aliasing and Image Enhancement: The aim is to filter the signals with the frequencies above the sampling rate. e. Compression: Compression techniques are used to decrease the size of an image. f. Image Segmentation: Segmenting the image into meaningful parts. 2.2.1 Image Processing Applications Image processing methods have been applied in many different sciences such as industry, medical fields, security and surveillance monitoring. Some of the applications are mentioned in this section briefly (Iscan et al. 2009): Pattern Recognition: The goal is to identify and extract a pattern with specified features and categorize the data. Identification of letters or numbers of a text or a license plate are some of the common examples about utilization of pattern recognition (Bhanu, 2005). Motion Tracking: There are various ways to track a moving object in a video sequence. One of the common methods is the correlation function in two consecutive frames. In the first frame one or more points with a window around them are selected. While a search window in the second frame is determined. By selecting the windows’ correlation around each point and the correct determination of the search window in the next frame, the window’s correlation around each point in the next frame could be calculated and the location of the maximum correlation could be defined as the new pixel location. 9 Figure 2.1: Motion tracking Medical Applications: Image processing knowledge is applied in many different medical fields (Garge et al. 2009; Iscan et al. 2009). Some of the common medical applications are listed below: a. Quality Enhancement of Thermal Images: Figure 2.2 indicates this process. Figure 2.2: Medical image Processing applications Source: Jambhorka, Sagar et al., 2012 p.310 b. Separating the distinctive tissues from each other: Due to distinct characteristics of different tissues such as permeability the distinction of different tissues is possible with image segmentation techniques. By the use of image processing technique, the 10 identification of cancerous tissues and locating the exact place of brain tumors are practical (Iscan et al. 2009). Figure 2.3: Distinction of various tissues from each other Source: Iscan Zafer et al 2009 p. 897 c. Measurements of Sonographic Images: Image Processing is used to calculate the distance and surface values of ultrasound Images. Figure 2.4: Image processing application for ultrasound measurement Source: Iscan Zafer et al 2009 p. 895 11 d. Computer-assisted Surgeries: By using computers as surgeons’ assistants two/three dimensional models of tissues or organs are obtained and surgeons could be guided throughout operations. Figure 2.5: Computer-assisted surgeries using image processing technique Source: http://www.futuretechnology500.com/index.php/future-medical-technology/ robotic-surgery- advantages-and-disadvantages/ Military Applications: Currently, many military systems are equipped with cameras and image processing techniques. Some common utilizations of this knowledge are explained below (Wang et al. 2014; Dimitris et al. 2003): Long range precision missiles apply image processing techniques with the use of GPS (Global Positioning System) data. Systems that lock on the target with predetermined specifications (aircraft, tanks, ...). Unmanned aerial vehicles driven by image processing techniques are used for missile shooting and launching purposes. 12 Figure 2.6: Intercept targets in military applications using image processing Industrial Applications: Image processing knowledge is applied rapidly over the past few years in this field: a. Control and guidance of the manipulators b. Separation of chemicals with different colors c. Measurement of leather surfaces d. Quality control of the factory products Identification and security systems: Fingerprint recognition, face recognition and iris recognition are some of the common image processing applications (Kambli et al.2010). Figure 2.7: Image processing in security systems Source: Kambli Mansi et al. , p.920 13 Remote Sensing Systems: Image Processing methods are benefited to extract meaningful information from satellite images. The separation of different graphical zones (sea, land, farms, mountains) are some of the related instances (Blaschke 2010). Figure 2.8: Image processing in geographic systems for identifying cover crops Source: Blaschke, T. p.8 Agricultural Image Processing Applications: The food industry is one of the important industries that mainly use machine learning algorithms: a. Classification of the agricultural products b. Separation of the defective agricultural products c. Packaging the agricultural products d. Identification of plant pests e. Calculating the agricultural crops 14 Figure 2.9: Separation of the defective fruit surface by image processing Source: Dubey, et al. p.8 The main steps for defective surface detection of a fruit are: a) Imaging b) Feature extraction (Image Processing) c) The extracted features are applied in proper algorithms 2.2.2 Imaging According to the variety of image processing applications, the proper imaging technique relevant to the utilized application is required: Imaging Methods: a) Imaging with conventional cameras (Morcol et al.2010) b) Satellite imaging (Blaschke 2010) c) Imaging using sound waves (Iscan et al. 2009) d) Imaging using X-rays (Garge et al. 2009) e) Imaging with infrared cameras (Wang et al. 2011) 2.2.3 Pre-processing The raw images obtained from the imaging device have many problems. The imperfection of the imaging devices is the reason for the errors and low quality of the images. Pre- processing consists of four general methods. By the use of Pre-processing methods, the 15 visibility of the image is enhanced. Any pre-processing operation requires information about the image, camera and also the surrounding environment (Gonzalez et al. 2002). The Image preprocessing methods classified in different categories are described in the Table 2.1: Table 2.1: Pre-processing categories Type Description Brightness The enhancement of the pixel brightness Local Binary Considering small neighborhood around an image Geometric The aim is to correct the geometric falsification due to various coordinates Global Binary Using overall information of the whole image a. Brightness: This amendment includes the grayscale and the pixel intensity. b. Geometric Transformation: In general, there are two types of Geometric transformation: One is relevant to the system such as camera angle, while the others are related to random noises and could be related to the sensors. c. Local/Global pre-processing: Local pre-processing is due to the processes done on each pixel with its neighborhood pixels. But Global pre-processing requires data of the entire image. There are other pre-processing categories for image pre-processing. One category includes two types of Geometric and Radiometric transformations. The radiometric transformation refers to the type of transformation in which the main target is the elimination of atmospheric and sensor noises. Image Enhancement is also considered as one type of the image pre-processing methods. In general, there are two types of Image Enhancement: Spatial and Spectral. Spatial Enhancement is Filtering, and Spectral Enhancement is Stretching. Noise removal itself is also a separate category. Preprocessing procedures: a. Data Cleaning: The purpose of this section is to eliminate the existing noise and removing the existing conflicts between the data. b. Data Reduction: Because of the use of different databases, additional and sometimes duplicate information may be created among the data. By using correlation and clustering algorithms this redundancy could be removed. 16 c. Data Transformation: The range of the attributes may not be the same. For example, the value of one attribute may be between one to ten while the other one could be between one to one thousand. Therefore, the normalization is required. 2.2.4 Feature Extraction from Images The most important image processing task is the extraction of the proper features for each application. The steps of working on the image are as follows: Figure 2.10: Feature extraction steps Source: Seema, et al, p.50 After imaging step, the image is sent to the preprocessor to remove unwanted data and noise from the images. The purpose of extracting attributes is to reduce image data by using certain properties such as color, texture, or shape. Some of the shape properties include coexistence matrix, fast Fourier transform, and rapid wavelet transformation for fruit recognition. Some of the color properties includes mean, variance, skewness and elongation. Textural features are entropy, energy, contrast, and correlation (Mohammad et al. 2016). These Features are introduced briefly: a. Edge-based features: In this method, the map and plot of the image edges determine the objects’ features. The advantage of using edges is the steadiness. Using the edges as features has advantages over other features. The edges have steady features 17 and they are resistant to light conditions, color changes of objects and outer texture of objects. Their changes do not affect the edges as well. Edges also define the boundaries well. Therefore, feature extraction is done with a great precision especially in crowded backgrounds with many objects. Among numerous edge detection algorithms Canny, Sobel, Roberts are the popular algorithms (Seena et al. 2015). b. Morphological features: Morphological features play a great role in classification purposes. The analysis of the morphological features starts with fruit range detection. there are a lot of morphological features for extraction. Fruit border as a morphological feature is connected to the fruit dimensions (Seema et. al 2015). The features are categorized into two main groups. The first groups’ components are round fruits such as orange and apples, while the other one consists of banana and carrot. Area is one of the morphological features. the area is calculated by the following formula: 𝐴𝑟𝑒𝑎 = ∑ ∑ 𝑓(𝑖, 𝑗)𝑘𝑗=1 𝑚 𝑖=1 (2.1) k is the number of columns, m is the number of rows and the f function is calculated by this formula (Mercol et al. 2008): 𝑓(𝑖, 𝑗) = { 1 𝑖𝑓 (𝑖, 𝑗) ∈ 𝐼𝑛 0 𝑜. 𝑤 (2.2) c. Morphological Image Processing (Morphology): The aim of applying morphology is to reduce the flaws of an image by the use of shape features (Seema et. al 2015). These algorithms process the binary images. There is a structural element moved across all the image pixels. Generally, there are two types of operations that will affect the resulting image: i. Erosion: The erosion function is described by simple formula if the structural element could fit the image or pixels, the output value is 1 otherwise it’s zero. 𝑔(𝑥) = { 1 𝑖𝑓 𝑠𝑡𝑟𝑢𝑐𝑡𝑢𝑟𝑎𝑙 𝑒𝑙𝑒𝑚𝑒𝑛𝑡 𝑓𝑖𝑡𝑠 𝑡ℎ𝑒 𝑖𝑚𝑎𝑔𝑒 0 𝑒𝑙𝑠𝑒 (2.3) ii. Dilation: The dilation function is described by simple formula if the structural element could hit the image or pixels, the output value is 1 otherwise it’s zero. 18 𝑔(𝑥) = { 1 𝑖𝑓 𝑠𝑡𝑟𝑢𝑐𝑡𝑢𝑟𝑎𝑙 𝑒𝑙𝑒𝑚𝑒𝑛𝑡 ℎ𝑖𝑡𝑠 𝑡ℎ𝑒 𝑖𝑚𝑎𝑔𝑒 0 𝑒𝑙𝑠𝑒 (2.4) d. Color-based features: color is one of the basic features that human eye uses to distinguish the objects from each other. Morphological features may cause misinterpretation according to the similarity between the fruits in the same group. The similarity between banana and carrot is an instance. Therefore, color models such as Hue Saturation Intensity (HSL), Red Green Blue (RGB) could be used to separate these objects (Seema et. al 2015). e. Textural features: These features are extracted according to statistical concepts. The main applied matrices are gray level and co-occurrence matrix. In this method the neighboring points with the equal gray level are compared throughout the image. (Mercol et al. 2008) Statistical concepts: i. Contrast: The contrast of an image (often known as variance) calculates the contrast level between any point (pixel) and its neighbors. The occurrence matrix is used for calculation. ii. Correlation: It calculates the relation between a pixel and its neighbors. iii. Energy: often known as "uniformity", "energy uniformity" or "second order torque", which is the sum of the squared related components in co-occurrence matrix. iv. Homogeneity: This value indicates the closeness level of matrix components to the original diameter. v. Skewness: It measures the asymmetry level around the mean and it’s standardized according to third torque. The zero value expresses symmetry (non-skidding), the positive value shows that the skewness is to the right and if the value is negative, the skewness is to the left. vi. Kurtosis: It measures the distance of the data from the normal distribution, and its value equals to the fourth-order central torque of a distribution. The value of normal distribution is equal to three. If the Kurtosis is greater than three, it means that the distribution is smooth and if it’s less than three, the distribution is inverse. 19 2.3 THERMAL IMAGING The temperature of the human body is a great source of a persons’ health. Because first of all the human body has a specific temperature different than its surrounding environment. There are two temperatures related to the body: the inner body (core) temperature and the outer body temperature. The alteration of the body temperature (33- 42 °C) is an obvious sign of an abnormality. Thermal systems are applied for the purpose of creating and improving the operational capabilities of military forces in night combat. It is also applied for detecting and tracking goals that are visually hidden and camouflage. Thermal imaging systems are part of the passive systems that operate in the mid-infrared region of the electromagnetic spectrum. All objects emit electromagnetic wave which is directly related to their temperature (Pasagic et al. 2008). Infrared wave is an electromagnetic wave with the wavelength between 700 nm and 1mm. This radiation is between microwave and visible light. According to the Plank law any object with the temperature above absolute zero (-273 °C) emits energy that could be recorded by thermal cameras as a black and white image (Pasagic et al. 2008). Figure 2.11: Electromagnetic waves Source: https://www.scienceabc.com/pure-sciences/why-are-infrared-waves-associated-with-heat.html Thermal imaging systems are divided into two kinds of cooled and uncooled cameras. Cooled thermal cameras have higher temperature resolution and higher temperature sensitivity so the images have better quality than non-cooler cameras and they have higher prices. The uncooled camera sensors are working at the room temperature while the 20 temperature of the cooled thermal camera sensors is decreased to cryogenic temperature (-32 degrees F) (Pasagic et al. 2008). The reason for high resolution is because of this low temperature of the working unit. Working at low degree in small dimensions need high quality. In general, thermal cameras require some time to adjust the sensors’ temperature. The amount of required time depends on the environment temperature. The thermal images could be obtained during the day and night. As the earth’s temperature is stabilized throughout a night, the distinction between the object are better at earliest dawn. Spain and United States of America (USA) are the first countries that used these systems at World War two. 2.3.1 Thermal Imaging Components a. Composite thermal system: This unit is responsible for collecting thermal radiation of the object, focusing it at a point, and creating a thermal image. Thermal cameras, like the night vision cameras, consist of several lenses and mirrors, but their structures are different. In these cameras, materials transparent to infrared radiation (such as germanium and silicon) are used (Pasagic et al. 2008). b. Detectors: Detector is an element that absorbs infrared radiation collected by a set of objects that changes one of its electrical properties (electrical conductivity, resistivity or volte formation) by absorbing this radiation, and this alteration causes the creation of an electrical signal. After transforming the infrared photon into electrical signals, these signals are amplified and processed by the camera's electronic component, then by the means of devices such as light emitting diodes, liquid crystal diodes or micro- monitors the signals are converted into photons with the visible light wavelength. Each detector element can only transform one point of the object into a visible image. Therefore, in order to obtain a two-dimensional high-quality image, the dimensions of these numerous elements and the distances between them must be very small. Because of these tiny element structures, it is very difficult to construct detectors and they’re often produced as linear arrays instead of two-dimensional arrays. A linear array can only represent a line of the target thus a scanner is used to have a two- dimensional image. Detectors are divided into two groups of thermal and photonic or quantic detectors according to the electrical signal production method. So far, most military thermal imaging systems have used Mercury Cadmium Telluride Detectors (HgCdTe) (8-12 mm) for hot and dry area detection and Indium Antimonite 21 Semiconductors (InSb) (3.0-3.5 mm) for the detection of moist regions such as beaches and seas. Although cooled thermal cameras have great efficiency, due to their high price, size and weight these technologies are preferably replaced by uncooled cameras with the same function (Pasagic et al. 2008). c. Scanners: In some thermal imaging systems, there is a scanner whose task is to transfer the target plate information to the detector. In fact, the scanner transfers different points of data in time and line-by-line to the detector. d. Electrical circuits: The circuits include power supplies, biases, amplifiers, processors and displays. e. Opto-mechanical (eyepiece) system: The eyepiece system enables the observer to see the image. 2.3.2 Different Generations of Thermal Cameras Generation zero (Woolfson, 2012): A thermal camera built with a single-element detector or a linear array with a small number of elements, is called the zero generation. In this system, two horizontal and one vertical scanners are required. First generation: If a thermoset camera is built with a very long linear array, it's called the first generation (Woolfson, 2012): In this system, only a horizontal scanner is needed, that is the reason for manufacturing most of the black and white cameras using this technology. Second generation (Woolfson, 2012): It includes cameras with a long-term, multi-linear array. In this system, only a horizontal scanner is required. The image of these cameras is not significantly different from generation one. Third generation (Woolfson, 2012): It refers to a camera with a two-dimensional array of detectors with a high number of elements. This system no longer needs a scanner. This generation is the latest generation of thermal cameras that are fully developed in military systems and used in large industrial countries. The distinctive image of this generation of cameras and the ability of separating the element colors from each other have distinguished them from their previous generations. 22 2.3.3 Effective Factors on Image Quality Factors such as noise (system noise, background, etc.), atmospheric environments, system technical specifications, distance, dimensions, etc., cause restrictions on the operation of the camera and therefore selection and design are more complicated (Woolfson, 2012). In general, some of the main factors that affect the quality of the image can be summarized as follows (Woolfson, 2012): a. Monitor: The factors that affect the monitor are related to the radiation, contrast and distance from observer. b. Page Elements: factors such as target specifications, background specifications, movements and reflections. c. Specifications of the thermal image system: factors such as the resolution, sensitivity, noise and output of the camera. d. Atmosphere Transmission Factor: factors such as haze, rain and dust. e. It should be noted that in the discussion of picture quality, much of the research is done on two issues of spatial resolution and temperature sensitivity. 2.3.4 Accuracy and Recognition Factors of Images The level of utilization of images depends on the image quality and the ability to obtain information from it. For the image and its information, four precision or diagnostic steps are defined, which are (Pasagic et al. 2008): a. Detection: Sensation or detection of an object that may be a target. (Usually observing the object as a spot) b. Orientation: Detects the overall dimensions of the system. (Latitude and Width Detection) c. Recognition: The diagnosis of the target category and the ability to determine the group of objects. For example, showing that the object is an airplane or a helicopter. d. Identification: Identification and distinguishing the target among the objects belonging to its group. (the type of the object’s aircraft). In fact, from the first to the fourth step, the number of pixels increases, and the image quality will improve. 23 2.3.5 Main Effective Factors in Thermal Imaging As mentioned, thermal imaging systems have different generations, and each generation has its own features that are used in the design of cameras. Some characteristics are also common in system performance and the result is important for the user. The most important features of thermal cameras are (Woolfson, 2012): a. MRTD (Minimum Resolvable Temperature Difference): The thermal imaging system response depends on the sensitivity and spatial resolution. In order to assess the image quality in terms of sensitivity, resolution, dependency and interaction between them, a feature called MRTD is defined. That is the lowest target black body temperature difference from the background which could be measured by the system. The MRTD is limited to the sensitivity of the system, which means when the temperature difference is less than a minimum value, the object cannot be detected. In plain language, the MRTD is a camera feature that determines at least which sensitivity (or temperature difference) is required at any frequency (Rayleigh Criterion). b. Resolution: In many cases, spatial resolution is considered to be the only determining factor of the image quality. In fact, spatial resolution is the system’s smallest receivable part. The resolution is sometimes expressed by the instantaneous eyesight. The interpretation of resolution power depends on the type of application. On the other hand, spatial resolution includes the effects of the system's target and noise contrast. By the way, there is a clear difference between spatial resolution (ability to see detail) and the ability to see anything (detect). c. Sensitivity: Sensitivity is the smallest signal that can be detected by the system, or in other words, a signal that produces a signal-to-noise ratio at the output of the system. The sensitivity depends on the ability to capture the optics and detectors’ ability of noise detection, and it’s independent of resolution. d. Field of View (FOV): The maximum angular field (horizontally and vertically) in any functional position visible on the display is called the field of view. The choice of field of view usually depends on the type of application, technology, detector and scanner properties. e. Instantaneous FOV: This property is the angular component that let the systems receive information and it determines the system's resolution. The smaller feature 24 results in better receiving, as long as it could provide enough energy for detection. The larger field of view causes the larger Instantaneous field of view, and this decreases the resolution of the system. f. Noise Equivalent Temperature Difference NETD): This feature indicates the temperature sensitivity of the system and it’s the least variation between body temperature and background, which produces noise in the output signal. This attribute depends on the detector's specification, the optical-atmospheric transmittance, and the noise of the system. g. F-Number: The focal number is the ratio of the focal length to the diameter of the lens in an image forming system. In fact, the focal number expresses how much light is collected by the speed of a lens. h. T-number: A T-number expresses the speed of a lens, assuming that the lens transmits all the light emitted from the subject. In fact, various lenses have different T-numbers. Lenses with the same focal numbers may actually have different speeds. For two lenses with the same T-number, the resulting images are the same with equal brightness. 2.3.6 Selection of Wavelength Region for Thermal Cameras According to the level of object’s emitted radiation at normal temperatures and atmospheric constraints, only two regions (3-5 μm) and (8-12 μm) can be used for passive photography. If targets are hotter than the intended environment (such as the exhaust of the missile), since the maximum wavelengths of these targets are shorter and the radiation power in that area is high, it is better to use a camera with a wavelength of three to five Micrometer. In general, without taking into account the specific application, one cannot make a region superior to another. Of course, the wavelength range of eight to twelve microns is a fine range (Woolfson, 2012). 2.3.7 Atmospheric Effects on Thermal Cameras’ Performance To see objects on the surface of the earth, the electromagnetic waves emitted from the surface of the object pass through the air and reach the camera. Since air is a mixture of different gases, water vapor and particulates, it absorbs and also spreads some of these waves depending on the wavelength. Considering different conditions, such as the amount of gas, wind, temperature and other atmospheric conditions during the emission 25 pathway, the assessment of these effects is complicated. The amount of contrast (sharpness) of the object or background is the factor that affects the visibility of objects. The atmospheric conditions reduce the contrast according to the distance from the viewing point. The atmosphere imposes important constraints on the performance of electro-optical systems. In fact, the environment could be considered as one of the most important components of an optical system. Today, due to the fact that the quality and capabilities of detection systems and radiation sources (such as lasers) have enhanced significantly, the most important limitation on system performance is usually the atmospheric environment (Woolfson, 2012). Most of the atmospheric disturbances that affect the radiation transmission and performance of the thermal imaging system are: a. Slippage of radiation (absorption, dispersion) that has the greatest effect and limits the range of systems. b. Radiation of the environment and the infrared region c. The deviation of the actual target location d. Radiating Modulation The atmospheric passage rate is not the same for all wavelengths, and the objects at ambient temperature with only a perceptible spontaneous radiation in two regions (3-5 and 8-12 microns), are not absorbed by atmospheric influences and are suitable for thermal imaging. As infrared radiation is less absorbed than visible light in the presence of mist and smoke of the Earth's atmosphere, therefore these cameras can be used in adverse weather conditions. Thermal cameras have different applications in industrial, military and nonmilitary cases, the most important applications of which are: a. Observations and operations at night b. Guided missiles c. Intelligent and identification operations d. Helping the planes for landing and taking off e. Photography during Nighttime and adverse weather conditions f. Photographing of camouflage and hidden objects g. Fire control system usages 26 It should be noted that all unmanned aerial reconnaissance aircraft flyable at night-time and any type of weather conditions are equipped with thermal cameras for identification purposes. This system is critical for identifying enemy forces in the battlefield during operations such as displacement, expansion, division, camouflage, hiding and so on. Advantages of thermal cameras instead of night vision cameras a. Ability to create a picture at night and day b. Online image transformation to the receiving data stations. (The image is visible both by eye and the image simultaneously is sent to an external monitor). c. Failure to reveal by night vision systems: Some night vision cameras require a source of help for the purpose of seeing the target (Active System), which can be seen by night vision systems. But the thermal camera does not need an external source and it observes objects’ own radiation. d. Thermal cameras do not require light for their functions, therefore even placement of a 18,000-watt projector in front of the desirable object doesn’t alter the output image(Pasagic et al. 2008). 2.4 IMAGE SEGMENTATION Image segmentation has a great value in image processing field. Because Segmenting an image into some meaningful components help interpretation. The segmentation methods that are applied in this study are as follows: a. Clustering b. Classification 2.4.1 Clustering Clustering algorithms are unsupervised learning system in which data is not labeled. In this type of clustering algorithms data is clustered in a way that objects in each group are similar together while they are different from the objects of other groups. The similarity between the clusters’ objects is obtained according to the extracted features. There are many different clustering algorithms but according to this study two applied clustering methods are briefly explained here: 27 2.4.1.1 K-means clustering In this clustering method data objects are gathered according to k which is the predefined number of clusters and also the selected initial centroids. The algorithm consists of two iterations and it continues until convergence happens. Data objects are clustered according to the initial centroids. The new centroid is then calculated for each cluster this loop continues unit convergence happens which means the centroid doesn’t move. The most important problem of this algorithm is the number of clusters which should be defined in advance that has a great impact on the performance of the algorithm and it could highly affect the results. The other disadvantage of this algorithm is due to the impact of centroid selection which also could affect the algorithm because sparse centroids could cause unwanted results (Mohammad et al 2016). 2.4.1.2 Mean shift clustering Mean Shift is a hierarchical non-parametric clustering algorithm, unlike k-means clustering method the algorithm itself figures out the number and the location of the clusters. The important factor of algorithm that should be determined is the radius around the data point and sometimes also referred as bandwidth. Radius is defined around each data point determining the bandwidth in which the points of that cluster is located. In the second step the mean of these data points is calculated as new cluster center and with this new point a new cluster is considered. The process continues until the mean doesn’t change, which means the mean is optimized and converged. In Mean Shift clustering although all points were started as one clusters by continuing the algorithm the mean of the cluster shifts that means the cluster center moves to the point of convergence (Comaniciuet al. 2002). 2.4.1.2.1 Mean shift applications Mean Shift algorithm is highly applied in discontinuity preserving smoothing segmentation and object tracking, this is very useful in different fields such as military, industry, sports and surveillance cameras. For example, the object could be tracked as moving in by the missiles. 28 Figure 2.12: Object tracking for surveillance Source: Comaniciu, D., et. al. 2003. p. 578 The other application is tracking the players in the sport field. In the Figure 2.13 the green and blue rectangle indicate the movement of the player in different directions. The Mean Shift is done for all the frames and the centroid is computed (Comaniciu et al. 2003). Figure 2.13: Object tracking in a soccer game Source: Comaniciu, D., et. al. 2003. p. 576 29 2.4.2 Classification Classifications are supervised learning algorithms in which data is labeled. Once the proper image features are extracted, the gathered features should be classified in a proper order, so that any instance of the problem space could be placed in the correct group. This step involves methods for matching each of the patterns derived from the feature extraction stage with one of the problem space classes. Number of attributes must be minimized according to relativeness between each input feature vector to one of the reference vectors. The reference vectors are the training dataset variables which are extracted previously from some training samples. Neural Network is one of the most common classification algorithms. The point is that these structures are not necessarily separable and sometimes are used as a combined concept (Sinngh 2011). ANN algorithm is explained below: 2.4.2.1 Artificial neural network In general, ANN solve complex problems by using the human brain function method. Common computational methods use the same algorithm. They follow a set of preset commands to solve problems. The processing ability of conventional computers are restricted to define and solve problems, but neural networks are able to find patterns in information that no one ever knew about their existence (Sinngh 2011). Neural networks have opened up a new and distinct answer to the problems instead of common methods. Common computational methods follow an algorithm. The algorithms are a set of preset commands used to solve problems except in special cases where the computer needs a series of information, and this limits the processing ability of ordinary computers to the solved problems. These algorithms are beneficial for processing the large number of instances and they increase the analyzing rate. The ANN is used to analyze the problem and find the best possible solution for any unsolved situation. Neural networks and common computational methods are not in competition, but they complete each other. There are tasks that are more suitable for algorithmic methods and there are tasks that are more suitable for neural networks. Furthermore, there are issues requiring a system that is obtained by combining both methods with high precision (Sinngh 2011). 30 Artificial Neural Networks provide a different method for processing and analyzing information. But it should not be inferred that neural networks can be used to solve all computational problems. Common computational methods continue to be the best option for solving specific groups of issues such as accounting, warehousing, and so on. Neural Networks are moving in the direction that the tools have the ability to learn and plan. Neural Network structures are capable of solving problems without the help of an expert and external planning. In fact, the Neural Network is able to find patterns in the information that no one ever knew about it. Since ANNs are developed based on the human brain, their structure which is a bio- network is explained: Neuron: Neuron is the humans’ brain cell which is defined as the main structural element of the brain. In Figure 2.14, a simplified representation of the structure of a neuron is shown. A bio-neuron, after receiving the input signals (in the form of an electrical pulse) from other cells, combines these signals together and, after performing another operation on a hybrid signal, the output appears (Sinngh 2011). Figure 2.14: Structure of the humans’ brain Source: https://online.science.psu.edu/bisc004_activewd001/node/1907 As shown in Figure 2.14, the neurons are made up of four main parts: Dendrites, Soma (Cell Body), Axon and Synapse (Nerve Ending). Dendrites are the same components that are scattered around longitudinal fibers from the center of the cell. Dendrites play the role of communication channels for transmitting electrical signals to the cell center. At the end of the dendrites, there is a special biological structure called synapse, which plays the role of connecting gateways for communication channels. In fact, various signals are 31 transmitted through the synapses and dendrites to the cell center, where they are combined. The mentioned combining operation could be obtained by a simple algebraic action. In principle, even if this is not the case, by using mathematical modeling it can be considered as an ordinary summation action. For this purpose, a special function is applied to the signal, and the output as a different form of electrical signal is transmitted from an axon (and its synapse) through other cells. The human brain has the most evolved structure among living organisms, the reasons are as follows: a. A human brain consists of at least ten to the power of eleven neural cells or neurons. b. It is one of the biggest brains among all living creatures. The intelligence factor is not just due to the size of the brain, otherwise the elephant or the whale should be smarter than humans. Human brain intelligence is due to the number of connections between brain neurons. Neuron is the smallest unit of a Neural Network that forms the function of neural networks, and each neuron has several parts: i. Soma or Body: It is modeled as a mathematical function. ii. Dendrites: Function inputs iii. Axon: Function output Artificial neural networks process information in a manner similar to the human brain. They consist of a number of superficially interconnected processing elements (the neural cell) that work together in parallel to solve a particular problem and cannot be programmed to perform a specific task. Examples should be carefully selected; otherwise, the useful time is lost, or even the network may work incorrectly. The Neural Network score is according to its ability of solving unknown problems and its performance is unpredictable. An ANN forms a collection of neurons. The most important factors that differentiate types and applications of the Neural Network include the applied type of neurons, the layout or structure and the input/output intervals. Artificial Neural Networks are a combination of neuronal complexes which are very similar to biological neurons. Therefore, it takes a lot of inputs with different weights and produces an input-dependent output. Biomedical neurons can be either causing it or not. The structure of the cells in the network is called network architecture. In the architecture of a network, the number of layers and connections between them are important. Network 32 inputs called "input layer" and network outputs called "output layer" and, if necessary, layers between these two layers are called hidden layers (Gonzalez et al. 2008). Figure 2.15 represents a simple structure of the Neural Network. Figure 2.15: Simple structure of the Neural Network a. Input layer: This layer receives inputs and sends the input signal to the next layer based on its power connection with the next layer. The relationship power of each neuron with another neuron is called the weight of that neuron. b. Middle (hidden) layer: The number of interlayers and the number of their neurons is arbitrary. The middle layers must be carefully selected to produce the proper output. c. Output layer: Another group of neurons also forms the outside world through its outputs. The Neural Network acts like a function. This function accepts outputs and inputs which are exactly same as the number of input and output neurons accordingly. Among different types of neural networks, some of the common ones are (Sinngh 2011): i. Multi-Layer Perceptron ii. Hopfield Network proposed by Hopfield (1982) iii. Kohonen Feature Map (Kohonen 1997) iv. Adaptive Resonance Theory 33 The following table shows the correspondence between Artificial Neural Network and Biological Neural Network (BNN): Table 2.2: Correspondence between ANN and BNN Biological Neural Network Artificial Neural Network Soma Neuron Dendrite Input Axon Output Synapse Weight 2.5 RESEARCH BACKGROUND Wei Wang et al. (2011) presented a method based on Cellular Neural Networks and Distributed Genetic Algorithms (DGA) for infrared edge detection. They trained the network using CNN format and distributed genetic algorithm. CNN can be used to process infrared images with special modifications. The results of their experiments showed that the edges detected by CNN-DGA were highly accurate. Similarly, compared to the way CNN has been trained by the Particle Swarm Optimization algorithm, the speed of the proposed method has been significantly improved. Qingju et al. (2016) Provided an edge detection method in infrared images based on an Ant Colony Optimization algorithm. Edge extraction is one of the most important tasks in detecting infrared images. The Ant Colony Optimization algorithms have properties that can enhance the efficiency of the edge detection system, control the noise with high precision and can extract the right information from the edge. Along with these points, he compared the Ant Colony Optimization (ACO) with the classical Canny Edge detection algorithm. The results of the experiments showed that ACO had high efficiency for the edge detection of infrared images . Wang et al. (2011) Provided an ACO-based method and a Sobel operator to identify the edge in infrared images. His method used a Sobel operator to control the primary position of the ants in the ACA. This method has been able to detect thin edges well and improve the overall performance of the algorithm. According to the report, the results of the tests indicated a good performance of the proposed method. 34 Qingju et al. (2016) Using a morphological-Canny compound, provided an algorithm for edge detection in infrared imagery. Effective extraction of the edge curves of infrared images helped detecting the geometric properties of defects. The results of the experiments show that the algorithm has a high anti-noise effect and it has recognized better geometric properties of the edge. 2.6 CONCLUSION Working with infrared images is very important. One of the important actions in the processing of infrared images is edge detection. Because of providing useful data edge detection has a great value. Accordingly, in this chapter, first explanations about infrared images were provided. Next, the concepts and definitions of image processing are described along with the image segmentation. Then, the concepts and stratified algorithms were expressed and last, some part of the work done in this field were expressed. Next in this thesis: In the third chapter, the proposed method will be explained, and the tools used in this study will be described in detail. In the fourth chapter, the proposed methods will be evaluated according to other cases and will be compared with each other. In the fifth chapter, the results will be discussed and proposals for future works will be presented. 35 3. DATA AND METHOD 
 3.1 INTRODUCTION In the previous chapters, the importance of image processing and infrared images were explained, and some of the major methods used in image processing were considered. After reviewing relevant methods and identifying existing research deficiencies a new approach is proposed to detect infrared image edges. Therefore, in this chapter, a general research plan is presented. Then, tools used in this research, image segmentation, extraction of expected area using Neural Network and edge detection are discussed. Finally, the method of evaluating the proposed method is expressed. 3.2 APPLIED TOOLS To implement the proposed method, MATLAB software is used (MathWorks 2015). In this software, based on the demands of the research, Neural Network, clustering and image processing toolboxes were used. The name of this software is derived from the English MATrix LABoratory Label (MATLAB). MATLAB was first designed for better accessibility of matrix software designed by LINPACK1 and EICPACK2 projects. MATLAB is a high-level language initially developed based on C language. It has a high technical capability for computing, and visualization. So, MATLAB is a modern programming environment that includes high level data structures, error-detection tools. It also supports object-oriented programming, and so on. These factors are a great tool for learning and research purposes. This programming language has many advantages over common programming languages such as C for solving technical problems. MATLAB is an interactive language in which the initial data is an array that does not require a dimension and its software package has been commercially available since 1984 and now serves as a standard tool in many universities and industries around the world. It also provides easy-to-use matrix, computational or functional operations, various algorithms and easy communications with other programming languages. MATLAB software has a wide range of applications, including image processing, Neural Networks, control design, Artificial Intelligence, and so on. As stated, MATLAB is written in C for 1 MATrix LABoratory 2 LINear system PACKage 36 speed and high performance, but its graphical interface is implemented with Java. One of the main advantages of MATLAB is the easy learning ability and the various accessible documents for learning and use. The other programming language applied in this study is R language (R core Team 2017). R is a programming language for statistical computing and graphics supported by the R Foundation for Statistical Computing. 3.3 OUTLINE OF THE THESIS In general, working with raw images, identifying and extracting desirable properties of any image require fundamental step-by-step processes. This study also focuses on image preparation and edge extraction. In this proposed method the first step includes image segmentation which is applied by two clustering algorithms (K-means and Mean Shift clustering). First, by using K-means clustering, the image is taken from the input unit and segmented into K pieces. Then, using the Multilayer Perceptron (MLP) neural network, which was previously trained with the training dataset, the ROIs are extracted among all available k pieces. ROI is the cluster which includes edges. Then the extracted region, which is converted to the binary image is sent to the edge detection unit. Edge detection unit extracts the edges using the morphological operators. The same procedure is done by using Mean Shift Clustering instead of K-means clustering algorithm. The reason of this choice is due to the shortcoming of k-means in which the number of clusters should be defined in advance and it could cause time-wasting specially in test part in which the test image is sent to the Neural Network algorithm. Mean Shift Clustering has this advantage as it could figure out the number of clusters automatically. This time, after receiving the image and pre-processing step in which possible noises are eliminated image is segmented by Mean Shift clustering algorithm. Then, using the MLP neural network, which was previously trained with the training dataset, the ROI is extracted among all available pieces. Then the extracted region, which is converted to the binary image is sent to the edge detection unit. Edge detection unit extracts the edges using the morphological operators. The diagram of the proposed method is drawn in Figure 3.1: 37 Figure 3.1: General steps of the proposed method In this chapter the Figure 3.1 is explained by selecting two images from the dataset. One sample image is selected to explain training part and one sample image is selected to explain the ROI extraction. All above steps and processing methods one by one are expressed with resulting images and tables. Figure 3.3.a is the example image selected to illustrate the training part. 3.4 PRE-PROCESSING As stated in the previous chapters, infrared (thermal) images have low precision and are disposed to occurrence of various noises, especially to pepper and salt noise. Noises are removed to improve image quality for better image visualization. Removing the noises are also necessary because this noise is a problem for one of the most basic stages of image processing, which is segmentation. In fact, if the image has a noise, the segmentation step is applied with much lower success rate. So, it is best to reduce noise as much as possible before the segmentation step. In this section, salt and pepper noise is introduced, then the removal or reducing method is explained. If the image contains salt and pepper noise, then black and white spots appear on most parts of the image. These black dots fall on the pixels of the original image and lower the quality of the original image. Based on the percentage of noise on the image, the dispersion of these black and white points is low or high. One of the ways to remove noise from a digital image is to filter a noisy image. Therefore, if a suitable filter is used to remove noise, then by using this filter, a processed image with less noise or in the best case without any noise will be obtained. The desirable filter is a small image. Usually this image is a square image whose number of rows and columns is an odd number. For example, 3 rows and 3 columns, or 5 rows and 5 columns, and so 38 on. To remove salt and pepper noise from a digital image, a filter called the median filter is used. This filter has the ability to remove salt and pepper noises. The median filter’s job is to arrange all the neighbors of a central pixel ascending and select the middle element of the ordered numbers and replace the central pixel. The process of implementing the median filter is shown in the Figure 3.2: Figure 3.2: Median Filter Source: Marques, O. 2011, p.217 Based on what is shown in Figure 3.2 the filtering operation is performed. The structure of this filter is based on sorting. Figure 3.3 shows applying the median filter on the infrared image. The image on the left side is the original image and the image on the right side is the filtered one. In this example 3 by 3 window is selected then all the pixel values are sorted then the middle one is selected and the value 5 which is located at the center of the window is replaced by number 8. By this simple idea a lot of noises such as salt and pepper are removed easily. the reason is due to the structure of salt and pepper noises which are unwanted irrelevant noises so smoothing the image could eliminate their effects (Marques et al. 2011). 39 Figure 3.3: Applying median filter The applied filter is a three by three matrix. 3.5 IMAGE SEGMENTATIONS Due to the significance of the segmentation step of this research, the segmentation process of the image in this research is described. After performing the filtering operation (noise removal), the image is prepared for segmentation and divided into several regions. The segmentation in this study is used to separate the different clusters of the image. K-means and Mean Shift clustering methods are separately used for segmentation in this research. For this purpose, two separate programs are developed one using k-means while the other one benefited Mean Shift clustering algorithm. The important point is that for both Neural Network training algorithm and also for testing the dataset the same clustering method is applied. The basic structure of this program includes two part which are training the neural network, testing and evaluation of the method. In this chapter the aim is to explain the method for training and testing. The results are explained in the next chapter. For this purpose, both algorithms will be expressed using a sample image. The NN is trained by 29 images which is selected from the relevant dataset1. 3.5.1 K-means Clustering The general k-means clustering steps in this study is presented in Figure 3.4. In this research, k-means clustering algorithm is applied using Euclidean distance. The Euclidean distance criterion is calculated by Formula 3.1: 1 J.-Davis and M.-Keck, A two stage approach to person detection in thermal imagery 40 Distance(𝑋𝑗 , 𝐶𝐾) = √∑ (𝑋𝑗,𝑖 − 𝐶𝑘,𝑖)2 𝑑 𝑖=1 (3.1) In Formula 3.1, j represents the number, k in this study is the image pixels. C represents the center of the clusters and the X is the desired data. D representing the dimensions of the data is equal to one in this study. Because the image used in this research is gray and it only has a value between zero and 255. It should be noted that the number of clusters in this method should be determined by the user. According to the elbow criterion optimal number of clusters could be calculated. Figure 3.4: Block diagram explaining the k-means clustering steps 41 In this part, the results of the image clustering using K-means clustering and segmentation of the image into several regions are presented. Figure 3.5 shows the result of clustering on the sample infrared image. In this Figure, the upper image (gray image) is the pre- processed image and the rest of the images are resulting clusters. Figure 3.5: Clusters extracted using k-means clustering Based on Figure 3.5, the original image is segmented into six clusters. One point to note is that, among these six resulting images, only one image should be used for edge detection so that the person on the corner side of the original image could be detected. The object to be found in this image is shown as a red circle in Figure 3.11. Now, by looking at the images (a to f ) in Figure 3.5, the cluster e has the desired object. Based on 42 the Figures 3.5-e the white areas indicate the pixels of the original image. Accordingly, the identifiable object is located in the e-cluster. Unlike humans, computers could not figure out the right cluster automatically. Therefore, Neural Network algorithms need to be applied to enable the computers to extract the correct regions. 3.5.1.2 Optimal number of clusters There are many different machinery algorithms that tries to find the right number of algorithm. Although this is still an ongoing research but in this study elbow criterion is applied which fortunately could estimate most of the clusters correctly. The main structure of elbow criterion is to consider a number as maximum number for clusters. Then for each of this numbers the (Standard Square Error) SSE is computed. it is obvious by considering each point as a cluster, of course the number of clusters are optimized because the SSE is zero as each point which is a centroid emerge to its own cluster. Therefore, the goal is to increase the number of clusters while the SSE decreases the elbow point is the location where k is optimized number with lower value of SSE and highest number of clusters. The slope line before elbow point is steeper than the slope line after this point. This idea may not be useful in all cases specially in datasets which could not be clustered well. 𝑆𝑆𝐸= ∑ (𝑌 − ?̅?𝐼 ) (3.2) The number of clusters for the following example is selected as the result of elbow criterion. The elbow point is 6. 43 Figure 3.6: Elbow criterion As stated at the beginning of this chapter, each step is simulated by all the training images as inputs. Therefore, the optimized cluster number of each image is also estimated using elbow criterion. Although elbow criterion is applied to find the optimized number of clusters, but this function may not always be able to find the correct number of clusters . Therefore, in this study Mean Shift Clustering is also applied. 3.5.2 Mean Shift Clustering Mean Shift is a hierarchical non-parametric clustering algorithm, unlike k-means clustering method the algorithm itself figures out the number and the location of the clusters. The main idea of Mean Shift is based on the Kernel Density Estimation (KDE). KDE is benefited to find the distribution associated with the dataset by using bandwidth kernel. The kernel is actually the weighting function and various kernels result in different clusters. One of the popular kernel is Gaussian kernel. 𝐾(𝑥) = 1 √2𝜋𝜎2 𝑒 − 1 2 ∥𝑥∥2 𝜎2 (3.3) Another popular kernel is Flat kernel. This the kernel used in this thesis. 44 𝑘(𝑥) = { 1 ||𝑥|| 0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 (3.4) Figure 3.7: Flat kernel Source: Yizong 1995. p. 792 Mean Shift considers the feature space as a sample of underlying probability density function. Mean Shift finds the densest region of the feature space. The densest region is the mode which determines the clusters. The aim is to find the densest region. Mean Shift computes the mean of the window considered for each data point and the center is shifted to the computed mean and this repeat until optimization which means the mean doesn’t change and converges. and the centroid moves to the denser region in each step. The Mean Shift function is as follows: 𝑚(𝑥) = ∑ 𝑔( 𝑥−𝑥𝑖 ℎ 𝑛 𝑖=1 )𝑥𝑖 ∑ 𝑔( 𝑥−𝑥𝑖 ℎ 𝑛 𝑖=1 ) − 𝑥 (3.5) the g(x) is the gradient of kernel (𝑘 ,(𝑥)) and h is the bandwidth. Kernel density function estimates the density. After noise reduction the image should be segmented into optimized number of clusters for this purpose the image is sent to the clustering unit. As mentioned in the last part bandwidth is the initial value that must be initialized, the flat kernel is used for computing the bandwidth. Therefore, the bandwidth is estimated from the Gaussian estimation. The Figure 3.8 is the density structure of the image. 45 Figure 3.8: Image density structure The quantile structure of the image is: Table 3.1: Extracted Quantile 0% 25% 50% 75% 100% 0 77 84 92 255 This value for bandwidth is used for clustering. The resulting clusters are shown in Figure 3.9 and Figure 3.10. Figure 3.9 indicates the clusters and the values of the members which is between 0 and 255. 46 Figure 3.9: The resulting clusters by Mean Shift clustering Figure 3.10.a, Figure 3.10.b and Figure 3.10.c represent the extracted clusters. Between these 3 clusters only one cluster includes the ROI (c). 47 Figure 3.10: Clusters extracted using Mean Shift Clustering 3.6 ROI EXTRACTION The previous section described the image segmentation. Also, the target region is visually identified. The identifying method for the program is described in this section. The Neural Network is used to identify the region of interest. The Neural Network is used to classify the areas and the ROI is Figure 3.10.c . For applying the neural network, features must be extracted from the regions of the image. 3.6.1 Feature Extraction from The Image Regions In this study, simple statistical properties are used as features of regions. These features include: i. The minimum pixel value of each cluster ii. The maximum pixel value of each cluster iii. The average pixel value of each cluster These properties are used according to the simplicity of acquirement and their high separating properties as they could separate and distinct the clusters easily. Obtaining these features is easy and reduces the implementation complexity of the method. 48 Figure 3.11: Expected object for edge detection in the IR image K-means clustering: Considering the regions shown in Figure 3.5, Table 3.2 shows the characteristics of these areas. As stated above all the tables and images are representing the results of the figure 3.3.a. According to these defined features, each region has three characteristics. Table 3.2: Extracted features for each region using K-means clustering Max mean Min Cluster number 92 86.97 83 a 73 67.92 40 b 39 11.25 0 c 89 87.67 74 d 212 170.9 135 e 134 97.5 93 f As the table indicates that the region e has the largest values. Mean Shift clustering: Considering the region shown in Figure 3.10, Table 3.3 shows the characteristics of these areas clustered by Mean Shift clustering. According to the features defined above, each region has three characteristics. Figure 3.10.c is the desired cluster, unlike k-means this time the desired region doesn’t have the highest values because means shift clustering is based on density. 49 Table 3.3: Extracted features for each region using Mean Shift clustering c b a Cluster 36 36 35 min 215 231 133 max 84.47 83.99 84.2 mean 3.6.2 Artificial Neural Network After extracting the features, the next step is training the ANN. Unlike humans, computers could not directly find the ROI among all regions and they need to be trained for identifying the ROI. The structure of ANNs are based on simulating the human brain, thus the ANN after training are able to automatically find the ROI. The Neural Network used in this study is the multilayer perceptron and trained by Error Back Propagation method. Its structure is shown in Figure 3.12. As shown in Figure 3.12, the Neural Network has three layers. it means that it has a hidden layer. In order to build a NN algorithm input data is needed. The inputs are the features extracted from all clusters of each image. The number of inputs of the Neural Network is three which is equal to the number of properties extracted (The minimum pixel value of each cluster, The maximum pixel value of each cluster, The average pixel value of each cluster). The number of hidden neurons is six. And the network has one output. If this output is greater than one, this means that the area under consideration is the ROI, otherwise the area is not the one with edges. 50 Figure 3.12: Structure of Neural Network in this study By selecting the correct data for Neural Network training and the right training method each image will be extracted correctly. With proper training of the neural network, the ROI of each image could be extracted. For this purpose, after extracting three mentioned features for each image from all clusters among training dataset the features are gathered as inputs for the neural network and the output clarifies whether the region is the intended one or not, which is done by labelling. K-means clustering: The result of the K-means clustering method is written as Table 3.4, where the last column represents the label of the area. The labels are used for training the neural network. With proper training the ROI of each image is extracted well. Label with value of 1 represents the ROI while labels with -1 value represents the unintended region. These labels are determined by the users so with proper training the ROI of each image is extracted well. An important point about the labels is the property values of the intended region. As the table indicates all three features of the intended region is higher comparing to the other regions. That is expected because ROI is the brightest region. 51 Table 3.4: Region labeling for training the NN (K-means clustering) Label Max mean min Cluster Number 1 92 86.97 83 a 1 73 67.92 40 b 1 39 11.25 0 c 1 89 87.67 74 d 1 212 170.9 135 e 1 134 97.5 93 f Mean Shift Clustering: In the Mean Shift algorithm, the same step is also implemented. The result is written as Table 3.5, where the last column represents the label of the area. Label with value of 1 represents the ROI while labels with -1 value represents the unintended region. These labels are determined by the users so with proper training the ROI of each image is extracted well. An important point about the labels is the property values of the intended region. In this algorithm unlike k-means the intended cluster doesn’t have the highest values. Table 3.5: Region labeling for training the NN (Mean Shift clustering) c b a Cluster 36 36 35 min 215 231 133 max 84.47 83.99 84.2 mean 1 -1 -1 Label 3.7 EDGE DETECTION After training the ANN the next step is testing the algorithm and finally the edges are detected in this part. The first two steps are similar as the previous part, but instead of labelling this time the trained ANN is used for ROI extraction. The selected image first should be segmented using a proper clustering method. Then by the use of ANN the ROI 52 is extracted. The NN is tested by various examples from different datasets. The results are stated in the next chapter. In this part the structure of the test part is defined. The basic structure of edge detection includes these steps: i. First each image is preprocessed using median filter. Figure 3.13: Pre-processing ii. The filtered image is clustered using one of the clustering algorithms. It means that if K-means clustering is selected as the segmentation method the clusters are segmented by this method. Again, optimal number of clusters for this method is estimated by elbow criterion. For Mean Shift first, the bandwidth is calculated then the image is segmented by Mean Shift clustering method. Figure 3.14 indicates clustering results for both methods. Figure 3.14: Clustering using both methods 53 iii. Now the clusters are sent to ANN unit in which the ROI is extracted accordingly. This time by simulation using the trained ANN, the region of interest is extracted. After ROI extraction the binary image (answer) is built. Figure 3.13 shows the built image: Figure 3.15: Binary image rebuilt from the extracted ROI iv. In last step the extracted ROI was sent to edge detection unit. At this stage, it is time to determine the edges of the infrared image from the ROI that was extracted previously. This section includes two main parts: a. Post-processing step b. Edge detection using morphological features 3.7.1 Post-Processing Once the Region of Interest has been selected among all regions, it should be sent to the edge detection unit. In this part the image is post processed to improve the images’ quality for better visualization and better edge extraction. In this step, the aim is to delete unwanted points and white noise that is missed in the pre-processing step. Post-processing unit includes two parts: i. Erosion: After ROI extraction of all images the ROI must be sent to the edge detection unit. In ROI presence of tiny unwanted points is probable so post-processing operation is required. This step is done using morphological erosion operator. It is used 54 to eliminate points with the value less than 3x3. The SEE matrix is used as a structural element for this operation. This equation is used to eliminate the unwanted tiny points: Image , SEE: ErodeImage = Image ⊖ SEE={z|(𝑆𝐸)𝑧 ∩ (𝐼𝑚𝑎𝑔𝑒 𝑐) ≠ ∅} (3.6a) SE= [ 1 1 1 1 1 1 1 1 1 ] (3.6b) The resulting image is eroded. With this step the white noise is eliminated. Figure 3.16: Binary image after Erosion ii. Dilation: In this part, the resulting image should be dilated for better visualization. The remaining image is returned to its original state. The dilation equation: 𝐼𝑚𝑎𝑔𝑒 = ErodeImage⨁𝑆𝐸𝐷 = {𝑧|(𝑆𝐸)̂𝑧 ∩ 𝐼𝑚𝑎𝑔𝑒 ≠ ∅} (3.7a) SED= [ 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1] (3.7b) In the last step, the image is narrower so in this step the inverse operation is done in order to dilate the image. This image is ready for edge detection. 55 Figure 3.17: Binary image after Dilation 3.7.2 Edge Detection Using Morphological Operators After receiving the image, the edges are detected. The obtained image is binary. In this step, instead of using common edge operators edge detection is performed by morphological operators. The morphological equation is written in the Equation 3.7: 𝐼𝑚𝑎𝑔𝑒𝐸𝑑𝑔𝑒 = 𝐼𝑚𝑎𝑔𝑒 − ( 𝐼𝑚𝑎𝑔𝑒 ⊖ 𝑆𝐸𝐸) (3.8) Figure 3.18: Detected edges 56 3.8 CHAPTER SUMMARY The edge detection of infrared images has a great importance. In this chapter, based on the research that was done from previous works first we presented a new way for edge detection of the infrared images. So, at the beginning of this chapter, the overall research process has been explained. Next, the pre-processing method of infrared images was expressed. Then, the image segmentation and extraction of the region of interest were described. Last, the infrared image edge detection based on morphological operators were defined. In the fourth chapter, the proposed method will be evaluated. 57 4. FINDINGS 
 4.1 INTRODUCTION In the previous chapters the proposed method of research was clearly described. In this chapter, the methodology for evaluation and the dataset selected in the research will be described first. Then, the results of the proposed method are presented. 4.2 EVALUATION To validate and evaluate the proposed method, a proper IR image dataset is selected according to the proposed research method. Then, based on the available information of the dataset, the efficiency of the proposed method is evaluated. 4.2.1 Dataset The dataset used in this study is the OSU Thermal Pedestrian Database. This dataset has 10 different files including images. The general info about the dataset are provided by J Davis and Keck (2005)1: Data Details: Pedestrian intersection on the Ohio State University campus Number of sequences = 10 Total number of images = 284 Format of images = 8-bit grayscale bitmap Image size = 360 x 240 pixels Sampling rate = non-uniform, less than 30Hz Environmental information for each sequence provided in subdirectories Ground truth provided in subdirectories as list of bounding boxes (with approximately same aspect ratio) around people. For the ground truth data, we selected only those people that were at least 50% visible in the image (i.e., highly occluded people were not selected). 4.3 EVALUATION RESULT In this study, a new method was proposed for edge detection of infrared images. Next in this chapter, the evaluation and results are described. Due to the explanations given in the dataset section and according to the use of dataset and expectations of edge detection methods it is expected that the proposed method is capable of identifying the edges properly. 1 J.-Davis and M.-Keck, A two stage approach to person detection in thermal imagery 58 4.3.1 Evaluation Method In this part clustering and ANN are also used. K-means clustering depends on cluster number determination and therefore, elbow criterion is benefited for this purpose. For Mean Shift clustering proper bandwidth should be defined. The results are expressed for both clustering methods bellow. In this part, 16 different images are selected randomly from the different files of the dataset and the results of the proposed method are given. Each image is evaluated by both k-means and Mean Shift clustering methods and the result is expressed one by one. The green and red ellipses are drawn according to the Ground Truth text file. Images are selected randomly from all folders and the edges are extracted by two different clustering methods. Among all figures, a is the original image b is the edges extracted by K-means clustering while c is the edges extracted by Mean Shift clustering method. Image 1: Figure 4.1: Detected edges (b,c) of image img_00003 (a) in folder 0002 As the results indicate, both methods detected the edges perfectly. But in the Figure 4.1.c the edges of the person at the top right-corner is detected with better precision . The reason is due to the cluster numbers that are determined by the program itself and it is the advantage of this program. 59 Image 2: Figure 4.2: Detected edges (b,c) of image img_00014 (a) in the folder 0002 Figure 4.2.a is also detected correctly by both methods. But again, like the previous example the edges of two objects in the Figure 4.2.c are detected better by Mean Shift method. 60 Image 3: Figure 4.3: Detected edges (b,c) of image img_00028 (a) in the folder 0002 As the results of Figure 4.3 indicate both methods have detected the edges correctly. Both methods have detected an object (lamp post light) which wasn’t defined as an identifiable object in the ground truth file. The reason for this matter is according to the structure of the program. Because the program is implemented in a way that it could recognize the objects based on their infrared radiation. So, it’s not the programs’ error but as the Ground Truth file clarified that the recognizable objects are humans. This matter is recognizable in next examples too. 61 Image 4: Figure 4.4: Detected edges (b,c) of image img_00001 (a) in the folder 0004 The edges of Figure 4.4.a are detected better by Mean Shift clustering and the lamp post light is detected by both methods. 62 Image 5: Figure 4.5: Detected edges (b,c) of image img_00009 (a) in the folder 0004 The edges of Figure 4.5.a are detected correctly by both methods. The lamp post light is detected by both methods. 63 Image 6: Figure 4.6: Detected edges (b,c) of image img_00018 (a) in the folder 0004 Image 7: Figure 4.7: Detected edges (b,c) of image img_00001 (a) in the folder 0006 64 For Figure 4.6 edges are detected correctly by both methods. Figure 4.7.b has better results comparing with Figure 4.7.c. Figure 4.7.b and 4.7.c include the lamp post light. Image 8: Figure 4.8: Detected edges (b,c) of image img_00009 (a) in the folder 0006 For this example, k-means detected the edges better. 65 Image 9: Figure 4.9: Detected edges (b,c) of image img_00018 (a) in the folder 0006 Image 10: Figure 4.10: Detected edges (b,c) of image img_00001 (a) in the folder 0007 both examples (Figure 4.9, Figure 4.10) have the same results for both clustering methods. 66 Image 11: Figure 4.11: Detected edges (b,c) of image img_00011 (a) in the folder 0007 Results of k-means method especially for the top right-corner pedestrian is better. Image 12: Figure 4.12: Detected edges (b,c) of image img_00022 (a) in the folder 0007 The results of both methods are close. 67 Image 13: Figure 4.13: Detected edges (b,c) of image img_00001 (a) in the folder 08 The results of k-means method are better, but the lamp post light is not detected by Mean Shift. Image 14: Figure 4.14: Detected edges (b,c) of image img_00012 (a) in the folder 0008 68 The results of both methods of Figure 4.14 are close. Image 15: Figure 4.15: Detected edges (b,c) of image img_00024 (a) in the folder 0008 Image 16: Figure 4.16: Detected edges (b,c) of image img_000032 (a) in the folder 0009 69 The results of both methods for Figure 4.15 and Figure 4.16 are close. The above images are initialized using these values, the values are obtained by elbow criterion for K-means, and using quantile of the density for Mean Shift clustering: i. K-means Clustering Initialization: The number of clusters for the first folder is eight. The number of clusters for the second folder is six. The number of clusters for the fourth folder is five. The number of clusters for the sixth folder is six. The number of clusters of the seventh folder is seven. The number of clusters of the eight folder is seven. The number of clusters for the K-means algorithm in ninth folder is ten. ii. Mean Shift Clustering Initialization: The selected bandwidth for the images of the second folder is nineteen. The bandwidth for the images of the fourth folder is twenty. The selected bandwidth for the images of the sixth folder is sixteen and for seventh folder is fifteen. The selected bandwidth for the images of the eighth and ninth folder is seventeen and sixteen. 4.3.1.2 Confusion matrix In this section the performances of both clustering methods are computed using confusion matrix. For this purpose, the ground truth of the dataset is benefited to calculate this assessment. In the confusion matrix, pixels represent total number of pixels and TP is true positive values which are the total number of edge pixels that are correctly detected. TN is true negative values which are the total number of non-edge pixels that are correctly detected. FP is false positive values which are the total number of non-edge pixels that are incorrectly detected as edges. FN is false negative values which are the total number of edge pixels that are incorrectly detected as non-edge values. In this part the resulting confusion matrices of all 16 images (Figure 4.1 to 4.16) are calculated: Table 4.1 and Table 4.2 illustrate the confusion matrices of two images (Figure 4.1 and Figure 4.3). The results of all the other confusion matrices (Table-2 to Table-15) are available in Appendix-2 section. 70 Table 4.1: Confusion matrix of Figure 4.1 Pixels= 86400 Detected =Yes Detected =No K-means Clustering Actual=yes TP=3080 FN=0 Actual=No FP=0 TN=83320 Actual=yes TP=3014 FN=66 Mean Shift Clustering Actual=No FP=0 TN=83320 Table 4.2: Confusion matrix of Figure 4.3 Pixels= 86400 Detected =Yes Detected =No K-means Clustering Actual=yes TP=638 FN=0 Actual=No FP=485 TN=85277 Actual=yes TP=638 FN=0 Mean Shift Clustering Actual=No FP=312 TN= 85450 Table 4.3 is the total results that are calculated for all 16 images (Figure 4.1-4.16). For this purpose, all of the corresponding tables are acquired. Tables of other fourteen images are presented in Apendix-2 (Table-2 to Table-15). Table 4.3: Total results of all 16 images Pixels= 1382400 Detected =Yes Detected =No K-means Clustering Actual=yes TP= 41915 FN=138 Actual=No FP=2962 TN=1337385 Actual=yes TP= 40623 FN=414 Mean Shift Clustering Actual=No FP=2883 TN=1338480 71 Following rates are calculated for both clustering methods. The inputs are obtained according to Table 4.3: k-means Clustering 𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦: (𝑇𝑃+𝑇𝑁) 𝑇𝑜𝑡𝑎𝑙 = (41915+1337385) 1382400 = 0.997757523 (4.1) 𝑀𝑖𝑠𝑐𝑙𝑎𝑠𝑠𝑖𝑓𝑖𝑐𝑎𝑡𝑖𝑜𝑛 𝑅𝑎𝑡𝑒: (𝐹𝑃+𝐹𝑁) 𝑇𝑜𝑡𝑎𝑙 = (2962+138) 1382400 = 0.002242477 (4.2) 𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦/𝑅𝑒𝑐𝑎𝑙𝑙: (𝑇𝑃) 𝐴𝑐𝑡𝑢𝑎𝑙 𝑦𝑒𝑠 = (41915) (41915+138) = 0.99671843 (4.3) 𝐹𝑎𝑙𝑠𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑅𝑎𝑡𝑒: (𝐹𝑃) 𝐴𝑐𝑡𝑢𝑎𝑙 𝑦𝑒𝑠 = (2962) (2962+1337385) = 0.00220988 (4.4) Specificity: (TN) Actual yes = (1337385) (2962+1337385) =0.99779012 (4.5) 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛: (𝑇𝑃) 𝐷𝑒𝑡𝑒𝑐𝑡𝑒𝑑 𝑦𝑒𝑠 = (1337385) (2962+1337385) =0.93399737 (4.6) Mean Shift Clustering: 𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦: (𝑇𝑃+𝑇𝑁) 𝑇𝑜𝑡𝑎𝑙 = (40623+1338480) 1382400 =0.99761502 (4.7) 𝑀𝑖𝑠𝑐𝑙𝑎𝑠𝑠𝑖𝑓𝑖𝑐𝑎𝑡𝑖𝑜𝑛 𝑅𝑎𝑡𝑒: (𝐹𝑃+𝐹𝑁) 𝑇𝑜𝑡𝑎𝑙 = (2883+414) 1382400 = 0.00238498 (4.8) 𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦/𝑅𝑒𝑐𝑎𝑙𝑙: (𝑇𝑃) 𝐴𝑐𝑡𝑢𝑎𝑙 𝑦𝑒𝑠 = 40623) (40623+414) = 0.98991154 (4.9) 𝐹𝑎𝑙𝑠𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑅𝑎𝑡𝑒: (𝐹𝑃) 𝐴𝑐𝑡𝑢𝑎𝑙 𝑦𝑒𝑠 = (2883) (2883+1338480) = 0.002149306 (4.10) Specificity: (TN) Actual yes = (1338480) (2883+1338480) = 0.997850694 (4.11) 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛: (𝑇𝑃) 𝐷𝑒𝑡𝑒𝑐𝑡𝑒𝑑 𝑦𝑒𝑠 = (41915) (41915+2962) =0.933733278 (4.12) 72 Table 4.4 shows the result of comparison between both methods: Table 4.4: Performance comparison of both methods K-means clustering Mean Shift clustering Comparison Accuracy 0.997757523 0.99761502 K-means has higher accuracy rate Misclassification Rate 0.002242477 0.00238498 K-means has lower misclassification rate Sensitivity 0.99671843 0.98991154 K-means has higher sensitivity False Positive Rate 0.00220988 0.002149306 Mean shift has lower False Positive Rate Specificity 0.99779012 0.997850694 K-means has higher specificity Precision 0.93399737 0.933733278 K-means has higher precision As Table 4.4 indicates rates are very good for both methods. Comparison of both methods illustrates that k-means clustering method has better accuracy, sensitivity, misclassification rate and precision, but the results of specificity rate is better for Mean Shift clustering method. This section is intended to assess the effectiveness of the proposed method itself and the method has not been compared with other methods. The simulation results indicate methods’ good performance and it has been able to recognize the edges in most images. The important point about k-means clustering method is that the method depends on the number of clusters. By selecting the correct number of clusters in the k-means algorithm, the proposed method will perform well. In the next chapter, the results of the proposed method are compared with other edge detection methods. 4.3.1.3 Running time In this part the running part of both methods for some random images from different folders are calculated: The running time of the final program for the image img_00002.bmp from the second folder using the trained Neural Network algorithm and k-means clustering is 3.600170 73 seconds. The program’s running time using the trained Neural Network algorithm and mean shift clustering method is 2.397038 seconds. The running time of the final program for the image img_00014.bmp from the sixth folder using the trained Neural Network algorithm and k-means clustering is 3.200960 seconds. The program’s running time using the trained Neural Network algorithm and mean shift clustering method is 4.003568 seconds. The running time of the final program for the image img_00001.bmp from the fourth folder using the trained Neural Network algorithm and k-means clustering is 3.545127 seconds. The program’s running time using the trained Neural Network algorithm and mean shift clustering method is 3.186451 seconds. The running time of the final program for the image img_00011.bmp from the seventh folder using the trained Neural Network algorithm and k-means clustering is 3.545127 seconds. The program’s running time using the trained Neural Network algorithm and mean shift clustering method is 4.328679 seconds. The running time of the final program for the image img_00022.bmp from the ninth folder using the trained Neural Network algorithm and k-means clustering is 3.919369 seconds. The program’s running time using the trained Neural Network algorithm and mean shift clustering method is 2.476575 seconds. As results indicate the Mean shit clustering program’s running time for all sample images except the second one is less than k-means clustering method. It’s due to the clustering part. Because the number of clusters in k-means method in most examples is more than mean shift cluster numbers, therefore it takes more time for the program to extract the edges. The running time of all images of chapter four and five are calculated and written in Appendix-1. 4.4 CONCLUSION The IR image edge detection is highly practical in image processing. The IR images have many uses in medical and military fields. According to their special structure, traditional methods of processing are not useful. Therefore, there are other ways to process these images. In this study, a new method for processing these images was presented for processing. In the previous chapters, the proposed method for implementation is discussed. In this chapter results of two clustering methods are evaluated. 74 5. DISCUSSION 
 As mentioned in last chapters IR images are significantly valuable for various applications and because of the obtained information from edge detection as an image processing knowledge. In this study a new method is proposed to detect and extract the edges. In the previous sections the findings and outputs of this method were discussed. In order to evaluate the application of this study the efficiency of the proposed method is compared with commonly used edge detection operators and it is also compared with other works done so far which are briefly introduced in the Literature Review chapter. The results indicated a good performance of the proposed method. In this section the performances of both clustering algorithms are also evaluated and compared with each other by some random IR images. The images are selected from different IR datasets. 5.1 COMPARING THE RESULTS WITH OTHER METHODS In this section the performance of the proposed method is compared with common edge detection algorithms that are Prewitt, Canny, Roberts and Sobels algorithms. The reason for this selection is because of the popularity and high performance of these algorithms for edge detection of visible light images. The proposed method is also compared with edge detection using OTSU segmentation algorithm. For this purpose, the same input image is used as input for both k-means and also Mean Shift clustering method. As mentioned before for each clustering algorithm ANN is trained using the same clustering method. It means that the same clustering is applied in both training and also test parts. Both algorithms show significantly better result comparing with the common methods. 75 i. K-means clustering: Figure 5.1: Comparison of proposed method with the common algorithms 76 ii. Mean Shift clustering: Figure 5.2: Comparison of proposed method with the common algorithms As the images indicate the outputs of both proposed methods have better results comparing with all the other algorithms. 5.2 COMPARING THE RESULTS WITH OTHER WORKS In this section, the results from the proposed method are compared with the results of CNN-DGA method which is proposed by Wang et al. (2013). The algorithm is also compared with the result of Mixture Edge Detection Method proposed by Abdulmunim et al. (2012). For this purpose, the images and the results used in these studies reported in the papers are used for the comparison. 77 i. K-means Clustering: Figure 5.3: Comparison of proposed method with CNN_DGA method ii. Mean Shift Clustering Figure 5.4: Comparison of proposed method with CNN_DGA method 78 The CNN-DGA result has some white noises in the output which don’t exist in the output of both proposed methods. The selected clusters for K-means clustering method is three. i. K-means Clustering Figure 5.5: Comparison of the proposed methods with Mix_model 79 ii. Mean Shift Clustering: Figure 5.6: Comparison of the proposed methods with Mix_model The selected cluster number for K-means clustering method for this comparison is two. As indicated above, the proposed method has been able to perform well in detecting edges in infrared images. 5.3 COMPARING BOTH CLUSTERING METHODS In the last chapter the edge detection results using both methods for some random images from different datasets are selected and the edges are extracted using both methods. As mentioned in previous part the problem of k-means clustering method is finding the correct number of clusters and Mean shift is proposed to solve this problem. It has a great advantage of cluster number auto determination. For the following examples the correct 80 number of clusters for K-means could been determined. In this section some examples from different datasets are selected and the edges are detected using both clustering methods. For each example the Neural Network algorithm is trained using the same clustering method. Figure 5.7.a shows the original image the green and red circles define the detectable edges. Two different clustering methods are implemented. The bandwidth for Mean Shift clustering is 20 and the selected number of clusters for K-means clustering is 4. Figure 5.7: Comparison between k-means and Mean Shift clustering As the results indicates both methods detect the edges correctly. 81 Figure 5.8.a shows the original image; the green and red circles define the detectable edges. Two different clustering methods are implemented. The bandwidth for Mean Shift clustering is 10 and the selected number of clusters for K-means clustering is 4. Figure 5.8: Comparison between k-means and Mean Shift clustering As the results indicates in this example k-means has a better result rather than Mean Shift clustering method. Figure 5.9.a shows the original image; the green and red circles define the detectable edges. Two different clustering methods are implemented. The bandwidth for Mean Shift clustering is 10 and the selected number of clusters for K-means clustering algorithm is 2. 82 Figure 5.9: Comparison between k-means and Mean Shift clustering As the results indicates in this example k-means clustering has a better result rather than Mean Shift method. Figure 5.10.a shows the original image. Two different clustering methods are implemented. The bandwidth for Mean Shift clustering is 20 and the selected number of clusters for K-means clustering algorithm is 2. 83 Figure 5.10: Comparison between k-means and Mean Shift clustering In this example both methods perform well. Figure 5.11.a shows the original image. Two different clustering methods are implemented. The bandwidth for Mean Shift clustering is 8 and the selected number of clusters for K-means clustering algorithm is 4. 84 Figure 5.11: Comparison between k-means and Mean Shift clustering In this example if the aim is only detection of the car’s edges, Mean Shift clustering has better results. In general, both algorithms perform well. The important point is initialization step which is number of clusters for k-means clustering method and bandwidth selection for Mean Shift clustering algorithm. Variation of these values cause the different results. Figure 5.12.a shows the original image. Two different clustering methods are implemented. The bandwidth for Mean Shift clustering is 8 and the selected number of clusters for K-means clustering algorithm is 2. 85 Figure 5.12: Comparison between k-means and Mean Shift clustering Both methods detected the edges well. Figure 5.13: Comparison between k-means and Mean Shift clustering In this example K-means clustering has a better result 86 6. CONCLUSION 
 6.1 INTRODUCTION Infrared images are special images. These images have their own applications. Regarding the structure of these images, common methods are not suitable for processing and editing these images. It should be considered that image edge detection is one of the most important actions in image processing. The image edges provide important information for identifying objects in the image. In this study, due to the importance of infrared imaging, a new method is proposed for image processing. Therefore, in the first chapter of this research, the goals and generalities of the research were expressed. In the second chapter, literature and research background were studied. In the third chapter, the proposed method of research was described in detail and the implementation method was also explained thoroughly. In the fourth chapter, the results of the method were expressed, and the performance of this method was analyzed. In fifth chapter, the results of the proposed method were compared with other common edge detection algorithms, other studies and also themselves. In this chapter, the results of the research will be expressed and, finally, suggestions for future work will be presented. 6.2 THESIS RESULTS In this thesis, the research is done to represent a proper method of IR image edge detection based on machine learning algorithms and image processing methods. The edges are detected by applying image segmentation concepts and extracting the Regions of Interest using the MLP, as well as the morphological operators of the infrared image edges. The general procedure in this study is as follows: First, preprocessing was done on images to clear the pepper and salt noises on the image, using the median filter. Next, using k-means algorithm or Mean Shift clustering, the image was segmented into several clusters. Then, the region of interest was extracted using the Neural Network algorithm. After extracting the region of interest, using the morphological operators, the image holes were filled and the small unwanted points in the image were also deleted. Finally, the edges were extracted by using morphological operators. 87 In this study, for evaluation of the proposed method a standard dataset was used to survey the efficiency of method for the infrared image edge detection. The results of the proposed method were compared with common edge detection methods. Comparing the results with Prewitt, Canny, Roberts, and Sobel operators showed that the proposed method has been able to extract the edges of the image well and have better performance than these operators. In order for further evaluation of the effectiveness of the proposed method, the results of the proposed method were examined with other works, in each case the results also showed that the proposed method was able to show good performance. Since two different clustering methods were applied these two methods were compared by confusion matrices, running time and different random IR images. In general, proposed method based on clustering and extraction of the region interest, has shown good performance and could be used in different relevant applications. 6.3 PROPOSAL FOR FUTURE RESEARCH Certainly, any research work done has some advantages and disadvantages. This research is not an exception. So, here are some suggestions for future work: a. Applying other techniques and image segmentation methods. b. Applying other classification methods (simpler neural network methods) to extract the expected area. c. As stated in the proposed method, k-means clustering is one of the applied segmentation algorithms. Although elbow criterion and Mean Shift clustering method are applied to solve the problem of cluster number manual determination, but other methods could be applied to extract the edges. d. This method could be extended in video sequences especially for surveillance purposes. 88 REFERENCES Books Bhanu, B., & Pavlidis, Ioannis. 2005. Computer Vision Beyond the Visible Spectrum Advances in Pattern Recognition. London: Springer London. Gilbert Strang, 2005, Linear Algebra and Its Applications, Brooks Cole, Gonzalez, R., & Woods, Richard E. 2008. Chapter 2: Moving Object Detection & Tracking in Videos. Digital image processing. 3rd edn. Upper Saddle River, N.J.: Prentice Hall.pp.15-39 Gonzalez, R., Woods, & Richard, E, 2002. Digital image processing Issue no:2. Upper Saddle River, N.J.: Prentice Hall. Iftekharuddin, Khan, & Awwal, Abdul. 2012. Field Guide to Image Processing. SPIE Press. Kohonen, T. 1997. Self-organizing maps, Springer series in information sciences; 30). Berlin; New York: Springer. Marques, O. 2011. Practical Image and Video Processing Using MATLAB. Hoboken, NJ, USA: John Wiley & Sons. Qiu, Peihua. 2005. Wiley Series in Probability and Statistics. Hoboken, NJ, USA: John Wiley & Sons. Szeliski, R. 2011. Computer Vision : Algorithms and Applications (Texts in computer science). London: Springer-Verlag London Limited. Woolfson, M. 2012. The fundamentals of imaging : From particles to galaxies. London : Hackensack, NJ: Imperial College Press ; Distributed by World Scientific Publishing. Zhang, W. 2005. Computational ecology: Artificial neural networks and their applications. Singapore; Hackensack, NJ; London: World Scientific. 89 Periodicals Abdulmunim, Matheel, E. M, Suhad, 2012. Propose a Mixture Edge Detection Method for Infrared Image Segmentation. British Journal of Science. 6 (2) Blaschke, T. 2010. Object based image analysis for remote sensing. ISPRS Journal of Photogrammetry and Remote Sensing, 65(1), pp. 2-16. Blaschke, T., 2010. Object based image analysis for remote sensing. ISPRS Journal of Photogrammetry and Remote Sensing. 65(1), pp. 2-16. Comaniciu, D., Ramesh, V., & Meer, P. 2003. Kernel-based object tracking. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 25(5), pp. 564-577. Comaniciu, Meer, 2002. Mean Shift a robust approach toward feature space analysis. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE PAMI, 24 (5), pp. 603-619. Davis, J., Keck, M., 2005. A two-stage approach to person detection in thermal imagery IEEE OTCBVS WS Series Bench. In Proc. Workshop on Applications of Computer Vision pp. 364-369 Dhanachandra, Manglem & Chanum, 2015. Image Segmentation Using K -means Clustering Algorithm and Subtractive Clustering Algorithm. Procedia Computer Science. 54, pp. 764-771. Dimitris, Manolakis, Marden, David & Shaw, Gary A, 2003. Hyperspectral image processing for automatic target detection applications. Lincoln laboratory journal 14 (1), pp. 79-116. Duarte, Carrão, Espanha, Viana, Freitas, Bártolo, . . . Almeida, 2014. Segmentation Algorithms for Thermal Images. Procedia Technology. 16 (C). pp, 1560-1569. Dubey, Shiv Ram, Anand, Singh Jalal, 2012. Adapted approach for fruit disease identification using images. International Journal of Computer Vision and Image Processing, 2(3), 44-58. Garge, D.M. Bapat, V.N. 2009. A low cost wavelet based mammogram image processing for early detection of breast cancer. Indian Journal of Science and Technology.2 (9) Iscan, Yüksel, Dokur, Korürek, & Ölmez. 2009. Medical image segmentation with transform and moment based features and incremental supervised neural network. Digital Signal Processing. 19 (5), pp. 890-901. 90 Jambhorka, Sagar, G.N.Sarage, 2012. Enhancement of Chest X-Ray images Using Filtering Techniques. International Journal of Advanced Research in Computer Science and Software Engineering 2 (5), PP.308-312 Kannan, Ramathilagam, Devi, & Sathya, 2011. Robust kernel FCM in segmentation of breast medical images. Expert Systems With Applications, 38 (4), pp. 4382-4389. Lahiri, Bagavathiappan, Jayakumar, & Philip. 2012. Medical applications of infrared thermography: A review. Infrared Physics and Technology, Infrared Physics and Technology. 55 (4) Mohammad, M. B, R. N, Srujana, A. J. N., Jyothi & P. B. T, Sundari. 2016. Disease Identification in Plants Using K-means Clustering and Gray Scale Matrices with SVM Classifier. International Journal of Applied Sciences, Engineering and Management.5 (2), pp. 84 – 88. Pasagic, V., Muzevic, M., & Kelenc, D. 2008. Infrared Thermography In Marine Applications. Brodogradnja, 59(2), 123-130. Qingju ,Tang, Chiwu, Bu, Yuanlin, Jiansuo, Liu Zang, & Li Dayong, 2016. Infrared Image Edge Detection Based on Morphology-Canny Fusion Algorithm. International Journal of U- and E- Service, Science and Technology, 9(3), pp. 259-268. Seema. Kumar, A. & Gill, G. S, 2015. Computer Vision based Model for Fruit Sorting using K-Nearest Neighbor classifier. Int, Journal of electrical &Electronics Engg. 2 (1), pp. 49-52 Singh, S. 2011. Artificial Neural Network. Nature Precedings, Nature Precedings. Sun, S, 2003. Automatic target recognition using boundary partitioning and invariant features in forward-looking infrared images. Optical Engineering. 42 (2), pp. 524- 533. Wang, Yang, Xie, & An, 2014. Edge detection of infrared image with CNN_DGA algorithm. Optik - International Journal for Light and Electron Optics. 125 (1), pp. 464-467. Yizong Cheng. 1995. Mean shift, mode seeking, and clustering. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 17(8), 790-799. 91 Other Publications Dang Hongshe, Song, Jinguo & Guo, Qin, 2010. A Fruit Size Detecting and Grading System Based on Image Processing. Second international conference on intelligent human-machine and cybernetics. 2010 School of Electric and Information Engineering, Shaanxi University of Science and Technology China. Khuwaja, G., & Tolba, A. 2000. Fingerprint image compression. Neural Networks for Signal Processing X. Proceedings of the 2000 IEEE Signal Processing Society Workshop. 2, pp. 517-526. MATLAB and Statistics Toolbox Release 2015b, The MathWorks, Inc., Natick, Massachusetts, United States Mercol. Juan Pablo, Gambini María Juliana, & Juan Miguel Santos, 2008. Automatic classification of oranges using image processing and data mining techniques. XIV Congreso Argentino de Ciencias de la Computación. Pennestate Eberly College of Science Human Body, Form & Function 2018, https://online.science.psu.edu/bisc004_activewd001/node/1907 [retrieval date 11 Feb 2018]. Peshin, Akash., 2017, Why Are Infrared Waves Associated With Heat? [online]. Science ABC. www.scienceabc.com/pure-sciences/why-are-infrared-waves- associated-with-heat.html [accessed 4 January 2018]. R Core Team. 2013. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL http://www.R- project.org/. Robotic surgery-advantages and disadvantages, Future medical technology 2011 http://www.futuretechnology500.com/index.php/future-medical-technology/ robotic-surgery-advantages-and-disadvantages [accessed 5 February 2018]. Seyedarabi, 2010, Image processing, Image processing, Image processing applications. University of Tabriz. Nasri. Shirazi, M., & Morris, B, 2015. Vision-based vehicle queue analysis at junctions. Advanced Video and Signal Based Surveillance (AVSS), 2015 12th IEEE International Conference on, pp. 1-6. 92 Shunyong, Zhou., Yang, Pingxian, 2011. Infrared image segmentation based on Otsu and genetic algorithm. Multimedia Technology (ICMT), 2011 International Conference on. IEEE. 30 August 2011. Hangzhou, China, pp. 5421-5424. Sivaramakrishnan, R., Antani, Sameer. Candemir, Sema. & Xue, Zhiyun, Abuya, Joseph, et al. , 2018. Comparing deep learning models for population screening using chest radiography. Medical Imaging: Computer-Aided Diagnosis. 10575. International Society for Optics and Photonics. February 2018, Houston, Texas, United States, pp. 2-12 Song, Yuheng, Hao, Yan, 2017. Image Segmentation Algorithms Overview. eprint arXiv preprint. Wang, Dong, & Zhang, Jingzhou. 2011. Infrared image edge detection algorithm based on sobel and ant colony algorithm. Multimedia Technology (ICMT), 2011 International Conference on, 4944-4947. 93 APPENDICES 94 Appendix A.1 Running time In this section the running time of all images whose edges are extracted in fourth and fifth chapter are calculated: Table-1: Calculated Running Time for Images of fourth and fifth chapter Figure Number Running time of K-means Clustering Running time of Mean Shift Clustering Figure 4.1 3.609888 sec 3.072522 sec Figure 4.2 3.767448 sec 3.644015 sec Figure 4.3 3.502037 sec 3.501053 sec Figure 4.4 3.620084 sec 3.280337 sec Figure 5.5 3.664335 sec 3.243459 sec Figure 4.6 3.817510 sec 3.162980 sec Figure 4.7 3.157077 sec 3.728421 sec Figure 4.8 3.507092 sec 3.221922 sec Figure 4.9 3.180126 sec 3.796464 sec Figure 4.10 3.157077 sec 2.607492 sec Figure 4.11 4.541129 sec 2.475525 sec Figure 4.12 3.936479 sec 2.600794 sec Figure 4.13 3.869416 sec 3.234204 sec Figure 4.14 3.790868 sec 3.138017 sec Figure 4.15 4.501476 sec 3.518253 sec Figure 4.16 4.181246 sec 2.417631 sec Figure 5.1-5.2 3.723121 sec 3.803506 sec Figure 5.3-5.4 3.101607 sec 2.009334 sec Figure 5.5-5.6 2.995475 sec 2.263715 sec Figure 5.7 2.995456 sec 2.293981 sec Figure 5.8 2.953164 sec 2.151352 sec Figure 5.9 2.935648 sec 2.128833 sec Figure 5.10 2.696574 sec 2.008255 sec Figure 5.11 2.991042 sec 2.080714 sec Figure 5.12 3.250820 sec 2.671819 sec Figure 5.13 4.850709 sec 5.616439 sec 95 Appendix A.2 Confusion matrix Following Tables (Table-2 to Table-15) are confusion matrices of chapter fourth images. For all fourteen images confusion matrices are calculated for both methods. Total result of all confusion matrices including Table 4.1 and Table 4.2 are calculated as Table 4.3 . Table-2: Confusion matrix of Figure 4.2 Pixels= 86400 Detected =Yes Detected =No K-means Clustering Actual=yes TP=2515 FN=9 Actual=No FP=0 TN=83876 Actual=yes TP=2524 FN=0 Mean Shift Clustering Actual=No FP=0 TN=83876 Table-3: Confusion matrix of Figure 4.4 Pixels= 86400 Detected =Yes Detected =No K-means Clustering Actual=yes TP=2970 FN=14 Actual=No FP=405 TN=83011 Actual=yes TP=2984 FN=0 Mean Shift Clustering Actual=No FP=521 TN= 82895 Table-4: Confusion matrix of Figure 4.5 Pixels= 86400 Detected =Yes Detected =No K-means Clustering Actual=yes TP= 3996 FN=0 Actual=No FP= 308 TN= 82096 Actual=yes TP=2984 FN=0 Mean Shift Clustering Actual=No FP=521 TN= 82895 96 Table-5: Confusion matrix of Figure 4.6 Pixels= 86400 Detected =Yes Detected =No K-means Clustering Actual=yes TP=4467 FN=4 Actual=No FP=308 TN=81621 Actual=yes TP=4467 FN=0 Mean Shift Clustering Actual=No FP=521 TN= 81412 Table-6: Confusion matrix of Figure 4.7 Pixels= 86400 Detected =Yes Detected =No K-means Clustering Actual=yes TP=4440 FN=99 Actual=No FP=308 TN=81553 Actual=yes TP=4364 FN=175 Mean Shift Clustering Actual=No FP=168 TN=81693 Table-7: Confusion matrix of Figure 4.8 Pixels= 86400 Detected =Yes Detected =No K-means Clustering Actual=yes TP= 2664 FN=0 Actual=No FP= 308 TN=83428 Actual=yes TP=2631 FN=33 Mean Shift Clustering Actual=No FP=168 TN=83568 97 Table-8: Confusion matrix of Figure 4.9 Pixels= 86400 Detected =Yes Detected =No K-means Clustering Actual=yes TP= 3878 FN=12 Actual=No FP=168 TN=82342 Actual=yes TP= 3878 FN=12 Mean Shift Clustering Actual=No FP=168 TN=82342 Table-9: Confusion matrix of Figure 4.10 Pixels= 86400 Detected =Yes Detected =No K-means Clustering Actual=yes TP= 1205 FN=0 Actual=No FP=0 TN=85195 Actual=yes TP= 1205 FN=0 Mean Shift Clustering Actual=No FP=0 TN=85195 Table-10: Confusion matrix of Figure 4.11 Pixels= 86400 Detected =Yes Detected =No K-means Clustering Actual=yes TP= 3517 FN=0 Actual=No FP=0 TN=82883 Actual=yes TP=3477 FN=40 Mean Shift Clustering Actual=No FP=0 TN=82883 98 Table-11: Confusion matrix of Figure 4.12 Pixels= 86400 Detected =Yes Detected =No K-means Clustering Actual=yes TP= 1725 FN=0 Actual=No FP=0 TN=84675 Actual=yes TP=1685 FN=40 Mean Shift Clustering Actual=No FP=0 TN=84675 Table-12: Confusion matrix of Figure 4.13 Pixels= 86400 Detected =Yes Detected =No K-means Clustering Actual=yes TP= 2379 FN=0 Actual=No FP=168 TN=83853 Actual=yes TP=2359 FN=20 Mean Shift Clustering Actual=No FP=0 TN=84021 Table-13: Confusion matrix of Figure 4.14 Pixels= 86400 Detected =Yes Detected =No K-means Clustering Actual=yes TP=2155 FN=0 Actual=No FP=168 TN=84077 Actual=yes TP=2127 FN=28 Mean Shift Clustering Actual=No FP=168 TN=8077 99 Table-14: Confusion matrix of Figure 4.15 Pixels= 86400 Detected =Yes Detected =No K-means Clustering Actual=yes TP= 1648 FN=0 Actual=No FP=168 TN=84584 Actual=yes TP= 1648 FN=0 Mean Shift Clustering Actual=No FP=168 TN=84584 Table-15: Confusion matrix of Figure 4.16 Pixels= 86400 Detected =Yes Detected =No K-means Clustering Actual=yes TP= 638 FN=0 Actual=No FP=168 TN= 85594 Actual=yes TP= 638 FN=0 Mean Shift Clustering Actual=No FP=168 TN= 85594