Makine öğrenmesi algoritmaları ve anomali tespiti

Makine öğrenmesi algoritmaları ve anomali tespiti

Topuz, Mehmet Deniz

URI: http://hdl.handle.net/123456789/1470

Date: 2014-07

Abstract:

Makine öğrenmesi yapay zekanın bir alt çalışma alanıdır ve veriden önemli davranışlar ve kurallar çıkartarak ileriye doğru tahminler yapabilmemizi sağlar. Son 20 yılda değişik çalışma alanlarındaki veri miktarı çok hızlı artmıştır ve bu verinin insan çalışması ile analiz edilmesi zordur. Makine öğrenmesi algoritmalarına dair temelde iki öğrenme şekli vardır : gözeticili öğrenme ve gözeticisiz öğrenme. Gözeticili öğrenmede data önceden bilinen sınıflara ayrılır. Gözeticisiz ¨öğrenme de ise sınıflar önceden bilinmez, öğrenme algoritması veri içindeki ayrık yapıları kendisi keşfeder. Bu tezde çok kullanılan makine öğrenmesi algoritmaları detayları ile açıklanmıştır. Veri kümesi içinde beklenen davranışları doğrulamayan örüntülere anomali denir. Veri kümesi içinde anomali bulunmasının önemli sonuçları olabilir. Tezin son bölümünde önceki kısımda bahsedilen makine öğrenmesi algoritmalarının ve yaklaşımlarının anomali tespit etme problemine nasıl uyarlandığı açıklanmıştır.

Machine learning is the subfield of the artifical intelligence which finds the significant behaviours or functions from the data for future predictions. Huge amount of data were collected in the last decades and analysis of such a big data requires intelligent systems. Machine learning enables a computer to learn from example data or past experience. According to their learning style, machine learning algorithms can be categorized into two groups: supervised learning algorithms and unsupervised learning algorithms. Training data of supervised learning algorithms includes both the inputs and labels. Unsupervised learning model is not provided with the correct labels during training. A detailed explanation of leading machine learning algorithms is offered in the first part of this thesis. Anomaly is a pattern in the data that does not conform to expected behaviour. Existence of anomalies in the data is important because they might translate to critical actionable information. Both supervised and unsupervised machine learning techniques are applied to detect anomalies in different domains. Last part of this thesis provides an overview of the relation between anomaly detection problem and machine learning approaches.

Show full item record