Araştırma Çıktıları | WoS | Scopus | TR-Dizin | PubMed

Permanent URI for this communityhttps://hdl.handle.net/20.500.14719/1741

Browse

Search Results

Now showing 1 - 9 of 9
  • Publication
    A hybrid method for feature selection based on mutual information and canonical correlation analysis
    (2010) Sakar, C. Okan; Kursun, Olcay; Sakar, C. Okan, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey; Kursun, Olcay, Department of Computer Engineering, Istanbul Üniversitesi, Istanbul, Turkey
    Mutual Information (MI) is a classical and widely used dependence measure that generally can serve as a good feature selection algorithm. However, under-sampled classes or rare but certain relations are overlooked by this measure, which can result in missing relevant features that could be very predictive of variables of interest, such as certain phenotypes or disorders in biomedical research, rare but dangerous factors in ecology, intrusions in network systems, etc. On the other hand, Kernel Canonical Correlation Analysis (KCCA) is a nonlinear correlation measure effectively used to detect independence but its use for feature selection or ranking is limited due to the fact that its formulation is not intended to measure the amount of information (entropy) of the dependence. In this paper, we propose Predictive Mutual Information (PMI), a hybrid measure of relevance not only is based on MI but also accounts for predictability of signals from one another as in KCCA. We show that PMI has more improved feature detection capability than MI and KCCA, especially in catching suspicious coincidences that are rare but potentially important not only for subsequent experimental studies but also for building computational predictive models which is demonstrated on two toy datasets and a real intrusion detection system dataset. © 2010 IEEE. © 2010 Elsevier B.V., All rights reserved.
  • Publication
    Prediction of protein sub-nuclear location by clustering mRMR ensemble feature selection
    (2010) Sakar, C. Okan; Kursun, Olcay; Şeker, Hüseyin; Gürgen, Fïkret S.; Sakar, C. Okan, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey; Kursun, Olcay, Department of Computer Engineering, Istanbul Üniversitesi, Istanbul, Turkey; Şeker, Hüseyin, Department of Informatics, De Montfort University, Leicester, United Kingdom; Gürgen, Fïkret S., Department of Computer Engineering, Boğaziçi Üniversitesi, Bebek, Turkey
    In many applications of pattern recognition in the bioinformatics and biomedical fields, input variables are organized into natural partitions that are called views in the literature. Mutual information can be used in selecting a minimal yet capable subset of views. Ignoring the presence of views, dismantling them, and treating their variables intermixed along with those of others at best results in a complex uninterpretable predictive system for researchers in these fields. Moreover, it would require measuring or computing majority of the views. We use the clustering indices of the views and rank the views according to the unique information they have with the target using minimum redundancy-maximum relevance (mRMR) approach. We also propose an ensemble approach to reduce the random variations in clusterings. © 2010 IEEE. © 2010 Elsevier B.V., All rights reserved.
  • Publication
    A feature selection method based on kernel canonical correlation analysis and the minimum Redundancy-Maximum Relevance filter method
    (2012) Sakar, C. Okan; Kursun, Olcay; Gürgen, Fïkret S.; Sakar, C. Okan, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey; Kursun, Olcay, Department of Computer Engineering, Istanbul Üniversitesi, Istanbul, Turkey; Gürgen, Fïkret S., Department of Computer Engineering, Boğaziçi Üniversitesi, Bebek, Turkey
    In this paper, we propose a feature selection method based on a recently popular minimum Redundancy-Maximum Relevance (mRMR) criterion, which we called Kernel Canonical Correlation Analysis based mRMR (KCCAmRMR) based on the idea of finding the unique information, i.e. information that is distinct from the set of already selected variables, that a candidate variable possesses about the target variable. In simplest terms, for this purpose, we propose using correlated functions explored by KCCA instead of using the features themselves as inputs to mRMR. We demonstrate the usefulness of our method on both toy and benchmark datasets. © 2011 Elsevier Ltd. All rights reserved. © 2011 Elsevier B.V., All rights reserved.
  • Publication
    A method for combining mutual information and canonical correlation analysis: Predictive Mutual Information and its use in feature selection
    (2012) Sakar, C. Okan; Kursun, Olcay; Sakar, C. Okan, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey; Kursun, Olcay, Department of Computer Engineering, Istanbul Üniversitesi, Istanbul, Turkey
    Feature selection is a critical step in many artificial intelligence and pattern recognition problems. Shannon's Mutual Information (MI) is a classical and widely used measure of dependence measure that serves as a good feature selection algorithm. However, as it is a measure of mutual information in average, under-sampled classes (rare events) can be overlooked by this measure, which can cause critical false negatives (missing a relevant feature very predictive of some rare but important classes). Shannon's mutual information requires a well sampled database, which is not typical of many fields of modern science (such as biomedical), in which there are limited number of samples to learn from, or at least, not all the classes of the target function (such as certain phenotypes in biomedical) are well-sampled. On the other hand, Kernel Canonical Correlation Analysis (KCCA) is a nonlinear correlation measure effectively used to detect independence but its use for feature selection or ranking is limited due to the fact that its formulation is not intended to measure the amount of information (entropy) of the dependence. In this paper, we propose a hybrid measure of relevance, Predictive Mutual Information (PMI) based on MI, which also accounts for predictability of signals from each other in its calculation as in KCCA. We show that PMI has more improved feature detection capability than MI, especially in catching suspicious coincidences that are rare but potentially important not only for experimental studies but also for building computational models. We demonstrate the usefulness of PMI, and superiority over MI, on both toy and real datasets. © 2011 Elsevier Ltd. All rights reserved. © 2011 Elsevier B.V., All rights reserved.
  • Publication
    Feature extraction for facial expression recognition by canonical correlation analysis, Kanoni̇k korelasyon anali̇zi̇ i̇le yüz i̇fadesi̇nden duygu tanima i̇çi̇n özni̇teli̇ k çikarimi
    (2012) Sakar, C. Okan; Kursun, Olcay; Karaali, Ali; Erdem, Cigdem Eroglu; Sakar, C. Okan, Bahçeşehir Üniversitesi, Istanbul, Turkey; Kursun, Olcay, Istanbul Üniversitesi, Istanbul, Turkey; Karaali, Ali, Bahçeşehir Üniversitesi, Istanbul, Turkey; Erdem, Cigdem Eroglu, Bahçeşehir Üniversitesi, Istanbul, Turkey
    Although several methods have been proposed for fusing different image representations obtained by different preprocessing methods for emotion recognition from the facial expression in a given image, the dependencies and relations among them have not been much investigated. In this study, it has been shown that covariates obtained by Canonical Correlation Analysis (CCA) that extracts relations between different representations have high predictive power for emotion recognition. As high prediction accuracy can be achieved using a small number of features extracted by it, CCA is considered to be a good dimensionality reduction method. For our simulations, we used the CK+ database and showed that covariates obtained from difference-images and geometric-features representations have high prediction accuracy. © 2012 IEEE. © 2012 Elsevier B.V., All rights reserved.
  • Publication
    Combining spatial proximity and temporal continuity for learning invariant representations
    (2012) Kursun, Olcay; Aytekin, Tevfik; Kursun, Olcay, Department of Computer Engineering, Istanbul Üniversitesi, Istanbul, Turkey; Aytekin, Tevfik, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey
    Location and time are two critical aspects of most security-related events, and thus, spatiotemporal data analysis plays a central role in many security-related applications. The human brain has great capabilities of developing invariant representations of objects by taking advantage of both spatial similarity of features of objects/events and their relative timings (temporal information). Trace learning rule is one well-known solution for this problem of combining temporal relations with spatial proximity in clustering tasks such as the one performed by self organizing maps. In this work, we investigate a two stage mechanism: i) finding local clusters using spatial proximity, ii) grouping these clusters as suggested by temporal continuity patterns. We show our experimental results on a movie created from face images. © 2012 IEEE. © 2013 Elsevier B.V., All rights reserved.
  • Publication
    Feature extraction based on discriminative alternating regression
    (Springer Verlag [email protected], 2014) Sakar, C. Okan; Kursun, Olcay; Gürgen, Fïkret S.; Sakar, C. Okan, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey; Kursun, Olcay, Department of Computer Engineering, Istanbul Üniversitesi, Istanbul, Turkey; Gürgen, Fïkret S., Department of Computer Engineering, Boğaziçi Üniversitesi, Bebek, Turkey
    Canonical Correlation Analysis (CCA) aims at measuring linear relationships between two sets of variables (views). Recently, CCA has been used for feature extraction in classification problems with multiview data by means of view fusion. However, the extracted correlated features with CCA may not be discriminative since CCA does not utilize the class labels in its traditional formulation. Besides, the CCA features are computed based on within-set and between-set sample covariance matrices of the views which can be very sensitive to representation-specific details and noisy samples of the two views. In this paper, we propose a method, D-AR (Discriminative Alternating Regression), in which the two above-mentioned problems encountered in the application of CCA for feature extraction are addressed: (1) the class labels are incorporated into the proposed feature fusion framework to explore correlated and also discriminative features, and (2) the use of sensitive sample covariates matrices is avoided while fusing the two views. D-AR is a supervised feature fusion approach based on Multi-layer Perceptron (MLP) implementation of alternating regression. From the neurobiological perspective, the architecture of D-AR is similar to the model of a single neuron in the cerebral cortex which has a function of discovering and representing one of the hidden factors in its sensory environment. The MLP trained on each view aims to predict the class labels and also the hidden factors which are responsible for the correlation. We show that the features found by D-AR on training sets accomplishes significantly higher classification accuracies on test set of an experimental dataset. © Springer International Publishing Switzerland 2014. © 2017 Elsevier B.V., All rights reserved.
  • Publication
    Prediction of level and abrupt changes of ozon concentration, Ozon seviyesi ve ani deǧişimlerinin kestirimi
    (IEEE Computer Society [email protected], 2014) Develi, Ahmet; Kursun, Olcay; Erdogdu Sakar, Betul; Develi, Ahmet, Istanbul Üniversitesi, Istanbul, Turkey; Kursun, Olcay, Istanbul Üniversitesi, Istanbul, Turkey; Erdogdu Sakar, Betul, Bahçeşehir Üniversitesi, Istanbul, Turkey
    While, in stratosphere, high level ozone concentration protects the Earth against ultraviolet radiation, in lower troposphere it has negative effects on human health and environment. The goal of this study is to determine the feature groups that are related to abrupt changes in the level of ozone. Linear discriminant analysis and support vector machines methods are used to explore which combination of features are predictive of abrupt changes in ozone level on the simulation dataset collected in Ankara, Turkey, by an automatic air quality monitoring station operated by the ministry of environment and urban planning. The dataset consists of one year of measurements of air pollutants and the meteorological factors. The obtained results showed that particulate matters, nitric oxides and temperature are most effective parameters in the classification of absurt rise and fall in the level of ozone. © 2014 IEEE. © 2014 Elsevier B.V., All rights reserved.
  • Publication
    Discriminative feature extraction by a neural implementation of canonical correlation analysis
    (Institute of Electrical and Electronics Engineers Inc., 2017) Sakar, C. Okan; Kursun, Olcay; Sakar, C. Okan, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey; Kursun, Olcay, Department of Computer Engineering, Istanbul Üniversitesi, Istanbul, Turkey
    The canonical correlation analysis (CCA) aims at measuring linear relationships between two sets of variables (views) that can be used for feature extraction in classification problems with multiview data. However, the correlated features extracted by the CCA may not be class discriminative, since CCA does not utilize the class labels in its traditional formulation. Although there is a method called discriminative CCA (DCCA) that aims to increase the discriminative ability of CCA inspired from the linear discriminant analysis (LDA), it has been shown that the extracted features with this method are identical to those by the LDA with respect to an orthogonal transformation. Therefore, DCCA is simply equivalent to applying single-view (regular) LDA to each one of the views separately. Besides, DCCA and the other similar DCCA approaches have generalization problems due to the sample covariance matrices used in their computation, which are sensitive to outliers and noisy samples. In this paper, we propose a method, called discriminative alternating regression (D-AR), to explore correlated and also discriminative features. D-AR utilizes two (alternating) multilayer perceptrons, each with a linear hidden layer, learning to predict both the class labels and the outputs of each other. We show that the features found by D-AR on training sets significantly accomplish higher classification accuracies on test sets of facial expression recognition, object recognition, and image retrieval experimental data sets. © 2018 Elsevier B.V., All rights reserved.