Ph.D. Theses
Permanent URI for this collection
Browse
Browsing Ph.D. Theses by Author "Baydoğan, Mustafa Gökçe."
Now showing 1 - 3 of 3
Results Per Page
Sort Options
Item Distance-based learning approaches for multiple instance learning(Thesis (Ph.D.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2022., 2022) Sivrikaya, Özgür Emre.; Baydoğan, Mustafa Gökçe.Multiple Instance Learning (MIL) is a weakly supervised approach that focuses on the labeling of a set of instances (i.e. bags) where the label information of in dividual instances is generally unknown. Many of the earlier MIL studies focus on certain assumptions regarding the relationship between the bag and instance labels and devise supervised learning approaches. With the ambiguity in instance labels, these studies fail to generalize to the MIL problems with complex structures. To avoid these problems, researchers focus on embedding instance- level information to learn bag representations. In this context, dissimilarity-based representations are known to gen eralize well. This thesis proposes a novel framework in which each bag is represented by its dissimilarities to the prototypes. The framework consists learning mechanisms that provide fast and competitive results compared to the existing distance-based ap proaches on extensive benchmark data sets. The first approach is a simple model that provides a prototype generator from a given MIL data set. We aim to find out prototypes in the feature space to map the collection of instances (i.e. bags) to a dis tance feature space and simultaneously learn a linear classifier for MIL. The second proposal is a tree-based ensemble learning strategy that avoids complex tuning pro cesses and heavy computational costs without sacrificing accuracy. The framework is enriched with the integration of the methods, parameter selection strategy, and en semble design. Furthermore, the proposed methods are extended to the regression domain, namely Multiple Instance Regression (MIR). MIR is a less commonly studied area where the bag labels are real valued data instead of classes. The experiments show that the performances of all proposals are better than the state-of-the art approaches in the literature.Item Mathematical programming and statistical learning approaches for multiple instance learning(Thesis (Ph.D.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2018., 2018.) Lök, Emel Şeyma.; Baydoğan, Mustafa Gökçe.; Taşkın, Zeki Caner.Many real-world applications of classification require flexibility in representing complex objects to preserve the relevant information for class separation. Multiple instance learning (MIL) aims to solve classification problem where each object is rep resented with a bag of instances, and class labels are provided for the bags rather than individual instances. The aim is to learn a function that correctly labels new bags. In this thesis, we propose statistical learning and mathematical optimization methods to solve MIL problems from diversified application domains. We first present bag encoding strategies to obtain bag-level feature vectors for MIL. Simple instance space partition ing approaches are utilized to learn representative feature vectors for the bags. Our experiments on a large database of MIL problems show that random tree-based encod ing is scalable and its performance is competitive with the state-of-the-art methods. Mathematical programming-based approaches to MIL problem construct a bag-level decision function. In this context, we formulate MIL problem as a linear programming model to optimize bag orderings for correct classification. Proposed formulation com bines instance-level scores to return an estimate on the bag label. All instances are solved to optimality on various data representations in a reasonable computation time. At last, we develop a quadratic programming formulation that is superior to previous MIL formulations on underlying assumptions and computational difficulties. Proposed MIL framework models contributions of instances to the bag class labels, and provide a bag class decision threshold. Experimental results verify that proposed formulation enables effective classification in various MIL applications.Item Multi-objective approaches for multi-target learning(Thesis (Ph.D.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2020., 2020.) Adıyeke, Esra.; Baydoğan, Mustafa Gökçe.Multi-target datasets (MTD) require simultaneous prediction of several variables hence they are considered to be more challenging in terms of predictive tasks compared to single-target datasets. Mining of MTD requires handling of several problems. To exemplify, scale inconsistencies are widely encountered in the targets. Most of the existing approaches resolve this issue by transforming the targets to the same scale, yet those operations may change the statistical properties of the dataset. Besides, features' scale inconsistencies cause problems in semi-supervised learning (SSL) applications since distance-based calculations are required therein. Another issue with MTD is, to explore alternative ways of including the target relations in learning applications. In this thesis, I develop supervised learning (SL), SSL and feature ranking (FR) models for MTD to deal with aforementioned problems. Bene ting from multi-objective optimization concepts, I aim to propose learning strategies that are robust to the type of the variables processed and utilize the target relations at the same time. Speci cally, I propose a multi-objective extension for standard decision trees and a selective classi er chaining strategy for SL tasks. Experimental studies show that proposed models outperform their benchmark models. Besides, multi-objective trees extended to their semi-supervised version so that proposed form could result a competitive performance when the label information is not adequate. Performed experiments show a signi cant improvement of the proposed model over its benchmarks. In addition, since highdimesionality and irrelevance in features reduce the e ectiveness of a learning model, an embedded feature ranking (FR) procedure to semi-supervised trees is given to address this problem. Applications on several datasets show that, proposed FR procedure enhances the predictive performance compared to its benchmark approaches.