Ph.D. Theses

Permanent URI for this collection

https://digitalarchive.library.bogazici.edu.tr/handle/123456789/11646

Browse

Now showing 1 - 20 of 115

Using machine learning to improve automated test generation
(Thesis (Ph.D.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2022., 2022) Köroğlu, Yavuz.; Şen, Alper.
Underestimating the value of software testing had catastrophic results in recent history. Automated Test Generation (ATG) is an approach that aims to minimize the manual effort required for testing. This thesis aims to improve the effectiveness and performance of ATG approaches via Machine Learning (ML) based guidance, and focuses on Android Graphical User Interface (GUI) testing using Reinforcement Learning (RL), specifically. We propose four solutions, Q-learning Based Exploration (QBE), Test Case Mutation (TCM), Fully Automated Reinforcement LEArning Driven (FARLEAD), and FARLEAD2 test generators. QBE uses RL to crawl a set of applications and learns an action generation policy while exploring. Then, it uses this learned policy to either detect more unique crashes or cover more activities in new applications. TCM takes the tests QBE generates and replaces the well-behaving actions in those tests with bad-behaving ones to detect even more crashes. FARLEAD uses RL to learn how to verify a functional behavior that is given as a high-level test scenario in the form of a monitorable formal specification. FARLEAD learns by trial-and- error like QBE but it learns app-specific patterns instead of QBE’s app-generic patterns. To the best of out knowledge, FARLEAD is the first engine fully automating the functional testing of GUI applications. Finally, FARLEAD2 improves FARLEAD with Generalized Experience Replay (GER) and human-readable Staged Test Scenario (STS) language. Experimental results show that, QBE outperforms state-of-the-art test generators in crash detection and coverage. Furthermore, executing QBE first and then switching to TCM detects even more unique crashes. FARLEAD and FARLEAD2 expand the scope of automated testing to verifying functional behavior. Overall, these test generators elevate automated GUI testing closer to replacing manual GUI testing.
Algorithms for learning from online human behavior and human interaction with learning algorithms
(Thesis (Ph.D.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2022., 2022) Çapan, Gökhan.; Özgür, Arzucan.; Cemgil, Ali Taylan.
In modern digital systems, algorithms that deliver personalized content shape the user experience and affect user satisfaction, hence long-term engagement with the system. What the system presents also influences the parties providing content to the system since visibility to the user is vital for reachability. Such algorithms learn to deliver personalized content using data on previous user behavior, e.g., their choices, clicks, ratings, etc., interpreted as a proxy for user preferences. In the first part of this work, we review prevalent models for learning from user feedback on content, including our contributions to the literature. As such data is ever-growing, we discuss computational aspects of learning algorithms and focus on software libraries for scalable implementations, including our contributions. The second part is on learning from user interactions with algorithmic personalization systems. Albeit helpful, human behavior is subject to cognitive biases, and data sets comprising their item choices are subject to sampling biases, posing problems to learning algorithms that rely on such data. As users interact with the system, the problem worsens—the algorithms use biased data to compose future content. Further, the algorithms self-reinforce their inaccurate beliefs on user preferences. We review some of the biases and investigate a particular one: the user’s tendency to choose from the alternatives presented by the system, putting the least effort into exploring further. To account for it, we develop a Bayesian choice model that explicitly incorporates in the inference of user preferences their limited exposure to a systematically selected subset of items by an algorithm. The model leads to an efficient online learning algorithm of user preferences through interactions.
Stress measurement and regulation in real-life using affective technologies
(Thesis (Ph.D.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2022., 2022) Chalabianloo, Niaz.; Ersoy, Cem.
Stress has become one of the main contributors to serious mental and physical health issues in today’s world. Existing works in the literature have used Psychophysi ological measures and proposed numerous mechanisms to detect stress and administer feedback to help users regulate it. Unobtrusive wearables’ popularity is increasingly growing, intertwined with digital health notions, making them efficient, inexpensive, and easily accessible affective self-help technologies. This thesis first aims to investigate and implement stress detection mechanisms in the laboratory and everyday environ ments using unobtrusive wearable devices. In this regard, we investigate various sce narios, such as how to design and deploy stress measurement models that can efficiently use multi-modal data coming from different types of wearables used in the laboratory and real- life settings. We also study low-cost and practical methods for emotion regula tion in stressful conditions of everyday life. In the next step, a mixed-methods study is conducted. For this, signals from multiple wearables and users’ subjective opinions re garding different aspects of wearability were analyzed quantitatively and qualitatively. The next step is an in-depth study in cooperation with HCI researchers, in which we demonstrate the effects of haptic feedback on emotion regulation. As a next step for helping users choose the right device, we evaluate several wearables under completely identical conditions to compare the stress detection quality in wearables with differ ent technologies. Finally, we utilize Explainable AI (XAI) to make our models more understandable for the end users, and in particular for the psychology and clinical experts. The results of our studies indicate that an integrated detection, notification, and intervention cycle is required to ensure a reliable system for regulating stress in daily life.
Parallel network flow algorithms
(Thesis (M.S.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2022., 2022) Kara, Gökçehan.; Özturan, Can.
Network flows is an active area of research that has applications in a wide variety of fields. Several network flow problems are reduced to either the maximum flow problem or the minimum cost flow problem. Maximum flow problem involves finding the maximum possible amount of flow between two designated nodes on a network with arcs having flow capacities. Minimum cost flow problem tries to determine a flow with the minimum cost on a network with supply and demand nodes. In this thesis, we propose two parallel algorithms for the maximum flow and the minimum cost flow problems respectively. First, we present a shared memory parallel push-relabel algorithm for the maximum flow problem. Graph coloring is used to avoid collisions between threads for concurrent push and relabel operations. In addition, excess values of target nodes are updated using atomic instructions to prevent race conditions. The experiments show that our algorithm is competitive for wide graphs with low diameters. Second, we contribute a parallel implementation of the network simplex algorithm that is used for the solution of minimum cost flow problem. We propose finding the entering arc in parallel as it often takes the majority of the execution time. Scanning all arcs can take quite some time, so it is common to consider only a fixed number of arcs which is referred as the block search pivoting rule. Arc scans can easily be done in parallel to find the best candidate as the calculations are independent of each other. We used shared memory parallelism using OpenMP along with vectorization using AVX instructions. We also tried adjusting block sizes to increase the parallel portion of the algorithm. Our experiments show speedups up to 4 are possible, though they are typically lower.
Deep learning-based dependency parsing for Turkish
(Thesis (Ph.D.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2022., 2022) Bilgin, Şaziye Betül.; Özgür, Arzucan; Güngör, Tunga.
Dependency parsing is an important step for many natural language processing (NLP) systems such as question answering and machine translation. Turkish, being a morphologically rich language and having a complex grammar, is challenging for au tomatic processing. Limited NLP tools and resources for Turkish make the task even more challenging. Data-driven deep learning models show promising performance in dependency parsing. Yet, the amount of data to train a data-driven dependency parser directly affects performance, and deep learning-based systems require extensive data to achieve good performance. In this thesis, we focused on Turkish dependency parsing and proposed two solutions to the challenges this task poses. First, we increased the size and quality of labeled data for Turkish dependency parsing. In this respect, we cre ated the BOUN Treebank by annotating 9,761 sentences. In addition, we re- annotated the IMST and PUD treebanks using the same annotation scheme. As a result, we presented the largest collection of Turkish treebanks with consistent annotation. Sec ond, we developed novel state-of-the-art dependency parsing models for Turkish as well as other low-resource languages. As our first parsing approach, we introduced a hybrid dependency parser where Turkish grammar rules and morphological features of words are integrated into the deep learning model. Despite the limited training data, the hybrid parser achieved higher success than the current methods for Turkish dependency parsing. As our second parsing approach, we proposed a deep dependency parser with semi-supervised enhancement. By conducting experiments on a number of low-resource languages besides Turkish, we achieved state-of-the-art results on all datasets. We have shown that deep learning-based models can be improved not only by additional training data, but also by integrating intelligently extracted information.
Extended models of finite automata
(Thesis (Ph.D.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2019., 2019.) Salehi, Özlem.; Say, Ahmet Celal Cem.
Many of the numerous automaton models proposed in the literature can be regarded as a nite automaton equipped with an additional storage mechanism. In this thesis, we focus on two such models, namely the nite automata over groups and the homing vector automata. A nite automaton over a group G is a nondeterministic nite automaton equipped with a register that holds an element of the group G. The register is initialized to the identity element of the group and a computation is successful if the register is equal to the identity element at the end of the computation after being multiplied with a group element at every step. We investigate the language recognition power of nite automata over integer and rational matrix groups and reveal new relationships between the language classes corresponding to these models. We examine the e ect of various parameters on the language recognition power. We establish a link between the decision problems of matrix semigroups and the corresponding automata. We present some new results about valence pushdown automata and context-free valence grammars. We also propose the new homing vector automaton model, which is a nite automaton equipped with a vector that can be multiplied with a matrix at each step. The vector can be checked for equivalence to the initial vector and the acceptance criterion is ending up in an accept state with the value of the vector being equal to the initial vector. We examine the e ect of various restrictions on the model by con ning the matrices to a particular set and allowing the equivalence test only at the end of the computation. We de ne the di erent variants of the model and compare their language recognition power with that of the classical models.
Person detection and tracking using omnidirectional cameras, and rectangle blanket problem
(Thesis (Ph.D.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2019., 2019.) Demiröz, Barış Evrim.; Akarun, Lale.; Salah, Albert Ali.
Person detection and tracking can provide the crucial analysis needed to avoid accidents with autonomous machinery, optimize environments for effciency and assist the elderly. Omnidirectional cameras have a large field of view that allow them to cover more ground at the expense of resolution. Omnidirectional cameras can decrease setup, maintenance and computational costs by reducing the number of cameras and the bandwidth required. Computer vision methods developed for conventional cameras usually fail for omnidirectional cameras due to their di erent image formation geometry. In this thesis, rst, a novel dataset for person tracking in omnidirectional cameras is introduced. The dataset, namely BOMNI, contains 46 videos of persons moving inside a room; where the bounding boxes and the identity of the persons are annotated at every frame. Second, a generative Bayesian framework is developed for coupling person tracking and fall detection. The method is evaluated on BOMNI dataset, producing 93% tracking accuracy and fall detection within a few frames of the event. Third, a similar method for multiple person tracking is developed and evaluated on BOMNI dataset. The method reaches 86% tracking accuracy, increasing a previous approach by 18%. Fourth, a discriminative method for person detection is presented. Also a novel structure called Radial Integral Image that speeds up feature extraction step is introduced. This method achieves state of the art detection performance on IYTE dataset: 4.5% miss rate for one false positive per image. Finally, the problem of representing a shape with multiple rectangles, Rectangle Blanket Problem, is formulated as an integer programming problem and a branch-and-bound scheme is presented along with a novel branching rule to solve it optimally. This problem is encountered in the earlier sections of this thesis, but it is a general problem that is present in the literature.
Utilizing weakly-supervised learning for hashtag segmentation and named entity disambiguation
(Thesis (Ph.D.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2020., 2020.) Çelebi, Arda.; Özgür, Arzucan.
Today’s high-performing machine learning algorithms learn to predict by the supervision of large amounts of human-labeled data. However, the labeling process is costly in terms of time and effort. In this thesis, we design weakly-supervised ap proaches, which are based on automatically labeling raw data, for two different Natural Language Processing (NLP) tasks, namely hashtag segmentation and Named Entity Disambiguation (NED). Hashtag segmentation’s aim is to identify the words in the hashtags, so as to process and understand them better. We propose a heuristic to ob tain automatically segmented hashtags using a large tweet corpus and use these data to train a maximum entropy classifier. State-of-the-art accuracy is achieved for hashtag segmentation without using any manually labeled training data. The target of NED, which is the second task that we address, is to link the named entity (NE) mentions in text to their corresponding records in the Knowledge Base. We hypothesize that the types of the NE mentions may provide useful clues for their correct disambigua tion. The standard approaches for identifying mention types require a type taxonomy and large amounts of mentions annotated with their types. We propose a cluster-based mention typing approach, which does not require a type taxonomy or labeled mentions. This weakly-supervised approach is based on clustering the NEs in Wikipedia by using different levels of contextual information and automatically generating data for train ing a mention typing model. The mention type predictions lead to significant F-score improvement when incorporated to a supervised NED model. This thesis shows that designing weakly-supervised approaches by considering the underlying characteristics of the addressed problem can be an effective strategy for NLP
Distance approximations between high and multi-dimensional structures
(Thesis (Ph.D.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2019., 2019.) Semerci, Murat.; Cemgil, Ali Taylan.
In this thesis, we focus on distance approximation methods between high and multi-dimensional structures and their applications. Two novel methods using distance approximations are proposed and they are applied to anomaly detection in cyber security (Distributed Denial of Service -DDoS- attack and attacker detection) and tensor decomposition in object retrieval (image and video classi cation on scarce data). At rst, we consider an autonomous cyber security system that consists of two components: A monitor for detection of DDoS attacks and a discriminator for detection of users in the system with malicious intents. A novel adaptive real time change-point detection model that tracks the changes in the Mahalanobis distances between sampled feature vectors in the monitored system accounts for possible DDoS attacks. A clustering model that runs over the similarity scores of behavioral patterns between the users is used for segregating the malicious from the innocent. Secondly, we propose a discriminative tensor decomposition with large margin (LMTD), which is a distance based model that nds the projection directions where the nearest neighbor classi - cation accuracy is improved over the projected instances. We experiment the cyber security system in a simulated SIP communication environment. Both the attack and attacker detection components are compared with some competitors in the literature. The tensor decomposition is applied to the image and video retrieval problem, where the data is scarce, and its performance also is compared with other decomposition methods. The experimental results are reported for both applications. It is shown that the proposed methods perform higher accuracy rates than their competitors.
Bayesian methods for network traffic analysis
(Thesis (Ph.D.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2019., 2019.) Kurt, Barış.; Cemgil, Ali Taylan.
Statistical information about tra c patterns help a service provider to characterize its network resource usage and user behavior, infer future tra c demands, detect tra c and usage anomalies, and possibly provide insights to improve the performance of the network. However, the increasingly high volume and speed of data over modern networks make collecting these statistics di cult. Moreover, smarter network attacks require sophisticated detection methods that are able to fuse many network and hardware signals. Fortunately, Bayesian statistical methods are powerful tools that can infer such information under the harsh network environments. In this thesis we apply two Bayesian methods for two speci c network problems. First, we use the Bayesian multiple change models to detect DDoS attacks in SIP networks by fusing the observations coming from the network tra c and the networking hardware. We show that our method is superior to classic DDoS detection methods and using hardware signals improve the detection rate. For this work, we developed a probabilistic SIP network simulator and a monitoring system, and published it as an open-source software. In our second work, we estimated network statistics from a high speed network where we can only observe a fraction of the network tra c. For this problem we develop a generic novel method called ThinNTF, based on non-negative tensor factorization. This method can work with di erent network sampling schemes and recovers original network statistics by detecting the periodic network tra c patterns from the sampled network data and gives better estimates compared to the state of the art.
Urban cellular wireless network planning with 3D geographical grid structures
(Thesis (Ph.D.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2019., 2019.) Özyurt, Murat.; Tuğcu, Tuna.
The largest traffic on the Internet is generated by multimedia content, and the share of wireless mobile access to the multimedia content is ever increasing. The delivery process in the content depends on the type of the content, and whether it requires a strictly time sensitive live session, or it can be delivered on demand, irrespective of time of request. As a general solution in the IP networks, the delivery is taking place as multiple individual transmissions from the source to all receivers, replicating the same content at di erent times, or at the same time if the content is streaming live. In this thesis, we propose an advanced collaborative multicast routing model for delivering bandwidth hungry streaming content such as IPTV to multiple users having low quality wireless connection, with the help of other nearby users having a better connection quality utilizing the wireless mesh network topology. The model reduces the amount of data replication, which causes signifficant overhead in the network transmissions and intermediate computations, improving the overall throughput of the wireless network. An essential issue in wireless mobile networks is where to position the base stations on the network coverage area. We also propose an alternative adaptive mixed path signal propagation estimation model for planning the base station locations in new deployment areas, based on the signal characteristics of existing networks. In doing so, we utilize digital 3D maps of urban areas and signal measurements to train the adaptive model, and to exploit local similarities in cities in order to estimate the signal channel characteristics and calculate the potential base station locations for covering the new area with a speciffic wireless communication technology.
Efficient personalized learning to rank from implicit feedback for time-sensitive recommendations
(Thesis (Ph.D.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2019., 2019.) Yağcı, Arif Murat.; Gürgen, S. Fikret.; Aytekin, Tevfik.
This thesis focuses on the problems at the intersection of time-sensitive recommendations, implicit user feedback, and learning to rank. Major challenges for achieving time sensitivity are distinguished, the importance of handling implicit feedback is emphasized, and an overview of learning to rank methods is presented with an emphasis on the models that can learn from implicit feedback for time-sensitive recommendations. Subsequently, novel and improved personalized learning to rank methods are proposed to handle large-scale implicit feedback datasets and streams as well as to defeat the different challenges for achieving time-sensitive recommendations. These proposals comprise: (i) Mining the user feedback stream for collaborative filtering and the SASCF algorithm, (ii) Parallel personalized pairwise learning to rank and the PLtR family of algorithms, (iii) Improving the efficiency of top-N predictions from matrix factorization models and the MMFNN meta-algorithm, (iv) Learning intention in user sessions and the BRF family of algorithms, and finally (v) Timely push recommendations in a cold start setting and a hybrid learning to rank approach. Theoretical as well as extensive empirical analyses of the proposed methods on real-life data show significant performance and trade-off improvements with respect to ranking accuracy, adaptivity, diversity, efficiency, and scalability.
Text-based machine learning methodologies for modelling drug-target interactions
(Thesis (Ph.D.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2019., 2019.) Öztürk, Hakime.; Özgür, Arzucan.; Özkırımlı, Elif.
The identi cation of novel interactions between proteins and drugs with computational methodologies constitutes a signi cant area of research. Most often, a drug can be re-purposed to target a novel protein which enables machine learning algorithms to learn from existing interactions to predict unknown interactions. The main goal of this thesis is to model the interactions between proteins and ligands (drug candidates) using their textual representations via machine/deep learning techniques. With that aim, we introduce a novel ligand representation approach and a novel protein representation approach as well as two prediction systems for identifying the strengths of the interactions between proteins and compounds (i.e., their binding a nities). The common theme of these studies is the use of textual representations of proteins (i.e., amino-acid sequences) and compounds (i.e., SMILES). A major advantage of textbased representations is that they are experimentally easier to obtain compared to the three-dimensional (3D) representations and therefore there are more protein/ligand text-based representations available than 3D representations. Furthermore, processing text-based representations is computationally less expensive compared to processing two-dimensional (2D) and 3D representations. We hypothesize that, much like natural languages, bio-chemical sequences have their own languages and processing these languages might reveal important insights about their characteristics. The application of Natural Language Processing (NLP) based approaches in tasks such as protein family/super-family clustering and protein-ligand binding a nity prediction achieved state-of-the-art performance. These results indicate that the textual forms of proteins and ligands can be used to formulate e ective solutions to address di erent bioinformatics and cheminformatics problems.
Ontology-based entity tagging and normalization in the biomedical domain
(Thesis (Ph.D.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2019., 2019.) Karadeniz Erol, Zeynep İlknur.; Özgür, Arzucan.
One of the challenges for scientists in the biomedical domain is the huge amount and the rapid growth of information buried in the text of electronic resources. Developing text mining methods to automatically extract biomedical entities from the text of these electronic resources and identifying the relations between the extracted entities is crucial for facilitating research in many areas in the biomedical domain. Two main problems, which have to be solved to accomplish this goal, are the extraction and normalization of entities, and the identi cation of the relations between them from a given text. In this thesis, we proposed two approaches with two di erent perspectives for the extraction and normalization of biomedical named entities. The rst approach makes use of shallow linguistic knowledge to extract entities and normalize them through an ontology. On the other hand, the second approach makes use of word embeddings, which convey semantic information, for the normalization of the entities in a given text. The word-embedding based approach obtained the state-of-the-art results on the BioNLP Shared Task 2016 Bacteria Biotope data set. Both of the proposed methods are unsupervised and can be adapted to di erent domains. We also developed two applications, one of which is a pipeline, which is composed of modules based on the approaches that we proposed in this thesis, for the extraction of bacteria biotope information from scienti c abstracts. The other application is developed for extracting Brucella-host interaction relevant data from the biomedical literature, whose results reveal the importance of using a wider context than a sentence for biomedical relation extraction.
Community detection in complex networks using local methods and information flow
(Thesis (Ph.D.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2019., 2019.) Taşgın, Mürsel.; Bingöl, Haluk.
Many real world systems can be represented using networks or graphs where ele ments of system are denoted by nodes and their relations by edges. Complex networks are special kinds of these networks with their emergent features created by interactions among nodes. One such emergent feature is the community structure. A community is a group of nodes where nodes within same community have more connections (i.e., edges) with each other than with the nodes in the rest of the network. Identifying such communities is the task of community detection that can be used to identify nodes with similar functions or features, densely connected regions in networks, information ﬂow patterns and spreading of a disease or information in a network. In this thesis, we work on community detection on complex networks using lo cal approach and information diﬀusion. We investigate current algorithms and try to understand the limitations of them. We especially focus on high time-complexity of algorithms because of using global approach, i.e., try to optimize a global metric or perform computations regarding the whole network repeatedly. We propose new algo rithms using local approach (i.e., similarity based on common friends) and information diﬀusion (i.e., gossip spreading). Local approach uses locally available or computable information around a node to identify its community. With this, community detection task can be seen as a set of distributed and parallel tasks running simultaneously on diﬀerent parts of the network. We also propose a variant of label propagation algorithm which decreases its overall execution time by eliminating unnecessary steps. During these studies, we develop a community detection framework which simpliﬁes the task of deﬁning, testing and comparing a new community detection algorithm.
Gait analysis and fall risk assessment with wearable inertial sensors
(Thesis (Ph.D.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2019., 2019.) Tunca, Can.; Ersoy, Cem.
The gold standards for gait analysis are instrumented walkways and camera based motion capture systems which are highly accurate, but require costly infrastruc ture and are only available in hospitals and specialized gait clinics. Hence, a mobile and pervasive alternative suitable for non-hospital settings is a clinical necessity. Us ing wearable inertial sensors for gait analysis has been explored in the literature with promising results; however, the majority of the existing work do not consider realis tic conditions where data collection and sensor placement imperfections are imminent. Moreover, some of the typical underlying assumptions are not compatible with patho logical gait, decreasing the accuracy. To overcome these challenges, we propose a wear able inertial sensor-based gait analysis system that builds upon the state-of-the-art gait analysis methods. Our system copes with various data collection diﬃculties and relaxes some of the assumptions invalid for pathological gait. The system is able to extract a rich set of standard gait parameters and produce average stride proﬁle visualizations, easily interpretable by clinicians. To validate the success of our system and highlight its clinical applicability, we collected a gait dataset from more than 60 neurological disorder patients and conducted a feature selection study to identify the key gait pa rameters in neurological disorders. To demonstrate how the extracted gait parameters can be used for a higher level inference problem, we also introduce an automated fall risk assessment solution, exploiting deep learning methods. The achieved classiﬁcation accuracies outperform the existing solutions. As a ﬁnal contribution, we present the design and evaluation of a computing and communication architecture that shows how the fall risk assessment methods can be transformed into a pervasive healthcare service that can handle numerous users concurrently under realistic conditions.
A collaborative multi-robot localization technique for autonomous robots
(Thesis (Ph.D.)-Bogazici University. Institute for Graduate Studies in Science and Engineering, 2007., 2007.) Bağcı, Hatice Köse.; Akın, H. Levent.
This work proposes a novel method for collaborative global localization of a team of soccer playing autonomous robots. It is also applicable to other indoor real-time robot applications in noisy, unpredictable environments, with insufficient perception. A novel solution, Reverse Monte Carlo Localization (R-MCL) is designed to solve single self-localization problem using local perception and action about the surrounding environment for each robot. R-MCL is a hybrid method based on Markov Localization (ML) and Monte Carlo Localization (MCL) where the ML based part finds the region where the robot should be and the MCL based part predicts the geometrical location with high precision by selecting samples in this region. In the multi-robot localization problem, robots use their own local position estimations, and the shared information from other team mates, to localize themselves. To integrate the local information and beliefs optimally, avoid conflicts and support collaboration among team members, a novel collaborative multi-robot localization method called Collaborative Reverse Monte Carlo Localization (CR-MCL), based on R-MCL, is presented. When robots detect each other, they share the grid cells representing this observation. The power of the method comes from its hybrid nature. It uses a grid based approach to handle detections which can not be accurate in real-time applications, and sample based approach in self-localization to improve its success, although it uses lower amount of samples compared to similar methods. Both methods are tested using simulated robots and real robots and results show that they are fast, robust, accurate and cheap in terms of communication, memory and computational costs.
Crowd - labelling for continuosun - valued annotations
(Thesis (Ph.D.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2018., 2018.) Kara, Yunus Emre.; Akarun, Lale.
As machine learning gained immense popularity across a wide variety of domains in the last decade, it has become more important than ever to have fast and inexpensive ways to annotate vast amounts of data. With the emergence of crowdsourcing services, the research direction has gravitated toward putting ‘the wisdom of crowds’ to use. We call the process of crowdsourcing based label collection crowd-labeling. In this thesis, we focus on crowd consensus estimation of continuous-valued labels. Unfortunately, spammers and inattentive annotators pose a threat to the quality and trustworthiness of the consensus. Thus, we develop Bayesian models taking diﬀerent annotator behaviors into account and introduce two crowd-labeled datasets for evaluating our models. High quality consensus estimation requires a meticulous choice of the candidate annotator and the sample in need of a new annotation. Due to time and budget limitations, it is beneﬁcial to make this choice while collecting the annotations. To this end, we propose an active crowd-labeling approach for actively estimating consensus from continuous-valued crowd annotations. Our method is based on annotator models with unknown parameters, and Bayesian inference is employed to reach a consensus in the form of ordinal, binary, or continuous values. We introduce ranking functions for choosing the candidate annotator and sample pair for requesting an annotation. In addition, we propose a penalizing method for preventing annotator domination, investigate the explore-exploit trade-oﬀ for incorporating new annotators into the system, and study the eﬀects of inducing a stopping criterion based on consensus quality. Experimental results on the benchmark datasets suggest that our method provides a budget and time-sensitive solution to the crowd-labeling problem. Finally, we introduce a multivariate model incorporating cross attribute correlations in multivariate annotations and present preliminary observations.
Reception modeling and achievable rate analysis of sphere - to - sphere molecular communication via diffusion
(Thesis (Ph.D.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2018., 2018.) Genç, Gaye.; Tuğcu, Tuna.
As a widely acknowledged information transfer method in the nanonetworking domain, Molecular Communication via Diﬀusion (MCvD) presents many advantages as well as challenges. In order to assess the capabilities and restrictions of MCvD, a thorough understanding of the reception process and the achievable rate holds utmost importance. With a reﬂective spherical transmitter and a fully absorbing spherical receiver, the network setup becomes more realistic, but analytical derivations become increasingly diﬃcult. In this thesis, we propose two novel heavy-tail distributions to statistically model the distribution of the ﬁrst passage time of messenger molecules (MM), conduct Kolmogorov-Smirnov goodness of ﬁt tests for model validation, and examine the modeling performance under diverse deployment parameters. We also in vestigate MM absorption probability, Signal-to-Interference Ratio, and the advantages of using a reﬂective transmitter. Since the heavy-tailed signal causes Inter-Symbol In terference (ISI), the MCvD channel has memory, and Shannon’s capacity formula for memoryless channels is inapplicable. To this end, we propose an accurate ISI-aware model of demodulation and bit error probabilities for Binary Concentration Shift Key ing modulated MCvD, carry out goodness of ﬁt tests, and prove that the literature’s assumption of a single symbol duration memory is overly optimistic. Furthermore, we adapt the general formulation of the achievable rate for ergodic ﬁnite state ISI channels to MCvD and investigate the eﬀect of deployment parameters, demodulation threshold, and input distribution on the achievable rate. We also present preliminary ﬁndings on estimating the achievable rate with a neural network. Finally, we apply the biological concept of protrusions to MCvD, in order to physically reduce ISI and study the eﬀect of protrusion deployment parameters on the achievable rate.
Finite and small - space automata with advice
(Thesis (Ph.D.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2018., 2018.) Küçük, Uğur.; Say, Ahmet Celal Cem.
Advice is a piece of trusted supplemental information that is provided to a computing device, in advance of its execution in order to extend its power beyond its limits and hence to assist it in its task. The content of this assistance, which is not restricted to be computable, typically depends only on the length, and not the full content of the actual input to the device. Advised computation has been studied on various computational models and in relation with concepts as diverse as complexity, nonuniform computation, formal languages and pseudorandomness. Several models for providing such external assistance to ﬁnite automata have also been studied by various groups. In this research, we introduce two novel models of advised ﬁnite automata: ﬁnite automata with advice tapes and ﬁnite automata with advice inkdots. In the former model advice is provided in the form of a string which is placed on a separate tape accessible independently from the input. In the latter one, we model advice as a set of uniform marks placed on the input tape which are called inkdots. We examine the power and limits of each of these models in a variety of settings where the underlying model of computation is deterministic, probabilistic or quantum and the advice is deterministically or randomly chosen. The roles of increasing amounts of advice as a computational resource are taken into consideration in various forms. The variants of each model are compared with each other and with the previously studied models of advised ﬁnite automata in terms of language recognition power. The main results of this analysis are demonstrated by establishing various separations, collapses and inﬁnite hierarchies of the language classes that can be recognized with diﬀerent models in varying settings.

Browse

Recent Submissions