Ph.D. Theses
Permanent URI for this collection
Browse
Browsing Ph.D. Theses by Title
Now showing 1 - 20 of 115
Results Per Page
Sort Options
Item A bayesian approach to the clustering problem with application to gene expression analysis(Thesis (Ph.D.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2016., 2016.) Fidaner, Işık Barış.; Cemgil, Ali Taylan.This thesis investigates methods for extraction of information from gene expression time series data. These time series provide indirect measurements about the underlying biological mechanisms, hence their analysis heavily depends on statistical modelling techniques. One particularly popular analysis approach is clustering genes by their similarity of expression profiles. However, for scientific data analysis, clustering requires a rigorous methodology and Bayesian nonparametrics provides a promising framework. In this context, two novel models were developed: Infinite Multiway Mixture (IMM) that extends the standard infinite mixture model; and Infinite Mixture of Piecewise Linear Sequences (IMPLS) that assumes a specific structure for its mixture components, tailored towards gene expression time series. In the Bayesian paradigm, the key object for gene analysis is the posterior distribution over partitionings, given the model and observed data. However, a posterior distribution over partitionings is a highly complicated object. Here, we apply Markov Chain Monte Carlo (MCMC) inference to obtain a sample from the posterior distribution of gene partitionings, and cluster genes by a heuristic algorithm. An alternative, novel approach for the analysis of distributions over partitions is also developed, that we named as entropy agglomeration (EA). We demonstrate the use of EA by a clustering experiment on a literary text, Ulysses by James Joyce. In our bioinformatics application CLUSTERnGO (CnG), the relevance of resulting clusters are evaluated by applying standard multiple hypothesis testing to compare them against previous biological knowledge encoded in terms of a Gene Ontology. The complete workflow of CnG consists of a four-phase pipeline (Configuration, Inference, Clustering, Evaluation).Item A collaborative multi-robot localization technique for autonomous robots(Thesis (Ph.D.)-Bogazici University. Institute for Graduate Studies in Science and Engineering, 2007., 2007.) Bağcı, Hatice Köse.; Akın, H. Levent.This work proposes a novel method for collaborative global localization of a team of soccer playing autonomous robots. It is also applicable to other indoor real-time robot applications in noisy, unpredictable environments, with insufficient perception. A novel solution, Reverse Monte Carlo Localization (R-MCL) is designed to solve single self-localization problem using local perception and action about the surrounding environment for each robot. R-MCL is a hybrid method based on Markov Localization (ML) and Monte Carlo Localization (MCL) where the ML based part finds the region where the robot should be and the MCL based part predicts the geometrical location with high precision by selecting samples in this region. In the multi-robot localization problem, robots use their own local position estimations, and the shared information from other team mates, to localize themselves. To integrate the local information and beliefs optimally, avoid conflicts and support collaboration among team members, a novel collaborative multi-robot localization method called Collaborative Reverse Monte Carlo Localization (CR-MCL), based on R-MCL, is presented. When robots detect each other, they share the grid cells representing this observation. The power of the method comes from its hybrid nature. It uses a grid based approach to handle detections which can not be accurate in real-time applications, and sample based approach in self-localization to improve its success, although it uses lower amount of samples compared to similar methods. Both methods are tested using simulated robots and real robots and results show that they are fast, robust, accurate and cheap in terms of communication, memory and computational costs.Item A concurrent programming system with a flexible structure(Thesis (Ph.D.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 1984., 1984.) Şıkoğlu, Tamer.; Balman, Tunç.This dissertation describes the design and imp)ementation of a concurrent programming system called CPS/1100 which is a modelling language used in systems programming. Processor dedication and program trace capabilities are the distinguishing characteristics of the language. In addition to these features, the implementation scheme used in developing the language is such that it can easily be modified and extended to develop languages for specific applications. This flexible character of the language can be considered as the main contribution of the present work. Structurally, CPS/1100 is an extension of a Pascal compiler by prescanner and postscanner routines. Prescanner routines provide syntax and semantic checks of the CPS/1100 statements and the control directives, while the postscanner routine modifies the assembler codes produced by the Pascal compiler. Concurrency primitives of Exec8, which is the system program of Univac 1100 machjnes, are utilized to maintain task creation, removal and control. This dissertation provides an overview of concurrent programming and multitasking on Univac 1100 systems. Statements, structure and implementation details of CPS/1100 are explained. Typical concurrent programming problems are solved by using CPS/1100 to prove its capabilities. Finally, this study is evaluated.Item A flexible approach for context-aware service selection in agent-mediated e-commerce(Thesis (Ph.D.)-Bogazici University. Institute for Graduate Studies in Science and Engineering, 2008., 2008.) Şensoy, Murat.; Yolum, Pınar.In this dissertation, we examine the problem of service selection in an e‐commerce setting where consumer agents cooperate to identify providers that would satisfy their service needs the most. There are three major challenges related to service selection: (i) representing consumers' past dealings with providers, (ii) handling deceptive information and (iii) managing evolution of consumers' service needs and semantics. Previous approaches represent consumers' past dealings with providers only as ratings. Even though the context is crucial for interpreting the ratings correctly, ratings do not contain any contextual information explicitly. A rating merely represents subjective opinion of a consumer about a provider. Because, the satisfaction criteria and the expectations of the rater are unknown, it is almost impossible to make sense of a rating. In this dissertation, consumers’ experiences with providers are represented with an ontology that can semantically capture the requested service and the received service in detail. When a consumer decides to share its experiences with a second consumer, the receiving consumer evaluates the experience using its own context and satisfaction criteria. By sharing experiences rather than ratings, the consumers can model providers more accurately and thus can select providers that are better suited for their needs. The proposed service selection approach in this dissertation is flexible enough to enable consumers to evolve their service semantics over time; context‐aware and consumer‐oriented to enable consumers to use their own satisfaction criteria and context during service selection; and robust to deception to lead satisfactory service selections even in highly deceptive environments.Item A scheduling model for centralized cognitive radio networks(Thesis (Ph.D.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2012., 2012.) Gözüpek, Didem.; Alagöz, Fatih.In this thesis, we present a scheduling model for centralized cognitive radio networks. Our model consists of a set of schedulers that focus on the data transmission of the secondary users and determine with which frequency, time slot and data rate each secondary user will transmit to the cognitive base station. Common features of the schedulers are that all of them ensure that the primary users in the service area of the cognitive base station are not disturbed, no collisions occur among the secondary users, and reliable communication of the secondary users with the cognitive base station is maintained. Our schedulers differ from each other mainly in terms of their objectives. We propose schedulers that maximize the overall cognitive radio cell throughput, minimize the average scheduling delay of the secondary users, provide maxmin, weighted max-min and proportional throughput fairness, maximize the number of secondary users that are satisfied in terms of throughput, and take the different delay costs of switching to different frequency bands into account. In addition to heuristic algorithms and simulation based studies, we also present a graph theoretic approach and prove several NP-hardness and inapproximability results, propose polynomial time graph algorithms as well as approximation algorithms.Item ADES : automatic driver evaluation system(Thesis (Ph.D.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2011., 2011 .) Kaplan, Kemal.; Akın, H. Levent.Most of the traffic accidents occurred in the world and in our country are caused by the drivers. ADES (Automatic Driver Evaluation System) project targets to present a framework for integrating different applications for driver evaluation purpose. The proposed system can be divided into two main modules. The first one, which is the data acquisition and processing module, acquires the sensor information from the outside world and processed this data to present valuable information to the decision system. The system may benefit from built-in sensors like cameras or GPS (Global Positioning System) systems as well as non standard devices like RFID (Radio Frequency Identification) readers. The second module is the inference engine, which processes the information provided by the first module and makes judgments about the actions of the driver. Two sample expert system designs are proposed in the project. The developed solution is tested in simulation environment and by using real video recordings.Item Algorithms for learning from online human behavior and human interaction with learning algorithms(Thesis (Ph.D.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2022., 2022) Çapan, Gökhan.; Özgür, Arzucan.; Cemgil, Ali Taylan.In modern digital systems, algorithms that deliver personalized content shape the user experience and affect user satisfaction, hence long-term engagement with the system. What the system presents also influences the parties providing content to the system since visibility to the user is vital for reachability. Such algorithms learn to deliver personalized content using data on previous user behavior, e.g., their choices, clicks, ratings, etc., interpreted as a proxy for user preferences. In the first part of this work, we review prevalent models for learning from user feedback on content, including our contributions to the literature. As such data is ever-growing, we discuss computational aspects of learning algorithms and focus on software libraries for scalable implementations, including our contributions. The second part is on learning from user interactions with algorithmic personalization systems. Albeit helpful, human behavior is subject to cognitive biases, and data sets comprising their item choices are subject to sampling biases, posing problems to learning algorithms that rely on such data. As users interact with the system, the problem worsens—the algorithms use biased data to compose future content. Further, the algorithms self-reinforce their inaccurate beliefs on user preferences. We review some of the biases and investigate a particular one: the user’s tendency to choose from the alternatives presented by the system, putting the least effort into exploring further. To account for it, we develop a Bayesian choice model that explicitly incorporates in the inference of user preferences their limited exposure to a systematically selected subset of items by an algorithm. The model leads to an efficient online learning algorithm of user preferences through interactions.Item Alternative spectrum trading architectures in cognitive radio networks: Spectrum exchange, CRM, strict power control(Thesis (Ph.D.)-Bogazici University. Institute for Graduate Studies in Science and Engineering, 2010., 2010.) Alptekin, Gülfem Işıklar.; Bener, Ayşe B.The underutilization of spectrum coupled with developments in network technologies has prompted a number of proposals for managing spectrum. Dynamic spectrum access radio technology, which is based on cognitive radio technology, promises to increase spectrum sharing and thus overcome the lack of available spectrum for new communication services. It allows unlicensed secondary systems to share the spectrum with the licensed systems. In this dissertation, different architectures are investigated in a cognitive radio environment. The considered network is assumed to consist of multiple primary service providers which have some unutilized bandwidth, and multiple secondary users that require bandwidth. Secondary users are assumed to pay the primary service providers for short term usage of their available spectrum bands; which is referred as the spectrum trading. The proposed architectures all aim at establishing a framework where each type of users satisfies with the services. As each new entrant secondary user creates interference on the incumbent users, controlling the power emission in a cognitive radio network is crucial in spectrum trading. Furthermore, proposed architectures examine the unit spectrum prices that primary service providers set in the multiple-seller and multiplebuyer environment. Modeling the competitive relationships among network elements as games ensures analyzing all elements’ behaviors and actions in a formalized way. The existence of various network elements that want to maximize its own profits makes the problem very complex, with usually conflicting objective functions. Therefore, the proposed pricing models have their basis on the game theory in order to deal with the severe competition in spectrum trading markets.Item An application aware utility based lifetime quantification framework for wireless sensor networks(Thesis (Ph.D.)-Bogazici University. Institute for Graduate Studies in Science and Engineering, 2009., 2009.) Özgövde, Bahri Atay.; Ersoy, Cem.In this work, we devise a framework for incorporating the application dependence into the lifetime measurement process of the wireless sensor networks, thereafter via extensive experiments, demonstrate the signi cance of the lifetime metric itself in the quanti cation process for a variety of application scenarios including both scalar and video based wireless sensor networks. We show that the lifetime metrics that ignore application dependence fail in solving the network lifetime quanti cation problem in WSNs. Our proposed framework, weighted cumulative operational time (WCOT), combines two distinct mechanisms for realistic and application context aware network lifetime evaluation. Firstly, by introducing the utility function it enables the users of the network to inscribe their own application level requirements in a formal setting. This clari es the inherent subjectivity due to the application dependence involved in the WSN network lifetime quanti cation problem by transforming it into a form that renders further computation possible. The utility function denotes the total cumulative utility (usefulness) o ered by the collaboration of the sensor nodes. Secondly, instead of o ering a single cut-off threshold value for de ning the point after which the network is assumed to be nonfunctional, WCOT framework makes use of the gradual change in the utility of the network and record how the network evolves over time in terms of functionality o ered by keeping the weighted sum of the operational time. Unlike lifetime metrics that focus on a single threshold value, WCOT is able to di erentiate network performances that di er in how the network evolves till the utility drops to zero.Item An efficient evolutionary clustering and prediction model for gene expression time series data(Thesis (Ph.D.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2014., 2014.) Erdem, Atakan.; Gündem, Taflan.Because of using manual methods in some parts of gene expression experiments, reliability of the data is low. If this data is directly utilized as input to a data mining algorithm or a model for evaluating gene expression data, then the adverse affects to the desired results will be inevitable. In order to eliminate aforementioned adverse affects and reduce the fuzziness, we represent the data with sample data sets that are generated by using uncertain data management techniques. Sample data approach not only reduces the percentage of fuzziness, but also it causes the output generation time to be increased due to an increase in the amount of processed data, which is directly proportional to the cardinality of the sample data set. In the first part of the study, we introduce an uncertain data clustering algorithm, named M-FDBSCAN, for enabling one to cluster uncertain data rapidly, which runs on multi-core systems in a concurrent fashion. We show that by using the proposed method, the algorithm yields considerable performance improvement on single core systems, as well. In the second part of the study, M-FDBSCAN algorithm is converted into an evolutionary clustering algorithm, named E-MFDBSCAN, by which time series data can be processed rapidly and efficiently. This new algorithm enables to generate global clusters. In the last part of the study, using time-based evolutionary patterns of global clusters a prediction model is constructed. The proposed prediction model enables us to predict the patterns and the similarities of a global cluster that will be generated at the next time point.Item An issue recommender model using the developer Collaboration network(Thesis (Ph.D.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2014., 2014.) Çağlayan, Bora.; Bener, Ayşe B.; Tosun, Oğuz.Assignment of new issues to developers is an important part of software maintenance activities. In this research, we build an issue recommendation model that recommends new issues to developers and highlights the defect prone software modules to the developer who owns an issue. Existing solutions address the automated issue assignment problem through text mining. We propose a recommender model that uses the collaboration network of developers on software modules and the structured issue data as its input. We perform an exploratory analysis using the issue data of two large software systems and observe the trends in the issue ownership, issue defect relations and issue timelines. Our base model estimates the developer centrality for each issue category and recommends issues to developers based on their centrality ranking. We test the performance of our recommender using the maintenance data of a globally developed large enterprise software using recommendation accuracy and workload imbalance as the performance metrics. We extend the recommender to address, (i) The problem of developer workload imbalances, (ii) The problem of assigning issues to a new group of developers by using stochastic Kronecker networks to model the future structure of the collaboration network. We change a learning based defect predictor's output based on recent history to update the predicted defect-prone software modules on a real-time basis. Our empirical analysis shows that: (i) The performance of our recommender model approximates historical trends of issue allocation, (ii) Heuristics applied to the model output reduces the issue ownership inequalities and model approximation performance, (iii) Kronecker networks can be used to estimate the future collaboration on the issues and the model can be extended by them to propose issues to new developer groups, (iv) Real time defect prediction model can signi cantly improve probability of detection over time while not changing false alarm rates.Item Analysis of reception process for an absorbing receiver in molecular communication via diffusion(Thesis (Ph.D.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2015., 2015.) Akkaya, Ali.; Tuğcu, Tuna.Nanotechnology is currently being applied to vast number of elds to overcome the challenges faced with existing technologies that cannot e ciently scale down to nano level. However, considering the limited processing and memory resources of nano-machines, performing complex tasks requires new communication mechanisms. Communication is one of the important issues to be addressed in nano-scale environment. Inspired by the nature, molecular communication via di usion is a candidate to address this issue. Although the reception process of the messenger molecules has a signi cant impact on the performance of molecular communication via di usion, the factors that e ect the received signal for an absorbing receiver have not been investigated in the literature. In this thesis, we rst introduce methods for e cient simulation of molecular communication via di usion to enable further analysis. We propose two novel simulation architectures; a dual-zone simulation model to decrease execution time while preserving simulation accuracy and an HLA based architecture for distributed simulation of molecular communication via di usion. Then, we analyse di erent dimensions of reception process for an absorbing receiver to derive closed form formulations. The results presented enable optimizations that will have a direct e ect on production costs of receptors and the receivers. Finally, we propose a new approach for demodulation of information for an absorbing receiver and analyse energy consumption and data rate for the proposed model.Item Application mapping and optimization for cmp based architectures(Thesis (Ph.D.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2011., 2011.) Demiröz, Betül.; Tosun, Oğuz.; Topçuoğlu, Haluk Rahmi.Chip Multiprocessors (CMPs) are becoming standard and primary building blocks for personal computers as well as large scale parallel machines, including supercomputers. In this thesis, our main focus is on performance-aware mapping and optimization of application threads onto multicore architectures. Specifically, we propose three different approaches, which are data-to-core mapping methodology, threadto- core mapping methodology, and cache-centric data assignment methodology that includes data-to-thread mapping. For demonstrating data-to-core mapping methodology, we propose two novel parallel formulations for the Barnes-Hut method on the Cell Broadband Engine architecture by considering technical specifications and limitations of the Cell architecture. Our experimental evaluation indicates that the Barnes-Hut method performs much faster on the Cell architecture compared to the reference architecture, an Intel Xeon based system. To present thread-to-core mapping methodology, we propose a framework that uses helper threads running in parallel with application threads, which dynamically observe the behavior of application threads and their data access patterns. These helper threads calculate data sharing among application threads, cluster them to be mapped to available cores, use cache counters to calculate the efficiency of a mapping, and make the mapping decision after considering the execution needs. Our final methodology provides a locality-aware mapping algorithm, which targets to assign computations with similar data access patterns of an application to the same core. Our algorithm divides computations of the application into chunks to provide load balancing, and a set of chunks with high similarity is grouped into bins to provide data locality. We consider the sparse matrix-vector multiplication as the reference application.Item Application of the 3G PCS technologies to the mobile subsystem of the next generation tactical communications systems(Thesis (Ph.D.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 1999., 2000.) Çayırcı, E. (Erdal); Ersoy, Cem.The advances in high-speed computations, multimedia and wireless communications promise new opportunities to develop more robust and agile battlefield communications systems. However, there are some major distinctions between the military and commercial communications. We can enumerate the most basic two of them as the hostile environment in battlefield, and the rapid deployment requirement. In our study, we discuss how to employ emerged and evolving civilian commercial technologies, namely the third generation (3G) Personal Communications Services (PCS) techniques for military communications in spite of the existence of such distinctions. We propose an approach called Virtual Cell Layout (VCL) in which the communications area is tessellated with regularly shaped fixed virtual cells starting from a geographic reference location. The radio resources are managed in a multitier hybrid network by employing both cellular and ad hoc techniques and using VCL.We also propose a simulation approach for the performance evaluation of the tactical communications systems. In this approach, the commands entered during the military computer aided exercises are replayed by running a constructive (combat) model which generates mobility, posture and status data for a number of units, then these data are enhanced and drive a simulation which produces the data related to the performance metrics. The evaluated performance of the system shows that the VCL based architecture satisfies the rapid deployment requirement and gives an acceptable grade of service.Item Assessing and enhancing machine learning methods in IVF process : predictive modeling of implantation and blastocyst development(Thesis (Ph.D.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2011., 2011.) Özkaya, Aslı Uyar.; Bener, Ayşe B.In this thesis, we address the decision-making problems in in vitro fertilization treatment from the machine learning perspective aiming to increase the clinical success rates. Initially, we present a comprehensive and comparative analysis of the classification techniques in embryo-based implantation prediction. In parallel, we evaluate the predictor effects of input features in order to eliminate the redundant variables and decide the optimum feature subset leading to the highest prediction performance. In contrast to the limited relevant literature, our preliminary experiments demonstrate the potential of machine learning classifiers as an automated decision support tool in critical decisions affecting the success of the treatment. Later, we focus on improving the classification performance either by algorithmic enhancements or by improving the information content of the data. First, we handle the problem of imbalanced class distribution and show that decision threshold optimization and re-sampling the training data produce similar results. Second, we propose a frequency based encoding technique to effiently transform categorical variables into continuous numeric values. And third, in addition to the patient and embryo characteristics, we investigate the effect of in- dividual physicians as a human factor on the pregnancy outcome. Finally, we apply Bayesian Networks to model the embryo growth process with the objective of blastocyst score prediction. We propose a novel approach to adjust the frequency estimates for parameter learning in conditional probability tables. The results of the experiments show that (i) the standard machine learning algorithms enable acceptable prediction of implantation and blastocyst score and ii) the prediction performance can be improved by using the proposed techniques in this study. From the clinical perspective, our re- sults have practical implications in reducing multiple pregnancies, preventing waste of embryos and cancelation of transfers.Item Auction and barter models for electronic markets(Thesis (Ph.D.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2011., 2011.) Özer, Ali Haydar.; Özturan, Can.We propose three auction and barter based electronic market models. Our first model is a direct barter model for the course add/drop process in the universities. We model the course add/drop process as a direct barter problem in which add/drop requests can be placed as barter bids. We also introduce a two-level weighting system that enables students to express priorities among their requests while providing fairness among the students. Our second model is the multi-unit differential auction-barter model which augments the double auction model with barter bids so that besides the usual purchase and sale activities, bidders can also carry out direct bartering of items. Our model also provides a mechanism for making or receiving a differential money payment as part of the direct bartering of items, hence, allowing bartering of different valued items. Furthermore, a powerful and flexible bidding language is designed which allows bidders to express their complex preferences of purchase, sell and exchange requests. Our third model is the double auction with limited cover money model. In this model, we propose the use of discrete time double auction institution for the trading of used goods as well as new ones. Our model allows declaration of an amount of cover money so that what is spent on purchased items minus the proceeds of sold items does not exceed this cover money amount. We also introduce a mechanism so that bidders may place multiple item requests in a single bid and limit the maximum number of items to be purchased. We formally define these three models and formulate the corresponding optimization problems. We propose fast polynomial-time network flow based algorithms for optimizing the first and the second models and show that the decision version of the optimization problem for the third model is NP-complete. The performances of our algorithms are also demonstrated on various test cases.Item Automated query-biased and structure-preserving document summarization for web search tasks(Thesis (Ph.D.)-Bogazici University. Institute for Graduate Studies in Science and Engineering, 2010., 2010.) Pembe, Fatma Canan.; Güngör, Tunga.With the drastic increase of available information sources on the Internet, people with different backgrounds in the world share the same problem: locating useful information for their actual needs. Search engines provide a means for users to locate documents on the Web via queries. However, users still have to perform the sifting process by themselves; i.e., to decide the relevance of each document with respect to their actual information needs. At this point, automatic summarization techniques can complement the task of search engines. Currently available search engines, such as Google and AltaVista, only show a limited capability in summarizing the Web documents; e.g. displaying only two or three lines of text fragments which consist of the query words and their surrounding text as the summary. In the literature, most of the research in automatic summarization has focused on creating general-purpose summaries without considering user needs. Also, summarization approaches have mostly seen a document as a flat sequence of sentences and ignored the structure within the documents. In the summarization literature, the effect of query-biased techniques and document structure have been considered only in a few studies and separately investigated. This research is distinguished from previous work by combining these two aspects in a coherent framework. In this thesis, we propose a novel summarization approach for Web search, i.e., query-biased and structure-preserving document summarization. The proposed system consists of two main stages. The first stage is the structural processing of Web documents in order to extract their section and subsection hierarchy together with the corresponding headings and subheadings. A document in the system is represented as an ordered tree of headings, subheadings and other text units. First, we formed a rule-based approach based on heuristics and HTML Document Object Model tree processing. Then, we developed a machine learning approach based on the tree representation using support vector machine (SVM) and perceptron algorithms. The methods were evaluated based on the accuracy of heading extraction and hierarchy extraction. The second stage of the research is to develop automatic summarization methods by utilizing the document structures obtained in the first stage. In the proposed method, the summary sentences are extracted in a query-biased way based on two levels of scoring: sentence scoring and section scoring. Document structure is utilized both in the summarization process and in the output summaries. The performance of the proposed system has been determined using several task-based evaluations. These include information retrieval tasks where the summaries will actually be used. The results of the experiments on Turkish and English documents show that the proposed system summaries are superior to Google extracts and unstructured query-biased summaries of the same size in terms of accuracy with reasonable judgment times. User ratings verify that query-biased and structure-preserving summaries are also found to be more useful by the users.Item Automated reasoning on exceptions in commitment-based multiagent systems(Thesis (Ph.D.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2012., 2012.) Kafalı, Remzi Özgür.; Yolum, Pınar.Exceptions constitute a signi cant portion of people's lives. When things do not go as planned, due to environmental reasons or because one does not bring about his responsibility in a given task, unexpected situations occur. When faced with exceptions, people need to deal with them in a timely fashion in order to restore proper working. However, dealing with exceptions is not an easy task for people to accomplish. First, it requires understanding that something has gone wrong (detection). Second, the actual source of the problem needs to be identi ed (diagnosis). Moreover, in some situations, identifying that an exception will possibly occur in the future helps changing the course of previously planned actions in order to avoid the exception (prediction). Accordingly, this thesis proposes to use agents for automating the reasoning on exceptions. We model the problem domains with open multiagent systems, and use commitments to formalize agent interactions. We propose automated methods based on computational logic for detecting, predicting, and diagnosing exceptions. We prove that our methods are sound and complete. We study our methods on two domains, online social networks and e-commerce, which exhibit di erent characteristics for the exceptions that may arise in them. Our specific contributions in this thesis are threefold. First, we extend the scope of detected exceptions in the literature such that an exception is not limited to a commitment violation. Second, we provide a prediction system based on model checking that identi es exceptions before they even occur. Finally, we investigate the temporal relations among commitments in order to diagnose what has gone wrong during an agent's execution.Item Automatic synthetic benchmark generation for multicore systems(Thesis (Ph.D.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2015., 2015.) Deniz, Etem.; Şen, Alper.We present a novel automated multicore benchmark synthesis framework for multicore systems including CPUs and GPUs with characterization and generation components to speed up architectural simulation of modern architectures. We rst identify a set of important application characteristics for CPUs and GPUs. Then, our framework captures these characteristics of original multicore applications and generates synthetic multicore benchmarks from those applications where synthetic benchmarks are a miniaturized form of applications that allow high simulation speeds and act as proxies of proprietary applications. We use parallel software architectural patterns in capturing important characteristics of CPU applications where we apply di erent machine learning techniques in a novel approach to automatically detect parallel patterns used in applications. In addition, we compare these techniques in terms of accuracy and speed and demonstrate that detecting parallel patterns is crucial for performance improvements and enables many architectural optimizations. The resulting synthetic benchmarks are small, fast, portable, human-readable, and they accurately re ect the key characteristics of the original multicore applications. Our synthetic CPU benchmarks use either Pthreads or Multicore Association (message passing and resource management) libraries and synthetic GPU benchmarks use OpenCL library. To the best of our knowledge, this is the rst time synthetic OpenCL benchmarks for GPUs are generated from existing applications. We implement our techniques for CPUs in the MINIME tool and generate synthetic benchmarks. Similarly, we implement our techniques for GPUs in the MINIME-GPU tool and experimentally validate them.Item Bayesian approaches for privacy preserving data sharing(Thesis (Ph.D.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2020., 2020.) Ermiş, Beyza.; Cemgil, Ali Taylan.In this thesis, we focus on the data fusion problem where we have heterogeneous data which is collected from di↵erent sources and stored in the form of matrices and higher-order tensors and propose coupled matrix and tensor factorization models to be able to jointly analyze these relational datasets. This method performs simulta neous factorization of matrices and tensors by extracting the common latent factors from the shared modes. We develop coupled models using various tensor models and cost functions for the missing link prediction problem and report the successful empir ical results. Most of the time, the data matrices and tensors are distributed between several parties. Sharing information across those parties brings the privacy protec tion requirement, therefore the second problem we handle is protecting the privacy of distributed and heterogeneous datasets. We exploit the connection between di?erential privacy and sampling from a Bayesian posterior to derive an efficient coupled tensor factorization algorithm. We empirically show that our methods are able to provide good prediction accuracy on synthetic and real datasets while providing provable pri vacy guarantee. Finally, we propose an approach to preserve the privacy of the neural network’s training data due to the connection between tensor factorization and neural networks. We introduce a dropout technique that provides an elegant Bayesian in terpretation to dropout, and show that the intrinsic noise added can be exploited to obtain a degree of differential privacy.