Ph.D. Theses
Permanent URI for this collection
Browse
Browsing Ph.D. Theses by Issue Date
Now showing 1 - 20 of 121
Results Per Page
Sort Options
Item A concurrent programming system with a flexible structure(Thesis (Ph.D.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 1984., 1984.) Şıkoğlu, Tamer.; Balman, Tunç.This dissertation describes the design and imp)ementation of a concurrent programming system called CPS/1100 which is a modelling language used in systems programming. Processor dedication and program trace capabilities are the distinguishing characteristics of the language. In addition to these features, the implementation scheme used in developing the language is such that it can easily be modified and extended to develop languages for specific applications. This flexible character of the language can be considered as the main contribution of the present work. Structurally, CPS/1100 is an extension of a Pascal compiler by prescanner and postscanner routines. Prescanner routines provide syntax and semantic checks of the CPS/1100 statements and the control directives, while the postscanner routine modifies the assembler codes produced by the Pascal compiler. Concurrency primitives of Exec8, which is the system program of Univac 1100 machjnes, are utilized to maintain task creation, removal and control. This dissertation provides an overview of concurrent programming and multitasking on Univac 1100 systems. Statements, structure and implementation details of CPS/1100 are explained. Typical concurrent programming problems are solved by using CPS/1100 to prove its capabilities. Finally, this study is evaluated.Item Qualitative system identification(Thesis (Ph.D.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 1992., 1992.) Say, Ahmet Celal Cem.; Kuru, Selahattin,The main contribution of this research in the qualitative reasoning area of Artificial Intelligence is the development of the qualitative system identification algorithm QSI. QSI's input is a description of the qualitative behaviors of the system to be identified. Its output is a constraint model (possibly containing "deep" parameters absent in the input) of that system, in the format of Kuipers' qualitative simulation algorithm QSIM. The QSI approach to qualitative modeling makes no assumptions and requires no knowledge about the "meanings" of the system parameters. QSI is discussed in detail. Other contributions are a new method of eliminating a class of spurious QSIM predictions, and an algorithm for postdiction. Unlike other approaches to spurious behavior reduction, the method presented here does not require restricting assumptions about the input model. A particular kind of spurious behavior is shown to be caused by pure QSIM's insistence on assigning only point values to "corresponding value tuples" associated with model constraints. The solution put forward here preserves the overall complexity of the algorithm, while producing fewer incorrect predictions, as shown by the presented reports of the case runs and proofs. Postdiction is the task of finding out the possible pasts of the system under consideration, given the laws of change and the current state. For obtaining the algorithm, a different scheme of interpreting the tree built by simulation is imposed, as well as the handling of the "flow" of time. Issues of thig reasoning task, which is promising for diagnosis applications, are discussed.Item Computer processing of Turkish: |morphological and lexical investigation(Thesis (Ph.D.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 1995., 1995.) Güngör, Tunga.; Kuru, Selahattin,The morphological analysis of Turkish is the subject of this thesis. Turkish belongs to the group of agglutinative languages. Because of its agglutinative nature, Turkish morphology is quite complex and includes many exceptional cases. Most recent research on Turkish morphology have limited themselves with a partial treatment of the language. The study has concentrated especially on the explanation and representation of the basic rules. The main Objective of this thesis is to bring the full morphological structure of Turkish to light and to build its computer representation. Before this analysis is handled, the syntactic or semantic parsing of the language is quite impossible. In this study, we divide the analysis of the morphology into two interrelated parts: morphophonemic analysis and morpho tactic analysis. We investigate and define the morphological structure for both of these. Then we combine these in the Augmented Transition Network (ATN) formalism. This forms the formal representation of the Turkish morphological structure. This proposed morphological structure forms a basis for the language applications about Turkish. Among these applications, we design and implement a morphological parser and a spelling checker which incorporates a spelling corrector component. We perform statistical analysis of Turkish based on this morphological representation and the implemented programs. This analysis is formed of two parts: lexical and morphological analysis, and corpus analysis. The first one uses the information about the structural parts of the language. The second one deals with the daily usage of the language. For this purpose, we form a corpus and run the spelling checker program on this corpus.|Key words: Computational linguistics, Natural language processing, Morphological analysis, Turkish, Augmented transition networks, Spelling checking, CorpusItem Application of the 3G PCS technologies to the mobile subsystem of the next generation tactical communications systems(Thesis (Ph.D.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 1999., 2000.) Çayırcı, E. (Erdal); Ersoy, Cem.The advances in high-speed computations, multimedia and wireless communications promise new opportunities to develop more robust and agile battlefield communications systems. However, there are some major distinctions between the military and commercial communications. We can enumerate the most basic two of them as the hostile environment in battlefield, and the rapid deployment requirement. In our study, we discuss how to employ emerged and evolving civilian commercial technologies, namely the third generation (3G) Personal Communications Services (PCS) techniques for military communications in spite of the existence of such distinctions. We propose an approach called Virtual Cell Layout (VCL) in which the communications area is tessellated with regularly shaped fixed virtual cells starting from a geographic reference location. The radio resources are managed in a multitier hybrid network by employing both cellular and ad hoc techniques and using VCL.We also propose a simulation approach for the performance evaluation of the tactical communications systems. In this approach, the commands entered during the military computer aided exercises are replayed by running a constructive (combat) model which generates mobility, posture and status data for a number of units, then these data are enhanced and drive a simulation which produces the data related to the performance metrics. The evaluated performance of the system shows that the VCL based architecture satisfies the rapid deployment requirement and gives an acceptable grade of service.Item Distributed simulation framework based on load balanced implementation of standard clock algorithm with web based extensions(Thesis (Ph.D.)-Bogazici University. Institute for Graduate Studies in Science and Engineering, 2003., 2003.) Darcan, Osman Nuri.; Kaylan, Ali Rıza.Standard Clock approach is used to simulate a number of parametric variants of a single system. In this thesis, a simulation engine based on the distributed implementation of the Standard Clock approach on networks of heterogeneous UNIX workstations is developed. The objective is to examine the scalability of the implementation and study the effect of load balancing. Two different heuristic load balancing techniques are proposed: (1) a static load balancing that is based on estimated cost of each variant and (2) a dynamic load balancing that migrates variants between workstations, based on their estimated performance during the simulation process. Simple queueing models are used to study the performance. Numerical results obtained from real-time simulations on a network of up to 7 workstations are used to investigate the speedup and the efficiency of both the implementation and the load balancing techniques. As more workstations are added to the simulation environment, a sublinear speedup is obtained. In addition, the load balanced distribution has produced an improvement up to 8 per cent compared to random distribution. Secondly, a web based user interface is integrated to this engine to provide an easy to use and practical experimentation platform. The complete system mainly consists of a web based graphical user interface that communicates with a powerful server that runs the engine. The user interface allows creating simulation models of queueing network, performing simulation experiments using the simulation engine, and performing simple output analysis in a platform independent manner. Finally, a prototype for a web based simulation optimization tool is developed by extending the system with the power of experimental design methodology and response surface method. Hence, this tool provides a way of planning which variants to simulate in order to quickly reach the neighbor of the optimal solution. The applicability of this simulation optimization approach and the features of the interface are illustrated using a two-node Jackson network and a manufacturing system as examples.Item Tuning model complexity using cross-validation for supervised learning(Thesis (Ph.D.)-Bogazici University. Institute for Graduate Studies in Science and Engineering, 2005., 2005.) Yıldız, Olcay Taner.; Alpaydın, Ethem.In this thesis, we review the use of cross-validation for model selection and propose the MultiTest method which solves the problem of choosing the best of multiple candidate supervised models. The MultiTest algorithm orders supervised learning algorithms (for classification and regression) taking into account both the result of pairwise statistical tests on expected error, and our prior preferences such as complexity of the algorithm. In order to validate the MultiTest method, we compared it with Anova, Newman-Keuls algorithms which check whether multiple methods have the same expected error. Though Anova and Newman-Keuls results can be extended to find a "best" algorithm, this does not always work. On the other hand, our proposed method is always able to find an algorithm as the "best" one. By using MultiTest method, we try to solve the problem of optimizing model complexity. For doing this, either we compare all possible models using MultiTest and select the best model or if the model space is very large, we make an effective search on the model space via MultiTest. If all possible models can be searched, MultiTest-based model selection always selects the simplest model with expected error not significantly worse than any other model. We also propose a hybrid, omnivariate architecture, for decision tree induction and rule induction. This is a hybrid architecture that contains different models at different places matching the complexity of the model to the complexity of the data reaching that model. We compare our proposed MultiTest-based omnivariate architecture with the well-known techniques for model selection on standard datasets.Item Quality-of-service-aware multicast routing for multimedia applications in mobile ad hoc networks(Thesis (Ph.D.)-Bogazici University. Institute for Graduate Studies in Science and Engineering, 2006., 2006.) Bür, Kaan.; Ersoy, Cem.The conceptual shift in the expectations of wireless users towards multimedia andgroup-oriented computing has a significant impact on today̕s networks in terms of needfor mobility, quality of service (QoS) and multicast routing. Mobile ad hoc networks can provide users with these features. However, it is imperative for them to combine QoS andmulticast routing strategies in order to utilize the wireless medium efficiently.This work defines the ad hoc QoS multicast (AQM) routing protocol, which achievesmulticast efficiency by tracking the availability of resources for each node within its neighbourhood. Computation of free bandwidth is based on reservations made for ongoingsessions and the requirements reported by the neighbours. The QoS status is announced atsession initiation and updated periodically to the extent of QoS provision. Nodes areprevented from applying for membership if there is no QoS path for the session. When nodes wish to join a session with certain service requirements, a three-phase processensures that the QoS information is updated and used to select the most appropriate routes.The allowed maximum hop count of the session is taken into account in order to satisfy thedelay requirements of the multimedia applications. To cope with the continuous nature of streaming multimedia, AQM nodes check the availability of bandwidth within theirneighbourhood not only for themselves but within a virtual tunnel of nodes. Objectionqueries are issued prior to reservation to avoid excessive resource usage due to allocationsmade by nodes which cannot detect each other directly. A priority queue determines thetransmission order of data packets according to their traffic classes to support even those applications with more stringent QoS requirements. AQM evolves the initial multicast treeinto a mesh during data flow to improve robustness. New performance metrics areintroduced to evaluate the efficiency of AQM regarding the satisfaction level of session members. Simulation results show that, by applying novel QoS management techniques,AQM significantly improves multicast efficiency for members as well as for sessions.Item Three dimensional face recognition(Thesis (Ph.D.)-Bogazici University. Institute for Graduate Studies in Science and Engineering, 2006., 2006.) Gökberk, Berk.; Akarun, Lale.In this thesis, we attack the problem of identifying humans from their three dimensional facial characteristics. For this purpose, a complete 3D face recognition system is developed. We divide the whole system into sub-processes. These sub-processes can be categorized as follows: 1) registration, 2) representation of faces, 3) extraction of discriminative features, and 4) fusion of matchers. For each module, we evaluate the state-of-the art methods, and also propose novel ones. For the registration task, we propose to use a generic face model which speeds up the correspondence establishment process. We compare the benefits of rigid and non-rigid registration schemes using a generic face model. In terms of face representation schemes, we implement a diverse range of approaches such as point clouds, curvature-based descriptors, and range images. In relation to these, various feature extraction methods are used to determine the discriminative facial features. We also propose to use local region-based representation schemes which may be advantageous in terms of both dimensionality reduction and for determining invariant regions under several facial variations. Finally, with the realization of diverse 3D face experts, we perform an in-depth analysis of decision-level fusion algorithms. In addition to the evaluation of baseline fusion methods, we propose to use two novel fusion schemes where the first one employs a confidence-aided combination approach, and the second one implements a two-level serial integration method. Recog- nition simulations performed on the 3DRMA and the FRGC databases show that: 1) generic face template-based rigid registration of faces is better than the non-rigid variant, 2) principal curvature directions and surface normals have better discriminative power, 3) representing faces using local patch descriptors can both reduce the feature dimensionality and improve the identification rate, and 4) confidence-assisted fusion rules and serial two-stage fusion schemes have a potential to improve the accuracy when compared to other decision-level fusion rules.Item Hardware / software partitioning for custom instruction processors(Thesis (Ph.D.)-Bogazici University. Institute for Graduate Studies in Science and Engineering, 2007., 2007.) Atasu, Kubilay.; Özturan, Can.; Dündar, Günhan,In this thesis, we describe an integer linear programming (ILP) based system called CHIPS for identifying custom instructions given the available data bandwidth and transfer latencies between the base processor and the custom logic. Our approach, which involves a baseline machine supporting architecturally visible custom state registers, enables designers to optionally constrain the number of input and output operands for custom instructions. We describe a comprehensive design flow to identify the most promising area, performance, and code size trade-offs. We study the effect of the constraints on the number of input/output operands and on the number of register file ports. Additionally, we explore compiler transformations such as if-conversion and loop unrolling. Our experiments show that, in most of the cases, the highest performing solutions are identified when the input/output constraints are removed. However, input/output constraints help our algorithms identify frequently used code segments, reducing the overall area overhead. We provide detailed results for eleven benchmarks covering cryptography and multimedia. We obtain speed-ups between 1.7 and 6.6 times, code size reductions between six per cent and 72 per cent, and area costs that range between 12 adders and 256 adders for maximal speed-up. We demonstrate that our ILP based solution scales well, and benchmarks with very large basic blocks consisting of up to 1000 instructions can be optimally solved, most of the time within a few seconds. We show that the state of the art techniques fail to find the optimal solutions on the same problem instances within reasonable time limits. We provide examples of solutions identified by our algorithms that are not covered by the existing methods.Item Deployment quality measures in surveillance wireless sensor networks(Thesis (Ph.D.)-Bogazici University. Institute for Graduate Studies in Science and Engineering, 2007., 2007.) Onur, Ertan.; Ersoy, Cem.Surveillance wireless sensor networks are deployed at border locations to detect unauthorized intrusions. For deterministic deployment of sensors, the quality of deployment can be determined sufficiently well by analysis in advance of deployment. However, when random deployment is required, determining the deployment quality becomes challenging. To assess the quality of sensor deployment, appropriate measures must be proposed. Determining the required number of sensors to be deployed initially is a critical decision. After deployment, temporal changes in the surveillance quality as the sensors die in time must be analyzed. The network lifetime definition must consider the surveillance performance of the network. In this thesis, to analyze the surveillance performance of the network, we propose deployment quality measures. We discuss the trade-off between the number of sensors and the deployment quality. We formulate the weakest breach path problem, and propose a method to determine the required number of sensors to be deployed. We propose the utilization of the watershed segmentation on the iso-sensing graph that reveals the equally sensed regions of the field of interest in a surveillance application. The watershed segmentation algorithm is applied on the iso-sensing graph to identify the possible breach paths. An algorithm is proposed to convert the watershed segmentation to an auxiliary graph which is then employed to determine the deployment quality. The surveillance quality is verified analytically. The temporal resilience of the surveillance quality is analyzed with a realistic discrete event simulator, and network lifetime definitions based on the deployment quality measures are proposed.Item A collaborative multi-robot localization technique for autonomous robots(Thesis (Ph.D.)-Bogazici University. Institute for Graduate Studies in Science and Engineering, 2007., 2007.) Bağcı, Hatice Köse.; Akın, H. Levent.This work proposes a novel method for collaborative global localization of a team of soccer playing autonomous robots. It is also applicable to other indoor real-time robot applications in noisy, unpredictable environments, with insufficient perception. A novel solution, Reverse Monte Carlo Localization (R-MCL) is designed to solve single self-localization problem using local perception and action about the surrounding environment for each robot. R-MCL is a hybrid method based on Markov Localization (ML) and Monte Carlo Localization (MCL) where the ML based part finds the region where the robot should be and the MCL based part predicts the geometrical location with high precision by selecting samples in this region. In the multi-robot localization problem, robots use their own local position estimations, and the shared information from other team mates, to localize themselves. To integrate the local information and beliefs optimally, avoid conflicts and support collaboration among team members, a novel collaborative multi-robot localization method called Collaborative Reverse Monte Carlo Localization (CR-MCL), based on R-MCL, is presented. When robots detect each other, they share the grid cells representing this observation. The power of the method comes from its hybrid nature. It uses a grid based approach to handle detections which can not be accurate in real-time applications, and sample based approach in self-localization to improve its success, although it uses lower amount of samples compared to similar methods. Both methods are tested using simulated robots and real robots and results show that they are fast, robust, accurate and cheap in terms of communication, memory and computational costs.Item Biologically motivated 3D face recognition(Thesis (Ph.D.)-Bogazici University. Institute for Graduate Studies in Science and Engineering, 2007., 2007.) Salah, Albert Ali.; Akarun, Lale.Face recognition has been an active area of study for both computer vision and image processing communities, not only for biometrics but also for human-computer interaction applications. The purpose of the present work is to evaluate the existing 3D face recognition techniques and seek biologically motivated methods to improve them. We especially look at findings in psychophysics and cognitive science for insights. We propose a biologically motivated computational model, and focus on the earlier stages of the model, whose performance is critical for the later stages. Our emphasis is on automatic localization of facial features. We first propose a strong unsupervised learning algorithm for flexible and automatic training of Gaussian mixture models and use it in a novel feature-based algorithm for facial fiducial point localization. We also propose a novel structural correction algorithm to evaluate the quality of landmarking and to localize fiducial points under adverse conditions. We test the effects of automatic landmarking under rigid and non-rigid registration methods. For the rigid registration approach, we implement the iterative closest point method (ICP). The most important drawback of ICP is the computational cost of registering a test scan to each scan in the gallery. By using an average face model in rigid registration, we show that the computation bottleneck can be eliminated. Following psychophysical arguments on the “other race effect”, we reason that organizing faces into different gender and morphological groups will help us in designing more discriminative classifiers. We test this claim by employing different average face models for dense registration. We propose a shape-based clustering approach that assigns faces into groups with nondescript gender and race. Finally, we propose a regular re-sampling step that increases the speed and the accuracy significantly. These components make up a full 3D face recognition system.Item Multicasting for all-optical multifiber networks(Thesis (Ph.D.)-Bogazici University. Institute for Graduate Studies in Science and Engineering, 2007., 2007.) Köksal, Fatih.; Ersoy, Cem.We propose to use a layered graph approach, which has been previously proposed for unicasting, to have a more general, realistic and flexible model of an all-optical multifiber network for multicasting. This new presentation enables us to state the problem of all-optical multicasting with sparse light splitting and wavelength conversion restrictions so that it is formulated as an original Mixed Integer Linear Programming (MILP). The MILP formulation is solved by CPLEX which finds the optimal solution within a given precision and it also gives a lower bound by relaxing the integrality constraints. However, it is possible to solve MILP problems to optimality only for small networks and number of sessions, since the problem is NP-hard. Therefore, we also propose three different heuristics (LAMA, SLAM and C-FWA) for larger problems and dynamic multicasting requests. Extensive computational experiments demonstrate that LAMA and SLAM perform close to the optimal and better than their competitor (M-ONLY) for all metrics. However, LAMA and SLAM work better than their alternatives, since we jointly optimize routing and fiber-wavelength assignment phases compared to the other candidates which attack to the problem by decomposing two phases. Experiments show that important metrics are adversely affected by the separation of routing and fiber wavelength assignment. SLAM, which is the scalable version of LAMA, performs close or better to LAMA. Finally, we also propose a new fiber wavelength assignment strategy (Ex-Fit in C-FWA) which uses wavelength and fiber conversion resources more effectively than the First Fit.Item TDMA based wireless sensor network for military monitoring (MILMON)(Thesis (Ph.D.)-Bogazici University. Institute for Graduate Studies in Science and Engineering, 2008., 2008.) Bekmezci, İlker.; Alagöz, Fatih.Wireless sensor network (WSN) is a new network family that enables to create smart environments. Although WSN has many application areas, military applications of WSN are very interesting. In this thesis, a new TDMA based sensor network for military monitoring (MILMON) is proposed. MILMON is developed to operate in large areas for acceptable lifetime periods. Design considerations of MILMON are energy consumption, delay, and fault tolerance. The main problems of TDMA based systems are time synchronization and time slot distribution. In order to realize MILMON, a new time synchronization mechanism (SyncHRT), a new distributed time scheduling mechanism (ft_DTSM) and data indicator slot mechanism (DISM) are proposed. Time synchronization with high range transmitter, SyncHRT is designed to minimize energy consumption and maximize precision. It assumes the existence of high range transmitter, so that many of the nodes can receive the broadcast signal of the high range transmitter. In this way, sensor nodes can be synchronized to a central point. Simulation model shows that SyncHRT can reduce energy consumption and increase precision. Another mechanism proposed for MILMON is delay sensitive, energy efficient and fault tolerant distributed slot assignment algorithm (ft_DTSM). It uses much less energy than the existing slot assignment mechanisms and it reduces delay with the convergecast traffic assumption which is very common traffic type for WSNs. Its fault tolerant structure helps to survive against single point of sensor node failures. Another mechanism proposed for MILMON is data indicator slot mechanism (DISM) which reduces energy consumption especially on low traffic requirements, as in most of the military monitoring systems. Analysis and simulation show that although there are WSN systems that perform better than our system for only energy consumption or for only delay, MILMON realizes an optimization on energy, delay and fault tolerance.Item A flexible approach for context-aware service selection in agent-mediated e-commerce(Thesis (Ph.D.)-Bogazici University. Institute for Graduate Studies in Science and Engineering, 2008., 2008.) Şensoy, Murat.; Yolum, Pınar.In this dissertation, we examine the problem of service selection in an e‐commerce setting where consumer agents cooperate to identify providers that would satisfy their service needs the most. There are three major challenges related to service selection: (i) representing consumers' past dealings with providers, (ii) handling deceptive information and (iii) managing evolution of consumers' service needs and semantics. Previous approaches represent consumers' past dealings with providers only as ratings. Even though the context is crucial for interpreting the ratings correctly, ratings do not contain any contextual information explicitly. A rating merely represents subjective opinion of a consumer about a provider. Because, the satisfaction criteria and the expectations of the rater are unknown, it is almost impossible to make sense of a rating. In this dissertation, consumers’ experiences with providers are represented with an ontology that can semantically capture the requested service and the received service in detail. When a consumer decides to share its experiences with a second consumer, the receiving consumer evaluates the experience using its own context and satisfaction criteria. By sharing experiences rather than ratings, the consumers can model providers more accurately and thus can select providers that are better suited for their needs. The proposed service selection approach in this dissertation is flexible enough to enable consumers to evolve their service semantics over time; context‐aware and consumer‐oriented to enable consumers to use their own satisfaction criteria and context during service selection; and robust to deception to lead satisfactory service selections even in highly deceptive environments.Item Medium access control layer performance ıssues in wireless sensor networks(Thesis (Ph.D.)-Bogazici University. Institute for Graduate Studies in Science and Engineering, 2008., 2008.) Demirkol, İlker Seyfettin.; Ersoy, Cem.Wireless sensor networks (WSNs) present a promising technology for many applications providing an intelligent and remote observation of a destination. Among the various potential applications, there are health monitoring, disaster monitoring, habitat monitoring, precision agriculture, surveillance systems. With the ongoing research both on new sensor types and on the hardware for improved computation, communication and power capacities, novel application areas are also expected. Due to the limited power sources of the sensor nodes which are generally irreplaceable, the WSN research is focused on the energy-efficient network operation. This energy concern requires new studies at each networking layer including the medium access control (MAC) layer. In this thesis, we investigate a number of MAC layer performance issues for WSN by first presenting a comparative survey of different MAC protocol schemes proposed in the literature. For the correct performance evaluation of the protocols, one needs to utilize a realistic packet traffic model that reflects the specific features of the WSN application represented. We derive an analytical packet traffic model for Surveillance WSN where sensor nodes inform the sink for detected intrusion events. The sensor detection model used is probabilistic and parametric which enables the adaptation of the packet traffic model to the sensor types deployed. One important contribution of this thesis is the optimization of the MAC layer contentions for the minimization of the energy consumption or the delay incurred in contention slotted medium access protocols. This is achieved by the derivation of the two separate formula for the energy consumption and the contention delay and, then, extracting the contention window size that optimizes the corresponding performance v metric. For its practical implementation in the distributed environment of WSNs, a method is proposed which achieve near-optimal performance values. To investigate the effect of the contention optimization proposed on the the overall network performance, the video sensor networks (VSNs) are studied. VSNs are a special type of WSNs where the sensor nodes are equipped with cameras and send image or video of a target area based on the specifications of the application. First, the network performance of the VSNs are investigated via simulations for the currently available hardware technology. Then, by applying the contention optimization proposed in this thesis, we show how the capacity of VSNs can be improved with intelligent contention window size setting.Item Improving the performance of software defect predictors with internal and external information sources(Thesis (Ph.D.)-Bogazici University. Institute for Graduate Studies in Science and Engineering, 2008., 2008.) Turhan, Burak.; Bener, Ayşe B.In this dissertation, we make an analysis of software defect prediction problem from a data mining perspective, where software characteristics are represented with static code features and defect predictors are learned from historical defect logs. We observe that straightforward applications of data mining methods for constructing defect predictors have reached a performance limit due to the limited information content in static code features. Therefore, we aim at increasing the information content in data without introducing new features, since collecting these may either be expensive or not possible in all contexts. We feed data mining methods with richer data in terms of information content. For this purpose, we propose the following methods: 1) relaxing the assumptions of data miners, 2) using project data from multiple companies, 3) modeling the interactions of software modules. For the first method, we use naive Bayes data miner and remove its i) independence and ii) equal importance of features assumptions. Then we compare the performance of defect predictors learned from local and remote data. Finally, we introduce call graph technique to model the interactions of modules. Our results on public industrial data show that: 1) relaxing the assumptions of naive Bayes may increase defect prediction performance significantly, 2) predictors learned from remote data have great capability of detecting defects at the cost of high false alarms, however this cost can be removed with the proposed filtering method 3) proposed way of modeling interactions may decrease the false alarm rates significantly. Our techniques provide guidelines for 1) employing defect prediction using remote information sources when local data are not available, 2) increasing prediction performances using local information sources.Item Vision based sign language recognition: modeling and recognizing isolated signs with manual and non-manual components(Thesis (Ph.D.)-Bogazici University. Institute for Graduate Studies in Science and Engineering, 2008., 2008.) Aran, Oya.; Akarun, Lale.This thesis addresses the problem of vision based sign language recognition and focuses on three main tasks to design improved techniques that increase the perfor- mance of sign language recognition systems. We first attack the markerless tracking problem during natural and unrestricted signing in less restricted environments. We propose a joint particle filter approach for tracking multiple identical objects, in our case the two hands and the face, which is robust to situations including fast move- ment, interactions and occlusions. Our experiments show that the proposed approach has a robust tracking performance during the challenging situations and is suitable for tracking long durations of signing with its ability of fast recovery. Second, we at- tack the problem of the recognition of signs that include both manual (hand gestures) and non-manual (head/body gestures) components. We investigated multi-modal fu- sion techniques to model the different temporal characteristics and propose a two-step sequential belief based fusion strategy. The evaluation of the proposed approach, in comparison to other state of the art fusion approaches, shows that our method models the two modalities better and achieves higher classification rates. Finally, we pro- pose a strategy to combine generative and discriminative models to increase the sign classification accuracy. We apply the Fisher kernel method and propose a multi-class classification strategy for gesture and sign sequences. The results of the experiments show that the classification power of discriminative models and the modeling power of generative models are effectively combined with a suitable multi-class strategy. We also present two applications, a sign language tutor and an automatic sign dictionary, developed based on the ideas and methods presented in this thesis.Item Routing and network mobility management in next generation satellite networks(Thesis (Ph.D.)-Bogazici University. Institute for Graduate Studies in Science and Engineering, 2009., 2009.) Korçak, Ömer.; Alagöz, Fatih.Satellite networks are an attractive option to provide broadband telecommunication services to globally scattered users, due to their extensive geographic coverage, high bandwidth availability, inherent broadcast capabilities, etc. Satellites rotating in geostationary orbit (GEO) are very well suited for broadcast services, but they suffer from high free space attenuation and long delays. On the contrary non-geostationary (NGEO) systems consisting of Medium Earth Orbit (MEO) and Low Earth Orbit (LEO) satellites offer smaller latency, lower free space loss, and better re-use of available ground-space communication frequencies, hence they are more suitable for most applications (especially for those running in real-time). However, these advantages come with a price: Footprints of satellites at lower altitudes are smaller, and global coverage can be provided by higher number of satellites that are connected each other with inter-satellite links (ISL). Moreover, lower orbit satellites move with higher speeds relative to the Earth’s surface, resulting in high dynamic in the network topology. Dynamics of the satellite constellation constitute major challenge in providing efficient routing and quality of service (QoS) for rapidly-growing real-time multimedia services. On the other hand, regular NGEO satellite networks has some facilitating features like periodicity, predictability and having highly symmetric and regular topology. For efficient networking in NGEO satellite networks, all these features should be considered. In this thesis, we clarify features of satellite systems that differ them from their terrestrial counterparts and propose novel routing and network mobility management techniques in NGEO satellite networks. Firstly, we make use of geometrical properties of the network topology, and propose a priority-based adaptive routing (PAR) algorithm. Next, we focus on handling the mobility of network by utilizing satellites with Earth-fixed footprints, and extend a well-known mobility handling technique called Virtual Node (VN). We propose Multi-state Virtual Network (MSVN) topology that alleviates deficiencies of VN concept. We clarify potential advantages of MSVN by developing efficient handover mechanisms and beam management techniques in MSVN-based satellite systems. Finally, we investigate efficient integration of NGEO satellites with High Altitude Platforms (HAPs) via highcapacity free-space optical links for carrying dense and real-time multimedia traffic. Considering the mobility and resource limitations of satellites, we propose an efficient solution for the optimal link establishment problem between HAPs and satellites.Item Bayesian source modelling for single-channel audio separation(Thesis (Ph.D.)-Bogazici University. Institute for Graduate Studies in Science and Engineering, 2009., 2009.) Dikmen, Onur.; Akarun, Lale.In many audio processing tasks, such as source separation, denoising or com- pression, it is crucial to construct realistic and flexible models to capture the physical properties of audio signals. This can be accomplished in the Bayesian framework through the use of appropriate prior distributions. In this thesis, we describe two prior models, Gamma Markov chains (GMCs) and Gamma Markov random fields (GMRFs) to model the sparsity and the local dependency of the energies of time-frequency expan- sion coefficients. We build two audio models where the variances of source coefficients are modelled with GMCs and GMRFs, and the source coefficients are Gaussian con- ditioned on the variances. The application area of these models are not limited to variance modelling of audio sources. They can be used in other problems where there is dependency between variables, such as the Poisson observation models. In single- channel source separation using non-negative matrix factorisation (NMF), we make use of GMCs to model the dependencies in frequency templates and excitation vectors. A GMC model defines a prior distribution for the variance variables such that they are correlated along the time or frequency axis, while a GMRF model describes a non-normalised joint distribution in which each variance variable is dependent on all the adjoining variance variables. In our audio models, the actual source coefficients are independent conditional on the variances and distributed as zero-mean Gaussians. Our construction ensures a positive coupling between the variance variables, so that signal energy changes smoothly over both axes to capture the temporal and/or spectral continuity. The coupling strength is controlled by a set of hyperparameters. Inference on the overall model, i.e., GMC or GMRF coupled with a Gaussian or Poisson observation model, is convenient because of the conditional conjugacy of all of the variables in the model, but automatic optimisation of hyperparameters is crucial to obtain better fits. In GMCs, hyperparameter optimisation can be carried out using the Expectation-Maximisation (EM) algorithm, with the E-step approximated with the posterior distribution estimated by the inference algorithm. In this optimisation, it is important for the inference algorithm to estimate the covariances between the variables inferred, because the hyperparameters depend on them. The marginal likelihood of the GMRF model is not available because of the in- tractable normalising constant. Thus, the hyperparameters of a GMRF cannot be optimised using maximum likelihood estimation. There are methods to estimate the optimal hyperparameters in these cases, such as pseudolikelihood, contrastive diver- gence and score matching. However, only contrastive divergence is readily applicable to models with latent variables. We optimised the hyperparameters of our GMRF- based audio model using contrastive divergence. We tested our audio models that are based on GMC and GMRF models in denois- ing and single-channel source separation problems where all the hyperparameters are jointly estimated given only audio data. Both models provided promising results, but the reconstructed signals by the GMRF model were slightly better and more natural sounding. Our third model makes use of Gamma and GMC prior distributions in an NMF setting for single-channel source separation. The hyperparameters are again optimised during the inference phase and the model needs almost no other design decisions. This model performs substantially better than the previous two models. In addition, it is less demanding in terms of computational power. However, it is designed only for source separation, i.e., it is not a general audio model as the previous two models.