M.S. Theses
Permanent URI for this collection
Browse
Recent Submissions
Item Automated assignment and classification of software issues(Thesis (M.S.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2023., 2023) Tabak, Büşra.; Aydemir, Fatma Başak.Software issues contain units of work to fix, improve or create new threads during the development and facilitate communication among the team members. Assigning an issue to the most relevant team member and determining a category of an issue is a tedious and challenging task. Wrong classifications cause delays and rework in the project and trouble among the team members. This thesis proposes a set of carefully curated linguistic features for shallow machine learning methods and compares the performance of shallow and ensemble methods with deep language models. Unlike the state-of-the-art, we assign issues to four roles (designer, developer, tester, and leader) rather than to specific individuals or teams to contribute to the generality of our solution. We also consider the level of experience of the developers to reflect the industrial practices in our solution formulation. We employ a classification approach to categorize issues into distinct classes, namely bug, new feature, improvement, and other. Additionally, we endeavor to further classify bugs based on the specific type of modification required. We collect and annotate five industrial data sets from one of the top three global television producers to evaluate our proposal and compare it with deep language models. Our data sets contain 5324 issues in total. We show that an ensemble classifier of shallow techniques achieves 0.92 for issue assignment and 0.90 for issue classification in accuracy which is statistically comparable to the state-of-the- art deep language models. The contributions include the public sharing of five annotated industrial issue data sets, the development of a clear and comprehensive feature set, the introduction of a novel label set and the validation of the efficacy of an ensemble classifier of shallow machine learning techniques.Item A multi-class approach to next session and in-session purchase predictıon with real-time e-commerce data using machine learning techniques(Thesis (M.S.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2023., 2023) Sürhan, Gizem.; Badur, Bertan Yılmaz.Advances in machine learning yield implications for the rapidly growing ecommerce industry. Retailers are looking for ways to better understand and predict complex customer behavior. Two crucial problems that exist in the domain are predicting customers’ platform engagement and purchase intent. Different techniques are employed in the literature addressing the problems separately, but the two tasks are highly dependent on each other since the ultimate goal for session engagement is purchase. Understanding if a next session will be made with a high purchase motive is critical to diversify the business actions taken. The main aim of the thesis is to develop a multi-class model that successfully distinguishes the next sessions with and without purchase intention. 38 million e- commerce sessions are collected for the specific task. Following the application of state-of-the-art LightGBM and LSTM algorithms, their results are compared, where LightGBM outperformed the latter. Additionally, a simple ensembling technique is used to increase the performance, leading to a 68% F1 score for the predictions of no session, 71% for the predictions of sessions without purchase and 59% for the predictions of sessions with purchase. Furthermore, an undersampling technique is employed to handle the imbalance differently than the technique used by LightGBM and LSTM, increasing F1 scores to 75%, 72% and 74% respectively.Item Cluster-based scoring for malicious model detection in federated learning(Thesis (M.S.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2023., 2023) Çağlayan, Cem.; Yurdakul, Arda.Federated learning is a distributed machine learning technique aggregating every client model on a server to obtain a global model. However, some clients may harm the system by poisoning their model or data to make the global model irrelevant to its objective. This thesis introduces an approach for the server to detect adversarial models by coordinate-based statistical comparison and eliminate them from the system prior to aggregation. A new attack type, layer poisoning, where the malicious nodes prefer poisoning selected small size layers of the model to deceive the detection system, is also introduced. Adaptive thresholding is adopted for preserving the robustness of the detection mechanism for various network against different attack types. A simulation framework is developed to benchmark and realize tests as a distributed system. Experiments that use random sampling of independent and identically distributed (iid) datasets with different batch sizes have been carried out to show that the proposed method can identify the malicious nodes successfully even if some of the clients learn slower than others or send quantized model weights due to energy limitations. The proposed approach is extensively tested with malicious-benign client ratios, model types, and datasets to present its versatility. The results show that the proposed system successfully eliminates the malicious models when their generating clients constitute at most 45% of the network. Comparison with the approaches from the literature shows that the proposed method performs the same as or better than the state of art solutions.Item Convolutional ensemble learning for edge intelligence(Thesis (M.S.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2023., 2023) Sıkdokur, İlkay.; Yurdakul, Arda.; Baytaş, İnci Meliha.Deep Edge Intelligence targets the deployment of deep learning algorithms in the edge network. While training deep networks requires computational resources, edge devices frequently lack high computational power. Decentralized learning methods such as federated learning provide a solution for gathering limited information from edge devices and collectively improving prediction performance. However, a drawback of such methods is that they often require multiple rounds of network communication, which increases communication time and the risk of communication errors. Another drawback is that the same model architecture is often used on all edge devices, which makes it mandatory to work with devices above a level of computational capacity. This thesis proposes a hybrid learning approach that employs ensemble learning with a convolutional scheme for different edge model architectures, except for a selected fully connected layer of the same dimensionality. Initially, shallow neural networks are trained on edge devices until a certain level of performance is achieved. Next, the feature representations obtained by the shallow models are transferred to an ensemble model. Subsequently, the proposed convolutional ensemble model is trained to boost the prediction performance. This method facilitates the completion of the system training with a one-way data transfer between edge devices and the server. Variational auto-encoders are also utilized to generate feature vectors in case transferring the required representations from the edge devices fails. Extensive experiments demonstrate that the suggested method outperforms state-of-the-art techniques in terms of accuracy while requiring fewer communications and a lower amount of data in various training scenarios.Item Structural behavior of non-pneumatic tires using finite element analysis(Thesis (M.S.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2023., 2023) Azer, Mesutcan.; Özüpek, Şebnem.Modern tire technology has revolutionized vehicle performance in terms of safety and comfort by combining improvements in production techniques, materials, and design. However, conventional tires have major drawbacks like punctures, pressure loss, and temperature fluctuations, leading to a reduction in efficiency and an increase in maintenance requirements. In order to overcome these limitations, the development of non-pneumatic tire (NPT) technology has gained importance. NPTs have been characterized by their airless design using durable materials such as solid rubber or foam with major advantages like puncture resistance, minimal maintenance, and excellent load-bearing capability. This study focuses on the structural behavior of the Michelin Unique Puncture-Proof Tire System (UPTIS) based NPT using the finite element analysis. For static vertical loading, the effects of collapsible spoke thickness, spoke angle, and reinforcement thickness on the vertical displacement, vertical stiffness, contact pressure, and rolling resistance are evaluated. Additionally, the steady-state rolling analysis of NPT is performed in order to investigate the dynamic characteristics which are contact pressure and shear stress distribution on the contact area in braking, freerolling, and traction states. As UPTIS and NPTs have been gaining more popularity in the automotive industry, the findings of this study contribute to understanding of the relationship between design parameters and NPT performance.Item Modeling volatility dynamics in financial time series : an analysis of econometric models and machine learning methods(Thesis (M.S.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2023., 2023) Erce, Hakan.; Işlak, Ümit.; Kaygun, Atabey.In this thesis we present an analysis of different volatility dynamics in financial and cryptocurrency markets. We focus on the Borsa Istanbul stock exchange index BIST 100 (XU100), NASDAQ index, and Bitcoin/USD exchange rate (BTCUSD). Our aim is to understand the risk profiles of the data, forecast future risk using historical data, and statistically analyze structural characteristics of the volatility in the markets. In our analysis we used logarithmic returns of each financial asset we considered. Our first step of the investigation was to determine the best fitting ARIMA models, and analyze the parameters of these models using tools such as ADF-test, ACFand PACF-plots, QQ-plots, and Kolmogorov-Smirnov test. Our working hypothesis is that if a model successfully explains the behaviour of an asset then the residual signal must be close to white noise. However, we found significant autocorrelations in the residuals which indicates that the ARIMA models we used did not fully capture all underlying patterns. In the next step, we used GARCH models to analyze the dynamics of the volatility inherent in each asset. While the risk profiles of the datasets varied, the structure coefficients of the models were consistent across all datasets. We finish the thesis by outlining how one might expand our study using endogeneous and exegeneous data, and different machine learning methods.Item A framework to improve user story sets through collaboration(Thesis (M.S.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2023., 2023) Köse, Salih Göktuğ.; Aydemir, Fatma Başak.Agile methodologies have become increasingly popular in recent years. Due to its inherent nature, agile methodologies involve stakeholders with a wide range of expertise and require interaction between them, relying on collaboration and customer involvement. Hence, agile methodologies encourage collaboration between all team members so that more efficient and effective processes are maintained. Generating requirements can be challenging, as it requires the participation of multiple stakeholders who describe various aspects of the project and possess a shared understanding of essential concepts. One simple method for capturing requirements using natural language is through user stories, which document the agreed-upon properties of a project. Stakeholders try to strive for completeness while generating user stories, but the final user story set may still be flawed. To address this issue, we propose SCOUT: Supporting Completeness of User Story Sets, which employs a natural language processing pipeline to extract key concepts from user stories and construct a knowledge graph by connecting related terms. The knowledge graph and different heuristics are then utilized to enhance the quality and completeness of the user story sets by generating suggestions for the stakeholders. We perform a user study to evaluate SCOUT and demonstrate its performance in constructing user stories. The quantitative and qualitative results indicate that SCOUT significantly enhance the quality and completeness of the user story sets. Our contribution is threefold. First, we develop heuristics to suggest new concepts to include in user stories by considering both the individuals’ and other team members’ contributions. Second, we implement an open-source collaborative tool to support writing user stories and ensuring their quality. Third, we share the experimental setup and materials to evaluate the SCOUT.Item Feature analysis for recommender systems using transformer-based architectures(Thesis (M.S.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2023., 2023) Boran, Emre.; Güngör, Tunga.Recommender systems are technology-based solutions that assist users by suggesting relevant items among millions of items. It could be anything like a movie, a meal, a vacation spot, shoes, or a piece of music. Unlike traditional recommender systems, sequential and session-based recommender systems make recommendations by paying attention to the order of items that users interact with. The advantage of such systems is that they take into account varying tastes. Additionally, due to some legal requirements, the users’ data cannot be collected from some platforms, and the recommender system has to suggest the session’s information without having any previous knowledge. It may only have to recommend products according to a few interactions in that session. These reasons constitute the importance of sequential and sessionbased recommender systems. In this thesis, we have experimented with sequential and session-based recommender systems using the Transformers4rec framework, which allows us to use transformer architectures in recommender systems. We observed that transformer architectures work better in short interaction sequences than long ones. We showed that additional features enhance the model’s performance, particularly time-based features. Additionally, we examined and interpreted that the importance of features changes according to the size, shape, and type of data.Item Evaluation of different downscaling approaches for very-high- resolution climate data(Thesis (M.S.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2023., 2023) Demiralay, Zekican.; Kurnaz, M. Levent.Climate change leads to widespread changes in atmospheric and oceanic conditions, increasing the frequency of climate anomalies and negatively impacting ecosystems and human communities. It is crucial to understand climate change and make accurate predictions about it. Climate change studies focus on tools like General Circulation Models (GCMs); however, GCMs cannot accurately represent local climates, leading to uncertainties due to their coarse resolution. Statistical and dynamical downscaling techniques improve local climate projection accuracy. This study compared statistical and dynamical downscaling techniques for evaluating Turkey’s climate change projections, using the MPI-ESM-MR as the main GCM, RegCM4.7.0 regional climate model for dynamical downscaling and the spatial delta method for statistical downscaling. 17 datasets were analyzed to investigate spatio- temporal correlations at resolutions of 1km, 5km, 10km, and 20km. Evaluated spatial correlation of precipitation and temperature showed low to moderate correlation coefficients with negative correlations and near-zero values for precipitation but higher correlation results for temperature. 10 and 20km resolution downscaling data showed more favorable results. The temporal correlation of precipitation showed superior consistency with reduced standard deviations and improved correlation coefficients. The study highlighted the temporal correlation of temperature, exhibiting exceptional precision due to its nature and alignment with annual seasonal cycles. This study’s findings will significantly enhance understanding of the optimal methodology for downscaling climate change projections and the impacts of climate change on local communities.Item Classification of mobile application reviews using deep language models(Thesis (M.S.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2023., 2023) Eren, Emre.; Aydemir, Fatma Başak.User reviews include valuable information for mobile applications such as bug reports, feature requests, and rationale for praising or criticising about the application. Manual analysis of the reviews is costly due to the vast number of reviews received for an application. To reduce this manual effort, the literature mainly focuses on shallow machine learning methods with few studies investigating the deep language models to assign labels to the reviews. This thesis i. defines a new label to distinguish reviews criticising the quality and business strategy of applications, ii. presents a new manually annotated dataset of application reviews of size 2230, and iii. studies the performance of BERT, RoBERTa, DeBERTa, GPT-3 (ada), and GPT-3 (curie) models for review classification. Our results indicate that GPT-3 (curie) significantly outperforms the BERT yet there is no significant difference among the rest considering the F1-score. Additionally, we extend our pipeline by performing topic extraction to identify and capture common themes and topics from the reviews resulting from the classification pipeline. This additional step allows us to gain deeper insights into the prevalent subjects and discussions within the user feedback.Item Prediction of pathogen-host interactions with protein sequence embeddings using deep learning(Thesis (M.S.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2023., 2023) Oğuzoğlu, Büşra.; Özgür, Arzucan.Infections caused by pathogens are a significant problem around the world. Determining protein interactions between pathogens and hosts is critical to understanding infection mechanisms and developing prevention and treatment strategies. Wet-lab experiments to identify protein interactions are expensive and time-consuming. Therefore, computational approaches have been proposed as a promising complementary solution. While 3D structures of proteins contain helpful information about protein functions, with advances in sequencing technology, 1D sequences of proteins are widely available and are often utilized because they are easier to process with less computational power. The main goal of this thesis is to develop a sequence-based approach for predicting pathogen-host protein interactions based on the hypothesis that protein sequences can be viewed as sentences, therefore, can be decomposed into chunks, which we refer to as protein words. We first adapt the Byte Pair Encoding (BPE) tokenization method from the field of natural language processing to protein sequences and then apply a graph-based approach using the Metapath2Vec algorithm to learn representations of sequences. The results show that incorporating a word-based representation of proteins improves the performance of the graph-based approach. In addition, two other methods for learning text representations, SeqVec and ProtBERT, are evaluated for predicting pathogen-host protein interactions. The results on three virus-host protein interaction datasets show that the sequence-based protein representation approaches are promising and achieve comparable performance to the state-of-the-art methods.Item An analysis on dimensionality and architecture on generative models(Thesis (M.S.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2023., 2023) Gursu, Ali Emre.; Badur, Bertan Yılmaz.Deep generative models are powerful class of machine learning models. However, a significant amount of computing power and technical knowledge is required to conduct the training process. Even searching for hyperparameters requires a high computational cost. Moreover, there is still ongoing research on methods for evaluating generative models, and owing to the lack of a robust and consistent metric, there are limited comparisons between generative model architectures and algorithms. In this study, we attempted to compare two types of generative model architectures, Generative Adversarial Networks (GANs) and Real-valued Non-Volume-Preserving (NVP) flows, with synthetic datasets as well as with a well known image dataset MNIST. We evaluate their data capturing ability according to data dimensionality and variability. We propose an Minimum Description Length (MDL) based metric to examine the effect of model complexity which is measured as model’s parameter count. We provide estimated Kullback-Leibler (KL) divergence and propsed MDL-based metric results. Our findings indicate that NVP models have the capability to encode more data variability while utilizing fewer parameters when contrasted with GANs for lower dimensional datasets. The proposed MDL-based metric, facilitates selecting suitable architecture in terms of model complexity for a given dataset considering its variability and dimensionality. NOTE Keywords : Generative Models, Generative Adversarial Networks, RealNVP, Deep Learning.Item Future predictions of global hotspots for temperature and precipitation extremes with CMIP6 model(Thesis (M.S.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2023., 2023) Bayındır, Elif.; Kurnaz, M. Levent.; An, Nazan.This study used a climate change hotspot technique using population-weighted Standard Euclidean Distance (SED) measurements to assess climate change risks across regions. The analysis shows changes in the world’s climatology for the years 2026-2050 and 2076-2099 under the SSP2-4.5 and SSP3-7.0 scenarios. In the SSP2-4.5 scenario (2026-2050), equatorial regions like China, Japan, and Indonesia are hotspots that would experience increasing sea levels, extreme weather events, and heat waves because of a changing climate and population density. The Americas, Southern Africa, the Mediterranean, Western Europe, and Asia are considered to be risk zones. Concerns include droughts and unpredictable precipitation, which affect economies and food security. Ecosystems and communities must use customized adaptation strategies to meet these difficulties. Moderate adaption issues are anticipated under SSP2-4.5, necessitating focused measures. Distant future trends (2076-1999) align with near-future tendencies, which calls for further efforts. Similar indicator changes may cause a lack of differences between scenarios in the near future. This study enhances our understanding of hotspots and emphasizes the importance of thorough approaches. Innovations and a variety of situations improve climate resilience and adaptation. Future work should focus on improving assessment techniques, improving used data and models, and including socioeconomic issues for effective climate plans.Item Disentangled representation learning in isolated sign language recognition(Thesis (M.S.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2023., 2023) Erdoğan, İpek.; Baytaş, İnci Meliha.Representation learning is an essential part of all deep learning tasks. Achieving good performance in recognition, generation, and classification heavily depends on learning meaningful and reliable representations. It is important to gather informative representations that are not affected by unnecessary details in all cases. Sign Language Recognition is one of the areas where deep learning models have been successfully used. Sign Language Recognition (SLR) is essential to exchange information between those who know sign language and those who do not. The input of an SLR model is a video in which an individual performs a sign or multiple signs. Therefore, Convolutional Neural Networks (CNN) are commonly a part of deep learning-based SLR frameworks. However, CNN-based recognition frameworks tend to capture the characteristics of the identity in the foreground, such as face attributes, hand and body shape, and skin color. This challenge is often encountered in problems such as face and gait recognition, image manipulation, and person re- identification problems. In this thesis, a disentangled representation learning framework is proposed to separate the latent factors in the sign and signer representations and eliminate the irrelevant identity information to improve sign recognition performance. Various disentanglement techniques, including regularized adversarial training, are investigated. Experiments are conducted on two isolated Turkish sign language benchmark datasets. The effect of feature disentanglement and its potential to improve recognition performance are discussed with qualitative and quantitative analysis.Item Automatic question generation for improving low resource question answering performance(Thesis (M.S.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2023., 2023) Manav, Yusufcan.; Özgür, Arzucan.; Arısoy Saraçlar, Ebru.This thesis focuses on employing a question-generation system to improve the performance of question-answering models. We propose a multitask-trained questiongeneration module that is built on a multilingual encoder-decoder architecture and can produce question-answer pairs over plain text passages. We were able to adapt the question-generation system to several languages by using a multilingual model. First, we created a Turkish Question Answering dataset utilizing the Turkish Wikipedia pages and this question-generation system. Our experiments revealed that the performance on the Turkish XQuAD set was enhanced by 3% when the generated dataset was combined with the human-annotated dataset for question-answering model training. Second we also extensively test our model in many languages and low-resource environments. We used limited annotated data from the question-answering datasets from different languages like English, German, French, and Turkish; to train the question generation model. We then utilized this model to create artificial question-answer pairs from the unannotated paragraphs. Our experiments revealed that, especially in the lower data settings, our augmentation strategy consistently outperformed the baseline question- answering models that are trained on human-annotated data across a range of dataset sizes and languages.Item Forecasting the energy consumption of sectors under different NGFS scenarios and analyzing the effects on Türkiye's GDP(Thesis (M.S.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2022., 2023) Gökçe, Utku.; Kurnaz, M. Levent.Physical and transition risks of climate change will have an impact on countries’ economies and sectors. With proper planning and taking the necessary steps, these impacts can be mitigated. Therefore, academic studies and analyzes in this field are important. In this study, it is examined how T¨urkiye’s GDP will be affected by physical risks and transition risks under different climate scenarios. In addition, within the scope of these scenarios, it has been forecasted how the energy consumption of the sectors in T¨urkiye will be in the future. In cases where current policies are continued or the necessary measures are not taken at the right time, the impact of climate change on T¨urkiye’s GDP will be huge. At this point, it is of great importance to limit GHG emissions and not to increase the global average temperatures compared to the preindustrial revolution. Because the increase in the number of extreme weather events or the occurrence of irreversible physical events such as sea level rise can seriously affect the economies. The steps in transitioning to a low carbon economy and combating the effects of climate change will also be a huge burden for the economies. In this context, the use of renewable energy sources should be increased in Turkey and practices that can reduce emissions such as carbon tax should be introduced. In energy production, fossil fuel consumption should be reduced and alternative energy types should be used. It can be said that the Oil and Gas, Transportation and Automotive sectors will be more affected by this situation. In these sector renewable energy types may need to be used more in energy production.Item Life cycle analysis of differnet cuisines(Thesis (M.S.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2023., 2023) Çatalçekiç, Dalya Nur.; Kurnaz, M. Levent.The extensive global food system is responsible for approximately 30% of greenhouse gas emissions. While the steps of the global food system such as production, packaging, transportation, distribution, storage, and disposal are followed, environmental effects remain intangible. With the life cycle assessment (LCA), environmental impacts are seen in concrete form at every stage of the system. In this study, the environmental impacts of different cuisines were investigated through life cycle assessment. Three menus have been created, consisting of Turkish, Far East, and Mediterranean cuisines, which are known and have a wide variety of food. Each menu has been chosen in accordance with the culture of the cuisine it has. The menus consist of soup, main course, side course and dessert. As a result of the life cycle assessment made on the menus selected for 3 cuisines, it has been determined that the environmental impact of the Mediterranean cuisine is quite low. The reason why the environmental impact is very low compared to the Turkish and Far Eastern cuisines, mainly agricultural foods are included in the Mediterranean cuisine and animalbased meals are not preferred much. On a food basis, the environmental impact of animalbased foods is greater than that of plant foods. As a result of the study, Turkish cuisine, in which animal-based meals are predominant, is the cuisine with the most environmental impact.Item Interpretation of compound fragments via attentive recursive tree(Thesis (M.S.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2023., 2023) Özel, Nural.; Ülgen, Kutlu Ö.; Özgür, Arzucan.The discovery of new drug-like chemicals with desired properties is a challenging and costly process in the pharmaceutical industry. To facilitate this process in the preclinical phase, many different neural network models have been proposed for different tasks (e.g., drug-target affinity prediction, molecular property prediction, targetspecific molecule generation). Despite producing successful results, they usually lack interpretability. To comprehend the significance of each fragment in the relevant compounds, we employed the Attention Recursive Tree (AR-Tree) model. Thanks to its task-specific attention mechanism, AR-Tree highlights the significant fragments of compounds by positioning them closer to the root of the tree structure. In this way, the identified significant fragments can be used to design new compounds with desired properties in future research. We experimented with five different classification and four different regression tasks of the MoleculeNet as benchmark tasks. The results of the experiments show that the proposed architecture succeeded in finding chemically meaningful fragments for the corresponding tasks.Item Drug-target affinity prediction using a graph-based approach enriched with molecule words(Thesis (M.S.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2023., 2023) Yılmaz, Cansu Damla.; Özgür, Arzucan.Wet-lab experiments to predict the affinity of drugs for their targets are costly and time consuming. Computational methods can provide an alternative to early stage experiments and guide the research process. Recently, the use of natural language processing techniques to represent molecules has become popular and has led to successful results. In our work, we assume that proteins and ligands, like human languages, have their own languages and that these languages consist of meaningful smaller parts that we call words. We identify protein and ligand words based on their 1D sequences using a subword tokenization method and represent protein-ligand interactions with a heterogeneous graph consisting of four different node types corresponding to proteins, ligands, protein words, and ligand words. A graph-based approach is used to learn embeddings for the nodes in the graph. These embeddings are fed into a deep learning model for predicting protein-ligand binding affinity. We show that using their word embeddings to represent novel proteins and/or ligands not present in the training set improves the results compared to the case where no words are used. Using pre-trained word embeddings for previously unknown molecules is also efficient in terms of complexity, as we do not need to re-train the input graph to learn the embeddings for these new molecules.Item Adversarial robustness and generalization(Thesis (M.S.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2023., 2023) Serbes, Duygu.; Baytaş, İnci Meliha.In light of recent discoveries regarding adversarial attacks, the necessity for robust models in deep learning has become increasingly critical. Adversarial training is considered one of the most effective approaches to defending against such attacks. However, a key challenge of it is the trade-off between adversarial robustness and generalization. The generalizability of robust models in adversarial training is affected by the diversity of perturbations, as they can overfit if the model only learns a limited attack pattern. Although stronger attacks can enhance robustness, their use may cause performance drops when classifying natural images. This thesis investigates the factors that affect the success of adversarial training and proposes solutions to mitigate some of these factors by utilizing new attack augmentation and generation methods. In that regard, we propose an adversarial training method that enhances adversarial directions by augmenting them from a one-step attack. The proposed framework is inspired by the feature scattering adversarial training and generates a principal adversarial direction based on the distance of the inter-sample relationships in a perturbed mini-batch. The principal direction is augmented by sampling new adversarial directions in a 45-degree region from it. The proposed method does not necessitate additional backpropagation steps than feature scattering. Experimental results on popular benchmark datasets indicate that the method consistently improves adversarial robustness without sacrificing natural accuracy. Furthermore, in this thesis, we propose integrating generalization-boosting techniques, namely mixup and shiftinvariance, into the adversarial training framework. The proposed techniques aim to improve the data representations and robustness of models through convex data augmentation and by making the models invariant to small shifts. The effectiveness of our proposals is evaluated under white-box attacks on benchmark datasets.