Slide notes

Title
Hello my dear professors,
I'll be presenting my master's thesis, titled Spectral methods for outlier detection in machine learning

First, I'd like to present a short overview of our work.

Next Slide
Overview

Our work deals with the problem of outlier detection
We argue that spectral methods are valuable
Propose to combine spectral and outlier detection methods
Evaluate our approach on 20 data sets and discuss the results

Give a short outline;
First; 
we analyze the problem of outlier detection
And outlier detection methods
Then 
we move on to spectral methods A
nd review some of them

After we present our idea,
we move on to experiments where we evaluate the performance of outlier detection methods, and their combination with spectral methods

So how can we define outliers?
 
Next slide

What is an outlier?
A popular definition by Grubbs:
An outlying observation, or outlier, is one that appears to deviate markedly from other members of the sample in which it occurs
For example, here this point is quite different from others
however note that it is not easy to define what makes an instance outlier
are these outliers?

Why important?
convey valuable and actionable information in real life
Ex: a network attack, potential fault, disease, cancerous tumor

from the perspective of learning theory, makes it possible to learn better models.

Now we look how outlier dtection differs from classifcaiton

Next Slide

How it differs from classification?

availability of labels
supervised, semi-supervised, usupervised

class priors are unbalanced
classifaction cost unsymmetric
noise is similar to outliers

Now we look at the different kinds of methods for outlier detection

Next Slide
Classifcation of Methods

Learns a discriminative model that separates outlier and typical instances
Requires labeled data
Two class vs One class

Density Estimation
Assumes outliers occur far from typical instances, in low density regions

High computational complexity and low performance on high-dimensional inputs
Parametric, Semi-Parametric, Non-Parametric

Statistical Methods
Nearest neighbor methods
Clustering methods

Now we review the outlier detection methods we use in our work

Next Slide

Active Outlier

Supervised, One class method
Reduces unsupervised outlier detection problem to
classification of normal samples from articially generated
outliers
An ensemble of classifiers are trained on selectively sampled
subsets of training
Requires much less computational power when compared to
density estimation and spectral methods

Tell Active outlier with 4 figures

Next Slide

Local Outlier Factor

Nearest neighbor based
Considers differences in local densities around an instance

Hard to find an optimal k, depends on problem
LOF values are quite sensitive to $k$ value
Calculate LOF values for $k \in [k_{min}, k_{max}]$ and take maximum

Next Slide

Parzen Windows

Non-parametric density estimation method
Determining an optimum bin size is difficult
A fixed value of bin size for the whole input space may not work well

Next Slide

One-Class SVM

Algorithm that returns a function f that takes the value 1 in a small region capturing most of the data -1 elsewhere. strategy is to map the data into the feature space corresponding to the kernel and to separate them from the origin with maximum margin

nu parameters controls the number of outliers, support vectors
nu is an upper bound of the fraction of outliers
lower bound on the fraction of SVs

Transition to Spectral Methods
Next Slide

Formal Definition
Unsupervised learning techniques that reveal low dimensional structure from high dimensional data \cite{Saul:2006}
Use the spectral decomposition of specially constructed matrices to reduce dimensionality and transform input data to a new space
Linear vs. Nonlinear

Generic approach Find a lower dimensional representation for input data where the dot products in this new space matches the similarities as soon as possible. this is also low rank matrix approx problem. solution is svd=spectral composition
Kernel Trick
