The development of mod_antiCrawl: an anti crawler add-on module for apache web servers

Topgül, Muhammed Oğuzhan.

The development of mod_antiCrawl: an anti crawler add-on module for apache web servers

Files

b1663003.013382.001.PDF (791.41 KB)

b1663003.013401.001.zip (21.56 KB)

Date

2012.

Authors

Topgül, Muhammed Oğuzhan.

Publisher

Thesis (M.S.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2012.

Abstract

A web crawler can be defined as automated software that extracts website maps by visiting all the links in a website. Website map extraction process can be used to build a basis for a web attack. Hence, crawling plays an important role in automated attacks. The most automated vulnerability scanners perform crawling before vulnerability tests in order to determine overall map and attack surface. Besides automated scanning features, crawlers can also be used for content theft. By utilising a crawler, one can copy all the pages and content of a website by visiting all pages in an orderly manner. Anti-crawling can be defined as a set of mechanisms that prevents websites from being crawled by automated crawlers. In this thesis, a set of anti-crawling mechanisms are combined into an Apache web server module called mod_antiCrawl. mod_antiCrawl is developed in C language by using Apache API and it has crawler detection and inhibition capabilities to protect servers from malicious crawlers. The performance of mod_antiCrawl has also been studied and our results show that website map discovery by crawlers decreases at least 70% after mod_antiCrawl is activated. This ratio increases to 90% by enabling different functionalities of the module.

URI

https://hdl.handle.net/20.500.14908/12210

Collections

M.S. Theses

Full item page

The development of mod_antiCrawl: an anti crawler add-on module for apache web servers

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By