Feature Selections for the Machine Learning based Detection of Phishing Websites

Buber E., DEMİR Ö. , Sahingoz O. K.

2017 International Artificial Intelligence and Data Processing Symposium (IDAP), Malatya, Turkey, 16 - 17 September 2017 identifier

  • Publication Type: Conference Paper / Full Text
  • City: Malatya
  • Country: Turkey


Phishing websites are malicious sites which impersonate as legitimate web pages and they aim to reveal users important information such as user id, password, and credit card information. Detection of these phishing sites is a very challenging problem because phishing is mainly a semantics-based attack, which especially abuses human vulnerabilities, however not network or system vulnerabilities. As a software detection scheme, two main approaches are widely used: blacklists/whitelists and machine learning approaches. Machine learning solutions are able to detect zero-hour phishing attacks and they have superior adaption for new types of phishing attacks, therefore they are mainly preferred. To use this type of solution features of input must be selected carefully. The whole performance of the solution depends on these features. Therefore, in this paper, it is aimed to list and identify the important features for machine learning-based detection of phishing websites.