Detecting Phishing URLs using the BERT Transformer Model
Publication Date
1-1-2023
Document Type
Conference Proceeding
Publication Title
Proceedings - 2023 IEEE International Conference on Big Data, BigData 2023
DOI
10.1109/BigData59044.2023.10386782
First Page
2483
Last Page
2492
Abstract
Phishing websites many a times look-alike to benign websites with the objective being to lure unsuspecting users to visit them. The visits at times may be driven through links in phishing emails, links from web pages as well as web search results. Although the precise motivations behind phishing websites may differ the common denominator lies in the fact that unsuspecting users are mostly required to take some action e.g., clicking on a desired Uniform Resource Locator (URL). To accurately identify phishing websites, the cybersecurity community has relied on a variety of approaches including blacklisting, heuristic techniques as well as content-based approaches among others. The identification techniques are every so often enhanced using an array of methods i.e., honeypots, features recognitions, manual reporting, web-crawlers among others. Nevertheless, a number of phishing websites still escape detection either because they are not blacklisted, are too recent or were incorrectly evaluated. It is therefore imperative to enhance solutions that could mitigate phishing websites threats. In this study, the effectiveness of the Bidirectional Encoder Representations from Transformers (BERT) is investigated as a possible tool for detecting phishing URLs. The experimental results detail that the BERT transformer model achieves acceptable prediction results without requiring advanced URLs feature selection techniques or the involvement of a domain specialist.
Funding Number
2319802
Funding Sponsor
National Science Foundation
Keywords
BERT Transformer Language Model., Phishing URLs, Social Engineering
Department
Computer Science
Recommended Citation
Denish Omondi Otieno, Faranak Abri, Akbar Siami Namin, and Keith S. Jones. "Detecting Phishing URLs using the BERT Transformer Model" Proceedings - 2023 IEEE International Conference on Big Data, BigData 2023 (2023): 2483-2492. https://doi.org/10.1109/BigData59044.2023.10386782