Publication Date
Summer 2023
Degree Type
Thesis
Degree Name
Master of Science (MS)
Department
Justice Studies
Advisor
Bryce Westlake,; Sambuddha Ghatak; Richard Frank
Abstract
Inherent encryption and anonymity of the Dark Web has facilitated its use for a variety of illegal activities, including the dissemination of child sexual abuse material (CSAM). Detection of CSAM has increasingly turned to automated tools; however, these tools’ reliance on databases of known CSAM limit their flexibility and ability to identify previously unknown CSAM. Currently, it is unknown whether the lack of countermeasures used by CSAM disseminators on the Surface Web translates to the Dark Web. Therefore, this thesis examines the structural and naming patterns of websites hosting CSAM on the Dark Web, to identify new, complimentary, detection methods. Data were collected from 1,197 Dark Web websites using a customized web crawler. In total, 179 unique CSAM hash values, found in 3,640 locations, were analyzed. Structurally, websites on the Dark Web prioritized organization (ease of access) over countermeasures (security). 85% of images were located in the first sub-folder level, with default ‘image’ or explicitly named folders being the most common. In contrast, files were frequently innocuously named. However, one countermeasure used was the creation of duplicate, ‘mirrored’, websites. Results are compared to the Surface Web and recommendations for supplementing automated detection tools with structural and naming patterns are outlined.
Recommended Citation
Guerra, Enrique, "Crawling the Dark Web: Structural Attributes of Child Sexual Abuse Websites on the Dark Web" (2023). Master's Theses. 5443.
DOI: https://doi.org/10.31979/etd.42pa-8qkh
https://scholarworks.sjsu.edu/etd_theses/5443