Westlake, B., Bouchard, M., Frank, R. (2012). Comparing methods for detecting child exploitation content online. Proceedings of the EISIC – European Intelligence and Security Informatics, Odense, 156-163.

The sexual exploitation of children online is seen as a global issue and has been addressed by both governments and private organizations. Efforts thus far have focused primarily on the use of image hash value databases to find content. However, recently researchers have begun to use keywords as a way to detect child exploitation content. Within the current study we explore both of these methodologies. Using a custom designed web-crawler, we create three networks using the hash value method, keywords method, and a hybrid method combining the first two. Results first show that the three million images found in our hash value database were not common enough on public websites for the hash value method to produce meaningful result. Second, the small sample of websites that were found to contain those images had little to no videos posted, suggesting a need for different criteria for finding each type of material. Third, websites with code words commonly known to be used by child pornographers to identify or discuss exploitative content, were found to be much larger than others, with extensive visual and textual content. Finally, boy-centered keywords were more commonly found on child exploitation websites than girl-centered keywords, though not at a statistically significant level. Applications for law enforcement and areas for future research are discussed.


Link to full text on Research Gate