Who’s Calling? Characterizing Robocalls through Audio and Metadata Analysis

Published in USENIX Security, 2020

Paper: pdf

Conference presentation: YouTube

Abstract: Unsolicited calls are one of the most prominent security issues facing individuals today. Despite wide-spread anecdotal discussion of the problem, many important questions remain unanswered. In this paper, we present the first large-scale, longitudinal analysis of unsolicited calls to a honeypot of up to 66,606 lines over 11 months. From call metadata we characterize the long-term trends of unsolicited calls, develop the first techniques to measure voicemail spam, wangiri attacks, and identify unexplained high-volume call incidences. Additionally, we mechanically answer a subset of the call attempts we receive to cluster related calls into operational campaigns, allowing us to characterize how these campaigns use telephone numbers. Critically, we find no evidence that answering unsolicited calls increases the amount of unsolicited calls received, overturning popular wisdom. We also find that we can reliably isolate individual call campaigns, in the process revealing the extent of two distinct Social Security scams while empirically demonstrating the majority of campaigns rarely reuse phone numbers. These analyses comprise powerful new tools and perspectives for researchers, investigators, and a beleaguered public.

@inproceedings{usenix_whoscalling,
 author = {Sathvik Prasad and Elijah Bouma-Sims and Athishay Kiran Mylappan and Bradley Reaves},
 title = {Who{\textquoteright}s Calling? Characterizing Robocalls through Audio and Metadata Analysis},
 booktitle = {29th {USENIX} Security Symposium ({USENIX} Security 20)},
 year = {2020},
 isbn = {978-1-939133-17-5},
 pages = {397--414},
 url = {https://www.usenix.org/conference/usenixsecurity20/presentation/prasad},
 publisher = {USENIX Association},
 month = aug,
}