Abstract
Many smartphone apps collect potentially sensitive personal data and send it to cloud servers. However, most mobile users have a poor understanding of why their data is being collected. We present MobiPurpose, a novel technique that can take a network request made by an Android app and then classify the data collection purposes, as one step towards making it possible to explain to non-experts the data disclosure contexts. Our purpose inference works by leveraging two observations: 1) developer naming conventions (e.g., URL paths) of ten offer hints as to data collection purposes, and 2) external knowledge, such as app metadata and information about the domain name, are meaningful cues that can be used to infer the behavior of different traffic requests. MobiPurpose parses each traffic request body into key-value pairs, and infers the data type and data collection purpose of each key-value pair using a combination of supervised learning and text pattern bootstrapping. We evaluated MobiPurpose's effectiveness using a dataset cross-labeled by ten human experts. Our results show that MobiPurpose can predict the data collection purpose with an average precision of 84% (among 19 unique categories).
Bibtex
@article{Jin2018WhyAT,
author = {Jin, Haojian and Liu, Minyi and Dodhia, Kevan and Li, Yuanchun and Srivastava, Gaurav Kumar and Fredrikson, Matthew and Agarwal, Yuvraj and I. Hong, Jason},
year = {2018},
month = {12},
pages = {1-27},
title = {Why Are They Collecting My Data?: Inferring the Purposes of Network Traffic in Mobile Apps},
volume = {2},
journal = {Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies},
doi = {10.1145/3287051}
}
Plain Text
Jin, Haojian & Liu, Minyi & Dodhia, Kevan & Li, Yuanchun & Srivastava, Gaurav Kumar & Fredrikson, Matthew & Agarwal, Yuvraj & I. Hong, Jason. (2018). Why Are They Collecting My Data?: Inferring the Purposes of Network Traffic in Mobile Apps. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies. 2. 1-27. 10.1145/3287051.