Accurate, Focused Research on Law, Technology and Knowledge Discovery Since 2002

Bootstrapping Privacy Compliance in Big Data Systems

In this paper, we demonstrate a collection of techniques to transition to automated privacy compliance compliance checking in big data systems. To this end we designed the LEGALEASE language, instantiated for stating privacy policies as a form of restrictions on information flows, and the GROK data inventory that maps low level data types in code to highlevel policy concepts. We show that LEGALEASE is usable by non-technical privacy champions through a user study. We show that LEGALEASE is expressive enough to capture real-world privacy policies with purpose, role, and storage restrictions with some limited temporal properties, in particular that of Bing and Google. To build the GROK data flow grap we leveraged past work in program analysis and data flow analysis. We demonstrate how to bootstrap labeling the graph with LEGALEASE policy datatypes at massive scale. We note that the structure of the graph allows a small number of annotations to cover a large fraction of the graph. We report on our experiences and learnings from operating the system for over a year in Bing. — Shayak Sen (Carnegie Mellon University), Saikat Guha (Microsoft Research, India), Anupam Datta (Carnegie Mellon University), Sriram Rajamani (Microsoft Research, India), Janice Tsai (Microsoft Research, Redmond), and Jeannette Wing (Microsoft Research), Bootstrapping Privacy Compliance in Big Data Systems, IEEE Security and Privacy Symposium 2014, Best Student Paper (1 of 2) – See more at:

Sorry, comments are closed for this post.