Privacy-preserving Machine Learning and Data Analytics

[Research Statement] [Publications] [Home]

Research Statement

Machine learning has recently been widely adopted in various applications, and such success is largely due to a combination of algorithmic breakthroughs, computation resource improvements, and especially the access to a large amount of diverse training data. However, such massive data usually contain privacy sensitive information such as medical and financial information of individuals. With the rise of ubiquitous sensing, personalization, and virtual assistants, users' privacy is at ever-increasing risk. Can we enable the power and utility of machine learning and data analytics while still ensuring users' privacy? What is the relationship between privacy-preservation and generalization and robustness in machine learning? Can we design privacy preserving learning algorithms that can ensure privacy and guarantee high data utility?

We will explore novel techniques including differential privacy to enable privacy-preserving machine learning and data analytics in practice. Our long-term goal is to both provide practical real-world solutions for privacy-preserving machine learning and data analytics and deepen the theoretical understanding in the big data era.

Recent Publications

How You Act Tells a Lot: Privacy-Leakage Attack on Deep Reinforcement Learning

​Xinlei Pan, Weiyao Wang, Xiaoshuai Zhang, Bo Li, Jinfeng Yi, Dawn Song.

International Conference on Autonomous Agents and Multiagent Systems (AAMAS). May, 2019


Towards Efficient Data Valuation Based on the Shapley Value

​Ruoxi Jia, David Dao, Boxin Wang, Frances Ann Hubis, Nick Hynes, Bo Li, Ce Zhang, Dawn Song, Costas Spanos.



Get Your Workload in Order: Game Theoretic Prioritization of Database Auditing

​Chao Yan, Bo Li, Yevgeniy Vorobeychik, Aron Laszka, Daniel Fabbri, Bradley Malin.

ICDE 2018


Engineering Agreement: The Naming Game with Asymmetric and Heterogeneous Agents

​J. Gao, B. Li, G. Schoenebeck and F. Yu.

In Proceedings of the 30th International Conference on Artificial Intelligence (AAAI 2017).


Iterative classification for sanitizing large-scale datasets

​B. Li, Y. Vorobeychik, M. Li, and B. Malin.

ICDM 2015