Sherman S. M. Chow — Privacy-Preserving Machine Learning

Thematic Area 1

Pioneering Privacy-Preserving Machine Learning (PPML)

Machine learning is one of the most active areas of modern computer science. My central goal is to enable the use of powerful AI and machine learning models on sensitive datasets, such as medical records, financial transactions, or personal communications, without revealing the underlying data to the model provider or any other party. This research addresses the fundamental tension between model utility and data privacy.

Intellectual Leadership

Systematization of Knowledge (SoK)

Abstract graphic with circuits and data nodes

In SoK: Cryptographic Neural-Network Computation [NC@SP23], my student and I delivered a cross-community synthesis of cryptographic approaches to privacy-preserving machine learning (PPML). An SoK at a flagship venue requires deep command of a fast-moving field. We undertook a large-scale study of 53 seminal PPML papers from 2016–2022, dissecting their cryptographic designs and their “love–hate relationships” with machine learning. The paper brings structure and critical insight to a field often seen as fragmented, giving newcomers a clear entry point and helping experts build without needless reinvention. By demystifying cryptography for ML researchers and vice versa, it fosters true interdisciplinary work. We also created a dedicated website to serve as an evolving resource for the community.

Technical Innovation

Secure and Efficient Transformer Inference

Building on the SoK insights, we targeted one of PPML’s most urgent problems: secure inference for transformer-based models, the foundation of modern large language models such as GPT-3 and BERT. Secure multi-party computation (SMC) is efficient for linear layers but struggles with the non-linear softmax and Gaussian error linear unit (GELU) functions critical to transformers. Prior solutions were too slow and complex for deployment. Our work, SHAFT: Secure, Handy, Accurate, and Fast Transformer Inference [KC@NDSS25], breaks this barrier with two cryptographic firsts: (1) the first constant-round private softmax protocol for transformers, eliminating the logarithmic growth in communication rounds and unlocking scalability; and (2) a high-accuracy Fourier-based GELU protocol far more efficient than existing polynomial methods. SHAFT reduces communication costs by 25-41%, achieves up to 5.3× speedups, and matches the accuracy of non-private models. It earned the Distinguished Artifact Award, recognizing our well-documented, reproducible software, a validation of the strong, practical training my student receives.

This work improves our earlier secure softmax/sigmoid protocol [ZZCPTLY@ACSAC23] and furthers our line of works including GForce [NC@USS21], a GPU-friendly system for oblivious neural network prediction that, as noted in the SoK, remains on the Pareto frontier for selected settings. My group also pioneered Goten [NCWWZ@AAAI21], the first method to securely outsource neural network training to untrusted GPUs, achieving a major performance leap. In addition, we developed sublinear evaluation of multi-client decision trees [MTZC@NDSS21], an interpretable model valued for its explainability, and exposed vulnerabilities in privately trained models [WMWNC@IJCAI20] in a top AI venue.

Secure inference • Constant-round softmax • Fourier-based GELU

Technical Innovation

Differentially Private Large Language Models (LLMs)

Beyond cryptographic methods, we present a complementary approach based on differential privacy (DP) that limits the information leaked by outputs. In DP-Forward [DCWHS@CCS23], we introduced a new DP training paradigm over LLMs. The standard DP approach, DP-SGD, adds noise to gradients during backpropagation. This is computationally and memory intensive, especially for fine-tuning massive pre-trained models, and fails to protect inference queries from attacks like embedding inversion.

DP-Forward shifts the paradigm: it perturbs embedding matrices directly during the forward pass, leading to: (1) Comprehensive privacy: rigorous local DP guarantees for both training data and user inputs at inference. (2) Superior efficiency: ~3× reduction in time and memory compared to the fastest DP-SGD implementations. (3) Stronger security and utility: up to 88 percentage point lower embedding inversion attack success rates (where DP-SGD fails entirely) and higher accuracy at equivalent privacy levels.

The above extends to our prior work on natural text sanitization, replacing sensitive words with semantically similar, non-sensitive alternatives to protect privacy while preserving utility for text analytics [YDWLSC@ACL21], and on sanitizing sentence embeddings and their labels, enabling private release of high-dimensional text representations that remain effective for downstream ML tasks [DYCS@WWW23].

We also advance federated learning (FL), developing a highly efficient secure aggregation protocol [BRECPH@ACSAC24] that reduces the communication overhead, a key bottleneck in practice, and cryptographic methods that enable differential privacy in FL for complex models such as generative adversarial networks and meta-learning [ZSDCLZW@ADMA23]. Foundational to these collaborations is private set intersection (PSI) [MC@ASIACCS22, ZC@WPES18], which aligns distributed, vertically partitioned data while ensuring that items unique to one dataset remain hidden from other data owners

References

Selected Publications

[KC@NDSS25] Andes Y. L. Kei, Sherman S. M. Chow.
SHAFT: Secure, Handy, Accurate, and Fast Transformer Inference.
Network and Distributed System Security Symposium (NDSS) 2025.
[BRECPH@ACSAC24] Rouzbeh Behnia, Arman Riasi, Reza Ebrahimi, Sherman S. M. Chow, Balaji Padmanabhan, Thang Hoang.
Efficient Secure Aggregation for Privacy-Preserving Federated Machine Learning.
Annual Computer Security Applications Conference (ACSAC) 2024.
[ZZCPTLY@ACSAC23] Yu Zheng, Qizhi Zhang, Sherman S. M. Chow, Yuxiang Peng, Sijun Tan, Lichun Li, Shan Yin.
Secure Softmax/Sigmoid for Machine-learning Computation.
Annual Computer Security Applications Conference (ACSAC) 2023.
[DCWHS@CCS23] Minxin Du, Xiang Yue, Sherman S. M. Chow, Tianhao Wang, Chenyu Huang, Huan Sun.
DP-Forward: Fine-tuning and Inference on Language Models with Differential Privacy in Forward Pass.
ACM Conference on Computer and Communications Security (CCS) 2023.<
[ZSDCLZW@ADMA23] Yu Zheng, Wei Song, Minxin Du, Sherman S. M. Chow, Qian Lou, Yongjun Zhao, Xiuhua Wang.
Cryptography-Inspired Federated Learning for Generative Adversarial Networks and Meta Learning.
Advanced Data Mining and Applications (ADMA) 2023.
[NC@SP23] Lucien K. L. Ng, Sherman S. M. Chow.
SoK: Cryptographic Neural-Network Computation.
IEEE Symposium on Security and Privacy (S&P) 2023.
[DYCS@WWW23] Minxin Du, Xiang Yue, Sherman S. M. Chow, Huan Sun.
Sanitizing Sentence Embeddings (and Labels) for Local Differential Privacy.
ACM Web Conference (WWW) 2023.
[MC22] Jack P. K. Ma, Sherman S. M. Chow.
Secure-Computation-Friendly Private Set Intersection from Oblivious Compact Graph Evaluation.
ACM on Asia Conference on Computer and Communications Security (AsiaCCS) 2022.
[NCWWZ@AAAI21] Lucien K. L. Ng, Sherman S. M. Chow, Anna P. Y. Woo, Donald P. H. Wong, Yongjun Zhao.
Goten: GPU-Outsourcing Trusted Execution of Neural Network Training.
AAAI Conference on Artificial Intelligence (AAAI) 2021.
[YDWLSC@ACL21] Xiang Yue, Minxin Du, Tianhao Wang, Yaliang Li, Huan Sun, Sherman S. M. Chow.
Differential Privacy for Text Analytics via Natural Text Sanitization.
Association for Computational Linguistics (ACL) Findings 2021.
[NC@USS21] Lucien K. L. Ng, Sherman S. M. Chow.
GForce: GPU-Friendly Oblivious and Rapid Neural Network Inference.
Usenix Security 2021.
[MTZC@NDSS21] Jack P. K. Ma, Raymond K. H. Tai, Yongjun Zhao, Sherman S. M. Chow.
Let's Stride Blindfolded in a Forest: Sublinear Multi-Client Decision Trees Evaluation.
Network and Distributed System Security Symposium (NDSS) 2021.
[WMWNC@IJCAI20] Harry W. H. Wong, Jack P. K. Ma, Donald P. H. Wong, Lucien K. L. NgM, Sherman S. M. Chow.
Learning Model with Error - Exposing the Hidden Model of BAYHENN.
International Joint Conference on Artificial Intelligence (IJCAI) 2020.
[ZC@WPES18] Yongjun Zhao, Sherman S. M. Chow.
Can You Find The One for Me? Privacy-Preserving Matchmaking via Threshold PSI.
ACM Workshop on Privacy in the Electronic Society (WPES@CCS) 2018.

Contact

Get in touch

Email: [firstname]@ie.cuhk.edu.hk