<![CDATA[Ph.D. Dissertation Defense

667011 event 1680298322 1680298322 <![CDATA[Ph.D. Dissertation Defense - Chao-Han Yang]]> Title: A Perturbation Approach to Differential Privacy for Deep Learning based Speech Processing

Committee:

Dr. Chin-Hui Lee, ECE, Chair, Advisor

Dr. Larry Heck, ECE

Dr. Elliott Moore II, ECE

Dr. Pin-Yu Chen, IBM AI

Dr. Sabato Marco Siniscalchi, NTNU

]]> To deploy high-performance speech applications, deep neural network (DNN) based models are widely utilized and trained with speech data. New data regulations (e.g., GDPR and CCPA) require service providers to ensure necessary privacy protections through privacy measurements to avoid security threats from query-based malicious attacks. Differential privacy (DP) is a technique that provides identity protection with a mathematical definition for measuring privacy losses under query-based attacks. This dissertation aims to establish a perturbation-based machine learning framework for training with DNNs to maintain the high performance of speech models while satisfying DP requirements. We develop specific mechanisms for signal distortion by adding bounded noise to the speech training data. These perturbations can be analyzed through statistical measurements and estimations of the deployed models. Our theoretical studies have used both Laplace and Gaussian noise to guarantee the privacy budget under a bounded max-divergence measurement with ensemble learning. We further leverage upon the properties of Lipchitz continuity of DNNs to provide model robustness estimation under different perturbations in vector space. Speech recognition in both isolated speech commands and large vocabulary continuous speech settings has been utilized to corroborate our theoretical justifications. We have advanced the proposed framework into popular speech and acoustic modeling tasks and conducted preliminary investigations to explore the trade-off between model performance and DP budgets. We introduce two training frameworks to encounter the model degradation thorough (i) teacher-ensemble learning and (ii) sub-sampling-based training. Our findings show that even with state-of-the-art speaker protection techniques, speaker identity information can still be leaked without DP-based data extraction or training methods. For more applications, we have incorporated decentralized and federated training pipelines of DNNs into the proposed privacy-preserving speech processing mechanism to investigate potential solutions for on-device or cloud-based speech applications for end-users.

]]> <![CDATA[]]> 434381 1788 192484