<![CDATA[PhD Proposal by Shreyas Malakarjun Patil]]>

671172 event 1700518226 1700518410 <![CDATA[PhD Proposal by Shreyas Malakarjun Patil]]> Title: Leveraging sparsity in deep neural networks for training efficiency, interpretability and transfer learning

Date: November 30th

Time: 10:30 AM

Physical attendance: conference room (Midtown) on Coda 12th floor

Virtual attendance: https://gatech.zoom.us/j/96615684093

Shreyas Malakarjun Patil

Machine Learning PhD Student

ECE
Georgia Institute of Technology

Committee

1. Dr. Constantine Dovrolis (Advisor)

2. Dr. Ling Liu

3. Dr. Zsolt Kira

Abstract

Sparse neural networks (NNs) exhibit fewer connections between consecutive layers compared to dense NNs. As a result, sparse NNs have been shown to enhance generalization and computational efficiency. However, the diverse sparse network structures and benefits beyond efficiency and generalization remain largely unexplored.

In this dissertation, I present an exploration of sparse network structures and their ensuing benefits. First, we propose a new method, PHEW, to identify sparse NNs at initialization without using training data. PHEW leads to sparse NNs that learn fast and generalize well, thus enhancing training efficiency. Second, we propose Neural Sculpting to uncover the hierarchically modular task structure in NNs. We iteratively prune units and edges during training and combine it with network analysis to detect modules and infer hierarchy, thereby enhancing NN interpretability. Finally, we plan to examine how efficiently hierarchically modular NNs, that reflect the task’s structure, transfer to new tasks as compared to dense NNs. Given the assumption that the new tasks introduced in transfer learning share similarities with the previous tasks, our investigation will specifically explore the degree of sub-task reuse from the initial tasks. In summary, this dissertation advances the understanding and capabilities of sparse NNs in terms of training efficiency, interpretability, and transfer learning.

]]> Leveraging sparsity in deep neural networks for training efficiency, interpretability and transfer learning

]]> <![CDATA[]]> 221981 1788 102851