THESIS DEFENSE PRESENTATION
Title: Enabling Transfer Learning Across Heterogeneous Domains using Input Alignment
Presenter: Arunavo Dey
Advisor: Dr. Tanzima Islam
Date: Tuesday, November 7, 2023
Time: 09.30-11.00 AM
Location: 310( COMAL)
Zoom Meeting Link: https://txstate.zoom.us/j/98698827193
Abstract:
As High-Performance Computing (HPC) architectures are evolving quickly, gathering data on each architecture to create predictive performance models can be time-consuming. Instead, leveraging existing performance models for predicting the performance of a new application or an existing application on a new architecture is a better use of scientists’ time. However, such knowledge transfer in HPC can be difficult because of domain heterogeneity resulting from (1) data distribution shifts due to differences in architectures and (2) incomparable feature space due to differences in the names and numbers of features. Although existing transfer learning techniques can handle the first scenario (distribution shift), the second is unique to HPC and requires attention. We propose a few-shot-learning-based test-time adaptation method for Neural Network (NN)s to address this gap. The proposed approach aligns target features using a fully connected NN to leverage existing knowledge from a pre-trained source model, and then stacking the results from the source model with a secondary model to capture the uncaptured relationships between the unique target features and labels. Our evaluations with both HPC datasets and Machine Learning (ML) benchmarks demonstrate that this approach can adapt an existing source model into a high-fidelity target model with only a few target samples. Additionally, we propose a novel distance measure to easily quantify the dissimilarity between the source and target datasets, which can help identify and explain the best model to use for knowledge transfer. Our extensive evaluations demonstrate that the proposed few-shot-learning-based test-time adaptation method augmented with input alignment achieves the lowest Mean-Square Error (MSE) when predicting a significantly different target from the source data.
Presenter: Arunavo Dey
Advisor: Dr. Tanzima Islam
Date: Tuesday, November 7, 2023
Time: 09.30-11.00 AM
Location: 310( COMAL)
Zoom Meeting Link: https://txstate.zoom.us/j/98698827193
Abstract:
As High-Performance Computing (HPC) architectures are evolving quickly, gathering data on each architecture to create predictive performance models can be time-consuming. Instead, leveraging existing performance models for predicting the performance of a new application or an existing application on a new architecture is a better use of scientists’ time. However, such knowledge transfer in HPC can be difficult because of domain heterogeneity resulting from (1) data distribution shifts due to differences in architectures and (2) incomparable feature space due to differences in the names and numbers of features. Although existing transfer learning techniques can handle the first scenario (distribution shift), the second is unique to HPC and requires attention. We propose a few-shot-learning-based test-time adaptation method for Neural Network (NN)s to address this gap. The proposed approach aligns target features using a fully connected NN to leverage existing knowledge from a pre-trained source model, and then stacking the results from the source model with a secondary model to capture the uncaptured relationships between the unique target features and labels. Our evaluations with both HPC datasets and Machine Learning (ML) benchmarks demonstrate that this approach can adapt an existing source model into a high-fidelity target model with only a few target samples. Additionally, we propose a novel distance measure to easily quantify the dissimilarity between the source and target datasets, which can help identify and explain the best model to use for knowledge transfer. Our extensive evaluations demonstrate that the proposed few-shot-learning-based test-time adaptation method augmented with input alignment achieves the lowest Mean-Square Error (MSE) when predicting a significantly different target from the source data.
Deadline: Dec. 1, 2023, midnight