Abdul Waheed

I'm a grad student at School of Computer Science, Carnegie Mellon University in Pittsburgh, PA.

Email  /  Scholar  /  Twitter/X  /  Github

profile photo

Research

I am interested in robust and interpretable machine learning. Please find some of my work below and visit my google scholar for more details.

Updates

  • Oct 2024 - Excited to release three new preprints of our work on Unsupervised Data Filtering For Knowledge Distillation, Investigating What Speech Foundation Models Learn About Speech, and Synthetic Data for LLM Pretraining.
  • August 2024 - Joined LTI@CMU as graduate student.
  • June 2024 - 1 main + 1 workshop papers accepted at ACL'2024.
  • Old updates are archived.
  • Publications

    * denotes equal contribution

    Please visit scholar page for more details.

    uDistil-Whisper On the Diversity of Synthetic Data and its Impact on Training Large Language Models
    Hao Chen, Abdul Waheed, Xiang Li, Yidong Wang, Jindong Wang, Bhiksha Raj, Marah I. Abdin
    Preprint, 2024
    Paper

    uDistil-Whisper What Do Speech Foundation Models Not Learn About Speech?
    Abdul Waheed, Hanin Atwany, Bhiksha Raj, Rita Singh
    Preprint, 2024
    Paper

    uDistil-Whisper uDistil-Whisper: Label-Free Data Filtering for Knowledge Distillation in Low-Data Regimes
    Abdul Waheed, Karima Kadaoui, Muhammad Abdul-Mageed
    Preprint, 2024
    Paper

    Robust Knowledge Distillation To Distill or Not to Distill? On the Robustness of Robust Knowledge Distillation
    Abdul Waheed, Karima Kadaoui, Muhammad Abdul-Mageed
    ACL, 2024
    Paper | Models

    Zero-Shot TTS Towards Zero-Shot Text-To-Speech for Arabic Dialects
    Khai Duy Doan, Abdul Waheed, Muhammad Abdul-Mageed
    ArabicNLP conference co-located with ACL , 2024
    Paper

    LaMini-LM LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions
    Minghao Wu, Abdul Waheed, Chiyu Zhang, Muhammad Abdul-Mageed, Alham Fikri Aji
    EACL, 2024
    Paper | Models | Data

    GPTaraeval GPTAraEval: A Comprehensive Evaluation of ChatGPT on Arabic NLP
    Md Tawkat Islam Khondaker, Abdul Waheed, El Moatez Billah Nagoudi, Muhammad Abdul-Mageed
    EMNLP, 2023
    Paper

    Tarjamat TARJAMAT: Evaluation of Bard and ChatGPT on Machine Translation of Ten Arabic Varieties
    Karima Kadaoui*, Samar M. Magdy*, Abdul Waheed*, Md Tawkat Islam Khondaker, Ahmed Oumar El-Shangiti, El Moatez Billah Nagoudi, Muhammad Abdul-Mageed
    ArabicNLP conference co-located with EMNLP, 2023
    Paper

    Whisper Arabic N-Shot Benchmarking of Whisper on Diverse Arabic Speech Recognition
    Bashar Talafha*, Abdul Waheed*, Muhammad Abdul-Mageed
    Interspeech, 2023
    Paper | Code

    Dialogue Act Classification Speaker and Time-aware Joint Contextual Learning for Dialogue-act Classification in Counselling Conversations
    Ganeshan Malhotra, Abdul Waheed, Aseem Srivastava, Md Shad Akhtar, Tanmoy Chakraborty
    ACM International Conference on Web Search and Data Mining (WSDM), 2022
    Paper

    Domain Robustness of Pretrained Models Analyzing the Domain Robustness of Pretrained Language Models, Layer by Layer
    Abhinav Ramesh Kashyap, Laiba Mehnaz, Bhavitvya Malik, Abdul Waheed, Devamanyu Hazarika, Min-Yen Kan, Rajiv Shah
    Domain Adaptation for NLP workshop co-located with EACL, 2021
    Paper


    Fork from Jon Barron's website.