cv
This is a description of the page. You can modify it in '_pages/cv.md'. You can also change or remove the top pdf download button.
Basics
| Name | Vaibhavi Nitin Lokegaonkar |
| Label | Graduate Student, Computer Science |
| vlokegao@umd.edu | |
| Phone | +1 (240) 726 9240 |
| Url | https://vaibhavi1707.github.io/ |
| Summary | Graduate student at the University of Maryland, College Park, working in the PIRL and GAMMA Labs under the guidance of Prof. Dinesh Manocha and Prof. Ramani Duraiswami. Research interests: Multimodal AI, Music Understanding, and Intelligibility. |
Work
-
2025.02 - Present Research Assistant
GAMMA Lab, University of Maryland, College Park
Curated a dataset of 1,000+ instances of non-speech sounds and music paired with multiple choice questions requiring audio-reasoning skills for DCASE2025. Actively working on music intelligibility and understanding.
- Dataset creation for non-speech sounds and music
- Audio reasoning QA for DCASE2025
- Music intelligibility research
-
2024.01 - 2024.06 Master’s Thesis Student
Multimodal Perception Lab, IIIT-Bangalore
Proposed a novel video-audio embedding scheme to capture implicit expressions and dialogue intent in acting auditions. Improved state-of-the-art performance by 12%.
- Weakly supervised multimodal learning
- Sequence analyzer for audition evaluation
-
2023.05 - 2023.08 Machine Learning Research Intern
Adobe Research
Developed algorithms for textual hierarchy estimation in design documents. Built GPT-based pipelines for generating contextual prompts to populate design templates.
- Algorithm for document hierarchy estimation
- LLM-based design prompt generation
- 80% human acceptance in evaluations
-
2022.04 - 2022.12 Machine Learning Research Intern
E-Health Research Center, IIIT-Bangalore
Built an attention-based video model for identifying self-stimulatory actions from raw videos, with 30% lower inference latency and 81% accuracy. Authored a paper published at IEEE Healthcom 2023.
- Attention-based video modeling
- Self-stimulatory behavior detection
- IEEE Healthcom 2023 publication
Education
-
2025.08 - 2027.05 College Park, Maryland, USA
-
2019.08 - 2024.07 Bangalore, Karnataka, India
B.Tech + M.Tech
International Institute of Information Technology, Bangalore
Computer Science (Specialization: AI & ML)
Publications
-
2026.01.01 MMAU-Pro: A Comprehensive Benchmark for Holistic Evaluation of Audio General Intelligence
AAAI
Comprehensive benchmark for evaluating general audio intelligence across multiple tasks. Submitted to AAAI 2026 Main Track.
-
2023.09.01 Introducing SSBD+ Dataset with a Convolutional Pipeline for Detecting Self-Stimulatory Behaviours
IEEE Healthcom
Proposed a convolutional pipeline for detecting self-stimulatory behaviours from videos, introducing the SSBD+ dataset.
Projects
-
GPT-3 Implementation
Implemented GPT-3 architecture from scratch using PyTorch and trained on the Tiny Shakespeare dataset.
- Self-attention trick
- Pretraining experiments
-
Van Gogh Me!
Generated Van Gogh-style celebrity portraits using CycleGAN trained on Van Gogh paintings and CelebA dataset.
- CycleGAN
- Image-to-image translation
Skills
| Programming Languages | |
| Python | |
| C/C++ | |
| Java | |
| Haskell | |
| SQL | |
| JavaScript |
| Libraries & Frameworks | |
| NumPy | |
| Pandas | |
| Matplotlib | |
| Seaborn | |
| OpenCV | |
| PyTorch | |
| TensorFlow | |
| scikit-learn | |
| Flask | |
| Django | |
| ReactJS | |
| NodeJS | |
| ThreeJS | |
| WebGL | |
| SpringBoot |
| Technologies | |
| Linux | |
| GitHub | |
| Jenkins | |
| CircleCI | |
| LaTeX | |
| Docker | |
| Kubernetes |
| Developer Tools | |
| VS Code | |
| Eclipse | |
| Android Studio |