Ali Vosoughi
Ali Vosoughi
阿力
PhD Candidate advised by Prof Axel Wismueller and Prof Chenliang Xu
University of Rochester
🤖 Agentic AI Systems 🎵 Computer Audition 🧠 Multimodal Reasoning 🎬 Multimodal Generation 🥽 Immersive Computing 🔍 Reasoning Verification 🎯 Reinforcement Learning 🚀 Large Action Models 🔊 Audio Generation 📹 Video Generation
📧 ali.vosoughi@rochester.edu
📍 CS Department, Wegmans Hall 3211
🎵 Smule AI
Research Scientist Intern
Spatial Audio Generation
Jun–Sep 2025
🏢 Microsoft Research
Research Intern
Audiovisual LLM
May–Aug 2024
🚗 Bosch AI Research
Research Intern
Audio LLM
Apr–Jul 2023
🛡️ DARPA PTG
Graduate Researcher
Autonomous AR Copilot
2022–present
🏆
First counterfactual audio methods
ICASSP’24 + US Patent Application
🤝
Autonomous multimodal copilot
Real-time AR demonstrations (DARPA)
📊
VERIFY benchmark
Reasoning verification framework

Recent News & Updates

10/2024
🎤 Presented at SANE 2024, DeepMind Boston
10/2024
📄 ACM Multimedia 2024 paper accepted
08/2024
💼 Research presentation at Microsoft, Seattle
03/2024
📄 NAACL 2024 paper accepted
02/2024
📄 IEEE Transactions on Multimedia paper
08/2023
🎯 Two ICCV 2023 papers accepted

Publications

VERIFY: A Benchmark of Visual Explanation and Reasoning for Investigating Multimodal Reasoning Fidelity

[Paper][Website][🤗 Hugging Face]

EAGLE: Egocentric AGgregated Language-video Engine
ACM MM’24
[Paper][Code][Website]

Cross Modality Bias in Visual Question Answering: A Causal View with Possible Worlds VQA
IEEE Transactions on Multimedia’24

[Paper][Code][Website]

OSCaR: Object State Captioning and State Change Representation
NAACL’24
[Paper][Code]

Video Understanding with Large Language Models: A Survey
[Paper][Code]

Learning Audio Concepts from Counterfactual Natural Language
ICASSP’24
[Paper][Code]

Separating Invisible Sounds Toward Universal Audiovisual Scene-Aware Sound Separation
ICCV’23: ICCV AV4D Workshop
[Paper]

MISAR: A Multimodal Instructional System with Augmented Reality
ICCV’23: ICCV AV4D Workshop
[Paper][Code][Video]

Relation Discovery in Nonlinearly Related Large-scale Settings
ICASSP’22
[Paper][Code]

Leveraging Pre-Images to Discover Nonlinear Relationships in Multivariate Environments
EUSIPCO’21
[Paper]

Large-scale Nonlinear Granger Causality for Inferring Directed Dependence from Short Multivariate Time-series Data
Nature’21
[Paper][Code]


Personal Gallery

Ali Vosoughi
Ali Vosoughi