press

Press Kit

Ali Vosoughi

PhD researcher, University of Rochester. Unified multimodal generation + understanding.

Bio

Ali Vosoughi builds unified multimodal AI systems where LLMs plan and diffusion/codec models generate audio and video. PhD researcher at University of Rochester with industry experience at Microsoft Research, Bosch AI, and Smule. Published at ICCV, NAACL, NeurIPS, ACM MM, IEEE TMM, ICASSP, EUSIPCO, Nature.

Key Facts

Affiliation: University of Rochester, Multimodal Vision & Language Lab
Research: Unified multimodal generation + understanding (audio, video, text)
Industry: Microsoft Research, Bosch AI, Smule research internships
Patent: US 18/379,518 pending (counterfactual audio learning)
Notable Work: EUSIPCO 2025 unified audiovisual encoder, SoundCLIP framework, VERIFY benchmark

Professional Photos

Professional Headshot
High-resolution available for media use

Resources

Research Demo

Publications

Code

Media Contact

Email: ali.vosoughi at rochester.edu
Response Time: 24-48 hours for media inquiries
Available For: Technical interviews, research discussions, conference presentations

Image Usage

Professional photographs available for editorial and academic use with attribution. For high-resolution versions or commercial licensing, contact ali.vosoughi at rochester.edu.