
Ali Vosoughi
阿力
Ali Vosoughi is a Ph.D. student in the Electrical and Computer Engineering Department at the University of Rochester. He is a scholar in the NSF’s Augmented and Virtual Reality project.
I am researching with two fantastic groups:
My interests are multimodal information processing using deep learning, multisensory perception, video and scene understanding, audio, image, video, and natural language processing.
News
- – Presentations at SPIE 2023
- – A paper is accepted in Nature, digital medicine. Congratulations to the team!
- – Congratulations on being nominated for the 2022 Donald M. and Janet C. Barnard Fellowship!
- – One paper is accepted in ICASSP 2022, track of Machine Learning. Thanks to our team!
- – Two papers accepted in Defense and Commercial Sensing, 2022! Congrats!
- – Four papers are accepted to SPIE. Congratulations to our team!
- – Our participation in NSF Research program on AR/VR
- – Codes for our Nature paper
- – A new paper on causal learning is accepted for publication in EUSIPCO 2021
- – New paper accepted to Scientific Reports, Nature
Projects

Audio-Visual Deep Learning
The work is currently under double-blind review…

Visual Question Answering
The work is currently under double-blind review…

Cross-Modal Language and Vision Models for Radiology
Deep learning models can be applied successfully in real-work problems; however, training most of these models requires massive data. Recent methods use language and vision, but unfortunately, they rely on datasets that are not usually publicly available. Here we pave the way for further research in the multimodal language-vision domain for radiology. In this paper, we train a representation learning method that uses local and global representations of the language and vision through an attention mechanism and based on the publicly available Indiana University Radiology Report (IU-RR) dataset. Furthermore, we use the learned representations for the downstream task of diagnosing five lung pathologies: atelectasis, cardiomegaly, edema, pleural effusion, and consolidation. Finally, we use both supervised and zero-shot classifications to extensively analyze the performance of the representation learning on the IU-RR dataset.

Point Clouds and Deep Learning
Visual information processing using point cloud (PC) data has attracted considerable attention during the last decade. Although PC data are in analytic geometry and Euclidean space, their use in deep neural networks faces challenges related to analyzing point sets. PC data constitute a permutation-invariant set of points that fades the advantages of using deep learning (DL) architectures for computer vision tasks. In this study, we use a Siamese neural network (SNN) for tracking and shape completion of 3D objects. Our contributions include a modified objective function and non-random sampling methods. Finally, by experimenting with the proposed method on the KITTI dataset of self-driving cars, the success and precision of the 3D object tracking are improved from 41.3 and 57.9 percent to 68.3 and 78.1 percent (this study) with a Kalman-filter’s search space.

Neural Networks and Causality
Causal queries nourish essential insight into various scientific discoveries. Unfortunately, data observed at the massive number of sensors typically are nonlinearly related and complex. At the same time, they have information from millions of sensors with only a few decades of temporal samples, which leads to the curse of dimensionality in large-scale systems. Despite a rich literature on the causal discovery, the problem is challenging for large-scale datasets. We propose a novel method that utilizes a radial basis function (RBF) to tackle the curse-of-dimensionality in complex systems. The proposed method is probabilistic, encompasses nonlinear relations, and is suitable for large-scale data in two steps. Extensive simulations on synthetic data of different sizes and real-world climatology data show that our method outperforms all other methods when nodal observations are temporally scarce.

Anomaly Detection with Autoencoders
Labeled data are necessary for supervised neural network (NN) training. However, supervised learning does not scale favorably, because human intervention for labeling large datasets is expensive. Here, we propose a method that introduces interventions on the training set, and enables NNs to learn features in a self-supervised learning (SSL) setting. The method intervenes in the training data by randomly changing image contrast and removing input image patches, thus creating a significantly augmented training dataset. This is fed into an autoencoder (AE) network, which learns how to reconstruct input images given variable contrast and missing patches of pixels. The proposed technique enables few-shot learning of most relevant image features by forcing NNs to exploit context information in a generative model. Here, we focus on a medical imaging application, where large labeled datasets are usually not available. We evaluate our proposed algorithm for anomaly detection on a small dataset with only 23 training and 35 test images of T2-weighted brain MRI scans from healthy controls (training) and tumor patients (test). We find that the image reconstruction error for healthy controls is significantly lower than for tumor patients (Mann-Whitney U-test, p <10^-10), which can be exploited for anomaly detection of pathologic brain regions by human expert analysis of reconstructed images. Interestingly, this still holds for conventional AE training without SSL, although reconstruction error distributions for healthy/diseased subjects appear to be less dissimilar (p<10^-7). We conclude that the proposed SSL method may be useful for anomaly detection in medical imaging, thus potentially enhancing radiologists’ productivity.

Information Theory for Hardware Security
Information theory for security

CMOS-Based Optimization Accelerators
Optimization using mixed-signal analog – Ising machines[website]

Camera Sensor Design
Tera hertz (lensless) camera sensors design [website]

CMOS and Electronics
Noise analysis and sensor interface design [website]
Personal Gallery


چرا فارسی را نمی شود در این وبسایت از راست نوشت؟