5/22 (Thur) 16:00 -17:30 Professor Chung, Joon Son (KAIST EE)

1. Speaker: Professor Chung, Joon Son(KAIST EE)
2. Title: Giving Voice and Face to AI
3. Date and Time: May 22nd (Thurs.), 4 pm – 5:30 pm
4. Venue: Room L409
* Abstract
Deep neural networks excel in various domains such as speech or image recognition, yet humans learn by combining multiple senses. We emulate this by leveraging the natural alignment of sight and sound—e.g., a guitarist’s image with its audio—to train models without labels. The resulting representations support tasks like sound-source localization and retrieval. Moving beyond monaural settings, our framework integrates binaural cues, enabling it to disentangle scenes where visual and auditory signals misalign or overlap, and to unify semantic and spatial reasoning.

Introduction

Education

Reservation

People

Research

Community

Introduction

Education

Reservation

People

Research

Community

Introduction

Education

Reservation

People

Research

Community

Introduction

Education

Reservation

People

Research

Community

Introduction

Education

Reservation

People

Research

Community

Introduction

Education

Reservation

People

Research

Community

Community

5/22 (Thur) 16:00 -17:30 Professor Chung, Joon Son (KAIST EE)