Apple is sponsoring the International Conference on Acoustics, Speech and Signal Processing (ICASSP), which is taking place in person from April 14 to 19 in Seoul, South Korea. ICASSP is the IEEE Signal Processing Society’s flagship conference on signal processing and its applications.
Schedule
Below is the schedule of Apple sponsored workshops and events at ICASSP 2024. Stop by the Apple booth from April 16 to 19 from 8:20 AM to 6:00 PM UTC at Booth D1 in the COEX Convention Center Exhibition Hall.
Monday, April 15
Wednesday, April 17
- LUNCHEON
- Women in Signal Processing (WiSP)
- 11:40 AM – 1:40 PM UTC, Room 402
- Panos Georgiou and Clara Borrelli will be representing Apple at the Women in Signal Processing Luncheon.
- ORAL PRESENTATION
- Personalization of CTC-based End-to-End Speech Recognition Using Pronunciation-Driven Subword Tokenization
- 1:10 PM – 1:30 PM UTC, Room 102
- Zhihong Lei, Ernie Pusateri, Michael Han, Leo Liu, Mingbin Xu, Tim Ng, Zhen Huang, Ruchir Travadi, Darien Zhang, Mirko Hannemann, Man-Hung Siu
Thursday, April 18
- JOB FAIR
- Student and Young Professionals Luncheon
- 12:00 – 2:00 PM UTC, Room E5 – E6
- Kisun You, Evan Yamasaki, and Alex Acero will be representing Apple at the Student Job Fair and Luncheon.
Friday, April 19
- ORAL PRESENTATION
- Dialog modeling in audiobook synthesis
- 2:10 – 2:30 PM UTC, Room 104
- Cheng-Chieh Yeh, Reza Shirani, Weicheng Zhang, Tuomo Raitio, Ramya Rasipuram, Ladan Golipour, David Winarsky
Accepted Papers
Corpus Synthesis for Zero-shot ASR Domain Adaptation using Large Language Models
Hsuan Su (National Taiwan University), Ting-Yao Hu, Hema Koppula, Raviteja Vemulapalli, Rick Chang, Karren Yang, Gautam Varma Mantena, Oncel Tuzel
Dialog modeling in audiobook synthesis
Cheng-Chieh Yeh, Reza Shirani, Weicheng Zhang, Tuomo Raitio, Ramya Rasipuram, Ladan Golipour, David Winarsky
Flexible Keyword Spotting based on Homogeneous Audio-Text Embedding
Kumari Nishu, Minsik Cho, Paul Dixon, Devang Naik
Alexandre Bittar (Ecole Polytechnique Fédérale de Lausanne, Switzerland), Paul Dixon, Mohammad Samragh Razlighi, Kumari Nishu, Devang Naik
Leveraging Large Language Models for Exploiting ASR Uncertainty
Pranay Dighe, Yi Su, Daniel Zheng, Yunshu Liu, Vineet Garg, Xiaochuan Niu, Ahmed Tewfik
Gautam Krishna, Sameer Dharur, Oggi Rudovic, Pranay Dighe, Saurabh Adya, Ahmed Hussen Abdelaziz, Ahmed Tewfik
Zhihong Lei, Ernie Pusateri, Michael Han, Leo Liu, Mingbin Xu, Tim Ng, Zhen Huang, Ruchir Travadi, Darien Zhang, Mirko Hannemann, Man-Hung Siu
Streaming Anchor Loss: Augmenting Supervision with Temporal Significance
Oggy Sarawgi, Jack Berkowitz, Vineet Garg, Arnav Kundu, Minsik Cho, Sai Srujana Buddi, Saurabh Adya, Ahmed Tewfik
Towards a World-English Language Model
Rricha Jalota, Lyan Verwimp, Markus Nussbaum-Thom, Amr Mousa, Arturo Argueta, Youssef Oualil
A Multimodal Approach to Device-Directed Speech Detection with Large Language Models
Dominik Wagner (FAU), Alex Churchill, Siddharth Sigtia, Panos Georgiou, Matt Mirsamadi, Aarshee Mishra, Erik Marchi
Investigating Salient Representations and Label Varience in Dimensional Speech Emotion Analysis
Vikramjit Mitra, Jingping Nie, Erdrin Azemi
Resource-constrained stereo singing voice cancellation
Clara Borrelli, Dogac Basaran, Matthias Mauch , Matthew McVicar, James Rae, Mehrez Souden
Workshop Accepted Papers
Multichannel Voice Trigger Detection based on Transform-average-concatenate
Takuya Higuchi, Avamarie Brueggeman (The University of Texas at Dallas), Masood Delfarah, Stephen Shum
Acknowledgements
Daniele Giacobello is a member of the ICASSP 2024 Organizing Committee.
Takaaki Hori, Daniele Giacobello, and Yi Su are ICASSP 2024 session chairs.
Vikram Mitra is an Affiliate SLTC Member.
Yi Su, Aswin Sivaraman, Takaaki Hori, Daniele Giacobello, Vineet Garg, Jack Berkowitz, and Vikram Mitra are reviewers for ICASSP 2024.