International Conference on Learning Representations (ICLR) 2025

Contents

Schedule Thursday, April 24 Friday, April 25 Saturday, April 26 Sunday, April 27 Sunday, April 27 Monday, April 28 Technical Demos Accepted Papers Workshop Accepted Papers Acknowledgements

Apple is presenting new research at the annual conference on International Conference on Learning Representations (ICLR), which takes place in person in Singapore from April 24 to 28. We are proud to again sponsor the conference, which brings together the scientific and industrial research communities in deep learning. Below is an overview of Apple’s participation at ICLR 2025.

Schedule

Stop by the Apple booth (#C03) in the Singapore EXPO during exhibition hours:

Thursday, April 24: 09:30 – 17:30
Friday, April 25: 09:30 – 17:30
Saturday, April 26: 09:30 – 17:30

All times listed in GMT +8 (Singapore time).

Thursday, April 24

POSTER
How to Verify Any (Reasonable) Distribution Property: Computationally Sound Argument Systems for Distributions
10:00 – 12:30, #30266, Poster Session 1, Hall 3 + Hall 2B
Tal Herman (Weizmann Institute of Science), Guy Rothblum

POSTER
Step-by-Step Reasoning for Math Problems via Twisted Sequential Monte Carlo
10:00 – 12:30, #29208, Poster Session 1, Hall 3 + Hall 2B
Shengyu Feng (CMU), Xiang Kong, Shuang Ma, Aonan Zhang, Dong Yin, Chong Wang, Ruoming Pang, Yiming Yang (CMU)

POSTER
SeedLM: Compressing LLM Weights through Seeds of a Pseudo-Random Generator
15:00 – 17:30, #28000, Poster Session 2, Hall 3 + 2B
Rasoul Shafipour, David (MIND) Harrison, Max Horton, Jeff Marker, Houman Bedayat, Sachin Mehta (Meta), Mohammad Rastegari (Meta), Mahyar Najibi, Saman Naderiparizi

Friday, April 25

POSTER
EC-DIT: Scaling Diffusion Transformers with Expert Choice Routing
10:00 – 12:30, #29721, Poster Session 3, Hall 3 + Hall 2B
Haotian Sun (Georgia Institute of Technology), Tao Lei, Bowen Zhang, Yanghao Li, Haoshuo Huang, Ruoming Pang, Bo Dai (Georgia Institute of Technology), Nan Du

POSTER
Ferret-UI 2: Mastering Universal User Interface Understanding Across Platforms
10:00 – 12:30, #32099, Poster Session 3, Hall 3 + Hall 2B
Zhangheng Li, Keen You, Haotian Zhang, Di Feng, Harsh Agrawal, Xiujun Li, Mohana Prasad Sathya Moorthy, Jeff Nichols, Yinfei Yang, Zhe Gan

POSTER
Large-Scale Image-Caption Data in Improving Multimodal Foundation Models
10:00 – 12:30, #29536, Poster Session 3, Hall 3 + Hall 2B
Jeff Lai, Vasileios Saveris, Chen Chen, Hong-You Chen, Haotian Zhang, Bowen Zhang, Wenze Hu, Juan Lao Tebar, Zhe Gan, Peter Grasch, Meng Cao, Yinfei Yang

POSTER
MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning
10:00 – 12:30, #30222, Poster Session 3, Hall 3 + Hall 2B
Haotian Zhang, Mingfei Gao, Zhe Gan, Philipp Dufter, Nina Wenzel, Forrest Huang, Dhruti Shah, Xianzhi Du, Bowen Zhang, Yanghao Li, Sam Dodge, Keen You, Zhen Yang, Aleksei Timofeev, Mingze Xu, Hong-You Chen, Jean-Philippe Fauconnier Biard, Jeff Lai, Haoxuan You, Zirui Wang, Afshin Dehghan, Peter Grasch, Yinfei Yang

POSTER
MMEgo: Towards Building Egocentric Multimodal LLMs for Video QA
10:00 – 12:30, #30907, Poster Session 3, Hall 3 + Hall 2B
Hanrong Ye (Hong Kong University of Science and Technology (HKUST)), Haotian Zhang, Erik Daxberger, Lin Chen, Zongyu Lin (UCLA), Yanghao Li, Bowen Zhang, Haoxuan You, Jiasen Lu, Dan Xu (HKUST), Zhe Gan, Yinfei Yang

POSTER
Theory, Analysis, and Best Practices for Sigmoid Self-Attention
10:00 – 12:30, #29205, Poster Session 3, Hall 3 + 2B
Jason Ramapuram, Federico Danieli, Eeshan Gunesh Dhekane, Floris Weers, Dan Busbridge, Pierre Ablin, Tatiana Likhomanenko, Jagrit Digani, Zijin Gu, Amitis Shidani, Russ Webb

POSTER
TIS-DPO: Token-level Importance Sampling for Direct Preference Optimization
10:00 – 12:30, #28368, Poster Session 3, Hall 3 + Hall 2B
Law Liu (Tsinghua University), Felix Bai, Zhiyun Lu, Yanchao Sun, Xiang Kong, Simon Wang, Jiulong Shan, Lijie Wen (Tsinghua University), Philip S. Yu (University of Illinois at Chicago), Meng Cao

SOCIAL
Women in Machine Learning (WiML)
12:30 – 14:00, Conference GHJ
Helen Zhou and Nandita Bhaskhar will represent Apple at the WiML social.

POSTER
DART: Denoising Autoregressive Transformer for Scalable Text-to-Image Generation
15:00 – 17:30, #29151, Poster Session 4, Hall 3 + Hall 2B
Jiatao Gu, Shuangfei Zhai, Yuyang Wang, Qihang Zhang (The Chinese University of Hong Kong), Yizhe Zhang, Dinghuai Zhang (Mila), Navdeep Jaitly, Josh Susskind

POSTER
Do LLMs Know Internally When They Follow Instructions?
15:00 – 17:30, #28257, Poster Session 4, Hall 3 + Hall 2B
Juyeon Heo (University of Cambridge), Christina Heinze-Deml, Oussama Elachqar, Shirley Ren, Udhay Nallasamy, Andy Miller, Kwan Ho Ryan Chan (University of Pennsylvania), Jaya Narain

SOCIAL
LatinX in AI
17:00 – 18:30, Conference GHJ
Alejandro Newell and Miguel Sarabia del Castillo will represent Apple at the LatinX in AI social.

Saturday, April 26

POSTER
A Formal Framework for Understanding Length Generalization in Transformers
10:00 – 12:30, #29490, Poster Session 5, Hall 3 + Hall 2B
Xinting Huang (Saarland University), Andy Yang (University of Notre Dame), Yash Sarrof (Saarland University), Mark Rofin (Saarland University), Satwik Bhattamishra (University of Oxford), Andreas Krebs (University of Tübingen), Hattie Zhou (MILA), Preetum Nakkiran, Michael Hahn (Saarland University)

POSTER
Scaling Diffusion Language Models via Adaptation from Autoregressive Models
10:00 – 12:30, #28659, Poster Session 5, Hall 3 + Hall 2B
Shansan Gong (HKU), Shivam Agarwal (UIUC), Yizhe Zhang, Lin Zheng (HKU), Jiacheng Ye (HKU), Mukai Li (HKU), Chenxin An (HKU), Hao Peng (UIUC), Lingpeng Kong (HKU)

POSTER
RelCon: Relative Contrastive Learning for a Motion Foundation Model for Wearable Data
15:00 – 17:30, #28603, Poster Session 6, Hall 3 + Hall 2B
Max Xu (UIUC), Jaya Narain, Greg Darnell, Hyewon Jeong (MIT), Haraldur Hallgrimsson, Darren Forde, Richard Fineman, James M. Rehg (UIUC), Karthik Jayaraman Raghuram, Shirley Ren

SOCIAL
Queer in AI
17:00 – 18:30, Conference GHJ
Azim Yusoff, Kevin Miao, and Nate True will represent Apple at the Queer in AI social.

Sunday, April 27

Monday, April 28

Technical Demos

Visit Apple’s booth at Singapore EXPO, Booth C03, to see our technical demos during exhibition hours:

DEMO
FastVLM
FastVLM is a family of mobile-friendly vision language models. These on-device models use a mix of CNN and transformer encoding techniques. Designed specifically for on-device applications like chatbots, captions, and image finders. Together, they optimize the balance between accuracy and speed.

DEMO
Depth Pro
Zero-shot monocular depth estimation from images without needing to know anything about the camera during training. Depth Pro can generalize to a wide variety of images including in-the-wild internet photos, low-light, text, and motion-blurred images from a smartphone. It uses a query-based architecture to offer state-of-the-art vision transformer modeling, and it works with both RGB and depth at multiple scales. Results show that Depth Pro has unmatched capability in out-of-domain generalization and accuracy, and it works with all kinds of photos. Absolute depth cues in each local region are provided.

Accepted Papers

A Formal Framework for Understanding Length Generalization in Transformers
Xinting Huang (Saarland University), Andy Yang (University of Notre Dame), Yash Sarrof (Saarland University), Mark Rofin (Saarland University), Satwik Bhattamishra (University of Oxford), Andreas Krebs (University of Tübingen), Hattie Zhou (MILA), Preetum Nakkiran, Michael Hahn (Saarland University)

How to Verify Any (Reasonable) Distribution Property: Computationally Sound Argument Systems for Distributions
Tal Herman (Weizmann Institute of Science), Guy Rothblum

Large-Scale Image-Caption Data in Improving Multimodal Foundation Models
Jeff Lai, Vasileios Saveris, Chen Chen, Hong-You Chen, Haotian Zhang, Bowen Zhang, Wenze Hu, Juan Lao Tebar, Zhe Gan, Peter Grasch, Meng Cao, Yinfei Yang

MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning
Haotian Zhang, Mingfei Gao, Zhe Gan, Philipp Dufter, Nina Wenzel, Forrest Huang, Dhruti Shah, Xianzhi Du, Bowen Zhang, Yanghao Li, Sam Dodge, Keen You, Zhen Yang, Aleksei Timofeev, Mingze Xu, Hong-You Chen, Jean-Philippe Fauconnier Biard, Jeff Lai, Haoxuan You, Zirui Wang, Afshin Dehghan, Peter Grasch, Yinfei Yang

MMEgo: Towards Building Egocentric Multimodal LLMs for Video QA
Hanrong Ye (Hong Kong University of Science and Technology (HKUST)), Haotian Zhang, Erik Daxberger, Lin Chen, Zongyu Lin (UCLA), Yanghao Li, Bowen Zhang, Haoxuan You, Jiasen Lu, Dan Xu (HKUST), Zhe Gan, Yinfei Yang

Scaling Diffusion Language Models via Adaptation From Autoregressive Models
Shansan Gong (HKU), Shivam Agarwal (UIUC), Yizhe Zhang, Lin Zheng (HKU), Jiacheng Ye (HKU), Mukai Li (HKU), Chenxin An (HKU), Hao Peng (UIUC), Lingpeng Kong (HKU)

Theory, Analysis, and Best Practices for Sigmoid Self-Attention
Jason Ramapuram, Federico Danieli, Eeshan Gunesh Dhekane, Floris Weers, Dan Busbridge, Pierre Ablin, Tatiana Likhomanenko, Jagrit Digani, Zijin Gu, Amitis Shidani, Russ Webb

Workshop Accepted Papers

Acknowledgements

Alexander Toshev and Ronan Collobert are Senior Area Chairs.

Chen Huang, Chong Wang, Eugene Ndiaye, Harsh Agrawal, Pau Rodriguez, Preetum Nakkiran, Stephan Richter, Yizhe Zhang, and Zhe Gan are Area Chairs.

Arno Blaas is a Workshop Co-Organizer, and Nicholas Apostoloff and Niv Sivakumar are Workshop Reviewers for “I Can’t Believe It’s Not Better: Challenges in Deep Learning (ICBINB) 2025”.

Agni Kumar, Andrew Szot, Arno Blaas, Barry Theobald, Bhuwan Dhingra, Devon Helm, Fartash Faghri, Hadi Pour Ansari, Haoxuan You, Huangjie Zheng, Iman Mirzadeh, Juri Minxha, Kunal Talwar, Lin Chen, Louis Bethune, Luca Zappella, Maartje ter Hoeve, Max Horton, Michael Kirchhof, Nicholas Apostoloff, Pavan Kumar Anasosalu Vasu, Philipp Krähenbühl, Pierre Ablin, Rasoul Shafipour, Raviteja Vemulapalli, Rin Metcalf Susa, Rupchen (Esther) Zhao, Santhosh Kumar Ramakrishnan, Vimal Thilak, Xavier Suau Cuadros, Xiaoming Zhao, and Zakhar Shumaylov are Reviewers.