Chulhee "Charlie" Yun

This website is no longer maintained. Please visit my new homepage.

Hi!

My name is Charlie, and I am a postdoctoral Research Specialist in the Laboratory for Information and Decision Systems at Massachusetts Institute of Technology. I recently finished my Ph.D. from the same laboratory. Hosted by my awesome Ph.D. advisors Prof. Ali Jadbabaie and Prof. Suvrit Sra, I work on optimization and machine learning. Before joining MIT, I was a master’s student in Electrical Engineering at Stanford University, where I had a great fortune to work with Prof. John Duchi. I finished my undergraduate program in Electrical Engineering at KAIST, South Korea.

I will be joining KAIST Kim Jaechul Graduate School of AI as an assistant professor starting in Spring 2022. Unfortunately, for Spring & Fall 2022 I do not have openings left for new master’s and Ph.D. students. In general, admissions to graduate programs at KAIST AI are determined at a department-wide level, so I am not in a position to give any confirmations regarding admissions.

[Google Scholar]

Research Interests

Convergence analysis of without-replacement optimization algorithms
Algorithm trajectory analysis of optimization in deep learning
Explaining generalization in deep learning
Federated/distributed learning
Expressive power of neural networks
Optimization landscape of neural networks
Fundamental limits and lower bounds for optimization algorithms
… and any interesting topics in OPT/ML, including applications

Contact

News

[08/2021] After defending my Ph.D. in late July, I finally submitted my doctoral thesis!

[06/2021] Two publications to be presented at COLT 2021: “Provable Memorization via Deep Neural Networks using Sub-linear Parameters” (Main Track) and “Open Problem: Can Single-Shuffle SGD be Better than Reshuffling SGD and GD?” (Open Problems Track).

[06/2021] I decided to join KAIST Graduate School of AI as an assistant professor!

[01/2021] Two papers accepted to ICLR 2021: “Minimum Width for Universal Approximation” (Spotlight) and “A Unifying View on Implicit Bias in Training Linear Neural Networks” (Poster).

[09/2020] Two papers got accepted to NeurIPS 2020: “SGD with shuffling: optimal rates without component convexity and large epoch requirements” (Spotlight) and “O(n) Connections are Expressive Enough: Universal Approximability of Sparse Transformers” (Poster).

[09/2020] Our results on the expressive power of deep and narrow networks (partly based on a recent preprint) are to be presented as a contributed talk at the DeepMath 2020 conference!

[06/2020] Started my second summer internship at Google Research, hosted by Hossein Mobahi and Shankar Krishnan.

[Older News]

Publications

* indicates alphabetical order or equal contribution.

Preprints

Minibatch vs Local SGD with Shuffling: Tight Convergence Bounds and Beyond [arXiv]
Chulhee Yun, Shashank Rajput, Suvrit Sra

Conference/Workshop Papers

Open Problem: Can Single-Shuffle SGD be Better than Reshuffling SGD and GD? [long version]
Chulhee Yun, Suvrit Sra, Ali Jadbabaie
Conference on Learning Theory (COLT) 2021

Provable Memorization via Deep Neural Networks using Sub-linear Parameters [arXiv]
Sejun Park, Jaeho Lee, Chulhee Yun, Jinwoo Shin
Conference on Learning Theory (COLT) 2021
Presented as a contributed talk at DeepMath 2020

A Unifying View on Implicit Bias in Training Linear Neural Networks [arXiv]
Chulhee Yun, Shankar Krishnan, Hossein Mobahi
International Conference on Learning Representations (ICLR) 2021
NeurIPS 2020 Workshop on Optimization for Machine Learning: OPT 2020

Minimum Width for Universal Approximation [arXiv]
Sejun Park, Chulhee Yun, Jaeho Lee, Jinwoo Shin
International Conference on Learning Representations (ICLR) 2021 (Spotlight)
Presented as a contributed talk at DeepMath 2020

SGD with shuffling: optimal rates without component convexity and large epoch requirements [arXiv]
Kwangjun Ahn*, Chulhee Yun*, Suvrit Sra
Neural Information Processing Systems (NeurIPS) 2020 (Spotlight)

O(n) Connections are Expressive Enough: Universal Approximability of Sparse Transformers [arXiv]
Chulhee Yun, Yin-Wen Chang, Srinadh Bhojanapalli, Ankit Singh Rawat, Sashank J. Reddi, Sanjiv Kumar
Neural Information Processing Systems (NeurIPS) 2020

Low-Rank Bottleneck in Multi-head Attention Models [arXiv]
Srinadh Bhojanapalli, Chulhee Yun, Ankit Singh Rawat, Sashank J. Reddi, Sanjiv Kumar
International Conference on Machine Learning (ICML) 2020

Are Transformers universal approximators of sequence-to-sequence functions? [arXiv] [paper] [slides]
Chulhee Yun, Srinadh Bhojanapalli, Ankit Singh Rawat, Sashank J. Reddi, Sanjiv Kumar
International Conference on Learning Representations (ICLR) 2020
NeurIPS 2019 Workshop on Machine Learning with Guarantees [short paper] [poster]
NYAS Machine Learning Symposium 2020 Poster Awards – Honorable Mention

Are deep ResNets provably better than linear predictors? [arXiv] [paper] [slides] [poster]
Chulhee Yun, Suvrit Sra, Ali Jadbabaie
Neural Information Processing Systems (NeurIPS) 2019

Small ReLU networks are powerful memorizers: a tight analysis of memorization capacity [arXiv] [paper] [slides] [poster]
Chulhee Yun, Suvrit Sra, Ali Jadbabaie
Neural Information Processing Systems (NeurIPS) 2019 (Spotlight)

Efficiently testing local optimality and escaping saddles for ReLU networks [arXiv] [paper]
Chulhee Yun, Suvrit Sra, Ali Jadbabaie
International Conference on Learning Representations (ICLR) 2019

Small nonlinearities in activation functions create bad local minima in neural networks [arXiv] [paper]
Chulhee Yun, Suvrit Sra, Ali Jadbabaie
International Conference on Learning Representations (ICLR) 2019

Minimax Bounds on Stochastic Batched Convex Optimization [paper]
John Duchi*, Feng Ruan*, Chulhee Yun*
Conference on Learning Theory (COLT) 2018

Global optimality conditions for deep neural networks [arXiv] [paper]
Chulhee Yun, Suvrit Sra, Ali Jadbabaie
International Conference on Learning Representations (ICLR) 2018
NIPS 2017 Workshop on Deep Learning: Bridging Theory and Practice [short paper]

Face detection using local hybrid patterns [paper]
Chulhee Yun, Donghoon Lee, Chang D. Yoo
International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2015

Talks

Minibatch vs Local SGD with Shuffling: Tight Convergence Bounds and Beyond. University of Wisconsin-Madison MLOPT Idea Seminar. Nov 2021.
Open Problem: Can Single-Shuffle SGD be Better than Reshuffling SGD and GD? COLT 2021 Open Problem Talk. Aug 2021.
A Unifying View on Implicit Bias in Training Linear Neural Networks. ICLR 2021 Social: ML in Korea. May 2021.
Towards Bridging Theory and Practice in Deep Learning & Optimization. Google PhD Tech Talk. Apr 2021.
Bridging Theory and Practice in Deep Learning & Optimization. KAIST Graduate School of AI Special Seminar. Apr 2021.
Bridging Theory and Practice in Deep Learning & Optimization. POSTECH Computer Science and Engineering & Graduate School of AI Special Seminar. Apr 2021.
Implicit bias in neural network optimization: a unifying approach. KAIST Stochastic Analysis & Application Research Center AI Seminar. Apr 2021.
SGD with shuffling: optimal rates without component convexity and large epoch requirements. NeurIPS 2020 Spotlight Talk. Dec 2020.
SGD with shuffling: optimal convergence rates and more. SNU CSE Seminar. Dec 2020.
Theory of optimization in deep learning. KAIST Graduate School of AI Fall 2020 Colloquium. Oct 2020.
On the optimality and memorization in deep learning. Invited Talk at Harvard CRISP Group Meeting. Mar 2020.
Small ReLU networks are powerful memorizers: a tight analysis of memorization capacity. The 25th LIDS Student Conference. Jan 2020.
Small ReLU networks are powerful memorizers: a tight analysis of memorization capacity. NeurIPS 2019 Spotlight Talk. Dec 2019. [slides]
On the Global and Local Optimality of Deep Learning. KAIST ISysE Seminar. Feb 2019.
Small nonlinearities in activation functions create bad local minima in neural networks. The 24th LIDS Student Conference. Feb 2019.
Spurious Local Minima in Neural Networks: A Critical View. The 2018 INFORMS Annual Meeting. Nov 2018.
Global optimality conditions for deep linear neural networks. The 23rd LIDS Student Conference. Feb 2018. [slides]

Awards

Gratefully, my study has been supported by numerous scholarships and awards. To name a few:

Conference Travel Awards: ICLR 2018–20, NeurIPS 2019, COLT 2018
Doctoral Study Abroad Program by Korea Foundation for Advanced Studies. Sep 2016–June 2021.
Samsung Scholarship. Sep 2014–June 2016.

Services

Reviewer/Program Committee: ICLR 2019–22, ICML 2019–21, COLT 2020–21, NeurIPS 2018–20, AISTATS 2019, CDC 2018, JMLR, SIAM Journal on Mathematics of Data Science, Annals of Statistics, IEEE TNNLR, IEEE Transactions on Information Theory
Co-Organizer: The 24th Annual LIDS Student Conference 2019
Co-Organizer: The LIDS & Stats Tea Talk Series, Fall 2019–Spring 2020

Last update: 11/24/2021