学术报告

【40周年校庆学术活动】学术报告四十七:The Heavy-Tail Phenomenon in SGD

时间:2023-06-26 16:45

主讲人 朱凌炯 副教授(佛罗里达州立大学) 讲座时间 2023年6月29日10:00-11:00
讲座地点 文科楼1420 实际会议时间日
实际会议时间年月

数学与统计学院学术报告[2023] 047

(高水平大学建设系列报告818)


报告题目: The Heavy-Tail Phenomenon in SGD

报告人:朱凌炯 副教授(佛罗里达州立大学

报告时间:2023年6月29日10:00-11:00

报告地点: 文科楼1420                                       

报告内容:Heavy-tail phenomena in stochastic gradient descent (SGD) have been reported in several empirical studies. Experimental evidence in previous works suggests a strong interplay between the heaviness of the tails and generalization behavior of SGD. To address this empirical phenomenon theoretically, we establish novel links between the tail behavior and generalization properties of SGD through the lens of algorithmic stability. We show that the generalization error decreases as the tails become heavier, as long as the tails are lighter than a threshold. Moreover, we investigate the origins of the heavy tails in SGD. We show that even in a simple linear regression problem with independent and identically distributed data whose distribution has finite moments of all order, the iterates can converge to a stationary distribution that is heavy-tailed with infinite variance. We further characterize the behavior of the tails with respect to algorithm parameters, the dimension, and the curvature. We then translate our results into insights about the behavior of SGD in deep learning. We support our theory with experiments conducted on synthetic data, fully connected, and convolutional neural networks.

报告人简历:朱凌炯2008年本科毕业于英国剑桥大学,2013年获美国纽约大学数学博士学位。后在纽约摩根斯坦利,明尼苏达大学任职,并于2015年加入佛罗里达州立大学。研究方向包括应用概率,数据科学,金融工程,运筹。先后在 AAP, SPA, Bernoulli , FS, ICML, IME, INFORMS JoC, JMLR, NeurIPS, OR, POM, QF, Queueing Systems, RESTAT, SIFIN等期刊会议发表学术论文五十余篇。曾获Kurt Friedrichs Prize,Developing Scholar Award, Graduate Faculty Mentor Award, 主持美国国家基金三项。

 

欢迎感兴趣的师生参加!    

邀请人:姚念         

数学与统计学院  

2023年6月26