学术报告

首页 >> 学术报告 >> 正文

【学术报告】A Memory Efficient Randomized Subspace Optimization Method for Training

发布日期：2025-11-06 点击：

数学科学学院学术报告

A Memory Efficient Randomized Subspace Optimization Method for Training

袁坤

（北京大学）

报告时间：2025年11月11日星期二上午10：30-11：30

报告地点：沙河校区E806

报告摘要：The memory challenges associated with training Large Language Models (LLMs) have become a critical concern, particularly when using the Adam optimizer. To address this issue, numerous memory-efficient techniques have been proposed, with GaLore standing out as a notable example designed to reduce the memory footprint of optimizer states. However, these approaches do not alleviate the memory burden imposed by activations, rendering them unsuitable for scenarios involving long context sequences or large mini-batches. Moreover, their convergence properties are still not well-understood in the literature. In this work, we introduce a Randomized Subspace Optimization framework for pre-training and fine-tuning LLMs. Our approach decomposes the high-dimensional training problem into a series of lower-dimensional subproblems. At each iteration, a random subspace is selected, and the parameters within that subspace are optimized. This structured reduction in dimensionality allows our method to simultaneously reduce memory usage for both activations and optimizer states. We establish comprehensive convergence guarantees and derive rates for various scenarios, accommodating different optimization strategies to solve the subproblems. Extensive experiments validate the superior memory and communication efficiency of our method, achieving performance comparable to GaLore and Adam.

报告人简介：袁坤，现任北京大学前沿交叉研究院助理教授，研究员，博士生导师，北京大学博雅青年学者。他于2019年在美国加州大学洛杉矶分校获得博士学位，并在2019年至2022年在阿里巴巴达摩院美国西雅图研究中心任高级算法专家。袁坤主要研究分布式优化及其在大模型中的应用。他在2018年获得IEEE信号处理协会青年作者最佳论文奖。相关成果被集成于阿里巴巴“敏迭”优化求解器和英伟达DeepStream官方软件库。

邀请人： 谢家新

上一篇：【学术报告】Advancing Generative Models through Topological Perspectives Informed by Discrete Morse Theory

下一篇：【学术报告】Riemannian Optimization and a Riemannian Proximal Newton-CG Method

首页

学院概况

师资队伍

人才培养

学科建设

科学研究

招生信息

党风廉政

学生园地

下载中心

校友之家

学术报告

【学术报告】A Memory Efficient Randomized Subspace Optimization Method for Training

快速链接

请升级浏览器版本

Chrome

Firefox

Opera

Edge

学术报告

【学术报告】A Memory Efficient Randomized Subspace Optimization Method for Training

快速链接