Abstract: Training machine learning models often involves solving high-dimensional stochastic optimization problems, where stochastic gradient-based algorithms are hindered by slow convergence.