Repeat After Me: Transformers are Better than State Space Models at Copying Paper • 2402.01032 • Published Feb 1, 2024 • 24
Escaping saddle points in zeroth-order optimization: the power of two-point estimators Paper • 2209.13555 • Published Sep 27, 2022