
Oğuz Kaan Yüksel
I am a fourth-year Ph.D candidate at the Theory of Machine Learning Lab, EPFL, advised by Nicolas Flammarion. I focus on understanding AI from first principles, using mathematical modeling and rigorous theory to explain how and why modern learning systems work. My work spans generalization, optimization, and the structure of data, with a particular interest in language models and autoregressive processes.
Background. M.Sc. in Data Science (minor in Mathematics) and Ph.D. at EPFL. B.Sc. in Computer Engineering & Mathematics at Boğaziçi University.
Selected Publications
News
Two papers [Incremental Learning of Sparse Attention Patterns in Transformers](/publications/yuksel2026incremental) and [Induction Heads Interpolate N-Grams](/publications/dangelo2026induction) are accepted to ICML, 2026.
Presented a talk on [Incremental Learning of Sparse Attention Patterns in Transformers](/slides/prigm-2025/) at PriGM, EurIPS 2025.
Presented two posters [Incremental Learning of Sparse Attention Patterns in Transformers](/publications/yuksel2025incremental) and [Generalization Bounds for Autoregressive Processes and In-Context Learning](/publications/yuksel2025generalization) at PriGM, EurIPS 2025.
Presented [On the Sample Complexity of Next-Token Prediction](/publications/yuksel2025sample) at AISTATS 2025.
Presented [Long-Context Linear System Identification](/publications/yuksel2025long) at ICLR 2025.
Presented [First-order ANIL provably learns representations despite overparametrization](/publications/yuksel2024first) at ICLR 2024.
Presented [First-order ANIL provably learns representations despite overparametrization](/publications/yuksel2024first) at NeurIPS 2023.