Milad Aghajohari
Affiliations. Mila, Montreal Artificial Intelligence Institute

I am Milad Aghajohari. I have two research directions:
I work on improving reasoning capabilities of LLMs via RL principles. I am a first co-author of VinePPO where we fixed the credit assignment problem prevalent silently in RL tuning of LLMs. Fixing this gaves us higher accuracy faster. This was an amazing work and I continue to work on this domain.
Also, I work on MARL on general sum games where we design algorithms that solve general sum games as naive RL fails badly.
In the past, I also workd on likelihood models and wrote the reimanian diffusion model paper.
I have expertise in both likelihood models and RL. I think the mixture of this two will be the key to AGI.
news
Dec 09, 2024 | Presenting VinePPO at NeurIPS 2024 MATH-AI workshop. |
---|
selected publications
- arXiv preprint arXiv:2410.01679, 2024
-
-