Milad Aghajohari

I am Milad Aghajohari. I have two research directions:

I work on improving reasoning capabilities of LLMs via RL principles. I am a first co-author of VinePPO where we fixed the credit assignment problem prevalent silently in RL tuning of LLMs. Fixing this gaves us higher accuracy faster. This was an amazing work and I continue to work on this domain.

Also, I work on MARL on general sum games where we design algorithms that solve general sum games as naive RL fails badly.

In the past, I also workd on likelihood models and wrote the reimanian diffusion model paper.

I have expertise in both likelihood models and RL. I think the mixture of this two will be the key to AGI.

news

Dec 09, 2024	Presenting VinePPO at NeurIPS 2024 MATH-AI workshop.

selected publications

Vineppo: Unlocking RL’s Potential for LLM Reasoning Through Refined Credit Assignment

Amirhossein Kazemnejad^*, Milad Aghajohari^*, Eva Portelance, and 4 more authors

arXiv preprint arXiv:2410.01679, 2024
Riemannian Diffusion Models

Chin-Wei Huang, Milad Aghajohari, Joey Bose, and 2 more authors

NeurIPS 2022, 2022
Advantage Alignment Algorithms

Juan Agustin Duque, Milad Aghajohari, Tim Cooijmans, and 4 more authors

arXiv 2024, 2024