Milad Aghajohari

Affiliations. Mila, Montreal Artificial Intelligence Institute

prof_pic.png

I am Milad Aghajohari. I have two research directions:

I work on improving reasoning capabilities of LLMs via RL principles. I am a first co-author of VinePPO where we fixed the credit assignment problem prevalent silently in RL tuning of LLMs. Fixing this gaves us higher accuracy faster. This was an amazing work and I continue to work on this domain.

Also, I work on MARL on general sum games where we design algorithms that solve general sum games as naive RL fails badly.

In the past, I also workd on likelihood models and wrote the reimanian diffusion model paper.

I have expertise in both likelihood models and RL. I think the mixture of this two will be the key to AGI.

news

Dec 09, 2024 Presenting VinePPO at NeurIPS 2024 MATH-AI workshop.

selected publications

  1. vineppo.png
    Amirhossein Kazemnejad*, Milad Aghajohari*, Eva Portelance, and 4 more authors
    arXiv preprint arXiv:2410.01679, 2024
  2. rdm.png
    Chin-Wei Huang, Milad Aghajohari, Joey Bose, and 2 more authors
    NeurIPS 2022, 2022
  3. adalign.png
    Juan Agustin Duque, Milad Aghajohari, Tim Cooijmans, and 4 more authors
    arXiv 2024, 2024