Rishav
I’m a graduate student at Mila, broadly interested in building reliable machine learning systems that work at scale. At Mila, my research has focused on topics in RL, mainly offline (explainability and adaptive regularization), real-time RL, and benchmarking SSL methods for Atari games.
My larger goal is to develop safe, interpretable, real-time, and sample-efficient agents, motivating my interests in areas like mechanistic interpretability, world models, and real-time systems.
Before Mila, I co-founded Offside (scaled to 100k users), spent ~2 years at DFKI in Germany developing real-time vision algorithms for precision farming (blog post), and earned a BEng (Thesis) in Computer Science from BITS Pilani in 2020.
Beyond Research
Outside of research, I enjoy reading about ancient civilizations, listening to classic rock, trekking, and strength training. I also write blogs reflecting on projects and life learnings. Check them out here.
News
| Oct 18, 2025 | Wrote a blog while learning distributed training with jax, wrote about the main confusions I got into (might as well be common!). Have a look: https://rish-av.github.io/blog/2025/jax_distributed/. |
|---|---|
| Aug 20, 2025 | Behavior discovery and attribution for explainable RL accepted for TMLR 2025. |
| Aug 5, 2025 | I was at RLC, presenting “Behavioral Suite Analysis of Self-Supervised Learning in Atari” at RLVG workshop. |
| Jun 20, 2025 | Our blog on Real-time RL is up on Mila, check it out: Real‑time Reinforcement Learning — Mila. |
| Mar 20, 2025 | I’ve started a series of posts on CUDA programming, with the end goal of accelerating DQN using CUDA. The very first blog post is now live: link. |
| Jan 22, 2025 | Handling delays in RL accepted at ICLR 2025. |
| Nov 15, 2024 | KD-LoRA accepted at NeurIPS ENLSP Workshop. |