About me

Machine Learning

A short note on DeepSeek v3.2 Dec 10, 2025

Attention is a differentiable lookup Oct 26, 2025

A systems level understanding of LLM inference process Oct 24, 2025

Initial results from the Reward Hacking Benchmark Jul 20, 2025

Environments are everything Jun 20, 2025

The Human Evaluator's Goodbye Jun 13, 2025

Why I'm working on reward hacking research Jun 13, 2025

← Back to main page