How Did One RTX 5090 Run Gemma at ~600 Tok/s?

A new research paper made it possible. A regular developer made it real. The implications are bigger than either piece on its own.

May 18, 2026

∙ Paid

Hope you guys are having a good Monday, here’s an insane letter for you…

Every breakthrough in AI looks like one moment, but it usually comes from two separate groups of people who never talk to each other.

The first group writes papers and ships open source code. They are usually employed by a research lab, they publish on arXiv, and the work is freely a…

Keep reading with a 7-day free trial

Subscribe to The Gug Letter to keep reading this post and get 7 days of free access to the full post archives.