The Gug Letter

The Gug Letter

How Did One RTX 5090 Run Gemma at ~600 Tok/s?

A new research paper made it possible. A regular developer made it real. The implications are bigger than either piece on its own.

Joe Guglielmucci's avatar
Joe Guglielmucci
May 18, 2026
∙ Paid

Hope you guys are having a good Monday, here’s an insane letter for you…

Share

Every breakthrough in AI looks like one moment, but it usually comes from two separate groups of people who never talk to each other.

The first group writes papers and ships open source code. They are usually employed by a research lab, they publish on arXiv, and the work is freely a…

Keep reading with a 7-day free trial

Subscribe to The Gug Letter to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2026 Joe Guglielmucci · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture