Gemma 4: 96 Tokens/Second on Dual RTX Cards, Fixing My Kubernetes Bugs by Lunch
96 tokens per second. That's Gemma 4 chewing through Kubernetes bug reports on my dual RTX setup. Google's open model just turned 'wait and hope' into 'deploy and debug now.'
⚡ Key Takeaways
Worth sharing?
Get the best Developer Tools stories of the week in your inbox — no noise, no spam.
Originally reported by dev.to