What is Asthenosphere on AMD NPU?

Asthenosphere is an AI inference system running full transformer pipelines on AMD Ryzen's Phoenix XDNA NPU, hitting 3866 effective tok/s with zero CPU/GPU load.

How fast is Asthenosphere inference?

Average 3866 effective tokens/second, 83ms per 64.7-token message, 91.8% speculation acceptance—blazing for edge devices.

Does Asthenosphere replace GPUs for AI?

For efficient, low-power inference

🤖 AI Dev Tools

3866 Tokens/Second: Asthenosphere Unleashes AMD NPU's Full Fury

Picture this: an AMD Ryzen NPU churning out AI responses at 3866 effective tokens per second, no CPU or GPU in sight. Asthenosphere just turned your laptop into a speculative decoding beast.

DevTools Feed Apr 03, 2026 4 min read

Asthenosphere performance logs showing 3866 tok/s on AMD Phoenix NPU tiles

⚡ Key Takeaways

Asthenosphere achieves 3866 effective tok/s on AMD NPU with zero CPU/GPU usage. 𝕏
Full 12-tile transformer pipeline enables speculative decoding at 91.8% acceptance. 𝕏
Edge AI shift incoming: NPUs like Phoenix XDNA redefine on-device inference. 𝕏

Published by

DevTools Feed

Ship faster. Build smarter.

#AI Inference #AMD NPU #Asthenosphere #Speculative Decoding

Worth sharing?

Get the best Developer Tools stories of the week in your inbox — no noise, no spam.

Originally reported by dev.to

⚡ Key Takeaways

The 60-Second TL;DR

DevTools Feed

Share this article

Worth sharing?

Related Stories

One Forgotten Line: How Anthropic Handed Rivals Their $340 Billion AI Crown Jewels

MCP's Tool Permissions Wake-Up Call: Stop Handing Agents the Keys to Everything

The AI Research Engine That Ditches Google for 100+ Raw Data APIs

Gemma 4: Multimodal Hype Meets Real Hacking

Stay in the loop